Amazon Mechanical Turk: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Citation bot (talk | contribs)
Add: s2cid, isbn. | Use this bot. Report bugs. | Suggested by Whoop whoop pull up | #UCB_webform 140/3529
I added references and context to things that were out of date. I removed an entirely incorrect and unsupported line about MTurk using a proprietary selection algorithm, and added more references for places where people can look up more on the demographics of the MTurk population.
Line 26: Line 26:


=== Location of Turkers ===
=== Location of Turkers ===
Workers have been primarily located in the United States since the platform's inception<ref>{{cite news|url= http://behind-the-enemy-lines.blogspot.com/2008/03/mechanical-turk-demographics.html |title=Mechanical Turk: The Demographics|author=Panos Ipeirotis|date=March 19, 2008|publisher=New York University|access-date=2009-07-30|author-link=Panos Ipeirotis}}</ref> with demographics generally similar to the overall Internet population in the US.<ref>{{cite news|url=http://behind-the-enemy-lines.blogspot.com/2009/03/turker-demographics-vs-internet.html|title=Turker Demographics vs Internet Demographics|author=Panos Ipeirotis|date=March 16, 2009|publisher=New York University|access-date=2009-07-30}}</ref>
Workers have been primarily located in the United States since the platform's inception<ref>{{cite news|url= http://behind-the-enemy-lines.blogspot.com/2008/03/mechanical-turk-demographics.html |title=Mechanical Turk: The Demographics|author=Panos Ipeirotis|date=March 19, 2008|publisher=New York University|access-date=2009-07-30|author-link=Panos Ipeirotis}}</ref> with demographics generally similar to the overall Internet population in the US.<ref>{{cite news|url=http://behind-the-enemy-lines.blogspot.com/2009/03/turker-demographics-vs-internet.html|title=Turker Demographics vs Internet Demographics|author=Panos Ipeirotis|date=March 16, 2009|publisher=New York University|access-date=2009-07-30}}</ref> '''Research shows that within the US workers are fairly evenly spread across states, proportional to each state’s share of the US population.'''<ref name=":0">{{Cite book |last=Litman |first=Leib |url=https://www.worldcat.org/oclc/1180179545 |title=Conducting online research on Amazon Mechanical Turk and beyond |date=2020 |others=Jonathan Robinson |isbn=978-1-5063-9111-3 |edition=1st |location=Los Angeles |oclc=1180179545}}</ref> '''In addition, studies suggest between 15 and 30 thousand people within the US complete at least one HIT each month and about 4,500 new people join MTurk each month.'''<ref>{{Cite journal |last=Robinson |first=Jonathan |last2=Rosenzweig |first2=Cheskie |last3=Moss |first3=Aaron J. |last4=Litman |first4=Leib |date=2019-12-16 |editor-last=Sudzina |editor-first=Frantisek |title=Tapped out or barely tapped? Recommendations for how to harness the vast and largely unused potential of the Mechanical Turk participant pool |url=https://dx.plos.org/10.1371/journal.pone.0226394 |journal=PLOS ONE |language=en |volume=14 |issue=12 |pages=e0226394 |doi=10.1371/journal.pone.0226394 |issn=1932-6203 |pmc=PMC6913990 |pmid=31841534}}</ref>


In 2010, cash payments for Indian workers were introduced, which gave new and updated results on the demographics of workers, who remained primarily within the United States.<ref>{{cite news|url= http://www.behind-the-enemy-lines.com/2010/03/new-demographics-of-mechanical-turk.html|title=The New Demographics of Mechanical Turk|author=Panos Ipeirotis|date=March 9, 2010|publisher=[[New York University]]|access-date=2014-03-24}}</ref> The researcher behind these statistics runs a website showing worker demographics, updated hourly. In May 2015, it showed that 80% of workers were located in the United States, with the remaining 20% located elsewhere in the world, most of whom were in India.<ref name="MTurk-tracker 2015">{{cite web|title=MTurk Tracker|url=http://demographics.mturk-tracker.com/|website=demographics.mturk-tracker.com|access-date=1 October 2015}}</ref> As of May 2019, it showed that approximately 60% of workers were located in the United States and 40% are located elsewhere in the world; approximately 30% are in India.<ref name="MTurk-tracker 2019">{{cite web|title=MTurk Tracker|url=http://demographics.mturk-tracker.com/|website=demographics.mturk-tracker.com|access-date=2 May 2019}}</ref>
In 2010, cash payments for Indian workers were introduced, which gave new and updated results on the demographics of workers, who remained primarily within the United States.<ref>{{cite news|url= http://www.behind-the-enemy-lines.com/2010/03/new-demographics-of-mechanical-turk.html|title=The New Demographics of Mechanical Turk|author=Panos Ipeirotis|date=March 9, 2010|publisher=[[New York University]]|access-date=2014-03-24}}</ref> The researcher behind these statistics runs a website showing worker demographics, updated hourly. In May 2015, it showed that 80% of workers were located in the United States, with the remaining 20% located elsewhere in the world, most of whom were in India.<ref name="MTurk-tracker 2015">{{cite web|title=MTurk Tracker|url=http://demographics.mturk-tracker.com/|website=demographics.mturk-tracker.com|access-date=1 October 2015}}</ref> As of May 2019, it showed that approximately 60% of workers were located in the United States and 40% are located elsewhere in the world; approximately 30% are in India.<ref name="MTurk-tracker 2019">{{cite web|title=MTurk Tracker|url=http://demographics.mturk-tracker.com/|website=demographics.mturk-tracker.com|access-date=2 May 2019}}</ref>
Line 32: Line 32:
== Uses ==
== Uses ==
=== Human-subject research ===
=== Human-subject research ===
Beginning in 2010, numerous researchers have explored the viability of Mechanical Turk to recruit subjects of social science experiments. Thousands of papers that rely on data collected from Mechanical Turk workers are published each year, including hundreds in top ranked academic journals.<ref name="mt-cs">{{cite journal | last1 = Chandler | first1 = Jesse. | last2 = Shapiro | first2 = Danielle | year = 2016 | title = Conducting Clinical Research Using Crowdsourced Convenience Samples | journal = Annual Review of Clinical Psychology | volume = 12 | pages = 53–81 | url = https://www.mathematica-mpr.com/our-publications-and-findings/publications/conducting-clinical-research-using-crowdsourced-convenience-samples | doi=10.1146/annurev-clinpsy-021815-093623| pmid = 26772208 | doi-access = free }}</ref> Researchers generally found that while samples of respondents obtained through Mechanical Turk do not perfectly match all relevant characteristics of the US population, they are also not wildly misrepresentative.<ref name="mt-cc">{{cite journal | last1 = Casey | first1 = Logan | last2 = Chandler | first2 = Jesse | last3 = Levine | first3 = Adam | last4 = Proctor | first4 = Andrew| last5 = Sytolovich| first5 = Dara| year = 2017 | title = Intertemporal Differences Among MTurk Workers: Time-Based Sample Variations and Implications for Online Data Collection | journal = SAGE Open | volume = 7 | issue = 2 | pages = 215824401771277 | doi = 10.1177/2158244017712774 | doi-access = free }}</ref><ref name="mt-lf">{{cite journal | last1 = Levay | first1 = Kevin | last2 = Freese | first2 = Jeremy | last3 = Druckman |author3-link=James N. Druckman | first3 = James| year = 2016 | title = The Demographic and Political Composition of Mechanical Turk Samples | journal = SAGE Open | volume = 6| pages = 215824401663643 | doi = 10.1177/2158244016636433 | doi-access = free }}</ref> A study published in 2021 found that the types of quality control approaches used by researchers (such as checking for bots, VPN users, or workers willing to submit dishonest responses) can meaningfully influence survey results, demonstrating so through impact on three common behavioral/mental healthcare screening tools.<ref>{{Cite journal|last1=Agley|first1=Jon|last2=Xiao|first2=Yunyu|last3=Nolan|first3=Rachael|last4=Golzarri-Arroyo|first4=Lilian|date=2021|title=Quality control questions on Amazon's Mechanical Turk (MTurk): A randomized trial of impact on the USAUDIT, PHQ-9, and GAD-7|journal=Behavior Research Methods|volume=54 |issue=2 |pages=885–897 |language=en|doi=10.3758/s13428-021-01665-8|pmid=34357539|pmc=8344397|issn=1554-3528|doi-access=free}}</ref>
Since 2010, numerous researchers have explored the viability of Mechanical Turk to recruit subjects of social science experiments. Researchers have generally found that while samples of respondents obtained through Mechanical Turk do not perfectly match all relevant characteristics of the US population, they are also not wildly misrepresentative.<ref name="mt-cc">{{cite journal | last1 = Casey | first1 = Logan | last2 = Chandler | first2 = Jesse | last3 = Levine | first3 = Adam | last4 = Proctor | first4 = Andrew| last5 = Sytolovich| first5 = Dara| year = 2017 | title = Intertemporal Differences Among MTurk Workers: Time-Based Sample Variations and Implications for Online Data Collection | journal = SAGE Open | volume = 7 | issue = 2 | pages = 215824401771277 | doi = 10.1177/2158244017712774 | doi-access = free }}</ref><ref name="mt-lf">{{cite journal | last1 = Levay | first1 = Kevin | last2 = Freese | first2 = Jeremy | last3 = Druckman |author3-link=James N. Druckman | first3 = James| year = 2016 | title = The Demographic and Political Composition of Mechanical Turk Samples | journal = SAGE Open | volume = 6| pages = 215824401663643 | doi = 10.1177/2158244016636433 | doi-access = free }}</ref> '''As a result, thousands of papers that rely on data collected from Mechanical Turk workers are published each year, including hundreds in top ranked academic journals.'''


'''Over the years, a challenge with using MTurk for human-subject research has been maintaining data quality.''' A study published in 2021 found that the types of quality control approaches used by researchers (such as checking for bots, VPN users, or workers willing to submit dishonest responses) can meaningfully influence survey results, demonstrating so through impact on three common behavioral/mental healthcare screening tools.<ref>{{Cite journal|last1=Agley|first1=Jon|last2=Xiao|first2=Yunyu|last3=Nolan|first3=Rachael|last4=Golzarri-Arroyo|first4=Lilian|date=2021|title=Quality control questions on Amazon's Mechanical Turk (MTurk): A randomized trial of impact on the USAUDIT, PHQ-9, and GAD-7|journal=Behavior Research Methods|volume=54 |issue=2 |pages=885–897 |language=en|doi=10.3758/s13428-021-01665-8|pmid=34357539|pmc=8344397|issn=1554-3528|doi-access=free}}</ref> '''Even though managing data quality requires work from researchers, there is a large body of research showing how to gather high quality data from MTurk.'''<ref>{{Cite journal |last=Hauser |first=David |last2=Paolacci |first2=Gabriele |last3=Chandler |first3=Jesse J. |date=2018-09-01 |title=Common Concerns with MTurk as a Participant Pool: Evidence and Solutions |url=https://osf.io/uq45c |doi=10.31234/osf.io/uq45c}}</ref><ref>{{Cite journal |last=Clifford |first=Scott |last2=Jerit |first2=Jennifer |date=2016 |title=Cheating on Political Knowledge Questions in Online Surveys: An Assessment of the Problem and Solutions |url=https://academic.oup.com/poq/article-lookup/doi/10.1093/poq/nfw030 |journal=Public Opinion Quarterly |language=en |volume=80 |issue=4 |pages=858–887 |doi=10.1093/poq/nfw030 |issn=0033-362X}}</ref><ref>{{Cite journal |last=Hauser |first=David J. |last2=Moss |first2=Aaron J. |last3=Rosenzweig |first3=Cheskie |last4=Jaffe |first4=Shalom N. |last5=Robinson |first5=Jonathan |last6=Litman |first6=Leib |date=2022-11-03 |title=Evaluating CloudResearch’s Approved Group as a solution for problematic data quality on MTurk |url=https://link.springer.com/10.3758/s13428-022-01999-x |journal=Behavior Research Methods |language=en |doi=10.3758/s13428-022-01999-x |issn=1554-3528}}</ref>
The general consensus among researchers is that the service works best for recruiting a diverse sample; it is less successful with studies that require more precisely defined populations or that require a representative sample of the population as a whole.<ref name="mt-cs" /> However, there are concerns that the proprietary selection algorithm may prejudice results (see: ''[[#Research_validity|Research validity]]''). Overall, the US MTurk population is mostly female and white, and is somewhat younger and more educated than the US population overall. Data collected on jobs conducted since 2013 show that the US population is no longer predominantly female, and that Workers are currently slightly more likely to be male.<ref name="blog.turkprime.com">{{cite web|title=The New New Demographics on Mechanical Turk: Is there Still a Gender Gap?|url=http://blog.turkprime.com/2015/03/the-new-new-demographics-on-mechanical.html|website=TurkPrime.com|access-date=12 March 2015|archive-url=https://web.archive.org/web/20150315074256/http://blog.turkprime.com/2015/03/the-new-new-demographics-on-mechanical.html|archive-date=15 March 2015|url-status=dead|df=dmy-all}}</ref> The cost of MTurk was considerably lower than other means of conducting surveys, with workers willing to complete tasks for less than half the US minimum wage.<ref name="mt-jh">{{cite book | last1 = Horton | first1 = John | last2 = Chilton | first2 = Lydia | year = 2010 | title = The Labor Economics of Paid Crowdsourcing | journal = Proceedings of the 11th ACM Conference on Electronic Commerce | pages = 209 | arxiv = 1001.0627| doi = 10.1145/1807342.1807376| isbn = 978-1-60558-822-3 | bibcode = 2010arXiv1001.0627H | s2cid = 18237602 }}</ref>

The general consensus among researchers is that the service works best for recruiting a diverse sample; it is less successful with studies that require more precisely defined populations or that require a representative sample of the population as a whole.<ref name="mt-cs">{{cite journal |last1=Chandler |first1=Jesse. |last2=Shapiro |first2=Danielle |year=2016 |title=Conducting Clinical Research Using Crowdsourced Convenience Samples |url=https://www.mathematica-mpr.com/our-publications-and-findings/publications/conducting-clinical-research-using-crowdsourced-convenience-samples |journal=Annual Review of Clinical Psychology |volume=12 |pages=53–81 |doi=10.1146/annurev-clinpsy-021815-093623 |pmid=26772208 |doi-access=free}}</ref> Many papers have been published on the demographics of the MTurk population.<ref name=":0" /><ref>{{Cite journal |last=Huff |first=Connor |last2=Tingley |first2=Dustin |date=2015-07-01 |title=“Who are these people?” Evaluating the demographic characteristics and political preferences of MTurk survey respondents |url=http://journals.sagepub.com/doi/10.1177/2053168015604648 |journal=Research & Politics |language=en |volume=2 |issue=3 |pages=205316801560464 |doi=10.1177/2053168015604648 |issn=2053-1680}}</ref><ref>{{Cite journal |last=Clifford |first=Scott |last2=Jewell |first2=Ryan M |last3=Waggoner |first3=Philip D |date=2015-10-01 |title=Are samples drawn from Mechanical Turk valid for research on political ideology? |url=http://journals.sagepub.com/doi/10.1177/2053168015622072 |journal=Research & Politics |language=en |volume=2 |issue=4 |pages=205316801562207 |doi=10.1177/2053168015622072 |issn=2053-1680}}</ref> What much of this data shows is that MTurk workers tend to be younger, more educated, more liberal, and slightly less wealthy than the US population overall.<ref>{{Cite journal |last=Chandler |first=Jesse |last2=Rosenzweig |first2=Cheskie |last3=Moss |first3=Aaron J. |last4=Robinson |first4=Jonathan |last5=Litman |first5=Leib |date=2019-10 |title=Online panels in social science research: Expanding sampling methods beyond Mechanical Turk |url=http://link.springer.com/10.3758/s13428-019-01273-7 |journal=Behavior Research Methods |language=en |volume=51 |issue=5 |pages=2022–2038 |doi=10.3758/s13428-019-01273-7 |issn=1554-3528 |pmc=PMC6797699 |pmid=31512174}}</ref> Meanwhile, the cost of using MTurk is considerably lower than many other means of conducting surveys, so many researchers continue to use it.


=== Machine Learning ===
=== Machine Learning ===

Revision as of 19:13, 19 December 2022

Amazon Mechanical Turk (MTurk) is a crowdsourcing website for businesses to hire remotely located "crowdworkers" to perform discrete on-demand tasks that computers are currently unable to do. It is operated under Amazon Web Services, and is owned by Amazon.[1] Employers (known as requesters) post jobs known as Human Intelligence Tasks (HITs), such as identifying specific content in an image or video, writing product descriptions, or answering survey questions. Workers, colloquially known as Turkers or crowdworkers, browse among existing jobs and complete them in exchange for a fee set by the employer. To place jobs, the requesting programs use an open application programming interface (API), or the more limited MTurk Requester site.[2] As of April 2019, Requesters could register from 49 approved countries.[3]

History

The service was conceived by Venky Harinarayan in a US patent disclosure in 2001.[4] Amazon coined the term artificial artificial intelligence for processes outsourcing some parts of a computer program to humans, for those tasks carried out much faster by humans than computers. It is claimed that Jeff Bezos was responsible for the concept that led to Amazon's Mechanical Turk being developed to realize this process.[5]

The name Mechanical Turk was inspired by "The Turk", an 18th-century chess-playing automaton made by Wolfgang von Kempelen that toured Europe, beating both Napoleon Bonaparte and Benjamin Franklin. It was later revealed that this "machine" was not an automaton at all, but was, in fact, a human chess master hidden in the cabinet beneath the board and controlling the movements of a humanoid dummy. Likewise, the Mechanical Turk online service uses remote human labor hidden behind a computer interface to help employers perform tasks that are not possible using a true machine.

MTurk was launched publicly on November 2, 2005. Following its launch, the Mechanical Turk user base grew quickly. In early- to mid-November 2005, there were tens of thousands of jobs, all of them uploaded to the system by Amazon itself for some of its internal tasks that required human intelligence. HIT types have expanded to include transcribing, rating, image tagging, surveys, and writing.

In March 2007, there were reportedly more than 100,000 workers in over 100 countries.[6] This increased to over 500,000 registered workers from over 190 countries in January 2011.[7] In the same year, Techlist published an interactive map pinpointing the locations of 50,000 of their MTurk workers around the world.[8] By 2018, research had demonstrated that while there were over 100,000 workers available on the platform at any time, only around 2000 were actively working.[9]

Overview

A user of Mechanical Turk can be either a "Worker" (contractor) or a "Requester" (employer). Workers have access to a dashboard that displays three sections: total earnings, HIT status and HIT totals. Workers set their own hours and are not under any obligation to accept any particular task. Amazon classifies the Workers as contractors rather than employees, and refuses to file forms or pay payroll taxes. This is aimed to evade the minimum wage, overtime, and workers compensation. Workers must report their income as self-employment income. In 2013, the average wage for the multiple microtasks assigned, if performed quickly, is about one dollar an hour, with each task averaging a few cents.[10] Workers can have a postal address anywhere in the world. Payment for completing tasks can be redeemed on Amazon.com via gift certificate (gift certificates are the only payment option available to international workers, apart from India) or be later transferred to a Worker's U.S. bank account.

Requesters can ask that Workers fulfill qualifications before engaging in a task, and they can set up a test in order to verify the qualification. They can also accept or reject the result sent by the Worker, which affects the Worker's reputation. As of April 2019, Requesters paid Amazon a minimum 20% commission on the price of successfully completed jobs, with increased amounts for additional services.[6] Requesters can use the Amazon Mechanical Turk API to programmatically integrate the results of that work directly into their business processes and systems. When employers set up their job, they must specify

  • how much are they paying for each HIT accomplished,
  • how many workers they want to work on each HIT,
  • maximum time a worker has to work on a single task,
  • how much time the workers have to complete the work,

as well as the specific details about the job they want to be completed.

Location of Turkers

Workers have been primarily located in the United States since the platform's inception[11] with demographics generally similar to the overall Internet population in the US.[12] Research shows that within the US workers are fairly evenly spread across states, proportional to each state’s share of the US population.[13] In addition, studies suggest between 15 and 30 thousand people within the US complete at least one HIT each month and about 4,500 new people join MTurk each month.[14]

In 2010, cash payments for Indian workers were introduced, which gave new and updated results on the demographics of workers, who remained primarily within the United States.[15] The researcher behind these statistics runs a website showing worker demographics, updated hourly. In May 2015, it showed that 80% of workers were located in the United States, with the remaining 20% located elsewhere in the world, most of whom were in India.[16] As of May 2019, it showed that approximately 60% of workers were located in the United States and 40% are located elsewhere in the world; approximately 30% are in India.[17]

Uses

Human-subject research

Since 2010, numerous researchers have explored the viability of Mechanical Turk to recruit subjects of social science experiments. Researchers have generally found that while samples of respondents obtained through Mechanical Turk do not perfectly match all relevant characteristics of the US population, they are also not wildly misrepresentative.[18][19] As a result, thousands of papers that rely on data collected from Mechanical Turk workers are published each year, including hundreds in top ranked academic journals.

Over the years, a challenge with using MTurk for human-subject research has been maintaining data quality. A study published in 2021 found that the types of quality control approaches used by researchers (such as checking for bots, VPN users, or workers willing to submit dishonest responses) can meaningfully influence survey results, demonstrating so through impact on three common behavioral/mental healthcare screening tools.[20] Even though managing data quality requires work from researchers, there is a large body of research showing how to gather high quality data from MTurk.[21][22][23]

The general consensus among researchers is that the service works best for recruiting a diverse sample; it is less successful with studies that require more precisely defined populations or that require a representative sample of the population as a whole.[24] Many papers have been published on the demographics of the MTurk population.[13][25][26] What much of this data shows is that MTurk workers tend to be younger, more educated, more liberal, and slightly less wealthy than the US population overall.[27] Meanwhile, the cost of using MTurk is considerably lower than many other means of conducting surveys, so many researchers continue to use it.

Machine Learning

Supervised Machine Learning algorithms require large amounts of human-annotated data to be trained successfully. Machine learning researchers have hired Workers through Mechanical Turk to produce datasets such as SQuAD, a question answering dataset.[28]

Missing persons searches

Since 2007, the service has been used to search for prominent missing individuals. It was first suggested during the search for James Kim, but his body was found before any technical progress was made. That summer, computer scientist Jim Gray disappeared on his yacht and Amazon's Werner Vogels, a personal friend, made arrangements for DigitalGlobe, which provides satellite data for Google Maps and Google Earth, to put recent photography of the Farallon Islands on Mechanical Turk. A front-page story on Digg attracted 12,000 searchers who worked with imaging professionals on the same data. The search was unsuccessful.[29]

In September 2007, a similar arrangement was repeated in the search for aviator Steve Fossett. Satellite data was divided into 85 squared meter sections, and Mechanical Turk users were asked to flag images with "foreign objects" that might be a crash site or other evidence that should be examined more closely.[30] This search was also unsuccessful. The satellite imagery was mostly within a 50-mile radius,[31] but the crash site was eventually found by hikers about a year later, 65 miles away.[32]

Artistic works

In addition to receiving growing interest from the social sciences, MTurk has also been used as a tool for artistic creation. One of the first artists to work with Mechanical Turk was xtine burrough, with The Mechanical Olympics (2008),[33][34] Endless Om (2015) and Mediations on Digital Labor (2015).[35][36] Other work was artist Aaron Koblin's Ten Thousand Cents (2008).

Third-party programming

Programmers have developed various browser extensions and scripts designed to simplify the process of completing jobs. Amazon has stated that they disapprove of scripts that completely automate the process and preclude the human element. This is because of the concern that the task completion process - e.g. answering a survey - could be gamed with random responses, and the resultant collected data could be worthless.[37] Accounts using so-called automated bots have been banned. There are services that extend the capabilities to MTurk.

API

Amazon makes available an application programming interface (API) to give users another access point into the MTurk system. The MTurk API lets a programmer access numerous aspects of MTurk like submitting jobs, retrieving completed work, and approving or rejecting that work.[38] In 2017, Amazon launched support for AWS Software Development Kits (SDK), allowing for nine new SDKs available to MTurk Users. MTurk is accessible via API from the following languages: Python, JavaScript, Java, .NET, Go, Ruby, PHP or C++.[39] Web sites and web services can use the API to integrate MTurk work into other web applications, providing users with alternatives to the interface Amazon has built for these functions.

Worker-HIT-assignment ER_diagram

Use case examples

Processing photos / videos

Amazon Mechanical Turk provides a platform for processing images, a task well-suited to human intelligence. Requesters have created tasks asking workers to label objects found in an image, select the most relevant picture in a group of pictures, screen inappropriate content, and classify objects in satellite images. Also, crowdworkers have completed tasks of digitizing text from images such as scanned forms filled out by hand.[40]

Data cleaning / verification

Companies with large online catalogues use Mechanical Turk to identify duplicates and verify details of item entries. Some examples of fixing duplicates are identifying and removing duplicates in yellow pages directory listings and online product catalog entries. Examples of verifying details include checking restaurant details (e.g. phone number and hours) and finding contact information from web pages (e.g. author name and email).[10][40]

Information collection

Diversification and scale of personnel of Mechanical Turk allow collecting an amount of information that would be difficult outside of a crowd platform. Mechanical Turk allows Requesters to amass a large number of responses to various types of surveys, from basic demographics to academic research. Other uses include writing comments, descriptions and blog entries to websites and searching data elements or specific fields in large government and legal documents.[40]

Data processing

Companies use Mechanical Turk's crowd labor to understand and respond to different types of data. Common uses include editing and transcription of podcasts, translation, and matching search engine results.[10][40]

Research validity

The validity of research conducted with the Mechanical Turk worker pool has been questioned.[41][42] This is in large part due to the proprietary method that Mechanical Turk uses to select its workers, just like most other prominent panel providers.[43] The website may also have a bias towards young and educated people compared to other panel providers, which question the relevance of the sample if it must be representative of the general population.[44] Since the method of selection is not shared with researchers, researchers cannot know the true demographics of the pool of participants. It is unclear whether Mechanical Turk uses fiscal, political, or educational limiters in their selection process. This may invalidate any surveys or research done using the Mechanical Turk worker pool.[45][46]

Systems such as Mechanical Turk have been criticised for encouraging conformity among workers, effectively penalising those training their systems for having any independent perspectives, and potentially promoting inherent prejudices. [47]

Labor issues

Mechanical Turk has been widely criticized for its interactions with and use of labor. Computer scientist Jaron Lanier notes how the design of Mechanical Turk "allows you to think of the people as software components" that conjures "a sense of magic, as if you can just pluck results out of the cloud at an incredibly low cost".[48] While a survey done by researchers at the University of Texas showed that the surveyed Workers were motivated by enjoyment and self-fulfillment,[49] these results may have been prejudiced by MTurk's Worker selection algorithms. A 2016 Pew Research study found that a quarter of online "gig workers" like those who work on Mechanical Turk do so because there are limited employment opportunities where they live.[50]

Monetary compensation

Because tasks are typically simple and repetitive and users are paid often only a few cents to complete them, some have criticized Mechanical Turk for exploiting and not compensating workers for the true value of the task they complete.[51] The minimum payment that Amazon allows for a task is one cent. The market for tasks is competitive and for some these tasks are their only available form of employment, particularly for the less educated. Because of the need to provide for themselves and a lack of other opportunities, many workers accept the low compensation for the completion of tasks. A study of 3.8 million tasks completed by 2,767 workers on Amazon's Mechanical Turk showed that "workers earned a median hourly wage of about $2 an hour" with 4 percent of workers earning more than $7.25 per hour. Since these workers are considered independent contractors, they are not protected by the Fair Labor Standards Act that guarantees minimum wage. By 2018, the increasing number of workers competing on the site reduced the total amount of work available. As workers search for tasks, they do not receive compensation nor do they receive additional compensation if a task takes longer than estimated by the requester.[50]

Professor Miriam Cherry, Saint Louis University School of Law, has argued that even independent contractors should be entitled to basic protections, saying that workers on Mechanical Turk are "no different than construction workers who show up at job sites and work for a day or two on a project". Those construction workers can still file a lawsuit under the Fair Labor Standards Act for wage theft, even though they are not considered employees.[50]

Fraud

The Nation magazine said in 2014 that some Requesters had taken advantage of Workers by having them do the tasks, then rejecting their submissions in order to avoid paying them.[52]

In the Facebook–Cambridge Analytica data scandal, Mechanical Turk was one of the means of covertly gathering private information for a massive database.[53] The system paid persons a dollar or two to install a Facebook connected app and answer personal questions. The survey task, as a work for hire, was not used for a demographic or psychological research project as it might have seemed. The purpose was instead to bait the worker to reveal personal information about the worker's identity that was not already collected by Facebook or Mechanical Turk.

Labor relations

Others have criticized that the marketplace does not have the ability for the workers to negotiate with the employers. In response to the growing criticisms of payment evasion and lack of representation, a group has developed a third party platform called Turkopticon which allows workers to give feedback on their employers allowing other users to avoid potentially unscrupulous jobs and to recommend superior employers.[54][55] Another platform called Dynamo was created to allow the workers to collect anonymously and organize campaigns to better their work environment, including the Guidelines for Academic Requesters and the Dear Jeff Bezos Campaign.[56][57][58][59] Amazon has made it harder for workers to enroll in Dynamo by closing the request account that provided workers with a required code for Dynamo membership. Amazon has installed updates that prevent plugins that identify high quality human intelligence tasks from functioning on the website.[50] Additionally, there have been worker complaints that Amazon's payment system will on occasion stop working - a major issue for workers requiring daily payments.[50]

Related systems

Mechanical Turk is comparable in some respects to the now discontinued Google Answers service. However, the Mechanical Turk is a more general marketplace that can potentially help distribute any kind of work tasks all over the world. The Collaborative Human Interpreter (CHI) by Philipp Lenssen also suggested using distributed human intelligence to help computer programs perform tasks that computers cannot do well. MTurk could be used as the execution engine for the CHI.[citation needed]

In 2014 the Russian search giant Yandex launched a similar system called Toloka that is similar to the Mechanical Turk.[60]

See also

References

  1. ^ "Amazon Mechanical Turk, FAQ page". Retrieved 14 April 2017.
  2. ^ "Overview | Requester | Amazon Mechanical Turk". Requester.mturk.com. Retrieved 2011-11-28.
  3. ^ "Amazon Mechanical Turk". www.mturk.com.
  4. ^ Multiple sources:
  5. ^ "Artificial artificial intelligence". The Economist. 2006-06-10.
  6. ^ a b "Mturk pricing". AWS. Amazon. 2019. Retrieved 16 April 2019.
  7. ^ "AWS Developer Forums". Retrieved 14 November 2012.
  8. ^ Tamir, Dahn. "50000 Worldwide Mechanical Turk Workers". techlist. Retrieved September 17, 2014.
  9. ^ Djellel, Difallah; Filatova, Elena; Ipeirotis, Panos (2018). "Demographics and dynamics of mechanical turk workers" (PDF). Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining: 135–143. doi:10.1145/3159652.3159661. ISBN 9781450355810. S2CID 22339115.
  10. ^ a b c "Amazon Mechanical Turk: The Digital Sweatshop" Ellen Cushing Utne Reader January–February 2013
  11. ^ Panos Ipeirotis (March 19, 2008). "Mechanical Turk: The Demographics". New York University. Retrieved 2009-07-30.
  12. ^ Panos Ipeirotis (March 16, 2009). "Turker Demographics vs Internet Demographics". New York University. Retrieved 2009-07-30.
  13. ^ a b Litman, Leib (2020). Conducting online research on Amazon Mechanical Turk and beyond. Jonathan Robinson (1st ed.). Los Angeles. ISBN 978-1-5063-9111-3. OCLC 1180179545.{{cite book}}: CS1 maint: location missing publisher (link)
  14. ^ Robinson, Jonathan; Rosenzweig, Cheskie; Moss, Aaron J.; Litman, Leib (2019-12-16). Sudzina, Frantisek (ed.). "Tapped out or barely tapped? Recommendations for how to harness the vast and largely unused potential of the Mechanical Turk participant pool". PLOS ONE. 14 (12): e0226394. doi:10.1371/journal.pone.0226394. ISSN 1932-6203. PMC 6913990. PMID 31841534.{{cite journal}}: CS1 maint: PMC format (link) CS1 maint: unflagged free DOI (link)
  15. ^ Panos Ipeirotis (March 9, 2010). "The New Demographics of Mechanical Turk". New York University. Retrieved 2014-03-24.
  16. ^ "MTurk Tracker". demographics.mturk-tracker.com. Retrieved 1 October 2015.
  17. ^ "MTurk Tracker". demographics.mturk-tracker.com. Retrieved 2 May 2019.
  18. ^ Casey, Logan; Chandler, Jesse; Levine, Adam; Proctor, Andrew; Sytolovich, Dara (2017). "Intertemporal Differences Among MTurk Workers: Time-Based Sample Variations and Implications for Online Data Collection". SAGE Open. 7 (2): 215824401771277. doi:10.1177/2158244017712774.
  19. ^ Levay, Kevin; Freese, Jeremy; Druckman, James (2016). "The Demographic and Political Composition of Mechanical Turk Samples". SAGE Open. 6: 215824401663643. doi:10.1177/2158244016636433.
  20. ^ Agley, Jon; Xiao, Yunyu; Nolan, Rachael; Golzarri-Arroyo, Lilian (2021). "Quality control questions on Amazon's Mechanical Turk (MTurk): A randomized trial of impact on the USAUDIT, PHQ-9, and GAD-7". Behavior Research Methods. 54 (2): 885–897. doi:10.3758/s13428-021-01665-8. ISSN 1554-3528. PMC 8344397. PMID 34357539.
  21. ^ Hauser, David; Paolacci, Gabriele; Chandler, Jesse J. (2018-09-01). "Common Concerns with MTurk as a Participant Pool: Evidence and Solutions". doi:10.31234/osf.io/uq45c. {{cite journal}}: Cite journal requires |journal= (help)
  22. ^ Clifford, Scott; Jerit, Jennifer (2016). "Cheating on Political Knowledge Questions in Online Surveys: An Assessment of the Problem and Solutions". Public Opinion Quarterly. 80 (4): 858–887. doi:10.1093/poq/nfw030. ISSN 0033-362X.
  23. ^ Hauser, David J.; Moss, Aaron J.; Rosenzweig, Cheskie; Jaffe, Shalom N.; Robinson, Jonathan; Litman, Leib (2022-11-03). "Evaluating CloudResearch's Approved Group as a solution for problematic data quality on MTurk". Behavior Research Methods. doi:10.3758/s13428-022-01999-x. ISSN 1554-3528.
  24. ^ Chandler, Jesse.; Shapiro, Danielle (2016). "Conducting Clinical Research Using Crowdsourced Convenience Samples". Annual Review of Clinical Psychology. 12: 53–81. doi:10.1146/annurev-clinpsy-021815-093623. PMID 26772208.
  25. ^ Huff, Connor; Tingley, Dustin (2015-07-01). ""Who are these people?" Evaluating the demographic characteristics and political preferences of MTurk survey respondents". Research & Politics. 2 (3): 205316801560464. doi:10.1177/2053168015604648. ISSN 2053-1680.
  26. ^ Clifford, Scott; Jewell, Ryan M; Waggoner, Philip D (2015-10-01). "Are samples drawn from Mechanical Turk valid for research on political ideology?". Research & Politics. 2 (4): 205316801562207. doi:10.1177/2053168015622072. ISSN 2053-1680.
  27. ^ Chandler, Jesse; Rosenzweig, Cheskie; Moss, Aaron J.; Robinson, Jonathan; Litman, Leib (2019-10). "Online panels in social science research: Expanding sampling methods beyond Mechanical Turk". Behavior Research Methods. 51 (5): 2022–2038. doi:10.3758/s13428-019-01273-7. ISSN 1554-3528. PMC 6797699. PMID 31512174. {{cite journal}}: Check date values in: |date= (help)CS1 maint: PMC format (link)
  28. ^ Rajpurkar, Pranav; Zhang, Jian; Lopyrev, Konstantin; Liang, Percy (2016). "SQuAD: 100,000+ Questions for Machine Comprehension of Text". arXiv:1606.05250 [cs.CL].
  29. ^ Steve Silberman (July 24, 2007). "Inside the High-Tech Search for a Silicon Valley Legend". Wired magazine. Retrieved 2007-09-16.
  30. ^ "AVweb Invites You to Join the Search for Steve Fossett". Avweb.com. 8 September 2007. Retrieved 2011-11-28.
  31. ^ "Official Mechanical Turk Steve Fossett Results". 2007-09-24. Retrieved 2012-08-14.
  32. ^ Jim Christie (October 1, 2008). "Hikers find Steve Fossett's ID, belongings". Reuters. Archived from the original on 20 December 2008. Retrieved 2008-11-27.
  33. ^ "Let's Get Physical".
  34. ^ "Mechanical Games, online sports video for turkers | Neural".
  35. ^ "Jail Benches and Amazon.com at SanTana's Grand Central Art Center | OC Weekly". Archived from the original on 2015-09-06. Retrieved 2019-04-16.
  36. ^ Project: http://www.missconceptions.net/mediations/
  37. ^ "Amazon Web Services Blog: Amazon Mechanical Turk Status Update". Aws.typepad.com. 2005-12-06. Retrieved 2011-11-28.
  38. ^ "Documentation Archive : Amazon Web Services". Developer.amazonwebservices.com. Archived from the original on 2009-04-10. Retrieved 2011-11-28.
  39. ^ "Amazon Mechanical Turk API Reference". Developer.amazonwebservices.com.
  40. ^ a b c d "Inside Amazon's clickworker platform: How half a million people are being paid pennies to train AI". TechRepublic. 16 December 2016.
  41. ^ Feldman, Gilad. "Running Experiments with Amazon Mechanical Turk | Gilad Feldman".
  42. ^ Landers, R. N.; Behrend, T. S. (2015). "Can I Use Mechanical Turk (MTurk) for a Research Study?". Industrial and Organizational Psychology. 8 (2).
  43. ^ Lazarus, Jeffrey V. (October 6, 2020). "COVID-SCORE: A global survey to assess public perceptions of government responses to COVID-19 (COVID-SCORE-10)". PLOS ONE. 15 (10). Data Collection. Bibcode:2020PLoSO..1540011L. doi:10.1371/journal.pone.0240011. PMC 7538106. PMID 33022023.
  44. ^ Kimball, Spencer H. (2018). "Survey Data Collection; Online Panel Efficacy. A Comparative Study of Amazon MTurk and Research Now SSI/ Survey Monkey/ Opinion Access" (PDF). Business Press. Archived (PDF) from the original on 2021-10-22. Retrieved October 22, 2021.
  45. ^ "External Validity - Generalizing Results in Research". explorable.com.
  46. ^ "Social Research Methods - Knowledge Base - External Validity". www.socialresearchmethods.net.
  47. ^ Naylor, Aliide (March 8, 2021). "Underpaid Workers Are Being Forced to Train Biased AI on Mechanical Turk". www.vice.com.
  48. ^ Jaron Lanier (2013). Who Owns the Future?. Simon and Schuster. ISBN 978-1-4516-5497-4.
  49. ^ Buhrmester, Michael; Kwang, Tracy; Gosling, Sam (2011). "Amazon's Mechanical Turk A New Source of Inexpensive, Yet High-Quality, Data?". Perspectives on Psychological Science. 6 (1): 3–5. doi:10.1177/1745691610393980. PMID 26162106. S2CID 6331667.
  50. ^ a b c d e Semuels, Alana (23 January 2018). "The Internet Is Enabling a New Kind of Poorly Paid Hell". The Atlantic. Retrieved 25 April 2019.
  51. ^ Schmidt, Florian Alexander (2013). "The Good, The Bad and the Ugly: Why Crowdsourcing Needs Ethics". 2013 International Conference on Cloud and Green Computing. pp. 531–535. doi:10.1109/CGC.2013.89. ISBN 978-0-7695-5114-2. S2CID 18798641. {{cite book}}: |journal= ignored (help)
  52. ^ Marvit, Moshe Z. (February 5, 2014). "How Crowdworkers Became the Ghosts in the Digital Machine". www.thenation.com.
  53. ^ New York Times (April 10, 2018). "Cambridge Analytica and the Coming Data Bust". The New York Times. Retrieved April 13, 2018.
  54. ^ Hal Hodson (February 7, 2013). "Crowdsourcing grows up as online workers unite". New Scientist. Retrieved May 21, 2015.
  55. ^ "turkopticon". turkopticon.ucsd.edu.
  56. ^ Mark Harris (December 3, 2014). "'Amazon's Mechanical Turk workers protest: 'I am a human being, not an algorithm''". The Guardian. Retrieved October 6, 2015.
  57. ^ Fingas, Jon (December 3, 2014). "'Amazon's Mechanical Turk workers want to be treated like humans'". Engadget. Retrieved October 6, 2015.
  58. ^ James Vincent (December 4, 2014). "Amazon's Mechanical Turkers want to be recognized as 'actual human beings'". The Verge. Retrieved October 6, 2015.
  59. ^ Sarah Kessler (February 19, 2015). "WHAT DOES A UNION LOOK LIKE IN THE GIG ECONOMY?". Fast Company. Retrieved October 6, 2015.
  60. ^ "Yandex.Toloka".

Further reading

External links