Wisdom of the crowd: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
tidy up refs a bit
Line 1: Line 1:
The '''wisdom of the crowd''' is the process of taking into account the collective opinion of a group of individuals rather than a single expert to answer a question. A large group's aggregated answers to questions involving quantity estimation, general world knowledge, and spatial reasoning has generally been found to be as good as, and often better than, the answer given by any of the individuals within the group. An intuitive and often-cited explanation for this phenomenon is that there is idiosyncratic noise associated with each individual judgment, and taking the average over a large number of responses will go some way toward canceling the effect of this noise.<ref>Yi, S.K.M., Steyvers, M., & Lee, M.D. (in press). [The Wisdom of Crowds in Combinatorial Problems]. Cognitive Science.</ref> This process, while not new to the [[information age]], has been pushed into the mainstream spotlight by social information sites such as [[Wikipedia]] and [[Yahoo! Answers]], and other web resources that rely on human opinion.<ref>Baase, Sara. A Gift of Fire: Social, Legal, and Ethical Issues for Computing and the Internet. 3. Upper Saddle River: Prentice Hall, 2007. Pages 351-357. ISBN 0-13-600848-8.</ref>
The '''wisdom of the crowd''' is the process of taking into account the collective opinion of a group of individuals rather than a single expert to answer a question. A large group's aggregated answers to questions involving quantity estimation, general world knowledge, and spatial reasoning has generally been found to be as good as, and often better than, the answer given by any of the individuals within the group. An intuitive and often-cited explanation for this phenomenon is that there is idiosyncratic noise associated with each individual judgment, and taking the average over a large number of responses will go some way toward canceling the effect of this noise.<ref name="WoC-Combi">{{cite journal |title=Pages: 452–470 |author=Yi, S. K. M., Steyvers, M., Lee, M. D. and Dry, M. J. |journal=[[Cognitive Science]] |date=April 2012 |title=The Wisdom of the Crowd in Combinatorial Problems |volume=36 |issue=3 |doi=10.1111/j.1551-6709.2011.01223.x}}</ref> This process, while not new to the [[information age]], has been pushed into the mainstream spotlight by social information sites such as [[Wikipedia]] and [[Yahoo! Answers]], and other web resources that rely on human opinion.<ref>Baase, Sara (2007). ''A Gift of Fire: Social, Legal, and Ethical Issues for Computing and the Internet''. 3rd edition. Prentice Hall. pp. 351–357. ISBN 0-13-600848-8.</ref>


The process, in the business world at least, was written about in detail by [[James Surowiecki]] in his book ''[[The Wisdom of Crowds]]''.<ref>Surowiecki, James. The Wisdom of Crowds: Why the Many Are Smarter Than the Few and How Collective Wisdom Shapes Business, Economies, Societies and Nations. Doubleday, 2004. ISBN 978-0-385-50386-0.</ref>
The process, in the business world at least, was written about in detail by [[James Surowiecki]] in his book ''[[The Wisdom of Crowds]]''.<ref name="Surowiecki">Surowiecki, James. ''[[The Wisdom of Crowds: Why the Many Are Smarter Than the Few and How Collective Wisdom Shapes Business, Economies, Societies and Nations]]''. Doubleday, 2004. ISBN 978-0-385-50386-0.</ref>


In the realm of justice, trial by jury can be understood as wisdom of the crowd, especially when compared to the alternative, trial by a judge, the single expert.
In the realm of justice, trial by jury can be understood as wisdom of the crowd, especially when compared to the alternative, trial by a judge, the single expert.
Line 11: Line 11:
==Classic examples==
==Classic examples==


The classic wisdom-of-the-crowds finding involves point estimation of a continuous quantity. At a 1906 country fair in [[Plymouth]], eight hundred people participated in a contest to estimate the weight of a slaughtered and dressed ox. Statistician [[Francis Galton]] observed that the mean of all eight hundred guesses, at 1197 pounds, was closer than any of the individual guesses to the true weight of 1198 pounds.<ref> Galton, F. (1907). Vox populi. Nature, 75, 450–45</ref> This has contributed to the insight in cognitive science that a crowd's individual judgments can be modeled as a probability distribution of responses with the mean centered near the true mean of the quantity to be estimated.<ref>Surowiecki, J. (2004). The wisdom of crowds. New York: Random House.</ref>
The classic wisdom-of-the-crowds finding involves point estimation of a continuous quantity. At a 1906 country fair in [[Plymouth]], eight hundred people participated in a contest to estimate the weight of a slaughtered and dressed ox. Statistician [[Francis Galton]] observed that the mean of all eight hundred guesses, at 1197 pounds, was closer than any of the individual guesses to the true weight of 1198 pounds.<ref>[[Francis Galton|Galton, F.]] (1907). "Vox populi". ''[[Nature (journal)|]]'', '''75''', pp. 450–45.</ref> This has contributed to the insight in cognitive science that a crowd's individual judgments can be modeled as a probability distribution of responses with the mean centered near the true mean of the quantity to be estimated.<ref name="Surowiecki"/>


==Definition of crowd==
==Definition of crowd==
Line 21: Line 21:


==Problems==
==Problems==
Wisdom-of-the-crowds research routinely attributes the superiority of crowd averages over individual judgments to the elimination of individual noise,<ref>[http://ssrn.com/abstract=1616519 A Model of Deliberation Based on Rawls’s Political Liberalism]</ref> an explanation that assumes [[independence]] of the individual judgments from each other.<ref>Surowiecki, J. (2004). The wisdom of crowds. New York: Random House.</ref><ref>Vul, E & Pashler, H (2008) "Measuring the Crowd Within: Probabilistic representations Within individuals" Psychological Science. 19(7) 645-647.</ref> Thus the crowd tends to make its best decisions if it is made up of diverse opinions and ideologies.
Wisdom-of-the-crowds research routinely attributes the superiority of crowd averages over individual judgments to the elimination of individual noise,<ref>{{cite journal |author=Benhenda, Mostapha |title=A Model of Deliberation Based on Rawls’s Political Liberalism |journal=Social Choice and Welfare |date= |url=http://ssrn.com/abstract=1616519 |volume=36 |pages=121–178 |year=2011}}</ref> an explanation that assumes [[independence]] of the individual judgments from each other.<ref name="Surowiecki"/><ref name="Vul-Pashler">Vul, E. and Pashler, H. (2008) "Measuring the Crowd Within: Probabilistic Representations Within Individuals". ''Psychological Science''. 19(7) 645–647.</ref> Thus the crowd tends to make its best decisions if it is made up of diverse opinions and ideologies.


Miller and Stevyers (in press) reduced the independence of individual responses in a wisdom-of-the-crowds experiment by allowing limited communication between participants. Participants were asked to answer ordering questions for general knowledge questions such as the order of U.S. presidents. For half of the questions, each participant started with the ordering submitted by another participant (and alerted to this fact), and for the other half, they started with a random ordering, and in both cases were asked to rearrange them (if necessary) to the correct order. Answers where participants started with another participant's ranking were on average more accurate than those from the random starting condition. Miller and Steyvers conclude that different item-level knowledge among participants is responsible for this phenomenon, and that participants integrated and augmented previous participants' knowledge with their own knowledge.<ref>Miller, B., & Steyvers, M. (in press). The Wisdom of Crowds with Communication. In L. Carlson, C. Hölscher, & T.F. Shipley (Eds.), Proceedings of the 33rd Annual Conference of the Cognitive Science Society. Austin, TX: Cognitive Science Society .</ref>
Miller and Stevyers reduced the independence of individual responses in a wisdom-of-the-crowds experiment by allowing limited communication between participants. Participants were asked to answer ordering questions for general knowledge questions such as the order of U.S. presidents. For half of the questions, each participant started with the ordering submitted by another participant (and alerted to this fact), and for the other half, they started with a random ordering, and in both cases were asked to rearrange them (if necessary) to the correct order. Answers where participants started with another participant's ranking were on average more accurate than those from the random starting condition. Miller and Steyvers conclude that different item-level knowledge among participants is responsible for this phenomenon, and that participants integrated and augmented previous participants' knowledge with their own knowledge.<ref name="WoC-Communi">Miller, B., and Steyvers, M. (in press). "The Wisdom of Crowds with Communication". In L. Carlson, C. Hölscher, & T.F. Shipley (Eds.), ''Proceedings of the 33rd Annual Conference of the Cognitive Science Society''. Austin, TX: Cognitive Science Society.</ref>


Crowds tend to work best when there is a correct answer to the question being posed, such as a question about geography or mathematics.<ref>[http://www.randomhouse.com/features/wisdomofcrowds/Q&A.html The wisdom of crowds: Q & A with James Surowiecki] Random House</ref>
Crowds tend to work best when there is a correct answer to the question being posed, such as a question about geography or mathematics.<ref>[http://www.randomhouse.com/features/wisdomofcrowds/Q&A.html "The wisdom of crowds: Q & A with James Surowiecki"]. Random House.</ref>


The wisdom of the crowd effect is easily undermined. Social influence can cause the average of the crowd answers to be wildly inaccurate, while the geometric mean and the median are far more robust.<ref>[http://www.pnas.org/content/108/22/9020.full.pdf+html How Social Influence can Undermine the Wisdom of Crowd Effect, Proc. Nat. Acad. Sciences, 2011]</ref>
The wisdom of the crowd effect is easily undermined. Social influence can cause the average of the crowd answers to be wildly inaccurate, while the geometric mean and the median are far more robust.<ref>[http://www.pnas.org/content/108/22/9020.full.pdf+html "How Social Influence can Undermine the Wisdom of Crowd Effect"]. Proc. Nat. Acad. Sciences, 2011.</ref>


== Analogues with individual cognition: the "crowd within" ==
== Analogues with individual cognition: the "crowd within" ==


The insight that crowd responses to an estimation task can be modeled as a sample from a [[probability distribution]] invites comparisons with individual cognition. In particular, it is possible that individual cognition is probabilistic in the sense that individual estimates are drawn from an "internal probability distribution." If this is the case, then two or more estimates of the same quantity from the same person should average to a value closer to ground truth than either of the individual judgments, since the effect of [[statistical noise]] within each of these judgments is reduced. This of course rests on the assumption that the noise associated with each judgment is (at least somewhat) [[statistically independent]]. Another caveat is that individual probability judgments are often biased toward extreme values (e.g., 0 or 1). Thus any beneficial effect of multiple judgments from the same person is likely to be limited to samples from an unbiased distribution.<ref>Vul, E & Pashler, H (2008) "Measuring the Crowd Within: Probabilistic representations Within individuals" Psychological Science. 19(7) 645-647.</ref>
The insight that crowd responses to an estimation task can be modeled as a sample from a [[probability distribution]] invites comparisons with individual cognition. In particular, it is possible that individual cognition is probabilistic in the sense that individual estimates are drawn from an "internal probability distribution." If this is the case, then two or more estimates of the same quantity from the same person should average to a value closer to ground truth than either of the individual judgments, since the effect of [[statistical noise]] within each of these judgments is reduced. This of course rests on the assumption that the noise associated with each judgment is (at least somewhat) [[statistically independent]]. Another caveat is that individual probability judgments are often biased toward extreme values (e.g., 0 or 1). Thus any beneficial effect of multiple judgments from the same person is likely to be limited to samples from an unbiased distribution.<ref name="Vul-Pashler"/>


Vul and Pashler (2008) asked participants for point estimates of continuous quantities associated with general world knowledge, such as "What percentage of the world's airports are in the United States?" Without being alerted to the procedure in advance, half of the participants were immediately asked to make a second, different guess in response to the same question, and the other half were asked to do this three weeks later. The average of a participant's two guesses was more accurate than either individual guess. Furthermore, the averages of guesses made in the three-week delay condition were more accurate than guesses made in immediate succession. One explanation of this effect is that guesses in the immediate condition were less independent of each other (an [[anchoring effect]]) and were thus subject to (some of) the same kind of noise. In general, these results suggest that individual cognition may indeed be subject to an internal probability distribution characterized by stochastic noise, rather than consistently producing the best answer based on all the knowledge a person has.<ref>Vul, E & Pashler, H (2008) "Measuring the Crowd Within: Probabilistic representations Within individuals" Psychological Science. 19(7) 645-647.</ref>
Vul and Pashler (2008) asked participants for point estimates of continuous quantities associated with general world knowledge, such as "What percentage of the world's airports are in the United States?" Without being alerted to the procedure in advance, half of the participants were immediately asked to make a second, different guess in response to the same question, and the other half were asked to do this three weeks later. The average of a participant's two guesses was more accurate than either individual guess. Furthermore, the averages of guesses made in the three-week delay condition were more accurate than guesses made in immediate succession. One explanation of this effect is that guesses in the immediate condition were less independent of each other (an [[anchoring effect]]) and were thus subject to (some of) the same kind of noise. In general, these results suggest that individual cognition may indeed be subject to an internal probability distribution characterized by stochastic noise, rather than consistently producing the best answer based on all the knowledge a person has.<ref name="Vul-Pashler"/>


Hourihan and Benjamin (2010) tested the hypothesis that the estimate improvements observed by Vul and Pashler in the delayed responding condition were the result of increased independence of the estimates. To do this Hourihan and Benjamin capitalized on variations in [[memory span]] among their participants. In support they found that averaging repeated estimates of those with lower memory spans showed greater estimate improvements than the averaging the repeated estimates of those with larger memory spans.<ref> Hourihan, K. L., & Benjamin, A. S. (2010). Smaller is better (when sampling from the crowd within): Low memory-span individuals benefit more from multiple opportunities for estimation. Journal of Experimental Psychology: Learning, Memory, and Cognition, 36, 1068-1074. </ref>
Hourihan and Benjamin (2010) tested the hypothesis that the estimate improvements observed by Vul and Pashler in the delayed responding condition were the result of increased independence of the estimates. To do this Hourihan and Benjamin capitalized on variations in [[memory span]] among their participants. In support they found that averaging repeated estimates of those with lower memory spans showed greater estimate improvements than the averaging the repeated estimates of those with larger memory spans.<ref>Hourihan, K. L., and Benjamin, A. S. (2010). "Smaller is better (when sampling from the crowd within): Low memory-span individuals benefit more from multiple opportunities for estimation". ''Journal of Experimental Psychology: Learning, Memory, and Cognition'', 36, 1068–1074.</ref>


Rauhut and Lorenz (2011) expanded on this research by again asking participants to make estimates of continuous quantities related to real world knowledge – however, in this case participants were informed that they would make five consecutive estimates. This approach allowed the researchers to determine: (1) the number of times one needs to ask oneself in order to match the accuracy of asking others and (2) the rate at which estimates made by oneself improve estimates compared to asking others. The authors concluded that asking oneself an infinite number of times does not surpass the accuracy of asking just one other individual. Overall, they found little support for a so-called “mental distribution” from which individuals draw their estimates; in fact, they found that in some cases asking oneself multiple times actually reduces accuracy. Ultimately, they argue that the results of Vul and Pashler (2008) overestimate the wisdom of the “crowd within” – as their results show that asking oneself more that three times actually reduces accuracy to levels below that reported by Vul and Pashler (who only asked participants to make two estimates).<ref>{{cite journal|last=Rauhut|first=H|coauthors=Lorenz|title=The wisdom of the crowds in one mind: How individuals can simulate the knowledge of diverse societies to reach better decisions|journal=Journal of Mathematical Psychology|year=2011|volume=55|pages=191–197}}</ref>
Rauhut and Lorenz (2011) expanded on this research by again asking participants to make estimates of continuous quantities related to real world knowledge – however, in this case participants were informed that they would make five consecutive estimates. This approach allowed the researchers to determine: (1) the number of times one needs to ask oneself in order to match the accuracy of asking others and (2) the rate at which estimates made by oneself improve estimates compared to asking others. The authors concluded that asking oneself an infinite number of times does not surpass the accuracy of asking just one other individual. Overall, they found little support for a so-called “mental distribution” from which individuals draw their estimates; in fact, they found that in some cases asking oneself multiple times actually reduces accuracy. Ultimately, they argue that the results of Vul and Pashler (2008) overestimate the wisdom of the “crowd within” – as their results show that asking oneself more that three times actually reduces accuracy to levels below that reported by Vul and Pashler (who only asked participants to make two estimates).<ref>{{cite journal|last=Rauhut|first=H|coauthors=Lorenz|title=The wisdom of the crowds in one mind: How individuals can simulate the knowledge of diverse societies to reach better decisions|journal=Journal of Mathematical Psychology|year=2011|volume=55|pages=191–197}}</ref>


Müller-Trede (2011) attempted to investigate the types of questions in which utilizing the “crowd within” is most effective. He found that while accuracy gains were smaller than would be expected from averaging ones’ estimates with another individual, repeated judgments lead to increases in accuracy for both year estimation questions (e.g., when was the thermometer invented?) and questions about estimated percentages (e.g., what percentage of internet users connect from China?). General numerical questions (e.g., what is the speed of sound, in kilometers per hour?), however, did not show improvement with repeated judgments, while averaging individual judgments with those of a random other did improve accuracy. This, Müller-Trede argues, is the result of the bounds implied by year and percentage questions.<ref>{{cite journal|last=Müller-Trede|first=J|title=Repeated judgment sampling: Boundaries.|journal=Judgment and Decision Making|year=2011|volume=6|pages=283–294}}</ref>
Müller-Trede (2011) attempted to investigate the types of questions in which utilizing the “crowd within” is most effective. He found that while accuracy gains were smaller than would be expected from averaging ones’ estimates with another individual, repeated judgments lead to increases in accuracy for both year estimation questions (e.g., when was the thermometer invented?) and questions about estimated percentages (e.g., what percentage of internet users connect from China?). General numerical questions (e.g., what is the speed of sound, in kilometers per hour?), however, did not show improvement with repeated judgments, while averaging individual judgments with those of a random other did improve accuracy. This, Müller-Trede argues, is the result of the bounds implied by year and percentage questions.<ref>{{cite journal|last=Müller-Trede|first=J.|title=Repeated judgment sampling: Boundaries.|journal=Judgment and Decision Making|year=2011|volume=6|pages=283–294}}</ref>


'''Dialectical Bootstrapping: Improving the estimates of the “crowd within.”'''
'''Dialectical Bootstrapping: Improving the estimates of the “crowd within.”'''
Line 45: Line 45:
Herzog and Hertwig (2009) attempted to improve on the “wisdom of many in one mind” (i.e., the “crowd within”) by asking participants to use dialectical bootstrapping. Dialectical bootstrapping involves the use of [[dialectic]] (reasoned discussion that takes place between two or more parties with opposing views, in an attempt to determine the best answer) and [[bootstrapping]] (advancing oneself without the assistance of external forces). They posited that people should be able to make greater improvements on their original estimates by basing the second estimate on [[Antithesis|antithetical]] information. Therefore, these second estimates, based on different assumptions and knowledge than that used to generate the first estimate would also have a different error (both [[Systematic error|systematic]] and [[Random_error|random]]) than the first estimate – increasing the accuracy of the average judgment. From an analytical perspective dialectical bootstrapping should increase accuracy so long as the dialectical estimate is not too far off and the errors of the first and dialectical estimates are different. To test this, Herzog and Hertwig asked participants to make a series 40 of date estimations regarding historical events (e.g., when electricity was discovered), without knowledge that they would be asked to provide a second estimate. Next, half of the participants were simply asked to make a second estimate. The other half were asked to use a consider-the opposite strategy to make dialectical estimates (using their initial estimates as a reference point). Specifically, participants were asked to imagine that their initial estimate was off, consider what information may have been wrong, what this alternative information would suggest, if that would have made their estimate an over or and under-estimate, and finally, based on this perspective what their new estimate would be. Results of this study revealed that while dialectical bootstrapping did not out perform the wisdom of the crowd (averaging each participants’ first estimate with that of a random other participant), it did render better estimates than simply asking individuals to make two estimates.<ref>{{cite journal|last=Herzog|first=S. M.|coauthors=Hertwig, R.|title=The wisdom of many in one mind: Improving individual judgments with dialectical bootstrapping.|journal=Psychological Science|year=2009|volume=20|pages=231–237}}</ref>
Herzog and Hertwig (2009) attempted to improve on the “wisdom of many in one mind” (i.e., the “crowd within”) by asking participants to use dialectical bootstrapping. Dialectical bootstrapping involves the use of [[dialectic]] (reasoned discussion that takes place between two or more parties with opposing views, in an attempt to determine the best answer) and [[bootstrapping]] (advancing oneself without the assistance of external forces). They posited that people should be able to make greater improvements on their original estimates by basing the second estimate on [[Antithesis|antithetical]] information. Therefore, these second estimates, based on different assumptions and knowledge than that used to generate the first estimate would also have a different error (both [[Systematic error|systematic]] and [[Random_error|random]]) than the first estimate – increasing the accuracy of the average judgment. From an analytical perspective dialectical bootstrapping should increase accuracy so long as the dialectical estimate is not too far off and the errors of the first and dialectical estimates are different. To test this, Herzog and Hertwig asked participants to make a series 40 of date estimations regarding historical events (e.g., when electricity was discovered), without knowledge that they would be asked to provide a second estimate. Next, half of the participants were simply asked to make a second estimate. The other half were asked to use a consider-the opposite strategy to make dialectical estimates (using their initial estimates as a reference point). Specifically, participants were asked to imagine that their initial estimate was off, consider what information may have been wrong, what this alternative information would suggest, if that would have made their estimate an over or and under-estimate, and finally, based on this perspective what their new estimate would be. Results of this study revealed that while dialectical bootstrapping did not out perform the wisdom of the crowd (averaging each participants’ first estimate with that of a random other participant), it did render better estimates than simply asking individuals to make two estimates.<ref>{{cite journal|last=Herzog|first=S. M.|coauthors=Hertwig, R.|title=The wisdom of many in one mind: Improving individual judgments with dialectical bootstrapping.|journal=Psychological Science|year=2009|volume=20|pages=231–237}}</ref>


Hirt and Markman (1995) found that participants need not be limited to a consider-the-opposite strategy in order to improve judgments. Researchers asked participants to consider-an-alternative – opearationalized as any plausible alternative (rather than simply focusing on the “opposite” alternative) – finding that simply considering an alternative improved judgments.<ref>Hirt, E. R., & Markman, K. D. (1995). Multiple explanation: A consider-an-alternative strategy for debiasing judgments. Journal of Personality and Social Psychology, 69, 1069-1086. </ref>
Hirt and Markman (1995) found that participants need not be limited to a consider-the-opposite strategy in order to improve judgments. Researchers asked participants to consider-an-alternative – opearationalized as any plausible alternative (rather than simply focusing on the “opposite” alternative) – finding that simply considering an alternative improved judgments.<ref>Hirt, E. R., and Markman, K. D. (1995). "Multiple explanation: A consider-an-alternative strategy for debiasing judgments". ''Journal of Personality and Social Psychology'', 69, 1069–1086.</ref>


It should also be noted that not all studies have shown support for the “crowd within” improving judgments. Ariely and colleagues asked participants to provide responses based on their answers to true-false items and their confidence in those answers. They found that while averaging judgment estimates between individuals significantly improved estimates, averaging repeated judgment estimates made by the same individuals did not significantly improve estimates.<ref>Ariely, D., Au, W. T., Bender, R. H., Budescu, D. V., Dietz, C. B., Gu, H., & Zauberman, G. (2000). The effects of averaging subjective probability estimates between and within judges. Journal of Experimental Psychology: Applied, 6, 130-147.</ref>
It should also be noted that not all studies have shown support for the “crowd within” improving judgments. Ariely and colleagues asked participants to provide responses based on their answers to true-false items and their confidence in those answers. They found that while averaging judgment estimates between individuals significantly improved estimates, averaging repeated judgment estimates made by the same individuals did not significantly improve estimates.<ref>Ariely, D., Au, W. T., Bender, R. H., Budescu, D. V., Dietz, C. B., Gu, H., and Zauberman, G. (2000). "The effects of averaging subjective probability estimates between and within judges." ''Journal of Experimental Psychology: Applied'', 6, 130–147.</ref>


== Higher-dimensional problems and modeling ==
== Higher-dimensional problems and modeling ==


Although classic wisdom-of-the-crowds findings center on point estimates of single continuous quantities, the phenomenon also scales up to higher-dimensional problems that do not lend themselves to aggregation methods such as taking the mean. More complex models have been developed for these purposes. A few examples of higher-dimensional problems that exhibit wisdom-of-the-crowds effects include:
Although classic wisdom-of-the-crowds findings center on point estimates of single continuous quantities, the phenomenon also scales up to higher-dimensional problems that do not lend themselves to aggregation methods such as taking the mean. More complex models have been developed for these purposes. A few examples of higher-dimensional problems that exhibit wisdom-of-the-crowds effects include:
* Combinatorial problems such as [[Minimum spanning tree|minimum spanning trees]] and the [[traveling salesperson problem]], in which participants must find the shortest route between an array of points. Models of these problems either break the problem into common pieces (the <i>local decomposition method</i> of aggregation) or find solutions that are most similar to the individual human solutions (the <i>global similarity aggregation</i> method)<ref>Yi, S.K.M., Steyvers, M., & Lee, M.D. (in press). [The Wisdom of Crowds in Combinatorial Problems]. Cognitive Science.</ref>,<ref>Yi, S.K.M., Steyvers, M., Lee, M.D., & Dry, M. (2010). Wisdom of Crowds in Minimum Spanning Tree Problems. Proceedings of the 32nd Annual Conference of the Cognitive Science Society. Mahwah, NJ: Lawrence Erlbaum..</ref>
* Combinatorial problems such as [[Minimum spanning tree|minimum spanning trees]] and the [[traveling salesperson problem]], in which participants must find the shortest route between an array of points. Models of these problems either break the problem into common pieces (the <i>local decomposition method</i> of aggregation) or find solutions that are most similar to the individual human solutions (the <i>global similarity aggregation</i> method).<ref name="WoC-Combi"/><ref>Yi, S.K.M., Steyvers, M., Lee, M.D., and Dry, M. (2010). Wisdom of Crowds in Minimum Spanning Tree Problems. Proceedings of the 32nd Annual Conference of the Cognitive Science Society. Mahwah, NJ: Lawrence Erlbaum.</ref>
* Ordering problems such as the order of the U.S. presidents or world cities by population. A useful approach in this situation is [[Thurstonian Model|Thurstonian modeling]], which each participant has access to the ground truth ordering but with varying degrees of [[stochastic]] [[Statistical noise|noise]], leading to variance in the final ordering given by different individuals<ref>Lee, M.D., Steyvers, M., de Young, M., & Miller. B.J. (in press). Inferring expertise in knowledge and prediction ranking tasks. Topics in Cognitive Science.</ref>,<ref>Lee, M.D., Steyvers, M., de Young, M., & Miller, B.J. (in press). A model-based approach to measuring expertise in ranking tasks.. In L. Carlson, C. Hölscher, & T.F. Shipley (Eds.), Proceedings of the 33rd Annual Conference of the Cognitive Science Society. Austin, TX: Cognitive Science Society.</ref>,<ref>Steyvers, M., Lee, M.D., Miller, B., & Hemmer, P. (2009). The Wisdom of Crowds in the Recollection of Order Information. In Y. Bengio and D. Schuurmans and J. Lafferty and C. K. I. Williams and A. Culotta (Eds.) Advances in Neural Information Processing Systems, 22, pp. 1785-1793. MIT Press.</ref>,<ref>Miller, B., Hemmer, P. Steyvers, M. & Lee, M.D. (2009). The Wisdom of Crowds in Ordering Problems. In: Proceedings of the Ninth International Conference on Cognitive Modeling. Manchester, UK.</ref>
* Ordering problems such as the order of the U.S. presidents or world cities by population. A useful approach in this situation is [[Thurstonian Model|Thurstonian modeling]], which each participant has access to the ground truth ordering but with varying degrees of [[stochastic]] [[Statistical noise|noise]], leading to variance in the final ordering given by different individuals.<ref>{{cite journal |author=Lee, M. D., Steyvers, M., de Young, M. and Miller, B. |title=Inferring Expertise in Knowledge and Prediction Ranking Tasks |journal=Topics in Cognitive Science |date=January 2012 |issue=4 |pages=151–163 |doi=10.1111/j.1756-8765.2011.01175.x}}</ref>
<ref>Lee, M. D., Steyvers, M., de Young, M. and Miller, B. J. (in press). "A model-based approach to measuring expertise in ranking tasks". In L. Carlson, C. Hölscher, and T.F. Shipley (Eds.), ''Proceedings of the 33rd Annual Conference of the Cognitive Science Society''. Austin, TX: Cognitive Science Society.</ref><ref>Steyvers, M., Lee, M.D., Miller, B., and Hemmer, P. (2009). "The Wisdom of Crowds in the Recollection of Order Information". In Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams and A. Culotta (Eds.) ''Advances in Neural Information Processing Systems'', 22, pp. 1785–1793. MIT Press.</ref><ref>Miller, B., Hemmer, P. Steyvers, M. and Lee, M.D. (2009). "The Wisdom of Crowds in Ordering Problems". In: ''Proceedings of the Ninth International Conference on Cognitive Modeling''. Manchester, UK.</ref>
* [[multi-armed bandit problem|Multi-armed bandit problems]], in which participants choose from a set of alternatives with fixed but unknown reward rates with the goal of maximizing return after a number of trials. To accommodate mixtures of decision processes and individual differences in probabilities of <i>winning and staying</i> with a given alternative versus <i>losing and shifting</i> to another alternative, [[Hierarchical Bayesian model|hierarchical Bayesian models]] have been employed which include parameters for individual people drawn from [[Gaussian]] distributions<ref>Zhang, S., & Lee, M.D., (2010). Cognitive models and the wisdom of crowds: A case study using the bandit problem. In R. Catrambone, & S. Ohlsson (Eds.), Proceedings of the 32nd Annual Conference of the Cognitive Science Society, pp. 1118-1123. Austin, TX: Cognitive Science Society. </ref>
* [[multi-armed bandit problem|Multi-armed bandit problems]], in which participants choose from a set of alternatives with fixed but unknown reward rates with the goal of maximizing return after a number of trials. To accommodate mixtures of decision processes and individual differences in probabilities of <i>winning and staying</i> with a given alternative versus <i>losing and shifting</i> to another alternative, [[Hierarchical Bayesian model|hierarchical Bayesian models]] have been employed which include parameters for individual people drawn from [[Gaussian]] distributions<ref>Zhang, S., and Lee, M.D., (2010). "Cognitive models and the wisdom of crowds: A case study using the bandit problem". In R. Catrambone, and S. Ohlsson (Eds.), ''Proceedings of the 32nd Annual Conference of the Cognitive Science Society'', pp. 1118–1123. Austin, TX: Cognitive Science Society. </ref>


==See also==
==See also==

Revision as of 22:35, 15 July 2012

The wisdom of the crowd is the process of taking into account the collective opinion of a group of individuals rather than a single expert to answer a question. A large group's aggregated answers to questions involving quantity estimation, general world knowledge, and spatial reasoning has generally been found to be as good as, and often better than, the answer given by any of the individuals within the group. An intuitive and often-cited explanation for this phenomenon is that there is idiosyncratic noise associated with each individual judgment, and taking the average over a large number of responses will go some way toward canceling the effect of this noise.[1] This process, while not new to the information age, has been pushed into the mainstream spotlight by social information sites such as Wikipedia and Yahoo! Answers, and other web resources that rely on human opinion.[2]

The process, in the business world at least, was written about in detail by James Surowiecki in his book The Wisdom of Crowds.[3]

In the realm of justice, trial by jury can be understood as wisdom of the crowd, especially when compared to the alternative, trial by a judge, the single expert.

In the political domain, sometimes sortition is held as an example of what wisdom of the crowd would look like. Decision making would happen by a diverse group instead of by a fairly homogenous political group or party.

Research within cognitive science has sought to model the relationship between wisdom of the crowd effects and individual cognition.

Classic examples

The classic wisdom-of-the-crowds finding involves point estimation of a continuous quantity. At a 1906 country fair in Plymouth, eight hundred people participated in a contest to estimate the weight of a slaughtered and dressed ox. Statistician Francis Galton observed that the mean of all eight hundred guesses, at 1197 pounds, was closer than any of the individual guesses to the true weight of 1198 pounds.[4] This has contributed to the insight in cognitive science that a crowd's individual judgments can be modeled as a probability distribution of responses with the mean centered near the true mean of the quantity to be estimated.[3]

Definition of crowd

The term crowd, in this usage, refers to any group of people, such as a corporation, a group of researchers, or simply the entire general public. The group itself does not have to be cohesive; for example, a group of people answering questions on Yahoo! Answers may not know each other outside of that forum, or a group of people betting on a horse race may not know each others' bets, but they nevertheless form a crowd under this definition.

Benefits

The wisdom of the crowd applies to democratic journalism in that a group of non-experts determine what news is important, and then people outside the group can view the news based on those rankings. The social news sites Digg and Newsvine both fall into this category and rely heavily upon the wisdom of the crowd in creating their content.

Problems

Wisdom-of-the-crowds research routinely attributes the superiority of crowd averages over individual judgments to the elimination of individual noise,[5] an explanation that assumes independence of the individual judgments from each other.[3][6] Thus the crowd tends to make its best decisions if it is made up of diverse opinions and ideologies.

Miller and Stevyers reduced the independence of individual responses in a wisdom-of-the-crowds experiment by allowing limited communication between participants. Participants were asked to answer ordering questions for general knowledge questions such as the order of U.S. presidents. For half of the questions, each participant started with the ordering submitted by another participant (and alerted to this fact), and for the other half, they started with a random ordering, and in both cases were asked to rearrange them (if necessary) to the correct order. Answers where participants started with another participant's ranking were on average more accurate than those from the random starting condition. Miller and Steyvers conclude that different item-level knowledge among participants is responsible for this phenomenon, and that participants integrated and augmented previous participants' knowledge with their own knowledge.[7]

Crowds tend to work best when there is a correct answer to the question being posed, such as a question about geography or mathematics.[8]

The wisdom of the crowd effect is easily undermined. Social influence can cause the average of the crowd answers to be wildly inaccurate, while the geometric mean and the median are far more robust.[9]

Analogues with individual cognition: the "crowd within"

The insight that crowd responses to an estimation task can be modeled as a sample from a probability distribution invites comparisons with individual cognition. In particular, it is possible that individual cognition is probabilistic in the sense that individual estimates are drawn from an "internal probability distribution." If this is the case, then two or more estimates of the same quantity from the same person should average to a value closer to ground truth than either of the individual judgments, since the effect of statistical noise within each of these judgments is reduced. This of course rests on the assumption that the noise associated with each judgment is (at least somewhat) statistically independent. Another caveat is that individual probability judgments are often biased toward extreme values (e.g., 0 or 1). Thus any beneficial effect of multiple judgments from the same person is likely to be limited to samples from an unbiased distribution.[6]

Vul and Pashler (2008) asked participants for point estimates of continuous quantities associated with general world knowledge, such as "What percentage of the world's airports are in the United States?" Without being alerted to the procedure in advance, half of the participants were immediately asked to make a second, different guess in response to the same question, and the other half were asked to do this three weeks later. The average of a participant's two guesses was more accurate than either individual guess. Furthermore, the averages of guesses made in the three-week delay condition were more accurate than guesses made in immediate succession. One explanation of this effect is that guesses in the immediate condition were less independent of each other (an anchoring effect) and were thus subject to (some of) the same kind of noise. In general, these results suggest that individual cognition may indeed be subject to an internal probability distribution characterized by stochastic noise, rather than consistently producing the best answer based on all the knowledge a person has.[6]

Hourihan and Benjamin (2010) tested the hypothesis that the estimate improvements observed by Vul and Pashler in the delayed responding condition were the result of increased independence of the estimates. To do this Hourihan and Benjamin capitalized on variations in memory span among their participants. In support they found that averaging repeated estimates of those with lower memory spans showed greater estimate improvements than the averaging the repeated estimates of those with larger memory spans.[10]

Rauhut and Lorenz (2011) expanded on this research by again asking participants to make estimates of continuous quantities related to real world knowledge – however, in this case participants were informed that they would make five consecutive estimates. This approach allowed the researchers to determine: (1) the number of times one needs to ask oneself in order to match the accuracy of asking others and (2) the rate at which estimates made by oneself improve estimates compared to asking others. The authors concluded that asking oneself an infinite number of times does not surpass the accuracy of asking just one other individual. Overall, they found little support for a so-called “mental distribution” from which individuals draw their estimates; in fact, they found that in some cases asking oneself multiple times actually reduces accuracy. Ultimately, they argue that the results of Vul and Pashler (2008) overestimate the wisdom of the “crowd within” – as their results show that asking oneself more that three times actually reduces accuracy to levels below that reported by Vul and Pashler (who only asked participants to make two estimates).[11]

Müller-Trede (2011) attempted to investigate the types of questions in which utilizing the “crowd within” is most effective. He found that while accuracy gains were smaller than would be expected from averaging ones’ estimates with another individual, repeated judgments lead to increases in accuracy for both year estimation questions (e.g., when was the thermometer invented?) and questions about estimated percentages (e.g., what percentage of internet users connect from China?). General numerical questions (e.g., what is the speed of sound, in kilometers per hour?), however, did not show improvement with repeated judgments, while averaging individual judgments with those of a random other did improve accuracy. This, Müller-Trede argues, is the result of the bounds implied by year and percentage questions.[12]

Dialectical Bootstrapping: Improving the estimates of the “crowd within.”

Herzog and Hertwig (2009) attempted to improve on the “wisdom of many in one mind” (i.e., the “crowd within”) by asking participants to use dialectical bootstrapping. Dialectical bootstrapping involves the use of dialectic (reasoned discussion that takes place between two or more parties with opposing views, in an attempt to determine the best answer) and bootstrapping (advancing oneself without the assistance of external forces). They posited that people should be able to make greater improvements on their original estimates by basing the second estimate on antithetical information. Therefore, these second estimates, based on different assumptions and knowledge than that used to generate the first estimate would also have a different error (both systematic and random) than the first estimate – increasing the accuracy of the average judgment. From an analytical perspective dialectical bootstrapping should increase accuracy so long as the dialectical estimate is not too far off and the errors of the first and dialectical estimates are different. To test this, Herzog and Hertwig asked participants to make a series 40 of date estimations regarding historical events (e.g., when electricity was discovered), without knowledge that they would be asked to provide a second estimate. Next, half of the participants were simply asked to make a second estimate. The other half were asked to use a consider-the opposite strategy to make dialectical estimates (using their initial estimates as a reference point). Specifically, participants were asked to imagine that their initial estimate was off, consider what information may have been wrong, what this alternative information would suggest, if that would have made their estimate an over or and under-estimate, and finally, based on this perspective what their new estimate would be. Results of this study revealed that while dialectical bootstrapping did not out perform the wisdom of the crowd (averaging each participants’ first estimate with that of a random other participant), it did render better estimates than simply asking individuals to make two estimates.[13]

Hirt and Markman (1995) found that participants need not be limited to a consider-the-opposite strategy in order to improve judgments. Researchers asked participants to consider-an-alternative – opearationalized as any plausible alternative (rather than simply focusing on the “opposite” alternative) – finding that simply considering an alternative improved judgments.[14]

It should also be noted that not all studies have shown support for the “crowd within” improving judgments. Ariely and colleagues asked participants to provide responses based on their answers to true-false items and their confidence in those answers. They found that while averaging judgment estimates between individuals significantly improved estimates, averaging repeated judgment estimates made by the same individuals did not significantly improve estimates.[15]

Higher-dimensional problems and modeling

Although classic wisdom-of-the-crowds findings center on point estimates of single continuous quantities, the phenomenon also scales up to higher-dimensional problems that do not lend themselves to aggregation methods such as taking the mean. More complex models have been developed for these purposes. A few examples of higher-dimensional problems that exhibit wisdom-of-the-crowds effects include:

  • Combinatorial problems such as minimum spanning trees and the traveling salesperson problem, in which participants must find the shortest route between an array of points. Models of these problems either break the problem into common pieces (the local decomposition method of aggregation) or find solutions that are most similar to the individual human solutions (the global similarity aggregation method).[1][16]
  • Ordering problems such as the order of the U.S. presidents or world cities by population. A useful approach in this situation is Thurstonian modeling, which each participant has access to the ground truth ordering but with varying degrees of stochastic noise, leading to variance in the final ordering given by different individuals.[17]

[18][19][20]

  • Multi-armed bandit problems, in which participants choose from a set of alternatives with fixed but unknown reward rates with the goal of maximizing return after a number of trials. To accommodate mixtures of decision processes and individual differences in probabilities of winning and staying with a given alternative versus losing and shifting to another alternative, hierarchical Bayesian models have been employed which include parameters for individual people drawn from Gaussian distributions[21]

See also

References

  1. ^ a b Yi, S. K. M., Steyvers, M., Lee, M. D. and Dry, M. J. (April 2012). "The Wisdom of the Crowd in Combinatorial Problems". Cognitive Science. 36 (3). doi:10.1111/j.1551-6709.2011.01223.x.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  2. ^ Baase, Sara (2007). A Gift of Fire: Social, Legal, and Ethical Issues for Computing and the Internet. 3rd edition. Prentice Hall. pp. 351–357. ISBN 0-13-600848-8.
  3. ^ a b c Surowiecki, James. The Wisdom of Crowds: Why the Many Are Smarter Than the Few and How Collective Wisdom Shapes Business, Economies, Societies and Nations. Doubleday, 2004. ISBN 978-0-385-50386-0.
  4. ^ Galton, F. (1907). "Vox populi". [[Nature (journal)|]], 75, pp. 450–45.
  5. ^ Benhenda, Mostapha (2011). "A Model of Deliberation Based on Rawls's Political Liberalism". Social Choice and Welfare. 36: 121–178.
  6. ^ a b c Vul, E. and Pashler, H. (2008) "Measuring the Crowd Within: Probabilistic Representations Within Individuals". Psychological Science. 19(7) 645–647.
  7. ^ Miller, B., and Steyvers, M. (in press). "The Wisdom of Crowds with Communication". In L. Carlson, C. Hölscher, & T.F. Shipley (Eds.), Proceedings of the 33rd Annual Conference of the Cognitive Science Society. Austin, TX: Cognitive Science Society.
  8. ^ "The wisdom of crowds: Q & A with James Surowiecki". Random House.
  9. ^ "How Social Influence can Undermine the Wisdom of Crowd Effect". Proc. Nat. Acad. Sciences, 2011.
  10. ^ Hourihan, K. L., and Benjamin, A. S. (2010). "Smaller is better (when sampling from the crowd within): Low memory-span individuals benefit more from multiple opportunities for estimation". Journal of Experimental Psychology: Learning, Memory, and Cognition, 36, 1068–1074.
  11. ^ Rauhut, H (2011). "The wisdom of the crowds in one mind: How individuals can simulate the knowledge of diverse societies to reach better decisions". Journal of Mathematical Psychology. 55: 191–197. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
  12. ^ Müller-Trede, J. (2011). "Repeated judgment sampling: Boundaries". Judgment and Decision Making. 6: 283–294.
  13. ^ Herzog, S. M. (2009). "The wisdom of many in one mind: Improving individual judgments with dialectical bootstrapping". Psychological Science. 20: 231–237. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
  14. ^ Hirt, E. R., and Markman, K. D. (1995). "Multiple explanation: A consider-an-alternative strategy for debiasing judgments". Journal of Personality and Social Psychology, 69, 1069–1086.
  15. ^ Ariely, D., Au, W. T., Bender, R. H., Budescu, D. V., Dietz, C. B., Gu, H., and Zauberman, G. (2000). "The effects of averaging subjective probability estimates between and within judges." Journal of Experimental Psychology: Applied, 6, 130–147.
  16. ^ Yi, S.K.M., Steyvers, M., Lee, M.D., and Dry, M. (2010). Wisdom of Crowds in Minimum Spanning Tree Problems. Proceedings of the 32nd Annual Conference of the Cognitive Science Society. Mahwah, NJ: Lawrence Erlbaum.
  17. ^ Lee, M. D., Steyvers, M., de Young, M. and Miller, B. (January 2012). "Inferring Expertise in Knowledge and Prediction Ranking Tasks". Topics in Cognitive Science (4): 151–163. doi:10.1111/j.1756-8765.2011.01175.x.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  18. ^ Lee, M. D., Steyvers, M., de Young, M. and Miller, B. J. (in press). "A model-based approach to measuring expertise in ranking tasks". In L. Carlson, C. Hölscher, and T.F. Shipley (Eds.), Proceedings of the 33rd Annual Conference of the Cognitive Science Society. Austin, TX: Cognitive Science Society.
  19. ^ Steyvers, M., Lee, M.D., Miller, B., and Hemmer, P. (2009). "The Wisdom of Crowds in the Recollection of Order Information". In Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams and A. Culotta (Eds.) Advances in Neural Information Processing Systems, 22, pp. 1785–1793. MIT Press.
  20. ^ Miller, B., Hemmer, P. Steyvers, M. and Lee, M.D. (2009). "The Wisdom of Crowds in Ordering Problems". In: Proceedings of the Ninth International Conference on Cognitive Modeling. Manchester, UK.
  21. ^ Zhang, S., and Lee, M.D., (2010). "Cognitive models and the wisdom of crowds: A case study using the bandit problem". In R. Catrambone, and S. Ohlsson (Eds.), Proceedings of the 32nd Annual Conference of the Cognitive Science Society, pp. 1118–1123. Austin, TX: Cognitive Science Society.