People have practiced collective decision-making here and there since antiquity. Many see modern social connectedness as offering great new possibilities for the concept. I agree, with a few giant caveats. I’m fond of the topic because I do some work in the field and because it is multidisciplinary, standing at the intersection of technology and society. I’ve written a couple of recent posts on related topics. A lawyer friend emailed me to say she was interested in my recent post on Yelp and crowd wisdom. She said the color-coded scatter plots were pretty; but she wondered if I had a version with less whereas and more therefore. I’ll do that here and give some high points from some excellent studies I’ve read on the topic.
First, in my post on the Yelp data, I accepted that many studies have shown that crowds can be wise. When large random crowds respond individually to certain quantitative questions, the median or geometric mean (though not the mean value) is often more accurate than answers by panels of experts. This requires that crowd members know at least a little something about the matter they’re voting on.
Then my experiments with Yelp data confirmed what others found in more detailed studies of similar data:
- Yelp raters tend to give extreme ratings.
- Ratings are skewed toward the high end.
- Even a rater who rates high on average still rates many businesses very low.
- Many businesses in certain categories have bimodal distributions – few average ratings, many high and low ratings.
- Young businesses are more like to show bimodal distributions; established ones right-skewed.
I noted that these characteristics would reduce statisticians’ confidence in conclusions drawn from the data. I then speculated that social influence contributed to these characteristics of the data, also seen in detailed studies published on Amazon, Imdb and other high-volume sites. Some of those studies actually quantified social influence.
Two of my favorite studies show how mild social influence can damage crowd wisdom; and how a bit more can destroy it altogether. Both studies are beautiful examples of design of experiments and analysis of data.
In one (Lorenz, et. al., full citation below), the experimenters asked six questions to twelve groups of twelve students. In half the groups, people answered questions with no knowledge of the other members’ responses. In the other groups the experimenters reported information on the group’s responses to all twelve people in that group. Each member in such groups could then give new answers. They repeated the process five times allowing each member to revise and re-revise his response with knowledge about his group’s answers, and did statistical analyses on the results. The results showed that while the groups were initially wise, knowledge about the answers of others narrowed the range of answers. But this reduced range did not reduce collective error. This convergence is often called the social influence effect.
A related aspect of the change in a group’s answers might be termed the range reduction effect. It describes that fact that the correct answer moves progressively toward the periphery of the ordered group of answers as members revise their answers. A key consequence of this effect is that representatives of the crowd become less valuable in giving advice to external observers.
The most fascinating aspect of this study was the confidence effect. Communication of responses by other members of a group increased individual members’ confidence about their responses during convergence of their estimates – despite no increase in accuracy. One needn’t reach far to find examples in the form of unfounded guru status, overconfident but misled elitists, and Teflon financial advisors.
Another favorite of the many studies quantifying social influence (Salganik, et. al.) built a music site where visitors could listen to previously-unreleased songs and download them. Visitors were randomly placed in one of eight isolated groups. All groups listened to songs, rated them, and were allowed to download a copy. In some of the groups visitors could see a download count of each song, though this information was not emphasized. The download count, where present, was a weak indicator of the preferences of other visitors. Ratings from groups with no download count information yielded a measurement of song quality as judged by a large population (14,000 participants total). Behavior of the groups with visible download counts allowed the experimenters to quantify the effect of mild social influence.
The results of the music experiment were profound. It showed that mild social influence contributes greatly to inequality of outcomes in the music market. More importantly, it showed, by comparison of the isolated populations that could see download count, that social influence introduces instability and unpredictability in the results. That is, wildly different “hits” emerged in the identical groups when social influence was possible. In an identical parallel universe, Rihanna did just OK and Donnie Darko packed theaters for months.
Engineers and mathematicians might correctly see this instability situation as something like a third order dynamic system, highly sensitive to initial conditions. The first vote cast in each group was the flapping of the butterfly’s wings in Brazil that set off a tornado in Texas.
This study’s authors point out the ramifications of their work on our thoughts about popular success. Hit songs, top movies and superstars are orders of magnitude more successful than their peers. This leads to the sentiment that superstars are fundamentally different from the rest. Yet the study’s results show that success was weakly related to quality. The best songs were rarely unpopular; and the worst rarely were hits. Beyond that, anything could and did happen.
This probably explains why profit-motivated experts do so poorly at predicting which products will succeed, even minutes before a superstar emerges.
When information about a group is available, its members do not make decisions independently, but are influenced subtly or strongly by their peers. When more group information is present (stronger social influence), collective results become increasingly skewed and increasingly unpredictable.
The wisdom of crowds comes from aggregation of independent input. It is a matter of statistics, not of social psychology. This crucial fact seems to be missed by many of the most distinguished champions of crowdsourcing, collective wisdom, crowd-based-design and the like. Collective wisdom can be put to great use in crowdsourcing and collective decision making. The wisdom of crowds is real, and so is social influence; both can be immensely useful. Mixing the two makes a lot of sense in the many business cases where you seek bias and non-individualistic preferences, such as promoting consumer sales.
But extracting “truth” from a crowd is another matter – still entirely possible, in some situations, under controlled conditions. But in other situations, we’re left with the dilemma of encouraging information exchange while maintaining diversity, independence, and individuality. Too much social influence (which could be quite a small amount) in certain collective decisions about governance and the path forward might result in our arriving at a shocking place and having no idea how we got there. History provides some uncomfortable examples.
Jan Lorenza, Heiko Rauhutb, Frank Schweitzera, and Dirk Helbing. “How social influence can undermine the wisdom of crowd effect” Proceedings of the National Acadamy of Science, May 31 2011.
Matthew J. Salganik, Peter Sheridan Dodds et. al. “Experimental Study of Inequality and Unpredictability in an Artificial Cultural Market,” Science Feb 10 2006.