Snowball sampling

From Wikipedia, the free encyclopedia
Jump to: navigation, search
For other uses, see Snowball (disambiguation).

In sociology and statistics research, snowball sampling[1] (or chain sampling, chain-referral sampling, referral sampling[2][3]) is a non-probability sampling technique where existing study subjects recruit future subjects from among their acquaintances. Thus the sample group is said to grow like a rolling snowball (similarly to breadth-first search (BFS) in computer science). As the sample builds up, enough data are gathered to be useful for research. This sampling technique is often used in hidden populations which are difficult for researchers to access; example populations would be drug users or sex workers. As sample members are not selected from a sampling frame, snowball samples, analogously to BFS samples,[4][5] are subject to numerous biases. For example, people who have many friends are more likely to be recruited into the sample.

It was widely believed that it was impossible to make unbiased estimates from snowball samples, but a variation of snowball sampling called respondent-driven sampling[6][7][8] has been shown to allow researchers to make asymptotically unbiased estimates from snowball samples under certain conditions. Snowball sampling and respondent-driven sampling also allows researchers to make estimates about the social network connecting the hidden population.

What is snowball sampling?[edit]

Snowball sampling uses a small pool of initial informants to nominate, through their social networks, other participants who meet the eligibility criteria and could potentially contribute to a specific study. The term "snowball sampling" reflects an analogy to a snowball increasing in size as it rolls downhill [9]

Snowball Sampling is a method used to obtain research and knowledge, from extended associations, through previous acquaintances, "Snowball sampling uses recommendations to find people with the specific range of skills that has been determined as being useful." An individual or a group receives information from different places through a mutual intermediary. This is referred to metaphorically as snowball sampling because as more relationships are built through mutual association, more connections can be made through those new relationships and a plethora of information can be shared and collected, much like a snowball that rolls and increases in size as it collects more snow. Snowball sampling is a useful tool for building networks and increasing the number of participants. However, the success of this technique depends greatly on the initial contacts and connections made. Thus it is important to correlate with those that are popular and honorable to create more opportunities to grow, but also to create a credible and dependable reputation.


  1. Draft a participation program (likely to be subject to change, but indicative).
  2. Approach stakeholders and ask for contacts.
  3. Gain contacts and ask them to participate.
  4. Community issues groups may emerge that can be included in the participation program.
  5. Continue the snowballing with contacts to gain more stakeholders if necessary.
  6. Ensure a diversity of contacts by widening the profile of persons involved in the snowballing exercise.

Applications of snowball sampling[edit]


The participants are likely to know others who share the characteristics that make them eligible for inclusion in the study.[10]

Applicable situation[edit]

Snowball sampling is quite suitable to use when members of a population are hidden and difficult to locate (e.g. samples of the homeless or users of illegal drugs) and these members are closely connected (e.g. organized crime, sharing similar interests, involvement in the same groups that are relevant to the project at hand).[10]

Application field[edit]

1. Social computing

Snowball sampling can be perceived as an evaluation sampling in the social computing field. For example, in the interview phase, snowball sampling can be used to reach hard-to-reach populations. Participants or informants with whom contact has already been made can use their social networks to refer the researcher to other people who could potentially participate in or contribute to the study.

2. Expert information collection
Snowball sampling can be used to identify experts in a certain field such as medicine, manufacturing processes, or customer relation methods, and gather professional and valuable knowledge.
For instance, 3M called in specialists from all fields that related to how a surgical drape could be applied to the body using snowball sampling. Every involved expert can suggest another expert who they may know could offer more information.

Advantages and Disadvantages[edit]


1. Locate hidden populations : It is possible for the surveyors to include people in the survey that they would not have known.

2. Locating people of a specific population: There are no lists or other obvious sources for locating members of the population (e.g. the homeless, users of illegal drugs).

3. Low cost: As subjects are used to locate the hidden population, the researcher invest less money and time in sampling. Snowball sampling method does not require complex planning and the staffing required is considerably smaller in comparison to other sampling methods.[11]


1. Community bias: The first participants will have strong impact on the sample. Snowball sampling is inexact, and can produce varied and inaccurate results. The method is heavily reliant on the skill of the individual conducting the actual sampling, and that individual's ability to vertically network and find an appropriate sample. To be successful requires previous contacts within the target areas, and the ability to keep the information flow going throughout the target group..

2. Non-random: Snowball sampling contravenes many of the assumptions supporting conventional notions of random selection and representativeness[12] However, social systems are beyond researchers' ability to recruit randomly. Snowball sampling is inevitable in social systems.

3. Unknown sampling population size: There is no way to know the total size of the overall population.[10]

4. Anchoring: Another disadvantage of snowball sampling is the lack of definite knowledge as to whether or not the sample is an accurate reading of the target population. By targeting only a few select people, it is not always indicative of the actual trends within the result group. Identifying the appropriate person to conduct the sampling, as well as locating the correct targets is a time consuming process such that the benefits only slightly outweigh the costs.

5. Lack of control over sampling method : As the subjects locate the hidden population, the research has very little control over the sampling method, which becomes mainly dependent on the original and subsequent subjects, who may add to the known sampling pool using a method outside of the researcher's control.


The best defense against weaknesses is to begin with a set of initial informants that are as diverse as possible.[10] Efforts to improve the main disadvantage of snowball sampling resulted in the Respondent Driven Sampling (RDS) method.[13] RDS augments the referral method by weighting the sample in order to compensate for the initial non-random selection, which may lead to the reduction of errors occurring in sampling by the referral method.[11]


Snowball sampling is a true multipurpose technique. Through its use, it is possible to make inferences about social networks and relations in areas in which sensitive, illegal, or deviant issues are involved. Equally important is its utility in exploring population about whom little is known. For example, Kaplan et al. have used snowball sampling to study the temporal and social contexts of heroin users.[14]


A total of 214 cases was selected by 45 independent snowball sampling operations. Graphically represented, these sampling ranged in "length" (i.e., the number of cases in the sample) from two cases (1-stage snowball sample) to nine cases(8-stage snowball sample). Samples were collected with different "target" traits to saturate. In one subset, "foreign origin" was the trait to be saturated; in the other, it was "prostitution as occupation". In this example, three samples were selected from the data set base for analysis to meet a criterion of holding the "length" of the three samples constant. A length of four cases was decided upon because these samples are complex enough to make statistical analysis practical, but short enough to allow clear and simple qualitative comparisons.


A field worker was instructed to start a snowball sample of a particular trait identified as characteristic of the heroin scene. At the zero stage the subject selected was asked to nominate other heroin users sharing that trait (the maximal number of nominees was 25). From the set of those nominated at each stage a simple random selection was made of a single individual. The field worker then attempted to make contact with that nominee. The amount of time in days (speed) required to make contact was recorded, as well as other specified traits of the individuals (e.g., sex, age, drug history, and patterns).


Figure 1. The result of one snowball sample

Figure 1 graphically shows the traits in this sample in terms or the days it took to find a specific nominee, the operation-alization of speed. Sample one was started with a 27-year-old British heroin user whose referrals included a 19-year-old Italian. After this Italian nominee was selected, it took the field worker 4 days to find him. He then nominated, among others, another Italian, 22 years of age, who was found on the same day. The randomly selected nominee of this third user was a 27-year-old Belgian who could not nominate another.


After subjects were identified, both quantitative and qualitative analyses of the three samples were conducted, and graphic representations of the data were constructed and marked for relevant traits. Descriptive statistical comparisons were made for the entire data set as well as for the foreign and prostitute subgroups. Inferential statistics were also used to determine whether the distributions for age and the time it took for a field worker to locate a nominee (speed) were significant and whether the respective snowballs were drawn from populations with the same distributions. The second question was seen as especially appropriate for and "ascending" sampling strategy because it cannot be assumed that each snowball is drawn from the same population when only an "imperfect sampling frame" composed of a "special list" compiled by nominees, is available.[15] Because it has been seen as especially appropriate for small samples (as little as three cases) for which population parameters are unknown and cannot be confidently assumed, the nonparametric Kolmogorov-Smirnov (KS) test was used. Two-tailed KS tests were performed on the pooled data of the three samples (one-sample test) and on the between-snowballs (subgroups) data (two-samples test).

Statistics on Snowball sampling[edit]

In U.K. where the telephone coverage is not as high as in USA, Harris(1971) conducted survey of handicapped and impaired. He screened 250,000 households, by mail questionnaires at the initial phase; a response rate of 85.6 percent was obtained. CartWright(1964) sent a mail questionnaires to 29400 persons to identify those who had been in hospitals in the last six months, and obtained a response rate of 87 percent. Hunt(1978) sent a screening questionnaires seeking basic demographic details about the members of 11500 households to identify a sample of elderly, and achieved a response rate of 8O percent. Although these response rates are high there is high risk that non-respondents include a larger proportion of members of the rare population.[16]

Ethical issues in Snowball sampling[edit]

Ethical concerns prevented the research staff from directly contacting many potential respondents, consequently program directors or personnel who knew of possible respondents would make initial contacts and then ask those who were willing to cooperate to personally contact the project. In each instance the newly recruited research assistant had to be trained to understand and accept the eligibility criteria of the research, which often was difficult because it violated some commonsense understandings concerning treatment and nontreatment. For example, many people define themselves as untreated in spite of possible long stays in civil commitment programs because their commitments to these institutions were involuntary and/or because they had become readdicted upon release and then recovered at a later time.[17]

In a qualitative research, apprehension around feelings of compulsion are reviewed for potential ethical dilemmas and recommendations for research process are made.[18]

Improvements for snowball sampling[edit]

Snowball sampling is a recruitment method that employs research into participants' social networks to access specific populations. According to research mentioned in the paper written by Kath Browne,[19] using social networks to research is accessible. In this research, Kath Browne used social networks to research non-heterosexual women. Snowball sampling is often used because the population under investigation is hard to approachable either due to low numbers of potential participants or the sensitivity of the topic. The author indicated the recruitment technique of snowball sampling, which uses interpersonal relations and connections within people.Due to the use of social networks and interpersonal relations, snowball sampling forms how individuals act and interact in focus groups, couple interviews and interviews. As a result, snowball sampling not only results in the recruitment of particular samples, use of this technique produces participants'accounts of their lives. To help mitigate these risks, it is important to not rely on any one single method of sampling to gather data about a target sector. In order to most accurately obtain information, a company must do everything it possibly can to ensure that the sampling is controlled. Also, it is imperative that the correct personnel is used to execute the actual sampling, because one missed opportunity could skew the results.

Respondent-driven sampling[edit]

A new approach to the study of hidden populations. It is effectively used to avoid bias in snowball sampling. Respondent driven sampling involves both a field sampling technique and custom estimation procedures that correct for the presence of homophily on attributes in the population. The respondent-driven sampling method employs a dual system of structured incentives to overcome some of the deficiencies of such samples. Like other chain-referral methods, RDS assumes that those best able to access members of hidden populations are their own peers. [20]

Peer Esteem Snowballing(PEST)[edit]

In this article Dr Dimitrios C Christopoulos mentioned that snowball sampling is not a robust measure of reliability when applied to the public, but if we want to investigate small populations of expert opinion, a sampling method like this is perfect for the study. Comparing to other snowballing techniques, PEST has several advantages:

a. reduces the selection bias inherent in initial seed samples for a snowball by advocating for a nominations phase that objectively identifies contact seeds for the first wave;

b. by analysing network data it provides an estimate of the population size, unbiased by any researcher defined population boundary;

c. by reporting the estimate of the sample size vis a vis the population, it provides a measure of relative significance (optimal sampling data can be reported in this context);

d. through a network analysis of referrals it allows for identifying clusters of experts that may be instrumental in explain variations in their response profile;

e. allows for a referrals nominations strategy that, in certain cases, could improve response rates, while the nominations strategy acts as an ultimate validation of expertise for informants and therefore improves content validity. [21]


  1. ^ Goodman, L.A. (1961). "Snowball sampling". Annals of Mathematical Statistics. 32 (1): 148–170. doi:10.1214/aoms/1177705148. 
  2. ^ "Snowball Sampling".  (accessed 8 May 2011).
  3. ^ Snowball Sampling, Changing, (accessed 8 May 2011).
  4. ^ Kurant, M.; Markopoulou, A.; Thiran., P. (2010). On the bias of BFS (Breadth First Search). International Teletraffic Congress (ITC 22). arXiv:1004.1729Freely accessible. 
  5. ^ Kurant, M.; Markopoulou, A.; Thiran., P. (2011). "Towards Unbiased BFS Sampling". IEEE JSAC. 29 (9): 1799–1809. arXiv:1102.4599Freely accessible. doi:10.1109/jsac.2011.111005. 
  6. ^ Heckathorn, D.D. (1997). "Respondent-Driven Sampling: A New Approach to the Study of Hidden Populations". Social Problems. 44 (2): 174–199. doi:10.1525/sp.1997.44.2.03x0221m. 
  7. ^ Salganik, M.J. and D.D. Heckathorn (2004). "Sampling and Estimation in Hidden Populations Using Respondent-Driven Sampling". Sociological Methodology. 34 (1): 193–239. doi:10.1111/j.0081-1750.2004.00152.x. 
  8. ^ Heckathorn, D.D. (2002). "Respondent-Driven Sampling II: Deriving Valid Estimates from Chain-Referral Samples of Hidden Populations". Social Problems. 49 (1): 11–34. doi:10.1525/sp.2002.49.1.11. 
  9. ^ David L., Morgan (2008). The SAGE Encyclopedia of Qualitative Research Methods. SAGE Publications, Inc. pp. 816–817. ISBN 9781412941631. 
  10. ^ a b c d David L., Morgan (2008). The SAGE Encyclopedia of Qualitative Research Methods. SAGE Publications, Inc. pp. 816–817. ISBN 9781412941631. 
  11. ^ a b Voicu, Mirela-Cristina (2011). "USING THE SNOWBALL METHOD IN MARKETING RESEARCH ON HIDDEN POPULATIONS". Challenges of the Knowledge Society. 1: 1341–1351. 
  12. ^ Atkinson, Rowland; Flint, John (2004). Encyclopedia of Social Science Research Methods. SAGE Publications, Inc. pp. 1044–1045. ISBN 9780761923633. 
  13. ^ Heckathorn, Douglas D. (1997). "Respondent-Driven Sampling: A New Approach to the Study of Hidden Populations" (PDF). Social Problems. Retrieved 19 September 2016. 
  14. ^ Kaplan C D, Korf D, Sterk C. Temporal and social contexts of heroin-using populations an illustration of the snowball sampling technique[J]. The Journal of nervous and mental disease, 1987, 175(9): 566-574.
  15. ^ Kish L. Survey sampling[J]. 1965.
  16. ^ Kalton, G., & Anderson, D. W. (1986). Sampling rare populations. Journal of the royal statistical society. Series A (general), 65-82. sampling
  17. ^ Biernacki, Waldorf / SNOWBALL SAMPLING
  18. ^ Brace-Govan, Jan. "Issues in snowball sampling: The lawyer, the model and ethics." Qualitative Research Journal 4.1 (2004): 52.
  19. ^ Browne, Kath (2005). "Snowball sampling: using social networks to research non‐heterosexual women". International Journal of Social Research Methodology. 8 (1). 
  20. ^
  21. ^ Dimitrios C. Christopoulos (2010). "Peer Esteem Snowballing: A methodology for expert surveys". 

External links[edit]