Observer bias is one of the types of detection bias and is defined as any kind of systematic divergence from accurate facts during observation and the recording of data and information in studies. The definition can be further expanded upon to include the systematic difference between what is observed due to variation in observers, and what the true value is.
Observer bias is the tendency of observers to not see what is there, but instead to see what they expect or want to see. This is a common occurrence in the everyday lives of many and is a significant problem that is sometimes encountered in scientific research and studies. Observation is critical to scientific research and activity, and as such, observer bias may be as well. When such biases exist, scientific studies can result in an over- or underestimation of what is true and accurate, which compromises the validity of the findings and results of the study, even if all other designs and procedures in the study were appropriate.
Observational data forms the foundation of a significant body of knowledge. Observation is a method of data collection and falls into the category of qualitative research techniques. There are a number of benefits of observation, including its simplicity as a data collection method and its usefulness for hypotheses. Simultaneously, there are many limitations and disadvantages in the observation process, including the potential lack of reliability, poor validity, and faulty perception. Participants' observations are widely used in sociological and anthropological studies, while systematic observation is used where researchers need to collect data without participants direct interactions. The most common observation method is naturalistic observation, where subjects are observed in their natural environments with the goal to assess the behaviour in an intervention free and natural setting.
Observer bias is especially probable when the investigator or researcher has vested interests in the outcome of the research or has strong preconceptions. Coupled with ambiguous underlying data and a subjective scoring method, these three factors contribute heavily to the incidence of observer bias.
Examples of cognitive biases include:
- Anchoring – a cognitive bias that causes humans to place too much reliance on the initial pieces of information they are provided with for a topic. This causes a skew in judgement and prevents humans and observers from updating their plans and predictions as appropriate.
- Bandwagon effect – the tendency for people to "jump on the bandwagon" with certain behaviours and attitudes, meaning that they adopt particular ways of doings things based on what others are doing.
- Bias blind spot – the tendency for people to recognize the impact of bias on others and their judgements, while simultaneously failing to acknowledge and recognize the impact that their own biases have on their own judgement.
- Confirmation bias – the tendency for people to look for, interpret, and recall information in such a way that their preconceived beliefs and values are affirmed.
- Guilt and innocence by association bias – the tendency for people to hold an assumption that individuals within a group share similar characteristics and behaviours, including those that would hail them as innocent or guilty.
- Halo effect – the tendency for the positive impressions and beliefs in one area around a person, brand, company, product or the like to influence an observers opinions or feelings in other unrelated areas.
- Framing effect – the tendency for people to form conclusions and opinions based on whether the pertinent relevant is provided to them with positive or negative connotations.
- Recency effect – the tendency for more recent pieces of information, ideas, or arguments to be remembered more clearly than those that preceded.
Examples of observer bias extend back to the early 1900's. One of the first recorded events of apparent observer bias was seen in 1904, with the case of "Clever Hans". Clever Hans was a horse, whose owner, Wilhem Von Olson, claimed could perform arithmetic equations. Von Olson would ask Clever Hans a series of questions involving arithmetic functions, and the horse would appear to answer by tapping its hoof with the numbered answer. This example was investigated by the psychologist Oskar Pfungst, and it was found that when the horse was nearing the correct number of taps, the owner would subconsciously react in a particular way, which signalled to Clever Hans to discontinue his tapping. This only worked, however, when the owner himself knew the answer to the question. This is an example of observer bias, due to the fact that the expectations of Von Olson, the horses' owner, were the cause of Clever Hans actions and behaviours, resulting in faulty data.
One of the most notorious examples of observer bias is seen in the studies and contributions of Cyril Burt, an English psychologist and geneticist who purported the heritability of IQ. Burt believed, and thus demonstrated through his research because of his observer bias, that children from families with lower socioeconomic status were likely to have lower levels of cognitive abilities when compared with that of children from families with higher socioeconomic status. Such research and findings had considerable impacts on the educational system in England throughout the 1960s, where middle- and upper-class children were sent to elite schools while the children from the lower socioeconomic demographic were sent to schools with less desirable traits. Following Burt's death, further research found that the data in Burt's studies was fabricated, which was presumed to be a result of his observer bias and the outcomes he was intending to find through his studies.
Another key example of observer bias is a 1963 study, "Psychology of the Scientist: V. Three Experiments in Experimenter Bias", published by researchers Robert Rosenthal and Kermit L. Fode at the University of North Dakota. In this study, Rosenthal and Fode gave a group of twelve psychology students a total of sixty rats to run in some experiments. The students were told that they either had "maze-bright" rats, who were bred to be exceptionally good at solving mazes, or that they had "maze-dull" rats, who were bred to be poor at solving mazes. They were then asked to run experiments with the rats and collect the data as they usually would. The rats were placed in T-shaped mazes where they had to run down the center and then decide to turn left or turn right. One of the sides of the maze was painted white, while the other was painted dark gray, and it was the rat's job to always turn towards the dark gray side of the maze. The rats who turned towards the dark gray side of the maze received a reward, while the rats who turned towards the white side of the maze did not. The students kept track of how many times each rat turned towards the correct (or dark gray) side of the maze, how many times each rat turned towards the incorrect (or white) side of the maze, and how long it took each rat to make a decision. They repeated this experiment ten times per day, all over the course of five days total, and in the end, they found that the "maze-bright" rats were better at both correctly completing the maze and completing the maze in the fastest time. However, It turns out that there was a big surprise. There were actually no "maze-bright" or "maze-dull" rats; these rats were all genetically identical to one another and were randomly divided into the two categories. The two groups of students should have gotten the same results for both kinds of rats, but failed to do so because of observer bias. The entire effect of the experiment was caused by their expectations: they expected that the "maze-bright" rats would perform better and that the "maze-dull" rats would perform worse. Rosenthal and Fode concluded that these results were caused by smaller and more subtle biases on the part of the students. The students were unaware of the fact that they were treating the rats differently. It's possible that they had slightly different criteria for when the two groups of rats finished the maze, that they had the tendency to hit the stopwatch later for the "maze-dull" rats, or that they were paying more attention to the "maze-bright" rats overall. In this way, the students, or the observers, created what looked like a real result, but what was, in reality, totally false.
Observational data forms the foundation of a significant body of knowledge. Observer bias can be seen as a significant issue in medical research and treatment. There is greater potential for variance in observations made where subjective judgement is required, when compared with observation of objective data where there is a much lower risk of observer bias.
When there is observer bias present in research and studies, the data collection itself is affected. The findings and results are not accurate representations of reality, due to the influence of the observers' biases. Although they may not intend to do so, observer bias may result in researchers subconsciously encouraging certain results, which would lead to changes in the findings and outcomes in the study. A researcher that has not taken steps to mitigate observer bias and is being influenced by their own observer bias has a higher probability of making erroneous interpretations, which ultimately will lead to inaccurate results and findings.
Research has shown that in the presence of observer bias in outcome assessment, it is possible for treatment effect estimates to be exaggerated by between a third to two-thirds, symbolising significant implications on the validity of the findings and results of studies and procedures.
Preventative steps to mitigate observer bias
Bias is unfortunately an unavoidable problem in epidemiological and clinical research. However, there are a number of potential strategies and solutions for the reduction of observer bias, specifically in the areas of scientific studies and research across the medical field. The effects that bias has can be reduced through the use of strong operational definitions, along with masking, triangulation, and standardisation of procedures, and the continual monitoring of the objectivity of those conducting the experiments and observations.
Blinded protocols and double-blinded research can act as a corrective lens in terms of reducing observer bias, and thus, to increase the reliability and accuracy of the data collected. Blind trials are often required in order for the attainment of regulatory approval for medical devices and drugs, but are not common practice in empirical studies despite the research supporting its necessity. Double-blinding is done by ensuring both the tester and research participants lack of information that could have a potential influence on their behaviour, while single-blind describes those experiments where information is withheld from the participants that may otherwise skew the results or introduce bias, but the experimenter is entirely aware of and in possession of those facts.
An example of how observer bias can impact on research, and how blinded protocols can impact, can be seen in the trial for an anti-psychotic drug. Researchers that know which of the subjects received the placebo and those that received the trial drugs may later report that the group that received the trial drugs had a calmer disposition, due to the expectations of that outcome. Similarly, if the participants in the trial were not blinded, then they may report how they are feeling differently based on whether they were provided with the placebo or the trial drug.
A further example could be seen at schools. Boys of school-age generally outperform their female peers in science, however there is evidence that this is potentially as a result of how they are taught and treated by their teachers, who have the expectation that the boys have higher performances, and thus subtly encourage them. As such, the observers, being the teachers who conduct tests and evaluate the results, have a bias and preconceived belief that boys will outperform girls, which impacts on their behaviour.
To complement blind or masked protocols and research, further strategies including standardised training for observers and researchers about how to record findings can be useful in the mitigation of observer bias. Clear definition of methodology, tools and the time frames allocated for the collection of findings can assist in adequately training and preparing observers in a standardised manner. Further, identifying any potential conflicts of interest within observers before commencement of the research is essential in ensuring bias is minimised.
Finally, triangulation within research is a method that can be used to increase the findings validity and credibility. Triangulation in research refers to the use of a variety of methods or data sources as a means of developing a more comprehensive and accurate understanding of the subject at hand. Triangulation will considerably increase the confidence in a study tremendously. There are a few ways triangulation can occur, including the use of multiple observers, which is a form of reliability in itself called interobserver reliability, measured by the percentage of times that the observers agree.
Hawthorne effect (observer effect)
Observer bias is commonly only identified in the observers, however, there also exists a bias for those being studied. Named after a series of experiments conducted by Elton Mayo between 1924 and 1932, at the Western Electric factory in Hawthorne, Chicago, the Hawthorne effect symbolises where the participants in a study change their behaviour due to the fact that they are being observed.
Within the Hawthorne studies, it was found that the departmental outputs increased each time a change was made, even when the changes made were reverting to the original unfavourable conditions. The subjects in the experiment were told that better lighting would result in improved productivity, and as such, their beliefs about the impact of good lighting had a more significant effect on their behaviour and output than what the actual lighting levels were. Researchers formed the conclusion that the workers were in fact responding to the attention of the supervisors, not the changes in the experimental variables.
To prevent the Hawthorne Effect, studies using hidden observation can be useful, however knowledge of participation in the study would be required by law and is thought to still have the potential to cause the induction of the Hawthorne Effect. Further, making responses or study data completely anonymous will result in reducing the likelihood of participants altering their behaviour as a result of being observed as they take part in an experiment or study. Furthermore, conducting research prior to the studies to establish a baseline measure could assist in mitigating the Hawthorne Effect from biasing the studies results significantly. With a baseline established, any potential participant bias that arises as a result of being observed can be evaluated. Furthermore, establishing a follow-up period could be of benefit to enable the examination of whether a behaviour or change continues and is sustained beyond the observation period.
- Observer-expectancy effect, when a researcher subconsciously influences the participants of an experiment
- Mahtani, Kamal; Spencer, Elizabeth A.; Brassey, Jon; Heneghan, Carl (2018-02-01). "Catalogue of bias: observer bias". BMJ Evidence-Based Medicine. 23 (1): 23–24. doi:10.1136/ebmed-2017-110884. ISSN 2515-446X. PMID 29367322. S2CID 46794082.
- Miettinen, Olli S. (2008-11-01). "M. Porta, S. Greenland & J. M. Last (eds): A Dictionary of Epidemiology. A Handbook Sponsored by the I.E.A.". European Journal of Epidemiology. 23 (12): 813–817. doi:10.1007/s10654-008-9296-5. ISSN 0393-2990. S2CID 41169767.
- Pronin, Emily (2007-01-01). "Perception and misperception of bias in human judgment". Trends in Cognitive Sciences. 11 (1): 37–43. doi:10.1016/j.tics.2006.11.001. ISSN 1364-6613. PMID 17129749. S2CID 2754235.
- Hróbjartsson, Asbjørn; Thomsen, Ann Sofia Skou; Emanuelsson, Frida; Tendal, Britta; Hilden, Jørgen; Boutron, Isabelle; Ravaud, Philippe; Brorson, Stig (2012-02-27). "Observer bias in randomised clinical trials with binary outcomes: systematic review of trials with both blinded and non-blinded outcome assessors". BMJ. 344: e1119. doi:10.1136/bmj.e1119. ISSN 0959-8138. PMID 22371859. S2CID 23296493.
- Tripepi, Giovanni; Jager, Kitty J.; Dekker, Friedo W.; Zoccali, Carmine (2010). "Selection Bias and Information Bias in Clinical Research". Nephron Clinical Practice. 115 (2): c94–c99. doi:10.1159/000312871. ISSN 1660-2110. PMID 20407272. S2CID 18856450.
- Tuyttens, F. A. M.; de Graaf, S.; Heerkens, J. L. T.; Jacobs, L.; Nalon, E.; Ott, S.; Stadig, L.; Van Laer, E.; Ampe, B. (2014-04-01). "Observer bias in animal behaviour research: can we believe what we score, if we score what we believe?". Animal Behaviour. 90: 273–280. doi:10.1016/j.anbehav.2014.02.007. ISSN 0003-3472. S2CID 53195951.
- Samhita, Laasya; Gross, Hans J (2013-11-09). "The "Clever Hans Phenomenon" revisited". Communicative & Integrative Biology. 6 (6): e27122. doi:10.4161/cib.27122. PMC 3921203. PMID 24563716.
- Gillie, Oliver (1977). "Did Sir Cyril Burt Fake His Research on Heritability of Intelligence? Part I". The Phi Delta Kappan. 58 (6): 469–471. ISSN 0031-7217. JSTOR 20298643.
- Rosenthal, Robert; Fode, Kermit L. (1963). "Psychology of the Scientist: V. Three Experiments in Experimenter Bias". Psychological Reports. 12 (2): 491. doi:10.2466/pr0.1918.104.22.1681.
- Wilgenburg, Ellen van; Elgar, Mark A. (2013-01-23). "Confirmation Bias in Studies of Nestmate Recognition: A Cautionary Note for Research into the Behaviour of Animals". PLOS ONE. 8 (1): e53548. Bibcode:2013PLoSO...853548V. doi:10.1371/journal.pone.0053548. ISSN 1932-6203. PMC 3553103. PMID 23372659.
- West, Charles (February 1980). "Book Reviews: Achenbach, Thomas M. Research in Developmental Psychology: Concepts, Strategies, Methods. New York: The Free Press, 1978. 350 + xiii pp. $14.95". Educational Researcher. 9 (2): 16–17. doi:10.3102/0013189x009002016. ISSN 0013-189X. S2CID 145015499.
- Noble, Helen; Heale, Roberta (2019-07-01). "Triangulation in research, with examples". Evidence-Based Nursing. 22 (3): 67–68. doi:10.1136/ebnurs-2019-103145. ISSN 1367-6539. PMID 31201209. S2CID 189862202.
- Carter, Nancy; Bryant-Lukosius, Denise; DiCenso, Alba; Blythe, Jennifer; Neville, Alan J. (2014-08-26). "The Use of Triangulation in Qualitative Research". Oncology Nursing Forum. 41 (5): 545–547. doi:10.1188/14.ONF.545-547. PMID 25158659.
- Persell, Stephen D.; Doctor, Jason N.; Friedberg, Mark W.; Meeker, Daniella; Friesema, Elisha; Cooper, Andrew; Haryani, Ajay; Gregory, Dyanna L.; Fox, Craig R.; Goldstein, Noah J.; Linder, Jeffrey A. (2016-08-05). "Behavioral interventions to reduce inappropriate antibiotic prescribing: a randomized pilot trial". BMC Infectious Diseases. 16 (1): 373. doi:10.1186/s12879-016-1715-8. ISSN 1471-2334. PMC 4975897. PMID 27495917.