The replication crisis (also called the replicability crisis and the reproducibility crisis) is an ongoing methodological crisis in which it has been found that the results of many scientific studies are difficult or impossible to reproduce. Because the reproducibility of empirical results is an essential part of the scientific method, such failures undermine the credibility of theories building on them and potentially of substantial parts of scientific knowledge.
The replication crisis most severely affects the social and medical sciences, where considerable efforts have been undertaken to re-investigate classic results, to determine both their reliability and, if found unreliable, the reasons for the failure. Survey data strongly indicates that all natural sciences are affected as well.
The phrase "replication crisis" was coined in the early 2010s as part of a growing awareness of the problem. Considerations around causes and remedies have given rise to a new scientific discipline called metascience, that uses methods of empirical research to examine empirical research practice.
Since empirical research involves both obtaining and analyzing data, considerations about its reproducibility fall into two categories. The validation of the analysis and interpretation of the data obtained in a study runs under the term reproducibility in the narrow sense and is discussed in depth in the computational sciences. The task of repeating the experiment or observational study to obtain new, independent data with the goal of reaching the same or similar conclusions as an original study is called replication.
A 2016 poll of 1,500 scientists conducted by Nature reported that 70% of them had failed to reproduce at least one other scientist's experiment (including 87% of chemists, 77% of biologists, 69% of physicists and engineers, 67% of medical researchers, 64% of earth and environmental scientists, and 62% of all others), while 50% had failed to reproduce one of their own experiments, and less than 20% had ever been contacted by another researcher unable to reproduce their work. Only a minority had ever attempted to publish a replication, and while 24% had been able to publish a successful replication, only 13% had published a failed replication, and several respondents that had published failed replications noted that editors and reviewers demanded that they play down comparisons with the original studies. In 2009, 2% of scientists admitted to falsifying studies at least once and 14% admitted to personally knowing someone who did. Such misconduct was, according to one study, reported more frequently by medical researchers than by others. A 2021 study found that papers in leading journals with findings that can't be replicated tend to be cited more than reproducible science. Results that are published unreproducibly – or not in a replicable sufficiently transparent way – are more likely to be wrong and may slow progress. The authors also put forward possible explanations for this state of affairs.
Several factors have combined to put psychology at the center of the controversy. According to a 2018 survey of 200 meta-analyses, "psychological research is, on average, afflicted with low statistical power". Much of the focus has been on the area of social psychology, although other areas of psychology such as clinical psychology, developmental psychology, and educational research have also been implicated.
First, questionable research practices (QRPs) have been identified as common in the field. Such practices, while not intentionally fraudulent, involve capitalizing on the gray area of acceptable scientific practices or exploiting flexibility in data collection, analysis, and reporting, often in an effort to obtain a desired outcome. Examples of QRPs include selective reporting or partial publication of data (reporting only some of the study conditions or collected dependent measures in a publication), optional stopping (choosing when to stop data collection, often based on statistical significance of tests), post-hoc storytelling (framing exploratory analyses as confirmatory analyses), and manipulation of outliers (either removing outliers or leaving outliers in a dataset to cause a statistical test to be significant). A survey of over 2,000 psychologists indicated that a majority of respondents admitted to using at least one QRP. The publication bias (see Section "Causes" below) leads to an elevated number of false positive results. It is augmented by the pressure to publish as well as the author's own confirmation bias and is an inherent hazard in the field, requiring a certain degree of skepticism on the part of readers.
Second, psychology and social psychology in particular, has found itself at the center of several scandals involving outright fraudulent research, most notably the admitted data fabrication by Diederik Stapel as well as allegations against others. However, most scholars[who?] acknowledge that fraud is, perhaps, the lesser contribution to replication crises.
Third, several effects in psychological science have been found to be difficult to replicate even before the current replication crisis. For example, the scientific journal Judgment and Decision Making has published several studies over the years that fail to provide support for the unconscious thought theory. Replications appear particularly difficult when research trials are pre-registered and conducted by research groups not highly invested in the theory under questioning.
These three elements together have resulted in renewed attention for replication supported by psychologist Daniel Kahneman. Scrutiny of many effects has shown that several core beliefs are hard to replicate. A 2014 special edition of the journal Social Psychology focused on replication studies and a number of previously held beliefs were found to be difficult to replicate. A 2012 special edition of the journal Perspectives on Psychological Science also focused on issues ranging from publication bias to null-aversion that contribute to the replication crises in psychology. In 2015, the first open empirical study of reproducibility in psychology was published, called the Reproducibility Project. Researchers from around the world collaborated to replicate 100 empirical studies from three top psychology journals. Fewer than half of the attempted replications were successful at producing statistically significant results in the expected directions, though most of the attempted replications did produce trends in the expected directions.
Many research trials and meta-analyses are compromised by poor quality and conflicts of interest that involve both authors and professional advocacy organizations, resulting in many false positives regarding the effectiveness of certain types of psychotherapy.
Although the British newspaper The Independent wrote that the results of the reproducibility project show that much of the published research is just "psycho-babble", the replication crisis does not necessarily mean that psychology is unscientific. Rather this process is part of the scientific process in which old ideas or those that cannot withstand careful scrutiny are pruned, although this pruning process is not always effective. The consequence is that some areas of psychology once considered solid, such as social priming, have come under increased scrutiny due to failed replications.
Nobel laureate and professor emeritus in psychology Daniel Kahneman argued that the original authors should be involved in the replication effort because the published methods are often too vague. Others, such as Andrew Wilson, disagree, arguing that the methods should be written down in detail. An investigation of replication rates in psychology in 2012 indicated higher success rates of replication in replication studies when there was author overlap with the original authors of a study (91.7% successful replication rates in studies with author overlap compared to 64.6% successful replication rates without author overlap).
Focus on the replication crisis has led to other renewed efforts in the discipline to re-test important findings. In response to concerns about publication bias and p-hacking, more than 140 psychology journals have adopted result-blind peer review where studies are accepted not on the basis of their findings and after the studies are completed, but before the studies are conducted and upon the basis of the methodological rigor of their experimental designs and the theoretical justifications for their statistical analysis techniques before data collection or analysis is done. Early analysis of this procedure has estimated that 61 percent of result-blind studies have led to null results, in contrast to an estimated 5 to 20 percent in earlier research. In addition, large-scale collaborations between researchers working in multiple labs in different countries and that regularly make their data openly available for different researchers to assess have become much more common in the field.
Psychology replication rates
A report by the Open Science Collaboration in August 2015 that was coordinated by Brian Nosek estimated the reproducibility of 100 studies in psychological science from three high-ranking psychology journals. Overall, 36% of the replications yielded significant findings (p value below 0.05) compared to 97% of the original studies that had significant effects. The mean effect size in the replications was approximately half the magnitude of the effects reported in the original studies.
The same paper examined the reproducibility rates and effect sizes by journal (Journal of Personality and Social Psychology [JPSP], Journal of Experimental Psychology: Learning, Memory, and Cognition [JEP:LMC], Psychological Science [PSCI]) and discipline (social psychology, developmental psychology). Study replication rates were 23% for JPSP, 48% for JEP:LMC, and 38% for PSCI. Studies in the field of cognitive psychology had a higher replication rate (50%) than studies in the field of social psychology (25%).
An analysis of the publication history in the top 100 psychology journals between 1900 and 2012 indicated that approximately 1.6% of all psychology publications were replication attempts. Articles were considered a replication attempt if the term "replication" appeared in the text. A subset of those studies (500 studies) was randomly selected for further examination and yielded a lower replication rate of 1.07% (342 of the 500 studies [68.4%] were actually replications). In the subset of 500 studies, analysis indicated that 78.9% of published replication attempts were successful.
A study published in 2018 in Nature Human Behaviour sought to replicate 21 social and behavioral science papers from Nature and Science, finding that only 13 could be successfully replicated. Similarly, in a study conducted under the auspices of the Center for Open Science, a team of 186 researchers from 60 different laboratories (representing 36 different nationalities from 6 different continents) conducted replications of 28 classic and contemporary findings in psychology. The focus of the study was not only on whether or not the findings from the original papers replicated, but also on the extent to which findings varied as a function of variations in samples and contexts. Overall, 14 of the 28 findings failed to replicate despite massive sample sizes. However, if a finding replicated, it replicated in most samples, while if a finding was not replicated, it failed to replicate with little variation across samples and contexts. This evidence is inconsistent with a popular explanation that failures to replicate in psychology are likely due to changes in the sample between the original and replication study.
- "Independent, direct replications of others' findings can be time-consuming for the replicating researcher"
- "[Replications] are likely to take energy and resources directly away from other projects that reflect one's own original thinking"
- "[Replications] are generally harder to publish (in large part because they are viewed as being unoriginal)"
- "Even if [replications] are published, they are likely to be seen as 'bricklaying' exercises, rather than as major contributions to the field"
- "[Replications] bring less recognition and reward, and even basic career security, to their authors"
For these reasons the authors advocated that psychology is facing a disciplinary social dilemma, where the interests of the discipline are at odds with the interests of the individual researcher.
"Methodological terrorism" controversy
With the replication crisis of psychology earning attention, Princeton University psychologist Susan Fiske drew controversy for calling out critics of psychology. She labeled these unidentified "adversaries" with names such as "methodological terrorist" and "self-appointed data police", and said that criticism of psychology should only be expressed in private or through contacting the journals. Columbia University statistician and political scientist Andrew Gelman, responded to Fiske, saying that she had found herself willing to tolerate the "dead paradigm" of faulty statistics and had refused to retract publications even when errors were pointed out. He added that her tenure as editor has been abysmal and that a number of published papers edited by her were found to be based on extremely weak statistics; one of Fiske's own published papers had a major statistical error and "impossible" conclusions.
Out of 49 medical studies from 1990–2003 with more than 1000 citations, 45 claimed that the studied therapy was effective. Out of these studies, 16% were contradicted by subsequent studies, 16% had found stronger effects than did subsequent studies, 44% were replicated, and 24% remained largely unchallenged. The US Food and Drug Administration in 1977–1990 found flaws in 10–20% of medical studies. In a paper published in 2012, C. Glenn Begley, a biotech consultant working at Amgen, and Lee Ellis, at the University of Texas, found that only 11% of 53 pre-clinical cancer studies could be replicated. The irreproducible studies had a number of features in common, including that studies were not performed by investigators blinded to the experimental versus the control arms, there was a failure to repeat experiments, a lack of positive and negative controls, failure to show all the data, inappropriate use of statistical tests and use of reagents that were not appropriately validated.
A survey on cancer researchers found that half of them had been unable to reproduce a published result. A similar survey by Nature on 1,576 researchers who took a brief online questionnaire on reproducibility showed that more than 70% of researchers have tried and failed to reproduce another scientist's experiments, and more than half have failed to reproduce their own experiments. "Although 52% of those surveyed agree there is a significant 'crisis' of reproducibility, less than 31% think failure to reproduce published results means the result is probably wrong, and most say they still trust the published literature."
A 2016 article by John Ioannidis, Professor of Medicine and of Health Research and Policy at Stanford University School of Medicine and a Professor of Statistics at Stanford University School of Humanities and Sciences, elaborated on "Why Most Clinical Research Is Not Useful". In the article Ioannidis laid out some of the problems and called for reform, characterizing certain points for medical research to be useful again; one example he made was the need for medicine to be "patient centered" (e.g. in the form of the Patient-Centered Outcomes Research Institute) instead of the current practice to mainly take care of "the needs of physicians, investigators, or sponsors".
Marketing is another discipline with a "desperate need" for replication. Many famous marketing studies fail to be repeated upon replication, a notable example being the "too-many-choices" effect, in which a high number of choices of product makes a consumer less likely to purchase. In addition to the previously mentioned arguments, replication studies in marketing are needed to examine the applicability of theories and models across countries and cultures, which is especially important because of possible influences of globalization.
A 2016 study in the journal Science found that one-third of 18 experimental studies from two top-tier economics journals (American Economic Review and the Quarterly Journal of Economics) failed to successfully replicate. A 2017 study in the Economic Journal suggested that "the majority of the average effects in the empirical economics literature are exaggerated by a factor of at least 2 and at least one-third are exaggerated by a factor of 4 or more".
In sports science
A 2018 study took the field of exercise and sports science to task for insufficient replication studies, limited reporting of both null and trivial results, and insufficient research transparency. Statisticians have criticized sports science for common use of a controversial statistical method called "magnitude-based inference" which has allowed sports scientists to extract apparently significant results from noisy data where ordinary hypothesis testing would have found none.
In water resource management
A 2019 study in Scientific Data suggested that only a small number of articles in water resources and management journals could be reproduced, while the majority of articles were not replicable due to data unavailability. The study estimated with 95% confidence that "results might be reproduced for only 0.6% to 6.8% of all 1,989 articles".
A major cause of low reproducibility is the publication bias stemming from the fact that statistically insignificant results are rarely published or discussed in publications on multiple potential effects. Among potential effects that are inexistent (or tiny), the statistical tests show significance (at the usual level) with 5% probability. If a large number of such effects are screened in a chase for significant results, these erroneously significant ones inundate the appropriately found ones, and they lead to (still erroneously) successful replications again with only 5% probability. An increasing proportion of such studies thus progressively lowers the replication rate corresponding to studies of plausibly relevant effects.
Erroneously significant results may also come from questionable practices in data analysis called data dredging or P-hacking, HARKing, and researcher degrees of freedom. They consist of applying different methods of data screening, outlier rejection, subgroup selection, data transformations, models, concomittant variables, and alternative estimation and testing methods--and finally again reporting the variety that produces the most significant result.
An alarming rate of results that fail replication is therefore triggered by the "generation of new data/publications at an unprecedented rate" that leads to a failure to adhere to good scientific practice and the "desperation to publish or perish."
Historical and philosophical roots
In fact, some predictions of an impending crisis in the quality control mechanism of science can be traced back several decades, especially among scholars in science and technology studies (STS). Derek de Solla Price – considered the father of scientometrics – predicted that science could reach 'senility' as a result of its own exponential growth. Some present day literature seems to vindicate this 'overflow' prophecy, lamenting the decay in both attention and quality.
Philosopher and historian of science Jerome R. Ravetz predicted in his 1971 book Scientific Knowledge and Its Social Problems that science – in its progression from "little" science composed of isolated communities of researchers, to "big" science or "techno-science" – would suffer major problems in its internal system of quality control. Ravetz recognized that the incentive structure for modern scientists could become dysfunctional, now known as the present 'publish or perish' challenge, creating perverse incentives to publish any findings, however dubious. According to Ravetz, quality in science is maintained only when there is a community of scholars linked by a set of shared norms and standards, all of whom are willing and able to hold one another accountable.
Historian Philip Mirowski offered a similar diagnosis in his 2011 book Science Mart (2011). In the title, the word 'Mart' is in reference to the retail giant 'Walmart', used by Mirowski as a metaphor for the commodification of science. In Mirowski's analysis, the quality of science collapses when it becomes a commodity being traded in a market. Mirowski argues his case by tracing the decay of science to the decision of major corporations to close their in-house laboratories. They outsourced their work to universities in an effort to reduce costs and increase profits. The corporations subsequently moved their research away from universities to an even cheaper option – Contract Research Organizations (CRO).
The crisis of science's quality control system is affecting the use of science for policy. This is the thesis of a recent work by a group of STS scholars, who identify in 'evidence based (or informed) policy' a point of present tension. Economist Noah Smith suggests that a factor in the crisis has been the overvaluing of research in academia and undervaluing of teaching ability, especially in fields with few major recent discoveries.
Social system theory, due to the German sociologist Niklas Luhmann  offers another reading of the crisis. According to this theory each the systems such as 'economy', 'science', 'religion', 'media' and so on communicates using its own code, true/false for science, profit/loss for the economy, new/no-news for the media; according to some sociologists, science's mediatization, its commodification  and its politicization, – as a result of the structural coupling among systems, have led to a confusion of the original system codes. If science's code true/false is substituted for by those of the other systems, such as profit/loss, news/no-news, science's operation enters into an internal crisis.
When effects that are wrongly stated as relevant in the literature, failure to detect this by replication will lead to the canonization of such false facts.
In the US, science's reproducibility crisis has become a topic of political contention, linked to the attempt to diminish regulations – e.g. of emissions of pollutants, with the argument that these regulations are based on non-reproducible science. Previous attempts with the same aim accused studies used by regulators of being non-transparent.
Public awareness and perceptions
Concerns have been expressed within the scientific community that the general public may consider science less credible due to failed replications. Research supporting this concern is sparse, but a nationally representative survey in Germany showed that more than 75% of Germans have not heard of replication failures in science. The study also found that most Germans have positive perceptions of replication efforts: Only 18% think that non-replicability shows that science cannot be trusted, while 65% think that replication research shows that science applies quality control, and 80% agree that errors and corrections are part of science.
Open data, open source software and open source hardware all are critical to enabling reproducibility in the sense of validation of the data analysis. The use of proprietary software, the lack of the publication of analysis software and the lack of open data prevents the replication of studies. Unless software used in research is open source, reproducing results with different software and hardware configurations is impossible. CERN has both Open Data and CERN Analysis Preservation projects for storing data, all relevant information, and all software and tools needed to preserve an analysis at the large experiments of the LHC. Aside from all software and data, preserved analysis assets include metadata that enable understanding of the analysis workflow, related software, systematic uncertainties, statistics procedures and meaningful ways to search for the analysis, as well as references to publications and to backup material. CERN software is open source and available for use outside of particle physics and there is some guidance provided to other fields on the broad approaches and strategies used for open science in contemporary particle physics.
Online repositories where data, protocols, and findings can be stored and evaluated by the public seek to improve the integrity and reproducibility of research. Examples of such repositories include the Open Science Framework, Registry of Research Data Repositories, and Psychfiledrawer.org. Sites like Open Science Framework offer badges for using open science practices in an effort to incentivize scientists. However, there has been concern that those who are most likely to provide their data and code for analyses are the researchers that are likely the most sophisticated. John Ioannidis at Stanford University suggested that "the paradox may arise that the most meticulous and sophisticated and method-savvy and careful researchers may become more susceptible to criticism and reputation attacks by reanalyzers who hunt for errors, no matter how negligible these errors are".
Higher standards for original publications
Metascience is the use of scientific methodology to study science itself. Metascience seeks to increase the quality of scientific research while reducing waste. It is also known as "research on research" and "the science of science", as it uses research methods to study how research is done and where improvements can be made. Metascience concerns itself with all fields of research and has been described as "a bird's eye view of science." In the words of John Ioannidis, "Science is the best thing that has happened to human beings ... but we can do it better."
Meta-research continues to be conducted to identify the roots of the crisis and to address them. Methods of addressing the crisis include pre-registration of scientific studies and clinical trials as well as the founding of organizations such as CONSORT and the EQUATOR Network that issue guidelines for methodology and reporting. There are continuing efforts to reform the system of academic incentives, to improve the peer review process, to reduce the misuse of statistics, to combat bias in scientific literature, and to increase the overall quality and efficiency of the scientific process.
Raise the overall standards of methods presentation
Some authors have argued that the insufficient communication of experimental methods is a major contributor to the reproducibility crisis and that improving the quality of how experimental design and statistical analyses are reported would help improve the situation. These authors tend to plead for both a broad cultural change in the scientific community of how statistics are considered and a more coercive push from scientific journals and funding bodies.
Addressing the misinterpretation of p-values
Although statisticians are unanimous that use of the p < 0.05 provides weaker evidence than is generally appreciated, there is a lack of unanimity about what should be done about it. Some have advocated that Bayesian methods should replace p-values. This has not happened on a wide scale, partly because it is complicated, and partly because many users distrust the specification of prior distributions in the absence of hard data. A simplified version of the Bayesian argument, based on testing a point null hypothesis was suggested by Colquhoun (2014, 2017). The logical problems of inductive inference were discussed in "The problem with p-values" (2016).
The hazards of reliance on p-values were emphasized by pointing out that even observation of p = 0.001 was not necessarily strong evidence against the null hypothesis. Despite the fact that the likelihood ratio in favour of the alternative hypothesis over the null is close to 100, if the hypothesis was implausible, with a prior probability of a real effect being 0.1, even the observation of p = 0.001 would have a false positive risk of 8 percent. It would not even reach the 5 percent level.
It was recommended that the terms "significant" and "non-significant" should not be used. p-values and confidence intervals should still be specified, but they should be accompanied by an indication of the false positive risk. It was suggested that the best way to do this is to calculate the prior probability that would be necessary to believe in order to achieve a false positive risk of, say, 5%. The calculations can be done with R scripts that are provided, or, more simply, with a web calculator. This so-called reverse Bayesian approach, which was suggested by Matthews (2001), is one way to avoid the problem that the prior probability is rarely known.
Encouraging larger sample sizes
To improve the quality of replications, larger sample sizes than those used in the original study are often needed. Larger sample sizes are needed because estimates of effect sizes in published work are often exaggerated due to publication bias and large sampling variability associated with small sample sizes in an original study. Further, using significance thresholds usually leads to inflated effects, because particularly with small sample sizes, only the largest effects will become significant.
Requiring smaller p-value
Many publications require a p-value of p < 0.05 to claim statistical significance. The paper "Redefine statistical significance", signed by a large number of scientists and mathematicians, proposes that in "fields where the threshold for defining statistical significance for new discoveries is p < 0.05, we propose a change to p < 0.005. This simple step would immediately improve the reproducibility of scientific research in many fields."
Their rationale is that "a leading cause of non-reproducibility (is that the) statistical standards of evidence for claiming new discoveries in many fields of science are simply too low. Associating 'statistically significant' findings with p < 0.05 results in a high rate of false positives even in the absence of other experimental, procedural and reporting problems."
This call was subsequently criticised by another large group, who argued that "redefining" the threshold would not fix current problems, would lead to some new ones, and that in the end, all thresholds needed to be justified case-by-case instead of following general conventions.
Tackling publication bias with pre-registration of studies
A recent innovation in scientific publishing to address the replication crisis is through the use of pre-registration reports. The registered report format requires authors to submit a description of the study methods and analyses prior to data collection. Once the method and analysis plan is vetted through peer-review, publication of the findings is provisionally guaranteed, based on whether the authors follow the proposed protocol. One goal of registered reports is to circumvent the publication bias toward significant findings that can lead to implementation of questionable research practices and to encourage publication of studies with rigorous methods.
The journal Psychological Science has encouraged the preregistration of studies and the reporting of effect sizes and confidence intervals. The editor in chief also noted that the editorial staff will be asking for replication of studies with surprising findings from examinations using small sample sizes before allowing the manuscripts to be published.
Moreover, only a very small proportion of academic journals in psychology and neurosciences explicitly stated that they welcome submissions of replication studies in their aim and scope or instructions to authors. This phenomenon does not encourage the reporting or even attempt on replication studies.
Funding for replication studies
In July 2016 the Netherlands Organisation for Scientific Research made €3 million available for replication studies. The funding is for replication based on reanalysis of existing data and replication by collecting and analysing new data. Funding is available in the areas of social sciences, health research and healthcare innovation.
In 2013 the Laura and John Arnold Foundation funded the launch of The Center for Open Science with a $5.25 million grant and by 2017 had provided an additional $10 million in funding. It also funded the launch of the Meta-Research Innovation Center at Stanford at Stanford University run by John Ioannidis and Steven Goodman to study ways to improve scientific research. It also provided funding for the AllTrials initiative led in part by Ben Goldacre.
Emphasizing replication attempts in teaching
Based on coursework in experimental methods at MIT, Stanford, and the University of Washington, it has been suggested that methods courses in psychology and other fields emphasize replication attempts rather than original studies. Such an approach would help students learn scientific methodology and provide numerous independent replications of meaningful scientific findings that would test the replicability of scientific findings. Some have recommended that graduate students should be required to publish a high-quality replication attempt on a topic related to their doctoral research prior to graduation.
Using replication studies for the final year thesis
During undergraduate degree, students are required to submit a final year thesis that consists of an original piece of research. It has been recommended that not only do we teach students about open science but also to encourage replication studies as their third year project.
Metadata and digital tools
It has been suggested that "a simple way to check how often studies have been repeated, and whether or not the original findings are confirmed" is needed. Categorizations or ratings of reproducibility at the study and/or results level as well as addition of links to and rating of third-party confirmations could be conducted by the peer-reviewers, the scientific journal or by readers in combination with novel digital platforms or tools.
Wider views on the problem
Emphasize triangulation, not just replication
replication alone will get us only so far (and) might actually make matters worse ... We believe that an essential protection against flawed ideas is triangulation. This is the strategic use of multiple approaches to address one question. Each approach has its own unrelated assumptions, strengths and weaknesses. Results that agree across different methodologies are less likely to be artefacts. ... Maybe one reason replication has captured so much interest is the often-repeated idea that falsification is at the heart of the scientific enterprise. This idea was popularized by Karl Popper's 1950s maxim that theories can never be proved, only falsified. Yet an overemphasis on repeating experiments could provide an unfounded sense of certainty about findings that rely on a single approach. ... philosophers of science have moved on since Popper. Better descriptions of how scientists actually work include what epistemologist Peter Lipton called in 1991 "inference to the best explanation".
Shift to a complex systems paradigm
It has been argued that research endeavours working within the conventional linear paradigm necessarily end up in replication difficulties. Problems arise if the causal processes in the system under study are "interaction-dominant" instead of "component dominant", multiplicative instead of additive, and with many small non-linear interactions producing macro-level phenomena, that are not reducible to their micro-level components. In the context of such complex systems, conventional linear models produce answers that are not reasonable, because it is not in principle possible to decompose the variance as suggested by the General Linear Model (GLM) framework – aiming to reproduce such a result is hence evidently problematic. The same questions are currently being asked in many fields of science, where researchers are starting to question assumptions underlying classical statistical methods.
A creative destruction approach
Replication is fundamental for scientific progress to confirm the original findings. However, replication alone is not sufficient to resolve the replication crisis. Replication efforts should seek not just to support or question the original findings, but also to replace them with revised, stronger theories with greater explanatory power. This approach therefore involves ‘pruning’ existing theories, comparing all the alternative theories, and making replication efforts more generative and engaged in theory-building.
Implications for the pharmaceutical industry
Pharmaceutical companies and venture capitalists maintain research laboratories or contract with private research service providers (e.g. Envigo and Smart Assays Biotechnologies) whose job is to replicate academic studies, in order to test if they are accurate prior to investing or trying to develop a new drug based on that research. The financial stakes are high for the company and investors, so it is cost effective for them to invest in exact replications. Execution of replication studies consume resources. Further, doing an expert replication requires not only generic expertise in research methodology, but specific expertise in the often narrow topic of interest. Sometimes research requires specific technical skills and knowledge, and only researchers dedicated to a narrow area of research might have those skills. Right now, funding agencies are rarely interested in bankrolling replication studies, and most scientific journals are not interested in publishing such results. Amgen Oncology's cancer researchers were only able to replicate 11 percent of 53 innovative studies they selected to pursue over a 10-year period; a 2011 analysis by researchers with pharmaceutical company Bayer found that the company's in-house findings agreed with the original results only a quarter of the time, at the most. The analysis also revealed that, when Bayer scientists were able to reproduce a result in a direct replication experiment, it tended to translate well into clinical applications; meaning that reproducibility is a useful marker of clinical potential.
- Data dredging
- Invalid science
- Publication bias
- Estimation statistics
- Reproducibility Project
- Research integrity
- List of scientific misconduct incidents
- Ioannidis, John P. A. (August 1, 2005). "Why Most Published Research Findings Are False". PLOS Medicine. 2 (8): e124. doi:10.1371/journal.pmed.0020124. ISSN 1549-1277. PMC 1182327. PMID 16060722.
- Staddon, John (2017). Scientific Method: How Science Works, Fails to Work or Pretends to Work. Taylor and Francis.
- Schooler, J. W. (2014). "Metascience could rescue the 'replication crisis'". Nature. 515 (7525): 9. Bibcode:2014Natur.515....9S. doi:10.1038/515009a. PMID 25373639.
- Lehrer, Jonah (December 13, 2010). "The Truth Wears Off". The New Yorker. Retrieved 2020-01-30.
- Marcus, Gary (May 1, 2013). "The Crisis in Social Psychology That Isn't". The New Yorker. Retrieved 2020-01-30.
- Baker, Monya (25 May 2016). "1,500 scientists lift the lid on reproducibility". Nature. Springer Nature. 533 (7604): 452–454. doi:10.1038/533452a. PMID 27225100. S2CID 4460617.
- Pashler, Harold; Wagenmakers, Eric Jan (2012). "Editors' Introduction to the Special Section on Replicability in Psychological Science: A Crisis of Confidence?". Perspectives on Psychological Science. 7 (6): 528–530. doi:10.1177/1745691612465253. PMID 26168108. S2CID 26361121.
- Fidler, Fiona; Wilcox, John (2018). "Reproducibility of Scientific Results". The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University. Retrieved 19 May 2019.
- Nature Video (28 May 2016). "Is There a Reproducibility Crisis in Science?". Scientific American. Retrieved 15 August 2019.
- Fanelli, Daniele (29 May 2009). "How Many Scientists Fabricate and Falsify Research? A Systematic Review and Meta-Analysis of Survey Data". PLOS ONE. 4 (5): e5738. Bibcode:2009PLoSO...4.5738F. doi:10.1371/journal.pone.0005738. PMC 2685008. PMID 19478950.
- "A new replication crisis: Research that is less likely to be true is cited more". phys.org. Retrieved 14 June 2021.
- Serra-Garcia, Marta; Gneezy, Uri (2021-05-01). "Nonreplicable publications are cited more than replicable ones". Science Advances. 7 (21): eabd1705. Bibcode:2021SciA....7D1705S. doi:10.1126/sciadv.abd1705. ISSN 2375-2548. PMC 8139580. PMID 34020944.
- Achenbach, Joel. "No, science's reproducibility problem is not limited to psychology". The Washington Post. Retrieved 10 September 2015.
- Stanley, T. D.; Carter, Evan C.; Doucouliagos, Hristos (2018). "What meta-analyses reveal about the replicability of psychological research". Psychological Bulletin. 144 (12): 1325–1346. doi:10.1037/bul0000169. ISSN 1939-1455. PMID 30321017. S2CID 51951232.
- Dominus, Susan (2017-10-18). "When the Revolution Came for Amy Cuddy". The New York Times. ISSN 0362-4331. Retrieved 2017-10-19.
- Leichsenring, Falk; Abbass, Allan; Hilsenroth, Mark J.; Leweke, Frank; Luyten, Patrick; Keefe, Jack R.; Midgley, Nick; Rabung, Sven; Salzer, Simone; Steiner, Christiane (April 2017). "Biases in research: risk factors for non-replicability in psychotherapy and pharmacotherapy research". Psychological Medicine. 47 (6): 1000–1011. doi:10.1017/S003329171600324X. PMID 27955715. S2CID 1872762.
- Hengartner, Michael P. (February 28, 2018). "Raising Awareness for the Replication Crisis in Clinical Psychology by Focusing on Inconsistencies in Psychotherapy Research: How Much Can We Rely on Published Findings from Efficacy Trials?". Frontiers in Psychology. Frontiers Media. 9: 256. doi:10.3389/fpsyg.2018.00256. PMC 5835722. PMID 29541051.
- Frank, Michael C.; Bergelson, Elika; Bergmann, Christina; Cristia, Alejandrina; Floccia, Caroline; Gervain, Judit; Hamlin, J. Kiley; Hannon, Erin E.; Kline, Melissa; Levelt, Claartje; Lew-Williams, Casey; Nazzi, Thierry; Panneton, Robin; Rabagliati, Hugh; Soderstrom, Melanie; Sullivan, Jessica; Waxman, Sandra; Yurovsky, Daniel (9 March 2017). "A Collaborative Approach to Infant Research: Promoting Reproducibility, Best Practices, and Theory‐Building". Infancy. 22 (4): 421–435. doi:10.1111/infa.12182. hdl:10026.1/9942. PMC 6879177. PMID 31772509.
- Tyson, Charlie (14 August 2014). "Failure to Replicate". Inside Higher Ed. Retrieved 19 December 2018.
- Makel, Matthew C.; Plucker, Jonathan A. (1 August 2014). "Facts Are More Important Than Novelty: Replication in the Education Sciences". Educational Researcher. 43 (6): 304–316. doi:10.3102/0013189X14545513. S2CID 145571836. Retrieved 19 December 2018.
- John, Leslie K.; Loewenstein, George; Prelec, Drazen (2012-05-01). "Measuring the Prevalence of Questionable Research Practices With Incentives for Truth Telling" (PDF). Psychological Science. 23 (5): 524–532. doi:10.1177/0956797611430953. ISSN 0956-7976. PMID 22508865. S2CID 8400625.
- Neuroskeptic (2012-11-01). "The Nine Circles of Scientific Hell". Perspectives on Psychological Science. 7 (6): 643–644. doi:10.1177/1745691612459519. ISSN 1745-6916. PMID 26168124. S2CID 45328962.
- "Research misconduct - The grey area of Questionable Research Practices". www.vib.be. 30 September 2013. Archived from the original on 2014-10-31. Retrieved 2015-11-13.
- Fiedler, Klaus; Schwarz, Norbert (2015-10-19). "Questionable Research Practices Revisited". Social Psychological and Personality Science. 7: 45–52. doi:10.1177/1948550615612150. ISSN 1948-5506. S2CID 146717227.
- Simmons, Joseph; Nelson, Leif; Simonsohn, Uri (November 2011). "False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant". Psychological Science. 22 (11): 1359–1366. doi:10.1177/0956797611417632. ISSN 0956-7976. PMID 22006061.
- Shea, Christopher (13 November 2011). "Fraud Scandal Fuels Debate Over Practices of Social Psychology". The Chronicle of Higher Education.
- Kahneman, Daniel (2014). "A New Etiquette for Replication". Social Psychology. 45 (4): 310–311. doi:10.1027/1864-9335/a000202.
- "Table of contents". Social Psychology. 45 (3). 2014. ISSN 1864-9335.
- "Table of contents". Perspectives on Psychological Science. 7 (6). 2012. ISSN 1745-6916.
- Open Science Collaboration (2015). "Estimating the reproducibility of Psychological Science" (PDF). Science. 349 (6251): aac4716. doi:10.1126/science.aac4716. hdl:10722/230596. PMID 26315443. S2CID 218065162.
- Coyne, James (April 15, 2014). "Are meta analyses conducted by professional organizations more trustworthy?". Mind the Brain. PLOS Blogs. Archived from the original on 2014-08-14. Retrieved September 13, 2016.
- Connor, Steve (27 August 2015). "Study reveals that a lot of psychology research really is just 'psycho-babble'". The Independent. London.
- Meyer, Michelle N.; Chabris, Christopher (31 July 2014). "Why Psychologists' Food Fight Matters". Slate.
- Aschwanden, Christie (19 August 2015). "Science Isn't Broken". FiveThirtyEight. Retrieved 2020-01-30.
- Aschwanden, Christie (27 August 2015). "Psychology Is Starting To Deal With Its Replication Problem". FiveThirtyEight. Retrieved 2020-01-30.
- Etchells, Pete (28 May 2014). "Psychology's replication drive: it's not about you". The Guardian.
- Wagenmakers, Eric-Jan; Wetzels, Ruud; Borsboom, Denny; Maas, Han L. J. van der; Kievit, Rogier A. (2012-11-01). "An Agenda for Purely Confirmatory Research". Perspectives on Psychological Science. 7 (6): 632–638. doi:10.1177/1745691612463078. ISSN 1745-6916. PMID 26168122. S2CID 5096417.
- Ioannidis, John P. A. (2012-11-01). "Why Science Is Not Necessarily Self-Correcting". Perspectives on Psychological Science. 7 (6): 645–654. doi:10.1177/1745691612464056. ISSN 1745-6916. PMID 26168125. S2CID 11798785.
- Pashler, Harold; Harris, Christine R. (2012-11-01). "Is the Replicability Crisis Overblown? Three Arguments Examined". Perspectives on Psychological Science. 7 (6): 531–536. doi:10.1177/1745691612463401. ISSN 1745-6916. PMID 26168109.
- Bartlett, Tom (30 January 2013). "Power of Suggestion". The Chronicle of Higher Education.
- Chambers, Chris (10 June 2014). "Physics envy: Do 'hard' sciences hold the solution to the replication crisis in psychology?". The Guardian.
- Makel, Matthew C.; Plucker, Jonathan A.; Hegarty, Boyd (2012-11-01). "Replications in Psychology Research How Often Do They Really Occur?". Perspectives on Psychological Science. 7 (6): 537–542. doi:10.1177/1745691612460688. ISSN 1745-6916. PMID 26168110.
- Stroebe, Wolfgang; Strack, Fritz (2014). "The Alleged Crisis and the Illusion of Exact Replication" (PDF). Perspectives on Psychological Science. 9 (1): 59–71. doi:10.1177/1745691613514450. PMID 26173241. S2CID 31938129.
- Aschwanden, Christie (6 December 2018). "Psychology's Replication Crisis Has Made The Field Better". FiveThirtyEight. Retrieved 19 December 2018.
- Allen, Christopher P G.; Mehler, David Marc Anton. "Open Science challenges, benefits and tips in early career and beyond". doi:10.31234/osf.io/3czyt. Cite journal requires
- Chartier, Chris; Kline, Melissa; McCarthy, Randy; Nuijten, Michele; Dunleavy, Daniel J.; Ledgerwood, Alison (December 2018), "The Cooperative Revolution Is Making Psychological Science Better", Observer, 31 (10), retrieved 19 December 2018
- Open Science Collaboration (2015-08-28). "Estimating the reproducibility of psychological science" (PDF). Science. 349 (6251): aac4716. doi:10.1126/science.aac4716. hdl:10722/230596. ISSN 0036-8075. PMID 26315443. S2CID 218065162.
- "Summary of reproducibility rates and effect sizes for original and replication studies overall and by journal/discipline". Retrieved 16 October 2019.
- Roger, Adam (2018-08-27). "The Science Behind Social Science Gets Shaken Up—Again". Wired. Retrieved 2018-08-28.
- Camerer, Colin F.; Dreber, Anna; et al. (27 August 2018). "Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015" (PDF). Nature Human Behaviour. 2 (9): 637–644. doi:10.1038/s41562-018-0399-z. PMID 31346273. S2CID 52098703.
- Klein, R.A. (2018). "Many Labs 2: Investigating Variation in Replicability Across Samples and Settings". Advances in Methods and Practices in Psychological Science. 1 (4): 443–490. doi:10.1177/2515245918810225.
- Witkowski, Tomasz (2019). "Is the glass half empty or half full? Latest results in the replication crisis in Psychology" (PDF). Skeptical Inquirer. 43 (2): 5–6. Archived from the original (PDF) on 2020-01-30.
- Earp, Brian D.; Trafimow, David (2015). "Replication, falsification, and the crisis of confidence in social psychology". Frontiers in Psychology. 6: 621. doi:10.3389/fpsyg.2015.00621. ISSN 1664-1078. PMC 4436798. PMID 26042061.
- Everett, Jim Albert Charlton; Earp, Brian D. (2015-01-01). "A tragedy of the (academic) commons: interpreting the replication crisis in psychology as a social dilemma for early-career researchers". Frontiers in Psychology. 6: 1152. doi:10.3389/fpsyg.2015.01152. PMC 4527093. PMID 26300832.
- Earp, Brian D. "Resolving the replication crisis in social psychology? A new proposal". Society for Personality and Social Psychology. Retrieved 2015-11-18.
- Letzter, Rafi (September 22, 2016). "Scientists are furious after a famous psychologist accused her peers of 'methodological terrorism'". Business Insider. Retrieved 2020-01-30.
- "Draft of Observer Column Sparks Strong Social Media Response". Association for Psychological Science. Retrieved 2017-10-04.
- Fiske, Susan T. (2016-10-31). "A Call to Change Science's Culture of Shaming". APS Observer. 29 (9).
- Singal, Jesse (2016-10-12). "Inside Psychology's 'Methodological Terrorism' Debate". NY Mag. Retrieved 2017-10-04.
- "BREAKING . . . . . . . PNAS updates its slogan! - Statistical Modeling, Causal Inference, and Social Science". Statistical Modeling, Causal Inference, and Social Science. 2017-10-04. Retrieved 2017-10-04.
- Ioannidis JA (13 July 2005). "Contradicted and initially stronger effects in highly cited clinical research". JAMA. 294 (2): 218–228. doi:10.1001/jama.294.2.218. PMID 16014596.
- Glick, J. Leslie (1992). "Scientific data audit—A key management tool". Accountability in Research. 2 (3): 153–168. doi:10.1080/08989629208573811.
- Begley, C. G.; Ellis, L. M. (2012). "Drug Development: Raise Standards for Preclinical Cancer Research". Nature. 483 (7391): 531–533. Bibcode:2012Natur.483..531B. doi:10.1038/483531a. PMID 22460880. S2CID 4326966.
- Begley, C. G. (2013). "Reproducibility: Six red flags for suspect work". Nature. 497 (7450): 433–434. Bibcode:2013Natur.497..433B. doi:10.1038/497433a. PMID 23698428. S2CID 4312732.
- Mobley, A.; Linder, S. K.; Braeuer, R.; Ellis, L. M.; Zwelling, L. (2013). Arakawa, Hirofumi (ed.). "A Survey on Data Reproducibility in Cancer Research Provides Insights into Our Limited Ability to Translate Findings from the Laboratory to the Clinic". PLOS ONE. 8 (5): e63221. Bibcode:2013PLoSO...863221M. doi:10.1371/journal.pone.0063221. PMC 3655010. PMID 23691000.
- Baker, Monya (2016). "1,500 scientists lift the lid on reproducibility". Nature. 533 (7604): 452–454. Bibcode:2016Natur.533..452B. doi:10.1038/533452a. PMID 27225100.
- Ioannidis, JPA (2016). "Why Most Clinical Research Is Not Useful". PLOS Med. 13 (6): e1002049. doi:10.1371/journal.pmed.1002049. PMC 4915619. PMID 27328301.
- Hunter, John E. (2001-06-01). "The desperate need for replications". Journal of Consumer Research. 28 (1): 149–158. doi:10.1086/321953.
- Armstrong, J.; Green, Kesten (2017-01-24). "Guidelines for Science: Evidence and Checklists". Marketing Papers. SSRN 3055874.
- Aichner, Thomas; Coletti, Paolo; Forza, Cipriano; Perkmann, Urban; Trentin, Alessio (2016-03-22). "Effects of Subcultural Differences on Country and Product Evaluations: A Replication Study". Journal of Global Marketing. 29 (3): 115–127. doi:10.1080/08911762.2015.1138012. S2CID 155364746.
- Camerer, Colin F.; Dreber, Anna; Forsell, Eskil; Ho, Teck-Hua; Huber, Jürgen; Johannesson, Magnus; Kirchler, Michael; Almenberg, Johan; Altmejd, Adam (2016-03-25). "Evaluating replicability of laboratory experiments in economics". Science. 351 (6280): 1433–1436. Bibcode:2016Sci...351.1433C. doi:10.1126/science.aaf0918. ISSN 0036-8075. PMID 26940865.
- Bohannon, John (2016-03-03). "About 40% of economics experiments fail replication survey". Science. Retrieved 2017-10-25.
- Ioannidis, John P. A.; Stanley, T. D.; Doucouliagos, Hristos (2017-10-01). "The Power of Bias in Economics Research". The Economic Journal. 127 (605): F236–F265. doi:10.1111/ecoj.12461. ISSN 1468-0297. S2CID 158829482.
- Halperin, Israel; Vigotsky, Andrew D.; Foster, Carl; Pyne, David B. (2018-02-01). "Strengthening the Practice of Exercise and Sport-Science Research". International Journal of Sports Physiology and Performance. 13 (2): 127–134. doi:10.1123/ijspp.2017-0322. hdl:10072/383414. ISSN 1555-0273. PMID 28787228. S2CID 3695727.
- Aschwanden, Christie; Nguyen, Mai (2018-05-16). "How Shoddy Statistics Found A Home In Sports Research". FiveThirtyEight. Retrieved 2018-05-16.
- Stagge, James H.; Rosenberg, David E.; Abdallah, Adel M.; Akbar, Hadia; Attallah, Nour A.; James, Ryan (2019-02-26). "Assessing data availability and research reproducibility in hydrology and water resources". Scientific Data. 6: 190030. Bibcode:2019NatSD...690030S. doi:10.1038/sdata.2019.30. ISSN 2052-4463. PMC 6390703. PMID 30806638.
- Begley, C. Glenn; Ioannidis, John P. A. "Reproducibility in Science: Improving the Standard for Basic and Preclinical Research". Circulation Research. 116: 116–126. doi:10.1161/CIRCRESAHA.114.303819.
- De Solla Price; Derek J. (1963). Little science big science. Columbia University Press. ISBN 9780231085625.
- Siebert, S.; Machesky, L. M. & Insall, R. H. (2015). "Overflow in science and its implications for trust". eLife. 4: e10825. doi:10.7554/eLife.10825. PMC 4563216. PMID 26365552.
- Della Briotta Parolo, P.; Kumar Pan; R. Ghosh; R. Huberman; B.A. Kimmo Kaski; Fortunato, S. (2015). "Attention decay in science". Journal of Informetrics. 9 (4): 734–745. arXiv:1503.01881. Bibcode:2015arXiv150301881D. doi:10.1016/j.joi.2015.07.006. S2CID 10949754.
- Mirowski, P. (2011). Science-Mart: Privatizing American Science. Harvard University Press.
- Saltelli, A.; Funtowicz, S. (2017). "What is science's crisis really about?". Futures. 91: 5–11. doi:10.1016/j.futures.2017.05.010.
- Benessia, A.; Funtowicz, S.; Giampietro, M.; Guimarães Pereira, A.; Ravetz, J.; Saltelli, A.; Strand, R.; van der Sluijs, J. (2016). The Rightful Place of Science: Science on the Verge. Consortium for Science, Policy and Outcomes at Arizona State University.
- Saltelli, Andrea; Ravetz, Jerome R. & Funtowicz, Silvio (25 June 2016). "A new community for science". New Scientist. No. 3079. p. 52.
- Saltelli, Andrea (December 2018). "Why science's crisis should not become a political battling ground". Futures. 104: 85–90. doi:10.1016/j.futures.2018.07.006.
- Smith, Noah (2016-12-14). "Academic signaling and the post-truth world". Noahpinion. Stony Brook University. Retrieved 5 November 2017.
- H. G. Moeller, Luhmann explained. Open Court Publishing Company, 2006.
- N. Luhmann, Social System. Stanford University Press, 1995.
- A. Saltelli and P.-M. Boulanger, "Technoscience, policy and the new media. Nexus or vortex?", Futures, vol. 115, p. 102491, November 2019.
- D. A. Scheufele, "Science communication as political communication", Proceedings of the National Academy of Sciences of the United States of America, vol. 111 Suppl, no. Supplement 4, pp. 13585–13592, September 2014.
- P. Mirowski. Science-Mart: Privatizing American Science. Harvard University Press, 2011.
- R. A. Pielke, Jr. The Honest Broker. Cambridge University Press, 2007.
- Nissen, Silas Boye; Magidson, Tali; Gross, Kevin; Bergstrom, Carl (December 20, 2016). "Research: Publication bias and the canonization of false facts". eLife. 5: e21451. arXiv:1609.00494. doi:10.7554/eLife.21451. PMC 5173326. PMID 27995896.
- Oreskes, N. (2018). "Beware: Transparency rule is a trojan horse". Nature. 557 (7706): 469. Bibcode:2018Natur.557..469O. doi:10.1038/d41586-018-05207-9. PMID 29789751.
- Michaels, D. (2008). Doubt is their product: How industry's assault on science threatens your health. Oxford University Press. ISBN 9780195300673.
- Białek, Michał (2018). "Replications can cause distorted belief in scientific progress". Behavioral and Brain Sciences. 41: e122. doi:10.1017/S0140525X18000584. ISSN 0140-525X. PMID 31064528. S2CID 147705650.
- Mede, Niels G.; Schäfer, Mike S.; Ziegler, Ricarda; Weißkopf, Markus (2020). "The "replication crisis" in the public eye: Germans' awareness and perceptions of the (ir)reproducibility of scientific research". Public Understanding of Science. 30 (1): 91–102. doi:10.1177/0963662520954370. PMID 32924865. S2CID 221723269.
- Ince, Darrel C.; Hatton, Leslie; Graham-Cumming, John (2012-02-22). "The case for open computer programs". Nature. 482 (7386): 485–488. doi:10.1038/nature10836. PMID 22358837.
- Junk, Thomas R.; Lyons, Louis (2020-12-21). "Reproducibility and Replication of Experimental Particle Physics Results". Harvard Data Science Review. 2 (4). arXiv:2009.06864. doi:10.1162/99608f92.250f995b. S2CID 221703733.
- Ioannidis, John P. A. (2016). "Anticipating consequences of sharing raw data and code and of awarding badges for sharing". Journal of Clinical Epidemiology. 70: 258–260. doi:10.1016/j.jclinepi.2015.04.015. PMID 26163123.
- Ioannidis, John P. A.; Fanelli, Daniele; Dunne, Debbie Drake; Goodman, Steven N. (2015-10-02). "Meta-research: Evaluation and Improvement of Research Methods and Practices". PLOS Biology. 13 (10): –1002264. doi:10.1371/journal.pbio.1002264. ISSN 1545-7885. PMC 4592065. PMID 26431313.
- Bach, Author Becky (8 December 2015). "On communicating science and uncertainty: A podcast with John Ioannidis". Scope. Retrieved 20 May 2019.
- Gosselin, Romain D. (2019). "Statistical Analysis Must Improve to Address the Reproducibility Crisis: The ACcess to Transparent Statistics (ACTS) Call to Action". BioEssays. 42 (1): 1900189. doi:10.1002/bies.201900189. PMID 31755115.
- Colquhoun, David (2015). "An investigation of the false discovery rate and the misinterpretation of p-values". Royal Society Open Science. 1 (3): 140216. arXiv:1407.5296. Bibcode:2014RSOS....140216C. doi:10.1098/rsos.140216. PMC 4448847. PMID 26064558.
- Colquhoun, David (2017). "The reproducibility of research and the misinterpretation of p-values". Royal Society Open Science. 4 (12): 171085. doi:10.1098/rsos.171085. PMC 5750014. PMID 29308247.
- Colquhoun, David. "The problem with p-values". Aeon Magazine. Retrieved 11 December 2016.
- Longstaff, Colin; Colquhoun, David. "Calculator for false positive risk (FPR)". UCL.
- Matthews, R. A. J. (2001). "Why should clinicians care about Bayesian methods?". Journal of Statistical Planning and Inference. 94: 43–58. doi:10.1016/S0378-3758(00)00232-9.
- Maxwell, Scott E.; Lau, Michael Y.; Howard, George S. (2015). "Is psychology suffering from a replication crisis? What does "failure to replicate" really mean?". American Psychologist. 70 (6): 487–498. doi:10.1037/a0039400. PMID 26348332.
- IntHout, Joanna; Ioannidis, John P. A.; Borm, George F.; Goeman, Jelle J. (2015). "Small studies are more heterogeneous than large ones: a meta-meta-analysis". Journal of Clinical Epidemiology. 68 (8): 860–869. doi:10.1016/j.jclinepi.2015.03.017. PMID 25959635.
- Button, Katherine S.; Ioannidis, John P. A.; Mokrysz, Claire; Nosek, Brian A.; Flint, Jonathan; Robinson, Emma S. J.; Munafò, Marcus R. (2013-05-01). "Power failure: why small sample size undermines the reliability of neuroscience". Nature Reviews Neuroscience. 14 (5): 365–376. doi:10.1038/nrn3475. ISSN 1471-003X. PMID 23571845.
- Greenwald, Anthony G. (1975). "Consequences of prejudice against the null hypothesis" (PDF). Psychological Bulletin. 82 (1): 1–20. doi:10.1037/h0076157.
- Amrhein, Valentin; Korner-Nievergelt, Fränzi; Roth, Tobias (2017). "The earth is flat (p > 0.05): significance thresholds and the crisis of unreplicable research". PeerJ. 5: e3544. doi:10.7717/peerj.3544. PMC 5502092. PMID 28698825.
- Benjamin, Daniel J.; et al. (2018). "Redefine statistical significance". Nature Human Behaviour. 2 (1): 6–10. doi:10.1038/s41562-017-0189-z. PMID 30980045.
- Lakens, Daniel; et al. (March 2018). "Justify your alpha". Nature Human Behaviour. 2 (3): 168–171. doi:10.1038/s41562-018-0311-x. hdl:21.11116/0000-0004-9413-F. ISSN 2397-3374. S2CID 3692182.
- Moonesinghe, Ramal; Khoury, Muin J.; Janssens, A. Cecile J. W. (2007-02-27). "Most Published Research Findings Are False—But a Little Replication Goes a Long Way". PLOS Med. 4 (2): e28. doi:10.1371/journal.pmed.0040028. PMC 1808082. PMID 17326704.
- Simons, Daniel J. (2014-01-01). "The Value of Direct Replication". Perspectives on Psychological Science. 9 (1): 76–80. doi:10.1177/1745691613514755. ISSN 1745-6916. PMID 26173243. S2CID 1149441.
- "Registered Replication Reports". Association for Psychological Science. Retrieved 2015-11-13.
- Chambers, Chris (2014-05-20). "Psychology's 'registration revolution'". The Guardian. Retrieved 2015-11-13.
- Lindsay, D. Stephen (2015-11-09). "Replication in Psychological Science". Psychological Science. 26 (12): 1827–32. doi:10.1177/0956797615616374. ISSN 0956-7976. PMID 26553013.
- Yeung, Andy W. K. (2017). "Do Neuroscience Journals Accept Replications? A Survey of Literature". Frontiers in Human Neuroscience. 11: 468. doi:10.3389/fnhum.2017.00468. ISSN 1662-5161. PMC 5611708. PMID 28979201.
- Martin, G. N.; Clarke, Richard M. (2017). "Are Psychology Journals Anti-replication? A Snapshot of Editorial Practices". Frontiers in Psychology. 8: 523. doi:10.3389/fpsyg.2017.00523. ISSN 1664-1078. PMC 5387793. PMID 28443044.
- "NWO makes 3 million available for Replication Studies pilot". NWO. Retrieved 2 August 2016.
- Apple, Sam (January 22, 2017). "The Young Billionaire Behind the War on Bad Science". Wired.
- Frank, Michael C.; Saxe, Rebecca (2012-11-01). "Teaching Replication". Perspectives on Psychological Science. 7 (6): 600–604. doi:10.1177/1745691612460686. ISSN 1745-6916. PMID 26168118. S2CID 33661604.
- Grahe, Jon E.; Reifman, Alan; Hermann, Anthony D.; Walker, Marie; Oleson, Kathryn C.; Nario-Redmond, Michelle; Wiebe, Richard P. (2012-11-01). "Harnessing the Undiscovered Resource of Student Research Projects". Perspectives on Psychological Science. 7 (6): 605–607. doi:10.1177/1745691612459057. ISSN 1745-6916. PMID 26168119.
- Marwick, Ben; Wang, Li-Ying; Robinson, Ryan; Loiselle, Hope (22 October 2019). "How to Use Replication Assignments for Teaching Integrity in Empirical Archaeology". Advances in Archaeological Practice. 8: 78–86. doi:10.1017/aap.2019.38.
- Quintana, Daniel S. (September 2021). "Replication studies for undergraduate theses to improve science and education". Nature Human Behaviour. 5 (9): 1117–1118. doi:10.1038/s41562-021-01192-8. ISSN 2397-3374.
- Munafò, Marcus R.; Smith, George Davey (January 23, 2018). "Robust research needs many lines of evidence". Nature. 553 (7689): 399–401. Bibcode:2018Natur.553..399M. doi:10.1038/d41586-018-01023-3. PMID 29368721.
- Wallot, Sebastian; Kelty-Stephen, Damian G. (2018-06-01). "Interaction-Dominant Causation in Mind and Brain, and Its Implication for Questions of Generalization and Replication". Minds and Machines. 28 (2): 353–374. doi:10.1007/s11023-017-9455-0. ISSN 1572-8641.
- Siegenfeld, Alexander F.; Bar-Yam, Yaneer (2020). "An Introduction to Complex Systems Science and Its Applications". Complexity. 2020: 1–16. arXiv:1912.05088. doi:10.1155/2020/6105872.
- Tierney, Warren; Hardy, Jay H.; Ebersole, Charles R.; Leavitt, Keith; Viganola, Domenico; Clemente, Elena Giulia; Gordon, Michael; Dreber, Anna; Johannesson, Magnus; Pfeiffer, Thomas; Uhlmann, Eric Luis (2020-11-01). "Creative destruction in science". Organizational Behavior and Human Decision Processes. 161: 291–309. doi:10.1016/j.obhdp.2020.07.002. ISSN 0749-5978.
- Tierney, Warren; Hardy, Jay; Ebersole, Charles R.; Viganola, Domenico; Clemente, Elena Giulia; Gordon, Michael; Hoogeveen, Suzanne; Haaf, Julia; Dreber, Anna; Johannesson, Magnus; Pfeiffer, Thomas (2021-03-01). "A creative destruction approach to replication: Implicit work and sex morality across cultures". Journal of Experimental Social Psychology. 93: 104060. doi:10.1016/j.jesp.2020.104060. ISSN 0022-1031.
- Wheeling, Kate (May 12, 2016). "Big Pharma Reveals a Biomedical Replication Crisis". Pacific Standard. Retrieved 2020-01-30. Updated on June 14, 2017.
- Prinz, Florian (2011-08-31). "Believe it or not: how much can we rely on published data on potential drug targets". Nature Reviews Drug Discovery. 10 (712): 712. doi:10.1038/nrd3439-c1. PMID 21892149.
- Bastian, Hilda (5 December 2016). "Reproducibility Crisis Timeline: Milestones in Tackling Research Reliability". Absolutely Maybe. Retrieved 5 June 2019.
- Denworth, Lydia (October 2019). "A Significant Problem: Standard scientific methods are under fire. Will anything change?", Scientific American, vol. 321, no. 4 , pp. 62–67. "The use of p values for nearly a century [since 1925] to determine statistical significance of experimental results has contributed to an illusion of certainty and [to] reproducibility crises in many scientific fields. There is growing determination to reform statistical analysis... Some [researchers] suggest changing statistical methods, whereas others would do away with a threshold for defining 'significant' results." (p. 63.)
- Harris, Richard (2017). Rigor Mortis: How Sloppy Science Creates Worthless Cures, Crushes Hope, and Wastes Billions. New York: Basic Books. ISBN 9780465097906.
- Kafkafi, Neri; Agassi, Joseph; Chesler, Elissa J.; Crabbe, John C.; Crusio, Wim E.; Eilam, David; Gerlai, Robert; Golani, Ilan; Gomez-Marin, Alex; Heller, Ruth; Iraqi, Fuad; Jaljuli, Iman; Karp, Natasha A.; Morgan, Hugh; Nicholson, George; Pfaff, Donald W.; Richter, S. Helene; Stark, Philip B.; Stiedl, Oliver; Stodden, Victoria; Tarantino, Lisa M.; Tucci, Valter; Valdar, William; Williams, Robert W.; Würbel, Hanno; Benjamini, Yoav (April 2018). "Reproducibility and replicability of rodent phenotyping in preclinical studies". Neuroscience & Biobehavioral Reviews. 87: 218–232. doi:10.1016/j.neubiorev.2018.01.003. PMC 6071910. PMID 29357292.
- Ritchie, Stuart (July 2020). Science Fictions: How Fraud, Bias, Negligence, and Hype Undermine the Search for Truth. New York: Metropolitan Books. ISBN 9781250222695. Book Review (Nov. 2020, The American Conservative)