Article
Research and Development

The Power of Weak Studies: Why the Synthesis of a Research Paper Matters

Date: 2013
Author: Blakeley B. McShane, Ulf Bockenholt
Contributor: eb™ Research Team

Multiple-study articles are the norm in behavioral research and the number of studies per article is increasing [Schimmack, 2012]. Since review practices dictate that all studies demonstrate statistical significance [Simmons et al., 2011], authors may feel obligated to run studies until they get a sufficient number of statistically significant results and then report only the statistically significant ones; this leads to a form of publication bias known as the file drawer problem [Rosenthal, 1979]. Indeed, given the modest power typical of studies in behavioral research [Cohen, 1962, 1992], papers where all studies ran pass this “statistical significance filter” [Gelman and Weakliem, 2009] are generally unrealistic. Furthermore, application of this filter guarantees upwardly biased effect size estimates–a bias that is exacerbated when the true effect size is small as well as when the size α of the test (i.e., the threshold p-value) is low. In effect, then, the standards of the field are self-defeating. While these observations are not new or even unique to our field [Sterling, 1959], recent work suggests that publication bias is indeed a problem in psychology journals [Simmons et al., 2011, John et al., 2012].