At this time it is hardly a surprise to learn that even scientific journals publish very low quality work – not just solid experiments that have resulted in unsuccessful conclusions, but poorly designed studies that had no real chance to succeed before ever executing.
Studies that were dead on arrival. We have seen a lot of examples.
In 1996, a psychologist study claimed that discreet priming – the introduction of some harmless words in a quiz – could create consistent behavioral change.
This paper has been cited by other researchers a few thousand times ̵
1; before failed replications made it clear many years later that this result, and much of the subsequent literature, were little more than researchers who hunted patterns in sound.
As a political scientist, my personal favorite survey finding in 2012 was that women were 20 points more likely to support Barack Obama for president during certain days of their monthly cycle.
In retrospect, this assertion made no sense and was not supported by data. Even prospectively, the experiment had no chance of working: how the study was conducted, the noise in assessing any effect – in this case, an average difference in political attitudes in different parts of the cycle – was much greater than any realistic possible signal (real result).
We see it all the time. Remember the claims that subliminal smiley faces on a computer screen can cause major changes in attitudes towards immigration? That choice is determined by college football games and shark attacks? These studies were published in serious journals or promoted in serious news.
Researchers know that this is a problem. In a recently published paper in the Nature Human Behavior magazine, a team of respected economists and psychologists released the results of 21 replications of high-profile experiments.
Replication is important for researchers, because that means that the discovery may be just right. In this study, many findings failed to replicate. On average, the result was only about half the size of the originally published requirements.
Here’s what gets really strange. The lack of replication was predicted in advance by a panel of experts who used a “prediction market” where experts were allowed to focus on which experiments were more or less likely to – indeed, be real.
Similar prediction markets have been used for many years for choice, imitating the movement of the sports field. Basically, the results showed that informed researchers were clear from what they read would not keep up.
So yes that’s a problem. There has been resistance to fixing it, some of which have come from outstanding researchers at leading universities. But many, if not most, scientists are aware of the severity of the replication crisis and fear its corrosive effects on public confidence in science.
The challenge is what to do next. One potential solution is pre-registration, where researchers who start a study publish their analysis plan before collecting their tasks.
Pre-registration can be seen as a kind of time-reversal replication, a firewall against “data dredging”, the slope to see
But that will not solve the problem itself.
The replication crisis in science is often presented as a matter of scientific procedure or integrity. But all the cautious procedure and all honesty in the world will not help if your signal (the pattern you are looking for) is small, and the variation (all confounders, the other things that can explain this pattern) is high.
From this point of view, the crisis in science is more fundamental, which means moving beyond the existing routine model.
Say you want to study the effect of a drug or educational innovation on a small number of people. Unless the treatment is very clearly focused on a result of interest (for example, a mathematical planning focused on a particular standardized test), your study is likely to be too loud – too many variables – to determine real effects.  If something random appears and achieves statistical significance, it is likely that there will be a big overestimate of any true effect. In an attempt to replicate, we are likely to see something much closer to zero.
The failed replications have not been a surprise to many researchers, including myself, who have great experience of fake starts and blind streets in our own research.
The major problem in science is not cheaters or opportunists, but sincere researchers who have unfortunately been trained to believe that each statistically “significant” result is remarkable.
When you read about research in the news media (and as a taxpayer, you are indirectly an investigator of research), you should ask what exactly is measured, and why.