The 7 deadly sins of research
The most common stumbling blocks.
10 December 2019
While strategies such as p-hacking, double-dipping, and salami slicing may boost the chances of obtaining exciting and novel results, these dodgy practices are driving down the quality of research.
Below are seven common traps that researchers can fall into when pursuing big discoveries.
In 2005, Stanford University epidemiologist, John Ionnaidis, made a bold claim: most published research findings are false. The main culprits? Bias, small sample sizes, and p-hacking.
For almost a century, scientists have used the ‘p-value’ to determine whether their results are due to a real effect or simply chance. With the pressure to publish only significant results, some researchers may be tempted to push their p-values to below the cutoff.
P-hacking can be as simple as removing pesky outliers during analysis or as laborious as running several analyses until significant results are achieved.
Scientists may also collect more data to increase their chances of a publishable result, says Robert MacCoun, a social psychologist at Stanford.
“The problem isn't that it is bad to collect more data,” says MacCoun. “The problem is created when we only do that for hypotheses we like and want to support, and when editors only publish the subset of studies that cross that threshold.”
2. HARKing (Hypothesizing After Results Are Known)
Presenting ‘post-results hypotheses’ as if they were created before an experiment and leaving out the original hypotheses when reporting results are examples of HARKing.
A 2017 analysis of six recent surveys on HARKing self-admission rates found that 43% of researchers have HARKed at least once in their career.
Kevin Murphy, a psychologist at the University of Limerick in the UK, says that using results to develop hypotheses “distorts the scientific process”.
“HARKing creates the real possibility that you will make up a story to ‘explain’ what is essentially a sampling error,” says Murphy, who studies research practices. “This will mislead you and your readers into thinking that you have gained some new understanding of a meaningful phenomenon.”
3. Cherry-picking data
Glenn Begley, an oncologist at BioCurate, spoke about an alarming interaction he had with a respected oncologist whose name he declined to make public.
Begley, who was vice president and global head of hematology and oncology research at US pharmaceutical company, Amgen, had attempted to replicate a high-profile study on tumour growth in animal models.
After several tries, he failed to produce the results that were published in Cancer Cell, one of the world’s most well-regarded cancer journals.
A chance meeting with the lead author resulted in a shocking admission, which Begley recounted to Science in 2011: "He said, 'We did this experiment a dozen times, got this answer once, and that's the one we decided to publish.'"
This is a classic case of cherry-picking data, where researchers only publish the results that best support their hypothesis.
Some of the consequences of cherry-picked data include biased results and making wide-reaching generalizations based on limited samples.
4. Data fabrication
From fake stem cell lines to duplicated graphs, it’s surprisingly easy for scientists to publish fabricated or “made up” data in prominent journals.
A 2009 systematic review and meta-analysis of 39 global surveys on research misconduct published in PLoS One found that almost 2% of scientists on average admitted to fabricating data at least once, but around 14% of researchers have witnessed colleagues falsifying results.
Last year, a Retraction Watch report on roughly 10,500 papers revealed that half of all retractions over the past two decades were due to fraud.
5. Salami slicing
For some researchers, a large dataset is considered a surefire way to publish multiple papers for the price of one.
Known as salami slicing, the process involves splitting a dataset into smaller chunks and writing separate papers on these ‘sliced’ results. ‘Salami publications’ share the same hypotheses, samples, and methods, which can result in misleading findings.
A 2019 analysis of more than 55,000 health science papers revealed that the number of publications per study increased by more than 20% between 1970 and 2014.
6. Not publishing negative results
Journals are biased towards publishing studies that find effects, and this makes it difficult for researchers to publish non-significant results. This bias can deter researchers from even attempting to publish their ‘failed’ studies, perpetuating the cycle further.
“People only publish when something works, and never when it doesn’t,” says Jean-Jacques Orban de Xivry, a neuroscientist at the Catholic University of Leuven in Belgium. “This means that we only publish half of the studies that we produce in laboratories. That’s troubling to me.”
According to a study of 221 social science experiments, only 21% of null findings make their way into the pages of a journal and two-thirds are never written up. By contrast, more than 95% of the experiments that produced positive findings are included in manuscripts, and more than half end up being published.
Circular analysis, also known as double-dipping, involves using the same data multiple times to achieve a significant result.
As Orban de Xivry explains, double-dipping also includes “analyzing your data based on what you see in the data,” leading to inflated effects, invalid conclusions, and false positive results.
Double-dipping is particularly rife in neuroimaging studies, according to a 2009 paper led by neuroscientist, Nikolaus Kriegeskorte, from the US National Institute of Mental Health. The analysis found that 42% of papers on functional magnetic resonance imaging (fMRI) experiments published in leading journals were bolstered by circular analysis.
A 2019 paper published in eLife, Orban de Xivry recommends that researchers develop a clear outline of analysis criteria before diving into the data to avoid falling into the double-dipping trap.
A cautionary tale
In 2011, an influential psychology paper was able to show how listening to certain songs can change a person’s age.
The point of the paper was to show how, in many cases, a researcher is more likely to find evidence that a false effect exists (a false-positive finding) than it is to correctly find evidence that it does not.
Most often it’s not the result of malicious intent, but is due to the array of decisions researchers must make throughout a study: Should more data be collected? Should some observations be excluded? Which control variables should be considered? Should specific measures be combined or transformed, or both?
“When we as researchers face ambiguous analytic decisions, we will tend to conclude, with convincing self-justification, that the appropriate decisions are those that result in statistical significance,” the team wrote.
With a few tweaks to how they analyzed, interpreted and reported the data, researchers from the University of Pennsylvania and University of California, Berkeley researchers. Achieved a scientifically impossible outcome from their two experiments. But, as they wrote in Psychological Science, “Everything reported here actually happened.”