Signs of ‘citation hacking’ flagged in scientific papers
An algorithm developed to spot abnormal patterns of citations aims to find scientists who have manipulated reference lists.
25 August 2020
Richard Van Noorden
Scientists who get too many references to their own work inserted in others’ papers — whether by prior arrangement or by asking for extra references during peer review — might leave telltale fingerprints in the citation record, say two researchers who have developed a way to detect what they call citation hacking.
“If someone is trying to manipulate their citations, they have to leave this mark,” says bioinformatician Jonathan Wren, at the Oklahoma Medical Research Foundation (OMRF) in Oklahoma City. On 13 August, he and Constantin Georgescu, also at the OMRF, posted an analysis of 20,000 authors’ citation patterns to the bioRxiv preprint server, in which they found around 80 scientists whose citations, they say, indicate “chronic, repeated” reference-list manipulation. The patterns also suggest that around 16% of authors in their sample “may have engaged in reference list manipulation to some degree”, they add.
The study has not yet been peer-reviewed and has many caveats, but its method seems technically sound, say two bibliometricians who were not involved in the work.
Researchers often complain that reviewers ask them to add unnecessary references to papers, a practice termed coercive citation. Surveys suggest that around one-fifth or more of scientists have experienced this. The publisher Elsevier said last year that after examining peer-review records at its journals, it is investigating some scientists whom it suspects of deliberately manipulating the peer-review process to boost their citations. Authors also sometimes arrange to cite each other.
Wren began to study citation patterns after he discovered an outlandish case in which a highly cited US biophysicist repeatedly manipulated the peer-review process to gain extra citations. The researcher, whom Nature reported this year is Kuo-Chen Chou, was subsequently barred as a reviewer from the journal Bioinformatics (where Wren is an assistant editor), from the editorial board of the Journal of Theoretical Biology, and, Nature has now learnt, as a reviewer for the journal Database. Chou told Nature that he had not engaged in “reviewer coercion”.
Hundreds of extreme self-citing scientists revealed in new database
Wren says that after he uncovered Chou’s behaviour, he began getting e-mails from researchers asking him to check the records of other scholars whom they thought might be involved in citation hacking. But because most peer-review processes are confidential, Wren hoped to spot such cases by examining citation records. Heavy self-citation is easy to measure, but deciding what counts as an unusual pattern involving other authors is much harder.
Wren and Georgescu considered many potential “red flag” indicators, such as when researchers frequently receive blocks of consecutive citations in others’ papers, or get disproportionately many citations from one journal. They found that a key measure that correlates with many of the red flags is the overall skewness, or inequality, in the distribution of citations that scientists receive from others’ work: some researchers are cited an unusually large number of times by just a few papers.
The researchers did not have access to a database of all scientists’ work, so they analysed public records in the database PubMed. They also restricted themselves to authors with middle initials on papers, to make misidentification less likely. This limits the study, but gives an idea of the citation patterns for around 20,000 scientists. Around 80 — including Chou — have extremely skewed patterns of citations accrued from others, together with other red-flag indicators. Researchers who have strange patterns of citations tend also to cite themselves heavily, the study found.
Asked for comment, Chou told Nature that the study was “meaningless”, because the “number of citations is not important”.
Wren does not directly name researchers in the paper, because of legal concerns, but he analyses a few individual citation records in detail and says the full figures are available on request. One such scientist, according to supplementary data Wren showed Nature, is Dimitrios Roukos, an oncology researcher at the University of Ioannina in Greece. From 2009 to 2014, Roukos benefited from almost 2,000 citations from 71 papers in one journal, Surgical Endoscopy. Each study referenced his work around 20–30 times, and was written by colleagues or mentors. Roukos did not reply to requests for comment for this story.
The analysis only points to unusual citing patterns, and can’t assess whether a researcher actually did arrange for extra references to their work; there may be innocent explanations for strange distributions, Wren notes. They might be skewed by mega-reviews of a field that heavily cite key researchers, for instance, although he says he tested this by removing such reviews from the analysis. He sees skewed citing patterns as markers for further investigation.
Plotting the global distribution of skewness in records of scientists’ citations by others ought to produce a symmetrical curve, says Wren, but doesn’t. On that basis, he suggests that around 16% of authors overall have engaged in some kind of reference-list manipulation at one time or other, even if it’s not possible to conclude that their individual records are unnatural. But Ludo Waltman, a bibliometrician at the Leiden University in the Netherlands, says that he doesn’t feel comfortable with the way the analysis draws a binary distinction between ‘manipulated’ and ‘non-manipulated’ references, when there are many complex reasons for citing others.
Wren says he’d like editors and reviewers to develop a database that makes clear which references were added during peer review. Both Waltman and Vincent Lariviere, a bibliometrician at the University of Montreal in Canada, say that making peer-review reports more transparent might help to address the issue.
An underlying problem that incentivizes citation hacking is that scientists are too often rewarded on the basis of simple citation counts, says Waltman. This, he adds, is what ultimately needs to change.