Incentive systems in academic journals
400 years ago, communication of scientific insights happened primarily via personal letters, or in-person group meetings. As the printing industry evolved in the 17th century, journals were formed to distribute hard copies of scientific research to individuals. The amount of government research funding began to slowly increase, and so did the number of researchers and institutions.
The demand for funding and academic status was growing faster than supply, so an increasingly competitive academic environment was created. Metrics based on citations, publications, student numbers, grants and journal prestige were then created to differentiate the status of stakeholders in academia. Rightfully so, academics began optimizing for these metrics.
Unfortunately, when a metric becomes a target, it ceases to be a good metric. As a result, academia has since seen an increase in salami publications (2+ paper/study), ghost authorships (people excluded from publications), p-hacking (data manipulation), metric manipulation, faking research data, faking of peer reviews, and plagiarism by peer reviewers.
Papers became the way scientists could prove they were performing valuable research. Funders, PIs, institutions and many investors began to evaluate scientists based off of their papers.
A survey with 25,000 participating academic scientists showed that the mean impact factor and the total number of publications were most correlated with academic success, compared to any other metric. Another survey showed that nearly half of all grant proposal reviewers considered the number of publications and mean impact factor as highly important for their decision.
The first person to propose a citation index, Eugene Garfield, intended for it to allow scientists to track citations between papers, and identify whether critiques were proposed for papers they were planning to cite. He even explicitly warned against using citation index as a metric of evaluation in some of his writings.
Above all, this new incentive structure brought huge benefits to prestigious journals. Publishers quickly realized that a subscription model for academic institutions brought significantly more revenue than one for individuals. More and more journals were created and many were consumed by the top-5 publishers: Elsevier, Taylor & Francis, Wiley-Blackwell, Springer and Sage. In 2013, these 5 companies were publishing 53% of all scientific research.
Clearly, a monopoly with flawed incentives was created.
There are different components of academic publishing, and I’m going to go through each one and explain the status quo, and the incentives of that system.
Quality of research and papers
Over 2.5 million papers are published every year, but only half of them are ever read by anyone that isn’t an author, reviewer or editor. A system that is over-optimized for publishing is leading to a prioritization of quantity of quality, which is less than ideal from the perspective of valuing scientific insights/discovery.
It’s common now to think that because a paper was published in Nature, Science or insert another prestigious journal, it is, therefore, true.
Science has become more about citations than replication. Over 70% of papers cannot be reproduced. This can be explained by the fact that there is a certain margin of variability in fields like biology. However, most of these papers are not reproducible even accounting for a very large acceptable margin.
Poor research methods are a very common issue among papers. Research methods are difficult to follow and consistently miss certain steps/measurements.
Reviewers are pushing-forward low quality research. For example, editors and peer reviewers at Springer and IEEE accepted hundreds of computer-generated papers, which they then retracted a few years later. A reporter for Science sent a fake paper to 304 open-access journals, and 157 accepted it.
Citation count of papers is the most common metric for determining journal prestige. However, papers that are cited less, are often more reproducible than those that are cited more . Highly novel paper also take longer to accumulate citations.
Peer reviewal
Peer reviews are done for free, and are considered to be part of a researcher’s scientific duties. Reviews are also often single-blinded, meaning they are anonymous, so the reviewer has no clear incentive to provide fair, quality reviews.
Editors try to find peer reviewers that are in the same niche of science as the author, the reviewer and the author are, most of the time, competing to solve a similar problem. For example, this problem is very prominent in Alzheimer research. Because so many researchers are trying to tackle this problem, new research is often shut down from publication because of reviewal, or competitor rejection.
Since scientific funding is limited per field of knowledge, reviewers are often more incentivized to reject a paper’s publishing, in order to secure more funding for themselves.
Bias
Reviewers are biased based on who the authors of the paper are. In a study, 8 out of 9 papers that were previously accepted by reviewers for publishing were rejected after the names of the authors were changed. Papers that are produced by scientists with big names are much more likely to be pushed compared to others, regardless of the quality.
Currently, 20% of scientists are also performing up to 94% of reviews, which means that 1/5th of researchers have significant power over what ends up being published.
I looked at the existing solutions tackling this problem, including preprints, open-access journals and sci-hub. Unfortunately, there are no incentives beyond participation in the “open science” movement for researchers to opt into these new systems.
So some open questions I have: How do we scale experiment replication? We need to release both data and code in papers. How do we backtrace? How do we incentivize people to publish in alternative journals? How do we incentivize researchers to publish failed studies? Is open access science something that universities and researchers actually want?
A lot of these statistics were gathered from one article, which I can no longer find as these notes were written many months ago. If you’re reading this and find a lot of similarities with another post you’ve read, please reach out so I can credit them!