INTRODUCTION
When many researchers are trying to do their best to evaluate very complicated papers under extreme time pressure, it is rather pessimistic – even cynical – to describe peer review as seeming to be an elaborate system of keeping a free and open scientific discourse at bay. Looking at the whole system from a distance, though, one can perhaps be forgiven for perceiving it in that way. What is peer review actually? As expected, it is different things for different purposes. The peer review supporting the decisions of hiring researchers, or of funding them, has purposes different to the decision whether a scientific article would be suitable for the open scientific discourse, and so should be published. It is the latter peer review that I am addressing.
Much is on the line for researchers, so having a love/hate relationship with peer review (though more hate than love seems to be uttered by researchers if blogs are any guide) is understandable. Peer review has been described, presumably by a “victim”, as follows [1]:
Think of your meanest high school mean girl at her most gleefully, underminingly vicious. Now give her a doctorate in your discipline, and a modicum of power over your future. That's peer review.
I understand the feeling, and I am sure “meanness” occurs from time to time, though I do not think this is generally a fair characterization. That said, Richard Horton, Editor-in-Chief of The Lancet, as quoted by E.E. van der Wall, has stated [2] that:
Peer review to the public is portrayed as a quasi-sacred process that helps to make science our most objective truth teller, but we know that the system of peer review is biased, unjust, unaccountable, incomplete, easily fixed, often insulting, usually ignorant, occasionally foolish, and frequently wrong.
As just about anybody working in science will realize, this rings painfully true. Peer review has been described and defined in different ways, but in the context of current research journal publishing it comes down to “judging publication-‘worthiness’ in the particular journal that commissioned the peer review.”
Does peer review done in this way really benefit science? My response, which is more serious than it may look, to that question is (you would have to imagine “science” being “pronounced” with a capital S, as it were): “Perhaps – and many other journals, too … who purport to choose on “quality”, which really means: “does it advance my impact factor and therefore the prestige of my brand?’”
Richard Smith – former editor of the British Medical Journal – does not believe it benefits science or the scientific community. He says “Slay the peer review sacred cow” [3]. I have sympathy for his stance, and a clear benefit to science has not been demonstrated as far as I am aware, but I would not go that far, at least not just yet, as I think there are – perhaps intangible – benefits to peer review and most problems are mainly due to the way we organize peer review and set its goals, rather than to the concept of peer review itself. That said, some of its characteristics are not exactly encouraging: slow, inefficient, unreliable, highly variable, ineffective, arbitrary, undermining scientific scepticism, confirmation-biased, putting careerism before science, expensive, to name a few. I will address these one by one.
SLOW
While rapid peer review occasionally occurs, generally the process takes months, in certain areas even more than a year. The reaction of this researcher speaks volumes [4]:
A couple of weeks ago, an article of mine was rejected by a journal, and that was pretty depressing. But what was shocking about it was that the rejection only took eight weeks. I have never before had anything at all come back to me, positive or negative, in fewer than six months.
Delays of this magnitude are obviously not good for scientific progress, particularly not in the faster-moving areas. The slowness is exacerbated by the lack of efficiency of the process. For individual researchers, this can be a career breaker if, while waiting for the process to be completed, their results, or similar ones, are published first by someone else.
INEFFICIENT
Submitted manuscripts are not often accepted for publication right away, or with minor revisions. Very often they are rejected by the first journal they are submitted to, due to the phenomenon of researchers “aiming high”. Estimates vary, but an average of three cycles of submission and rejection before a manuscript is accepted for publication does not seem implausible. Perhaps it is more. Every time a manuscript is submitted to another journal, at least two peer reviewers are needed (and as many as 10 are probably needed to be contacted). If the estimate of three cycles is correct, this means that six times as many qualified peer reviewers need to be engaged as there are articles published. Given the relentless year-on-year increase of papers published, one has to question the scalability of the process.
UNRELIABLE
I have already quoted Richard Horton above, but he has more to say about the scientific literature. In a recent comment [5] in The Lancet, the journal of which he is the Editor-in-Chief, he says:
Much of the scientific literature, perhaps half, may simply be untrue. Part of the problem is that no-one is incentivised to be right. Instead, scientists are incentivised to be productive and innovative.
The pressures of “Publish or Perish” on researchers is enormous. If they care for a career in science, they do not only have to publish regularly (read: a lot), but also the journals in which they manage to publish, particularly their place in the pecking order (read: impact factor) is deemed crucial. This often leads to trumped-up results and to attempts at presenting research as more original and innovative than it really is. The perceived prestige of having a paper in a journal at the top of the pecking order is a powerful incentive to do everything possible to get in. It is not surprising that exaggerated claims – and subsequent retractions – occur proportionally more often in “prestigious” journals.
HIGHLY VARIABLE
At the end of 2010, Lutz Bornmann, Rüdiger Mutz, and Hans-Dieter Daniel published an article in PLOS One, entitled “A Reliability-Generalization Study of Journal Peer Reviews: A Multilevel Meta- Analysis of Inter-Rater Reliability and Its Determinants” [6]. In it they write that it is commonly assumed that:
…the reviewers, being experts, are able to make a more or less objective judgement. In other words, when a reviewer says that a paper's good or bad, they're reporting something about the paper, not just giving their own personal opinion.
“If true”, they continue, “reviewers ought to agree with each other about the merits of each paper.”
However, they find that “The most important weakness of the peer review process is a lack of inter-rater reliability, defined as ‘the extent to which two or more independent reviews of the same scientific document agree.’”
That said, it is not altogether clear that a low internal rate of return is necessarily as problematic as it might seem, especially if of two reviewers one is a specialist and one a generalist, each of whom will look potentially very differently at the manuscript they have been asked to review. But it does show that opinions on the suitability of a paper for publication often differ quite a lot, undermining the value of peer review and giving succour to the opinion of Richard Smith, mentioned above.
INEFFECTIVE
Regarding research misconduct, at any rate. Vera Sharav has a scathing opinion [7] about the Institute of Medicine [8] Report “Integrity in Scientific Research: Creating an Environment That Promotes Responsible Conduct.” The report, as well as editorials in The Lancet and Science, “reveal an astonishing lack of resolve to hold scientists who have been found guilty of scientific misconduct – including fraud – accountable for their actions” and “Peer review has proven ineffective in restraining wrongdoers.” She quotes The Lancet editorial [9] of August 2002:
The IOM report recognises the multiple players involved: individual researchers; institutions; funding agencies; journals; scientific societies; governments; and the environment in which research is conducted, such as public opinion and sociopolitical priorities. The recent adoption by several journals, The Lancet included, of more transparency in conflict of interest disclosure was named as a positive step to influence research integrity, albeit indirectly. All these participants have complex reciprocal relationships with one another. The report rightly categorises the individual scientist as both the most influential and the most unpredictable variable. But it places the responsibility for dealing with research integrity firmly on research institutions.
No responsibility on the journals, it seems, in spite of their “rigorous” peer review. Vera Sharav:
ARBITRARY
Articles resubmitted after they already had been published, in the same journals, were rejected the second time. That is what Douglas P. Peters and Stephen J. Ceci report [10]. The study is quite old (1982), but I have no reason to believe the outcome would be very different today. A quote from their article:
With fictitious names and institutions substituted for the original ones, the altered manuscripts were formally resubmitted to the journals that had originally refereed and published them 18 to 32 months earlier. Of the sample of 38 editors and reviewers, only three (8%) detected the resubmissions. This result allowed nine of the 12 articles to continue through the review process to receive an actual evaluation: eight of the nine were rejected. Sixteen of the 18 referees (89%) recommended against publication and the editors concurred. The grounds for rejection were in many cases described as ‘serious methodological flaws’.
It is difficult to believe in the robustness of the peer review carried out by those journals.
UNDERMINING SCIENTIFIC SCEPTICISM
Scepticism is a key feature of the inquisitive scientific mind. Research results are never “final”, even if evidence seems solid. Scepticism is what makes science work. Someone who calls himself the Skeptical Raptor, but who otherwise wishes to remain anonymous, put it like this:
I find it amusing that there are individuals who use peer reviewed articles as the final word on a topic, assuming that the article itself is some central dogmatic word from a higher power.
There is a danger, though, that putting the label “peer-reviewed” on an article gives it more than its due authority, especially in the eyes of a less or inexperienced reader. Peer review does not deliver any guarantee that an article accepted for publication is correct. That is not the task of a peer reviewer, either. Their task is just to judge whether a particular paper is worth adding to the scientific discourse. There are examples of reviewer decisions that clearly support that idea. A relatively recent one is this, paraphrased in an editorial [11] by Alan Singleton:
This is a remarkable result – in fact, I don’t believe it. However, I have examined the paper and can find no fault in the author's methods and results. Thus I believe it should be published so that others may assess it and the conclusions and/or repeat the experiment to see whether the same results are achieved.
This reviewer understood his or her role correctly. Another example is one from almost a century ago: Oliver Lodge, editor of the Philosophical Magazine, described a 1924 submission by Felix Ehrenhaft as “either badly written or badly translated” and almost certainly incorrect from a scientific perspective. But he agreed that it ought to be inserted in the journal [12]. His motivation is likely to have been similar to the motivation of the reviewer in the first example: it may not be correct, but it is worth discussing in the community. Scientists with a healthy dose of scepticism – in other words, any scientist worth his or her salt – can easily deal with something that is uncertain and do not need the (illusionary) reassurance of peer review.
CONFIRMATION-BIASED
People have a deep-seated tendency to favour evidence consistent with their expectations. Scientists may collect and interpret data objectively, but if those data do not support their hypothesis, many are reluctant to publish their results. Even when they submit manuscripts with negative results for publication, journal editors and publishers seem rather reluctant to publish them. The problem is already old, and was apparently already commented on, three centuries ago, by Francis Bacon. More recently, though still 40 years ago, in 1977, Michael Mahoney commented, in an article [13] in the journal Cognitive Therapy and Research:
If we selectively ‘find’ior communicate only those data that support a given model […], then our inquiry efforts will hardly be optimally effective.
Things do not seem to change in a direction more favourable to the visibility of negative results. Quite the contrary. The Economist reported [14] in 2013 that:
“Negative results” may have been a concept with a rather wide definition by The Economist because it is hard to believe that as much as 30% of published papers reported on negative results in 1990. I have no insight into the figures and method of the report of The Economist, and I may well be exposing my own confirmation bias by expressing scepticism about that percentage. But ignoring negative results in the scientific literature is not good for science. At the very least it is likely to cause unnecessary duplication of effort, and scarce research resources may be used to test hypotheses again and again, if the results of earlier experiments are not available.
PUTTING CAREERISM BEFORE SCIENCE
The incentives to advance one's career are naturally strong. But not necessarily all that compatible with advancing science. In the same article in The Economist as cited above, we find:
The obligation to ‘publish or perish’hhas come to rule over academic life. Competition for jobs is cut-throat.
Nowadays verification (the replication of other people's results) does little to advance a researcher's career. And without verification, dubious findings live on to mislead.
The hierarchical prestige structure of the journal publishing system, with their strong emphasis on impact factor and other metrics, reinforces these career incentives. Cash bonuses, even substantial ones amounting to tens of thousands of dollars, have been reported for getting a paper into a top journal. These practices apparently occur in China, Korea, Turkey, and perhaps other countries, too. Shao Jufang and Shen Huiyun report from China, in an article [15] in the journal Learned Publishing, that an article in Science or Nature earns a Chinese author a bonus of more than 30,000 dollars.
The tendency of peer reviewers to judge manuscripts on their ability to boost a journal's prestige – prestige y of peer reviewers to judge manuscripts on their ability to boost a journal emphasis on impact faensures that the interests of science play second fiddle to the interests of scientists.
EXPENSIVE
Peer review is an academic exercise, the organization of which is currently outsourced to publishers, who gratefully accept that role, of course. To them it looks like the money is just thrown at them, and why would not they pick it up? Peer review is very expensive. Not peer review itself, of course; the involvement of publishers as organizers and administrators of peer review makes it expensive. Up to 99% of the cost of publishing is spent on the notion (delusion?) that articles can be trusted when they are published in a “peer-reviewed” journal. Let me illustrate that with a simple calculation. Recently on the Scholarly Kitchen blog it was reported [16] that the technical preparation (xml-coding, etc.) and hosting costs an average of 47 dollars per article at PubMedCentral. Any other costs are essentially those associated with the organization of peer review. That means, the cost of peer review is an average of 2953 dollars per article published in a typical hybrid journal with APC of 3000 dollars, and an average of 4953 dollars in a typical subscription journal with revenues of 5000 dollars per article. If these figures are correct – and I have no reason to believe they are materially wrong – the cost of technical preparation is less than one percent of the average per-article revenue of a typical subscription journal. The only caveat is that the per-article revenue of the articles published in a journal includes overhead, profits, and arranging peer review for all submitted and peer-reviewed articles, including those that are rejected.
The current system impedes scientific communication, while its focus is strongly on scientific career advancement at the expense of the scientific discourse. We seem to prefer it that way, and spend a lot of money keeping proper scientific discourse at bay.
QUO VADIS?
In spite of being as sceptical of the value of peer review as Richard Smith is, I would not go so far as to advocate its complete abolition – see my comment at the beginning of this piece. But that does not mean we cannot remove, or at least ameliorate, the negative effects of the current practice. We have seen what peer review is, and what the consequences are of the way we currently do it. But what should it be? How might we be able to go forward? Almost no paper can be guaranteed to be correct, and I would define the purpose of peer review as “helping to avoid unnecessary errors occurring in the scientific literature.” That makes it into help offered to authors, much more than to journals. Publishers can remain involved, but primarily to produce technologically superb and robust, easily and openly accessible literature, and they should not be in charge of peer review, which they can – and do – generally use as justification for high prices, be they subscription fees or article processing charges for open access. By the way, I am sceptical of the reported technical publishing cost of 47 dollars per paper. The realistic cost of cleaning up (manuscripts are often a formatting mess), proper XML coding, rendering in different formats, hosting, and overheads is more likely in the order of 300 dollars to 400 dollars per paper, in my estimate, especially if one also includes a small profit for the publisher, which is reasonable, in my opinion. Still, that is only in the order of one-tenth of what we pay now.
How could this be achieved, keeping peer review and dramatically reducing the cost of publishing? Peer review itself is an academic community exercise, not a publisher's one. Let's start there. Imagine a system of “peer review by endorsement.”
PEER REVIEW BY ENDORSEMENT
The principle is quite straightforward, really. Authors themselves invite at least two peers to review their paper (according to some rules to avoid nepotism and friend-bias, such as peer-endorsers having to be active researchers, and not be, or for at least five years have been, at the same institution as, or a co-author of, any of the authors), and when these peers endorse its publication – most likely after some iteration with the authors to clarify questions that arise – they do so openly, fully disclosing their identities. And perhaps also giving a motivation as to why they endorse publication, a motivation which may, of course, be along the lines of what Alan Singleton reported, or of the comments of Oliver Lodge (see above). Many publishers ask authors who should be invited to review their papers anyway, so authors inviting them by themselves would be a relatively small step. Peer review can only operate on trust anyway. It is the openness of peer review that builds trust; not the party who invites the review.
Such a peer-review-by-endorsement system is likely to be at least as good as, and quite probably better than, the currently widespread “black box” of anonymous peer review. As reviews/endorsements would be signed and non-anonymous, there is very little danger of sub-standard articles being published, as endorsers/reviewers would not want to put their reputations at risk. ScienceOpen has launched Peer Review by Endorsement as an option for authors. Peer Review by Endorsement occurs, just as usual peer review, before publication, and is entirely open and transparent. Articles published this way will, of course, also be available for Post-Publication Peer Review, as are all OA articles aggregated on the ScienceOpen site. ScienceOpen will take such peer-endorsed manuscripts and turn them into professionally published and complete (including data and metadata) documents, adhering to all the technical, presentational, and unique identifier standards, in a number of formats, linked and linkable to databases and other relevant information, human- and machine-readable and suitable for widespread usage, for text- and data-mining, for structured analysis (incl. semantic analysis), and further knowledge discovery, and, crucially, for long-term preservation in repositories and archives of any kind.
More details about Peer Review by Endorsement can be found on the ScienceOpen site: http://about.scienceopen.com/peer-review-by-endorsement-pre/