Vigilant scientists [UPDATED -- March 5th, 2016]

In an editorial entitled “Vigilante science”, the editor-in-chief of Plant Physiology, Michael Blatt, makes the hyperbolic claim that anonymous post-publication peer review by the PubPeer community represents the most serious threat to the scientific process today.

We obviously disagree. We believe a greater problem, which PubPeer can help to address, is the flood of low-quality, overinterpreted and ultimately unreliable research being experienced in many scientific fields, but especially in life sciences. In a famous paper (1), John Ioannidis explained how a combination of low statistical power and publication bias has resulted in the expectation that a majority of publications are unreproducible. And of course a paper may contain many, many problems in addition to bad statistics. Just one example is the issue of contaminated cancer cell lines.

These arguments suggest that a large majority of publications are unreproducible. This statement may appear extreme; is it supported by direct evaluations of reproducibility? Although there have been few such studies, those that exist all point to a grave situation. Two surveys of preclinical studies by pharmaceutical companies Amgen (2) and Bayer (3) reported dismal robustness of “landmark” studies they had hoped to develop., Similarly, the psychology reproducibility project could only reproduce a minority of studies and revealed a generalized reduction in effect sizes (4). We are unaware of any reproducibility studies reporting substantially higher success rates.

Unreliable research, a problem recently acknowledged by Francis Collins (5), the current NIH director, could have an enormous economic cost. Consider that the annual budget of the NIH is $30bn. Extrapolation of the alarming reproducibility rates above would lead to the conclusion that more than half of that money—taxpayers’ money—is funding unreliable research. Today’s research builds on yesterday’s results. But anybody basing their research on unreliable publications is likely to be wasting their time and further resources. In the high-pressure environment of research, such an unwitting error can easily spell the end of a young scientist’s career.

But it gets worse: the negative consequences of unreliable research extend far beyond future research. Many aspects of public policy are based upon the scientific record. Obvious examples are medical guidelines and environmental policy, but the trend is to increase evidence-based policy. Unreliable research can lead to mistaken policy with both economic and human costs.

A dramatic example of the potential human cost is afforded by the Poldermans case. A prominent cardiologist, he was responsible for a series of clinical trials that drove the adoption of guidelines recommending the widespread use of perioperative treatment with beta-blockers to protect against myocardial infarctions. Several of Poldermans’ studies were subsequently shown to have serious integrity problems and he was fired for misconduct. A meta-analysis of the field excluding Poldermans’ discredited research estimated that the beta-blockers increased perioperative mortality by 27% (6). In other words, mistaken guidelines based upon unreliable research (in this case involving misconduct) may have caused preventable deaths. Because the procedures were common and the guidelines widely applied, the number of potential victims almost defies comprehension (as reported by Forbes).

In this context, we believe it is imperative that all possible users of published research be made aware of potential problems as fully and as quickly as possible. Any other course of action will cost money, careers and maybe lives. The central mission of PubPeer is to facilitate this exchange of information. We therefore provide a web platform that can make comments instantly available to every interested reader in the world and aim to remove barriers and discouragements to commenting. As shown in the graph of historical comment traffic, commenting on PubPeer was greatly stimulated after we enabled user-controlled anonymity, which is the only certain defense against legal attack or a breach of site security. The “unregistered” comments, which represent the majority, are not of inferior quality to those of registered users.

 

commenting.png

 

In contrast to our desire to disseminate information, Blatt is mostly concerned about the psychological effect on researchers of public and anonymous discussion of their work. From this point of view, his suggestion to “draw the author aside” for a quiet chat “after a seminar” is certainly a good way to minimize hurt feelings, but it is totally ineffective as a strategy for disseminating information. We believe that making any relevant information rapidly available to readers should trump concerns about the authors’ feelings, especially given that they freely chose to publish in the first place. Frankly, a few ruffled academic feathers pale into insignificance when patients’ lives, taxpayer billions and young researchers’ careers are at stake. We also suspect that the researchers’ employers—those same taxpayers and patients—would share this point of view.

There would be less need for PubPeer and anonymous commenting if self-correction and policing of research worked as they should, but we believe they do not and probably cannot, as detailed in a previous blog post. Although we do not doubt Blatt’s personal probity or question the efforts made by the Rockefeller press, unreliable research seems to be endemic in the current system, while authors, journals and institutions are all unquestionably exposed to conflicts of interest when it comes to correcting problems. PubPeer hosts comments on thousands of papers in which image manipulation is manifest, yet visible action is taken in only a few percent of cases. Where are the research police? They are too inefficient, slow, unreliable and, crucially, opaque to be fit for purpose. The arsenic life paper highlighted by Blatt as an example of acceptable post-publication peer review was, ironically, never retracted from Science. The New England Journal of Medicine has not retracted a key Poldermans study, despite serious doubt being cast upon its integrity. Scientists clearly cannot rely on the traditional avenues for correcting problems in the literature; PubPeer offers a way to bypass this logjam of conflicts of interest.

We now address Blatt’s other complaints about anonymous commenting, although we believe they are secondary to the issues outlined above.

No system is perfect and the possibility for abuse of anonymous commenting on PubPeer does exist. However, as seems often to be the case with critics of anonymity on PubPeer, Blatt doesn’t offer concrete examples of abuse on our site. From our accumulated experience of moderating the 37000+ comments on PubPeer, we consider worries about abuse to be overblown. The factual focus of comments is one very important protection; despite Blatt’s carefully worded insinuations, PubPeer does not allow “hearsay”, “allegations” or invite “innuendo” (and we aim to act on all reports of such comments). We also observe that conflicts of interest are much less of an issue with anonymous commenters because they have no way to abuse any power or authority they may possess; they must convince by strength of argument alone. Scientists are simply not convinced by anonymous assertions without factual support, even if such comments were to find their way past our moderation systems. It is probable that as PubPeer grows from its informal beginnings, some form of dedicated editorial or appeals board will be instituted, but our current experience from moderating comments and reports of abuse does not indicate this to be an urgent need.

The argument that researchers find it difficult to reply to a factual or scientific question without knowing who asked it is barely credible. We are with Paul Brookes on this one: scientists should be able to explain and defend the work they have chosen to publish. And in reality no competent scientist would experience the slightest difficulty in defending their work, if it is defensible. In particular, the overwhelming majority of questions on PubPeer would be resolved instantly by showing the original data that the authors describe in their publication. A growing number of enlightened journals require full data-sharing, so there can be no argument that data should be kept secret from ordinary readers, yet most authors (and editors) still succumb to this reflex.

Bizarrely, given his apocalyptic warnings about PubPeer, Blatt states that the bulk of PubPeer comments “relate to small errors and oversights” in image data that are secondary to the “ideas in themselves” of the papers. We encourage the use of PubPeer for all types of factual discussion, be it positive, negative, major or minor. Comments about details should be treated as such, although attention to detail is often important in science. People who rush to judgment on the sole basis that a comment about some minor detail exists on PubPeer have only themselves to blame. If they are scientists they should definitely know better, and we actively advise readers to form their own opinion of comments. PubPeer should be treated as a source of potentially useful information, not a definitive judgment. Note that PubPeer does not aspire to provide in-depth scientific review of comments.

Blatt appears to include in this class of comments about “small errors” the many that highlight signs of image manipulation—not such small errors after all. We do not agree that a lack of scientific discussion accompanying such comments is problematic: if the data can’t be trusted, that is vitally important information and there is little point in discussing the science. It is of course the comments highlighting obvious manipulations or serious errors that authors find so difficult to counter (and may cause them to reach for their lawyers), not the comments about genuinely small errors and oversights. Affected authors often seek to distract from their predicament by complaining about the anonymity that prevents the deployment of ad hominem defenses.

Blatt also bemoans the negativity of most PubPeer comments. We consider this unsurprising and even inevitable, since most authors have been forced by the system to put the most positive spin possible on their results. In addition, science proceeds by the falsification and refinement of hypotheses. Thus, for most papers, the only way is down.

In conclusion, we choose to allow anonymity on PubPeer as a necessary compromise. The arguments in favor are the overwhelming importance of rapidly informing readers about potential issues in publications and the fact that strong anonymity on PubPeer has greatly encouraged commenting. Conversely, we believe that the argument against anonymity, the risk of malicious or unjustified damage to researchers’ reputations, has been overstated, and this is based upon our direct experience of running the site. A time may come when open criticism is no longer considered a risk and anonymity becomes unnecessary to facilitate commenting, but for now PubPeer users clearly prefer to control their anonymity. We believe the balance of benefits currently strongly favors the continuation of anonymity on PubPeer.

[UPDATE -- March 5th 2016]

Michael Blatt has written a follow-up editorial http://www.plantphysiol.org/content/170/3/1171.full. We respond to a few of the issues he raises.

Most importantly, Blatt simply ignores the central point in our blog above, which is that rapidly sharing information about possible problems in a publication is more important than the niceties of academic etiquette. We also provided evidence that strong anonymity encourages this information sharing. The argument for allowing anonymous commenting is therefore that it maximizes a beneficial activity. Even if Blatt as an editor believes that authors rather than readers and users of the publications are his most important customers, he provides no explicit argument for this. Instead of addressing our utilitarian argument, Blatt simply lists a series of rather intangible appeals to the scientific process.

In his original editorial, Blatt glaringly conflated potential misconduct and “scientific” errors. He now distinguishes misconduct from other issues and implicitly acknowledges that unprotected discussion is insufficient to deal with this problem. This represents serious backtracking on his part and implicitly validates much of the commentary on PubPeer, which does concern potential misconduct. Vague mention is made of an initiative to develop a more effective whistleblowing system. We look forward to a robust and effective procedure for dealing with allegations of misconduct, but won’t be holding our breath.

Following our challenge to provide examples where anonymous comments on PubPeer have been used to denigrate researchers unjustly, Blatt provides examples concerning his journal that are by his own admission innocuous—we seriously doubt that his reputation was harmed by those comments. Some speculation by Leonid Schneider is also cited. In short, the examples are not convincing. Although we would take even a single example very seriously and admit there may be some, we note that the PubPeer database is rapidly approaching 50000 comments, so it would not be unreasonable to consider this denominator in evaluating the prevalence of any perceived abuse. We continue to believe that the factual basis of comments, our moderation policies and, above all, the diligence of our users, means that the overall accuracy of PubPeer commentary is excellent. In this context it may be worth pointing out that one of Blatt’s own papers has been commented on PubPeer. Was that an unjustified denigration of his reputation by cowards acting with anonymous impunity? Apparently not, as the authors have issued a mega-correction. Judge for yourselves: https://pubpeer.com/publications/CBE5FF3720F04311141D8254433C9B

Finally, Blatt makes an argument that unreliable research advances science. We believe this to be dangerous and disingenuous relativism. We feel there is a clear distinction between work that authors, referees and editors should have known was unreliable or wrong at the time of publication, on the one hand, and the difficulties of interpreting experiments at the frontiers of knowledge on the other. Moreover, it is precisely the ability to make this distinction that “quality” journals such as Plant Physiology provide as justification for their existence, at least until they need an excuse for having published low-quality work. Similarly, although some papers may stimulate subsequent advances even if their central claims are invalidated, quality journals still aspire to certifying those claims.

Some of the examples provided by Blatt to glorify unreliable science are bizarre. Everybody agrees that the arsenic life paper falls into the category of vastly overblown claims where the authors, referees and editors should have known better. It was a catastrophic failure of quality control by the journal. In contrast, the failure of Cole and Curtis (1938, 1939) to measure the overshoot of the action potential was entirely due to the fact that their equipment did not give them access to the membrane potential, they could only measure membrane impedance. This was nevertheless a huge breakthrough and is a perfect example of an entirely acceptable state-of-the-art interpretation. Nobody at the time could or should have known better. To suggest otherwise is like criticizing Rutherford for not predicting the existence of the Higgs boson.  We certainly wouldn’t consider the research of Cole and Curtis to be unreliable.

The example involving the influence of the sodium pump on the membrane potential is even more confused. By maintaining the transmembrane sodium and potassium ion gradients, the sodium pump is indirectly essential for the maintenance of the resting membrane potential, but this had been known for a long time. The pump is also electrogenic, as observed in Gadsby’s work, but in animal cells its direct contribution to the membrane potential, which is dominated by passive potassium conductances, is minor, contrary to Blatt’s suggestion. Moreover, the De Weer and Gadsby (1988) reference that Blatt actually cites describes studies of the voltage-dependence of the sodium pump—the effect of the membrane potential on the pump, not of the pump on the membrane potential. For completeness, we note that in plant cells an electrogenic proton-pump (not a sodium pump) does make a direct and significant contribution to the membrane potential.

We feel these misunderstandings illustrate the confused nature of the arguments in Blatt’s editorials.

 

Comments can be left here: https://pubpeer.com/topics/1/B6CF3DB974A8ECC64B1A0303BBCD6F

1. Ioannidis, J. P. A. Why Most Published Research Findings Are False. PLoS Med 2, e124 (2005).

2. Prinz, F., Schlange, T. & Asadullah, K. Believe it or not: how much can we rely on published data on potential drug targets? Nat Rev Drug Discov 10, 712–712 (2011).

3. Begley, C. G. & Ellis, L. M. Drug development: Raise standards for preclinical cancer research. Nature 483, 531–533 (2012).

4. Collaboration, O. S. Estimating the reproducibility of psychological science. Science 349, aac4716 (2015).

5. Collins, F. S. & Tabak, L. A. Policy: NIH plans to enhance reproducibility. Nature 505, 612–613 (2014).

6. Bouri, S., Shun-Shin, M. J., Cole, G. D., Mayet, J. & Francis, D. P. Meta-analysis of secure randomised controlled trials of β-blockade to prevent perioperative death in non-cardiac surgery. Heart 100, 456–464 (2014).