A crisis of trust

When we created PubPeer, we expected to facilitate public, on-the-record discussions about the finer points of experimental design and interpretation, similar to the conversations we all have in our journal clubs. As PubPeer developed, and especially once we enabled anonymous posting, we were shocked at the number of comments pointing out much more fundamental problems in papers, involving very questionable research practices and rather obvious misconduct. We link to a few examples of comments raising apparently serious issues and where the articles were subsequently withdrawn or retracted (for which the reasons were not always given):

https://pubpeer.com/publications/C55070469007978707693AA374BF21
https://pubpeer.com/publications/890E1E22DAFD6926D577FE461A66F6
https://pubpeer.com/publications/058CFA77EAF6D5E019D9902C6B3553
https://pubpeer.com/publications/FF771F6D16ADB90D7F8C11E5361A1F
https://pubpeer.com/publications/0C40189FE3F10DF9B4B0166DE1FA4E
https://pubpeer.com/publications/8B755710BADFE6FB0A848A44B70F7D
https://pubpeer.com/publications/1F3D9CBBB6A8F1953284B66EEA7887

The choice of retracted/withdrawn articles was made for legal reasons, but that is all that makes them special. There are many, many similar comments regarding other papers.

Many critical comments have involved papers by highly successful researchers and all the very best journals (https://pubpeer.com/journals) and institutions are represented. So it is hard to argue that these problems only represent a few bad apples that nobody knows or cares about. We have come to believe that these comments are symptomatic of a deep malaise: modern science operates in an environment where questionable practices and misconduct can be winning strategies.

Although we were not anticipating so many comments indicative of misconduct on PubPeer, maybe we should not have been so surprised. The incentives to fabricate data are strong: it is so much easier to publish quickly and to obtain high-profile results if you cheat. Given the unceasing pressure to publish (or perish), this can easily represent the difference between success and failure. At the same time, ever fewer researchers can afford the time to read, consider or check published work. There are also intense pressures discouraging researchers from replicating others’ work: replications are difficult to fund or publish well, because they are considered unoriginal and aggressive; replications are often held to much higher standards than the original work; publishing contradictory findings can lead to reprisals when grants, papers, jobs or promotions are considered; failures to replicate are brushed off as the replicator’s poor experimental technique. So the pressures in science today may push researchers to cheat and simultaneously discourage checks that might detect cheating.

As followers of ‘research social media’ like Retraction Watch and the now-shuttered Science Fraud have already realized, the climate of distorted incentives has been exploited by some scientists to build very successful careers upon fabricated data, landing great jobs, publishing apparently high-impact research in top journals and obtaining extensive funding.

This has numerous direct and indirect negative consequences for science. Honest scientists struggle to compete with cheats in terms of publications, employment and funding. Cheats pollute the literature, and work trying to build upon their fraudulent research is wasted. Worse, given the pressure to study clinically relevant subjects, it is only to be expected that clinical trials have been based upon fraudulent data, unethically exposing patients to needless risk. Cheats are also terrible mentors, compromising junior scientists and selecting for sloppy or dishonest researchers. Less tangible but also damaging, cheats spread cynicism and unrealistic expectations.

One reason we find ourselves in this situation is that the organizations supposed to police science have failed. Most misconduct investigations are subject to clear conflicts of interest. Journals are reluctant to commit manpower to criticizing their own publications. Host institutions are naturally inclined to defend their own staff and to suppress information that would create bad publicity. Moreover, both institutional administrators and professional editors often lack scientific expertise. It is little wonder therefore that so many apparently damaging comments on PubPeer seem to elicit no action whatsoever from journals or institutions (although we know from monitoring user-driven email alerts that the journals and institutions are often informed of comments). Adding to the problem of conflicts of interest, most investigations lack transparency, giving no assurance that they have been carried out diligently or expertly. Paul Brookes recounts a sadly typical tale of the frustrations involved in dealing with journals and institutions. How difficult would it have been to show Brookes the original data or, even better, to post it publicly? Why treat it as a dangerous secret?

It is hard to avoid the conclusion that the foxes have been set to guard the hen house (of course the institutions are important because they have access to the data, a point to which we return below). An external investigator would seem like a good idea. And one exists, at least in the US: the Office of Research Integrity (ORI). However, as Adam Marcus and Ivan Oransky of Retraction Watch explain in a recent New York Times article, the ORI has been rendered toothless by underfunding and an inability to issue administrative subpoenas, so it remains dependent on the conflicted institutions for information. Moreover, other countries may not even have such an organisation.

As also detailed by Marcus and Oransky, even on the rare occasions when blatant frauds are established, the typical punishments are no deterrent. Journals often prefer to save face by publishing ‘corrections’ of only the most egregious errors, even when all confidence in the findings has been lost. Funding agencies usually hand down ludicrously lenient punishments, such as a few years of being mentored or not being allowed to sit on a grant committee, even when millions of federal funding have been embezzled. Most researchers ‘convicted’ of fraud seem able to carry on as if nothing much had happened.

What can be done?

We first eliminate a non-solution. We would be very wary about prescribing increased formalized oversight of experiments, data management, analysis and reporting, a suggestion made by the RIKEN investigation into the stem cell affair. The problem is, who would do the oversight? Administrators don’t understand science, while scientists would waste a lot of time doing any overseeing. If you think you do a lot of paperwork now, imagine a world where every step of a project has to be justified in some report. The little remaining enjoyment of science would surely be sucked dry. (This viewpoint should not, however, be taken as absolving senior authors from their clear responsibility to verify what goes into the manuscripts they sign).

A measure often suggested is to extend checking of manuscripts for plagiarism and image manipulation at journals. This is happening, but it has the serious disadvantage of remaining mostly out of sight. If caught, it is easy for an author to publish elsewhere, maybe having improved his image manipulation if he is less lazy than most cheats. Amusingly, the recent stem cell debacle at Nature provides a perfect illustration of this problem. It has been suggested that one of the image manipulations that ultimately led to the retractions was spotted by a referee at Science, contributing to the paper’s rejection from that journal (see here). Presumably Nature now wish they had known about those concerns when they reviewed the articles. Information about the results of such checking should therefore be centralized and ideally made available to the most important audience: other researchers. We understand that this might be complicated for legal reasons, but all possible avenues, even for restricted dissemination, for instance to referees within the same publishing conglomerate, should be explored.

Another suggestion is to introduce more severe punishments in cases of misconduct. These could be administrative (recovery of grants, job loss, funding or publication bans) or even involve criminal prosecution. We believe that science and the law mix poorly and foresee the potential for some incredibly technical, expensive and inconclusive court cases. Indeed, according to Marcus and Oransky, the difficulties of the Baltimore/Imanishi-Kari case contributed to the current weakness of the ORI. We note also that all formal investigations are incredibly time-consuming. Any researchers co-opted into such investigations will waste a lot of time for little credit. Nevertheless, we contend that more severe punishments, even in just a few clear-cut cases, would send a strong message, help convince the weak-willed and strengthen the hand of vulnerable junior researchers pressured into misconduct by unscrupulous lab heads. Certainly, funding agencies should reconsider their ludicrously lax penalties.

Policing research is always likely to be burdensome and haphazard if it is carried out by organizations subject to conflicts of interest or administered by people with little understanding of science. But that is unfortunately exactly the situation today and we think it must be changed. A more effective approach would be to leverage the motivation and expertise of the researchers most interested in the subject. How much better if they were the policemen, rather than uninterested, conflicted and bureaucratic organizations. This could be done if together we invert the burden of proof. It should be your responsibility as a researcher to convince your peers, not theirs to prove you wrong. If you cannot convince your peers, that should be a problem for you, not a problem for them. Simply managing to publish a conclusion with some incomplete data should not be enough. Although this may sound Utopian, we argue next that there are now mechanisms in place that could realistically create this sea change in attitude.

The key trend is towards greater data access. Traditional publication requires readers to trust the authors who write the paper, as well as the institutions and journals that carry out any investigations. As we have argued above, in a growing number of cases that trust is breaking down. Yet the internet and advances in information technology mean that it is no longer necessary to trust; one can also verify. All methods, data, materials, and analysis can and should be made available to the public without precondition. This will automatically make it harder to cheat and easier to do the right thing, because it is a lot more difficult to fabricate a whole data set convincingly than it is to photoshop the odd image of bands on a gel. Moreover, our personal experience suggests that requiring authors to package their data and analysis in reproducible form will introduce unaccustomed and beneficial rigor into lab work flows. Open data is therefore a policy of prevention being better than cure. Moreover, replications and more formal investigations will be greatly facilitated by having all the original data immediately available, eliminating a significant bottleneck in investigations today.

On the issue of data sharing, PLoS is leading the way: following a recent policy change, everything must be easily accessible as a precondition of publication. Moreover, the policy explicitly states that `… it is not acceptable for the authors to be the sole named individuals responsible for ensuring data access’, so it should not be necessary to request the data from the authors. We applaud this revolutionary initiative wholeheartedly and we strongly encourage other journals to follow this lead. Nature group have also made progress, but under their policy it is often still necessary to request the data from the authors, who are not all equally responsive (whether through poor organization or bad faith), despite their undertakings to the journal. Furthermore, such data requests to the authors still expose researchers to possible reprisals.

A less dramatic but necessary and complementary step would be for journals and referees to insist on complete descriptions of methods and results. If space is constrained, online supplementary information could be used, although we feel this confuses article structure. We believe the trend of hiding the methods section has been a big mistake. As scientists, we were disheartened to hear people opine during the STAP stem cell affair that it was `normal’ for published methods to be inadequate to reproduce the findings. We strongly disagree: all failures to replicate should be treated as serious problems. It is the authors’ and journals’ responsibility to help resolve these failures and to avoid them in the first place. As an aside, could journals PLEASE provide a way to download a single file combining the main article and any supplementary information? This hardly requires an ace web programmer, yet it seems that still only PNAS has managed to get this right. It shows that most publishers never read papers.

The next question is: how to make use of data access and detailed methods? This is where we see a role for PubPeer and other forms of post-publication peer review. The main aim of PubPeer is to centralize all discussion about papers durably. This is now possible, and it has therefore become a simple matter to check on the track record of a specific researcher. Searching PubPeer will show any signs of possible misconduct (as well as less serious issues or even positive commentary). It will also show how the authors have responded to those issues being raised. Currently, most authors keep their heads firmly in the sand, maybe because they have no real answer to the criticisms posted. Nevertheless, a minority of authors do respond convincingly, showing that they can support their published conclusions (see for example). Of course, there are also genuine scientific discussions on PubPeer (e.g. here) as well as a few oddball comments, so it remains important to read the comments and make up your own mind as to their meaning and significance.

By exploiting this centralized information, the high-pressure environment that cheats have navigated so successfully can now become their downfall. Referees and members of committees for recruitment, promotion or funding can now give careful consideration to the scientific community’s opinions about the quality and reliability of applicants’ research. Researchers whose work displays unresolved issues are likely to find their advancement encounters some well deserved friction. As we all know, it only takes the slightest friction in a grant committee for your application not to be funded. Similarly, prospective students, post-docs and collaborators now have an additional data source to evaluate before entrusting their future careers to a group. In this way, platforms like PubPeer can help ensure that cheating, once discovered, has lasting consequences, tilting the balance of benefits towards honest, high-quality research. Scientists will also have much stronger incentives to resolve issues in their work.

We are therefore hopeful for the future. The growing use of post-publication peer review, the movement towards full data access and, hopefully, some improvement in the policies of research organizations and publishers, should usher in a new era of quality in science. Scientists can make use of services like PubPeer and leverage the high pressure under which we all work to insist upon high standards and to unmask cheats. Together, let’s retake control of our profession.