A crisis of trust

When we created PubPeer, we expected to facilitate public, on-the-record discussions about the finer points of experimental design and interpretation, similar to the conversations we all have in our journal clubs. As PubPeer developed, and especially once we enabled anonymous posting, we were shocked at the number of comments pointing out much more fundamental problems in papers, involving very questionable research practices and rather obvious misconduct. We link to a few examples of comments raising apparently serious issues and where the articles were subsequently withdrawn or retracted (for which the reasons were not always given):

https://pubpeer.com/publications/C55070469007978707693AA374BF21
https://pubpeer.com/publications/890E1E22DAFD6926D577FE461A66F6
https://pubpeer.com/publications/058CFA77EAF6D5E019D9902C6B3553
https://pubpeer.com/publications/FF771F6D16ADB90D7F8C11E5361A1F
https://pubpeer.com/publications/0C40189FE3F10DF9B4B0166DE1FA4E
https://pubpeer.com/publications/8B755710BADFE6FB0A848A44B70F7D
https://pubpeer.com/publications/1F3D9CBBB6A8F1953284B66EEA7887

The choice of retracted/withdrawn articles was made for legal reasons, but that is all that makes them special. There are many, many similar comments regarding other papers.

Many critical comments have involved papers by highly successful researchers and all the very best journals (https://pubpeer.com/journals) and institutions are represented. So it is hard to argue that these problems only represent a few bad apples that nobody knows or cares about. We have come to believe that these comments are symptomatic of a deep malaise: modern science operates in an environment where questionable practices and misconduct can be winning strategies.

Although we were not anticipating so many comments indicative of misconduct on PubPeer, maybe we should not have been so surprised. The incentives to fabricate data are strong: it is so much easier to publish quickly and to obtain high-profile results if you cheat. Given the unceasing pressure to publish (or perish), this can easily represent the difference between success and failure. At the same time, ever fewer researchers can afford the time to read, consider or check published work. There are also intense pressures discouraging researchers from replicating others’ work: replications are difficult to fund or publish well, because they are considered unoriginal and aggressive; replications are often held to much higher standards than the original work; publishing contradictory findings can lead to reprisals when grants, papers, jobs or promotions are considered; failures to replicate are brushed off as the replicator’s poor experimental technique. So the pressures in science today may push researchers to cheat and simultaneously discourage checks that might detect cheating.

As followers of ‘research social media’ like Retraction Watch and the now-shuttered Science Fraud have already realized, the climate of distorted incentives has been exploited by some scientists to build very successful careers upon fabricated data, landing great jobs, publishing apparently high-impact research in top journals and obtaining extensive funding.

This has numerous direct and indirect negative consequences for science. Honest scientists struggle to compete with cheats in terms of publications, employment and funding. Cheats pollute the literature, and work trying to build upon their fraudulent research is wasted. Worse, given the pressure to study clinically relevant subjects, it is only to be expected that clinical trials have been based upon fraudulent data, unethically exposing patients to needless risk. Cheats are also terrible mentors, compromising junior scientists and selecting for sloppy or dishonest researchers. Less tangible but also damaging, cheats spread cynicism and unrealistic expectations.

One reason we find ourselves in this situation is that the organizations supposed to police science have failed. Most misconduct investigations are subject to clear conflicts of interest. Journals are reluctant to commit manpower to criticizing their own publications. Host institutions are naturally inclined to defend their own staff and to suppress information that would create bad publicity. Moreover, both institutional administrators and professional editors often lack scientific expertise. It is little wonder therefore that so many apparently damaging comments on PubPeer seem to elicit no action whatsoever from journals or institutions (although we know from monitoring user-driven email alerts that the journals and institutions are often informed of comments). Adding to the problem of conflicts of interest, most investigations lack transparency, giving no assurance that they have been carried out diligently or expertly. Paul Brookes recounts a sadly typical tale of the frustrations involved in dealing with journals and institutions. How difficult would it have been to show Brookes the original data or, even better, to post it publicly? Why treat it as a dangerous secret?

It is hard to avoid the conclusion that the foxes have been set to guard the hen house (of course the institutions are important because they have access to the data, a point to which we return below). An external investigator would seem like a good idea. And one exists, at least in the US: the Office of Research Integrity (ORI). However, as Adam Marcus and Ivan Oransky of Retraction Watch explain in a recent New York Times article, the ORI has been rendered toothless by underfunding and an inability to issue administrative subpoenas, so it remains dependent on the conflicted institutions for information. Moreover, other countries may not even have such an organisation.

As also detailed by Marcus and Oransky, even on the rare occasions when blatant frauds are established, the typical punishments are no deterrent. Journals often prefer to save face by publishing ‘corrections’ of only the most egregious errors, even when all confidence in the findings has been lost. Funding agencies usually hand down ludicrously lenient punishments, such as a few years of being mentored or not being allowed to sit on a grant committee, even when millions of federal funding have been embezzled. Most researchers ‘convicted’ of fraud seem able to carry on as if nothing much had happened.

What can be done?

We first eliminate a non-solution. We would be very wary about prescribing increased formalized oversight of experiments, data management, analysis and reporting, a suggestion made by the RIKEN investigation into the stem cell affair. The problem is, who would do the oversight? Administrators don’t understand science, while scientists would waste a lot of time doing any overseeing. If you think you do a lot of paperwork now, imagine a world where every step of a project has to be justified in some report. The little remaining enjoyment of science would surely be sucked dry. (This viewpoint should not, however, be taken as absolving senior authors from their clear responsibility to verify what goes into the manuscripts they sign).

A measure often suggested is to extend checking of manuscripts for plagiarism and image manipulation at journals. This is happening, but it has the serious disadvantage of remaining mostly out of sight. If caught, it is easy for an author to publish elsewhere, maybe having improved his image manipulation if he is less lazy than most cheats. Amusingly, the recent stem cell debacle at Nature provides a perfect illustration of this problem. It has been suggested that one of the image manipulations that ultimately led to the retractions was spotted by a referee at Science, contributing to the paper’s rejection from that journal (see here). Presumably Nature now wish they had known about those concerns when they reviewed the articles. Information about the results of such checking should therefore be centralized and ideally made available to the most important audience: other researchers. We understand that this might be complicated for legal reasons, but all possible avenues, even for restricted dissemination, for instance to referees within the same publishing conglomerate, should be explored.

Another suggestion is to introduce more severe punishments in cases of misconduct. These could be administrative (recovery of grants, job loss, funding or publication bans) or even involve criminal prosecution. We believe that science and the law mix poorly and foresee the potential for some incredibly technical, expensive and inconclusive court cases. Indeed, according to Marcus and Oransky, the difficulties of the Baltimore/Imanishi-Kari case contributed to the current weakness of the ORI. We note also that all formal investigations are incredibly time-consuming. Any researchers co-opted into such investigations will waste a lot of time for little credit. Nevertheless, we contend that more severe punishments, even in just a few clear-cut cases, would send a strong message, help convince the weak-willed and strengthen the hand of vulnerable junior researchers pressured into misconduct by unscrupulous lab heads. Certainly, funding agencies should reconsider their ludicrously lax penalties.

Policing research is always likely to be burdensome and haphazard if it is carried out by organizations subject to conflicts of interest or administered by people with little understanding of science. But that is unfortunately exactly the situation today and we think it must be changed. A more effective approach would be to leverage the motivation and expertise of the researchers most interested in the subject. How much better if they were the policemen, rather than uninterested, conflicted and bureaucratic organizations. This could be done if together we invert the burden of proof. It should be your responsibility as a researcher to convince your peers, not theirs to prove you wrong. If you cannot convince your peers, that should be a problem for you, not a problem for them. Simply managing to publish a conclusion with some incomplete data should not be enough. Although this may sound Utopian, we argue next that there are now mechanisms in place that could realistically create this sea change in attitude.

The key trend is towards greater data access. Traditional publication requires readers to trust the authors who write the paper, as well as the institutions and journals that carry out any investigations. As we have argued above, in a growing number of cases that trust is breaking down. Yet the internet and advances in information technology mean that it is no longer necessary to trust; one can also verify. All methods, data, materials, and analysis can and should be made available to the public without precondition. This will automatically make it harder to cheat and easier to do the right thing, because it is a lot more difficult to fabricate a whole data set convincingly than it is to photoshop the odd image of bands on a gel. Moreover, our personal experience suggests that requiring authors to package their data and analysis in reproducible form will introduce unaccustomed and beneficial rigor into lab work flows. Open data is therefore a policy of prevention being better than cure. Moreover, replications and more formal investigations will be greatly facilitated by having all the original data immediately available, eliminating a significant bottleneck in investigations today.

On the issue of data sharing, PLoS is leading the way: following a recent policy change, everything must be easily accessible as a precondition of publication. Moreover, the policy explicitly states that `… it is not acceptable for the authors to be the sole named individuals responsible for ensuring data access’, so it should not be necessary to request the data from the authors. We applaud this revolutionary initiative wholeheartedly and we strongly encourage other journals to follow this lead. Nature group have also made progress, but under their policy it is often still necessary to request the data from the authors, who are not all equally responsive (whether through poor organization or bad faith), despite their undertakings to the journal. Furthermore, such data requests to the authors still expose researchers to possible reprisals.

A less dramatic but necessary and complementary step would be for journals and referees to insist on complete descriptions of methods and results. If space is constrained, online supplementary information could be used, although we feel this confuses article structure. We believe the trend of hiding the methods section has been a big mistake. As scientists, we were disheartened to hear people opine during the STAP stem cell affair that it was `normal’ for published methods to be inadequate to reproduce the findings. We strongly disagree: all failures to replicate should be treated as serious problems. It is the authors’ and journals’ responsibility to help resolve these failures and to avoid them in the first place. As an aside, could journals PLEASE provide a way to download a single file combining the main article and any supplementary information? This hardly requires an ace web programmer, yet it seems that still only PNAS has managed to get this right. It shows that most publishers never read papers.

The next question is: how to make use of data access and detailed methods? This is where we see a role for PubPeer and other forms of post-publication peer review. The main aim of PubPeer is to centralize all discussion about papers durably. This is now possible, and it has therefore become a simple matter to check on the track record of a specific researcher. Searching PubPeer will show any signs of possible misconduct (as well as less serious issues or even positive commentary). It will also show how the authors have responded to those issues being raised. Currently, most authors keep their heads firmly in the sand, maybe because they have no real answer to the criticisms posted. Nevertheless, a minority of authors do respond convincingly, showing that they can support their published conclusions (see for example). Of course, there are also genuine scientific discussions on PubPeer (e.g. here) as well as a few oddball comments, so it remains important to read the comments and make up your own mind as to their meaning and significance.

By exploiting this centralized information, the high-pressure environment that cheats have navigated so successfully can now become their downfall. Referees and members of committees for recruitment, promotion or funding can now give careful consideration to the scientific community’s opinions about the quality and reliability of applicants’ research. Researchers whose work displays unresolved issues are likely to find their advancement encounters some well deserved friction. As we all know, it only takes the slightest friction in a grant committee for your application not to be funded. Similarly, prospective students, post-docs and collaborators now have an additional data source to evaluate before entrusting their future careers to a group. In this way, platforms like PubPeer can help ensure that cheating, once discovered, has lasting consequences, tilting the balance of benefits towards honest, high-quality research. Scientists will also have much stronger incentives to resolve issues in their work.

We are therefore hopeful for the future. The growing use of post-publication peer review, the movement towards full data access and, hopefully, some improvement in the policies of research organizations and publishers, should usher in a new era of quality in science. Scientists can make use of services like PubPeer and leverage the high pressure under which we all work to insist upon high standards and to unmask cheats. Together, let’s retake control of our profession.

33 thoughts on “A crisis of trust

  1. That you have had to make the statement
    “This could be done if together we invert the burden of proof. It should be your responsibility as a researcher to convince your peers, not theirs to prove you wrong”
    shows how substantial the problem is. After all, the burden of proof in science rests on the researcher, not the audience. Only if we are engaged in quackery should this statement be necessary.

    • and so the science will not advance.

      Read Thomas Kuhn, read the long list of real innovators who will be delayed forever

      there is no method, forget it.

      the method you defend is already abused by the establishment to block new knowledge…

      allow people to spot the frauds, freely…

      allow critics, and critics of critics, without the editor try to block debate.

      peer-review is already flawed, especially when it works as you say.

      failure of peer review in high impact journal is the scientific fiasco of the past century.

      Not because it have allowed fraudsters, but because it have blocked revolutions despite evidences.

      http://lenr-canr.org/acrobat/RothwellJhownaturer.pdf

      Freedom of speech and critic is the only solution, and market will decide if it works.

    • “After all, the burden of proof in science rests on the researcher, not the audience.” Of course we agree, but it really doesn’t seem that way at the moment.

      A couple of points for which there was not room in the post, which was already rather long.

      The viewpoint of the post is from the life sciences. In the physical sciences there may be, at least in theory, an additional constraint arising from the importance of plausible and demonstrated mechanisms. In addition, techniques in fields like chemistry seem quite standardized and replications straightforward, at least from the outside, so the risks involved in fabrication may be correspondingly greater. Conversely, there are entire fields where mechanisms are rudimentary to say the least – psychology for instance. So maybe they are especially at risk?

      We are aware that even full data access is unable to prevent selection of the data. If the authors decide not to show the experiments that did not work; nobody need ever know. But clearly open data would still be a huge step in the right direction.

      • you make a good point, but this also can raise an awful result.

        real innovation, like transistor (or PN junction), start with hard to replicate experiments. that is a rule, because the condition of the sucess are not known, yet can be attained by luck.

        If you as total replcation, you restrict science to known place, to boring extension of known technology.

        any effort to restrict the risk of making wrong science will kill necessarily the good innovative science.
        in the 1930s, the technology was advancing quickly despite many scientist were investigation parapsychology. there was many discovery dones, and even some accepted.

        Transistor, radioactivity, nuclear physics, planes were developed, more or less in a fringe way.

        what we need is not censorship, it is freedom to critics, but we try to do the opposite.
        on one side many ideas are blocked upfront, and on the others critics of some ideas is forbidden… this is a good way to develop the current paradigm, but not to find a new one.

        ONE method is the enemy of science.
        I support the idea of biodiversity of science, but also of harsh critics, which can be criticized too…
        We should value replication, value anomalies, value outlier,value failures, give time to experimentation, accept ignorance, accept lack of repeatability, and not theory or success.

        today what is clear is that you cannot do science without a theory, a model. People don’t even understand it is terrible.
        When theory face experimental result, the reaction is to challenge the experiments, not the theory, not the model, even if that model is based on hundred of non fundamental assumptions specific to a small domain.
        Moreover in many case people challenge the laws which are very solid), while the real error is simply in assumptions, methods, habits…
        Cowboys says no animal can fly, showing that cows are too heavy and that it would be easily detected by BS falling.

        finally the modern theory of groupthink, as well described by Roland Benabou explains that all that people ask to “regulate” science, peer-review, funding committees, publication index, are not fighting against errors, but enforcing the groupthink conditions.
        Groupthink emerge from rational minds in a group, when the opposition of the group hurts the dissenter and don’t allow him to benefit from his realism. For a scientist in modern system, publication and grants are the condition of his success, and preventing dissidence is simply ensuring a perfect groupthink condition.

        for more detail on the model, read “Groupthink: Collective Delusions in Organizations and Markets”.

        • You are quite right that many important ideas often only emerge slowly and messily; there is not always a single, clear breakthrough like the double helix, relativity or the action potential. You are also right that people may long resist revolutionary ideas for the wrong reasons. But remember the Galileo syndrome – it doesn’t suffice to be persecuted, you also need to be right. And most revolutionary ideas are simply wrong. The solution of course is to present possibly revolutionary aspects of a piece of
          work in an honest and balanced manner, but not to imply they are proven when they are not. Unfortunately, the style of ‘top’ journals requires grandiose claims but often allows them to remain unproven.

          I really don’t see how you can better enable the emergence of revolutionary ideas than to facilitate discussion directly between interested researchers, as we aim to do on PubPeer. Other non-expert intermediaries like journals, institutions, funding bodies could add little value to a direct discussion with the researchers a revolutionary must one day convince.

          In any case, our post is mostly directed against fraud and sloppy science, where usually the problem is that the authors refuse to show data they claim to have or that they have not preformed simple confirmatory experiments.

          • in fact we challenge high impact journal for opposite reason, based on a common governance failure.
            You (and i shar your concern) challenge their love for tabloid science without much evidence, but big headlines…
            i challenge them for their deep conservatism.It seems opposite but it is simply that they do “demagogy science”, shallow science which please the populace of media, of politicians, of decision makers and spin doctors.

            as long as the claim follow the prejudice of the masses, or trigger funny feeling without any fear to endanger status quo, they push fringe articles… they can enter political debates just for fun, or on the opposite behave like white knight (that they are not) to defend the widow, the orphan and the good science.

            in no case does the depth of the research have any importance, neither the solidity of the evidences. the only question is :
            – is it forbidden by the consensus
            – is it fashion by the consensus
            – is it just funny and painless

            in a way they behave like popular media.
            they propagate cliché, enforce sacred cows and taboo, reinforce prejudice, entertain.

            journals with less impact are mostly following those leaders, enforcing the taboo and the sacred cows sometime with resistance, sometime with complicity, but always with more risk to be caught, either by the police of taboo or by the police of errors.

      • I am not convinced that Life and Physical Sciences are so different. There are plenty of examples of dodgy spectra in chemistry and there have been some really quite extraordinary fabrications of data in physical sciences (Schoen, for example!). There are also a fair number of physical science papers appearing in PubPeer.

        • chemistry and material science are much more complex than nuclear science, which lead to cultural differnec and different relation to experiment and theory.
          chemistry is much nearer biology as the condition are always hard to control, and sucess may be related to contamination, unconscious mistakes or invisible parameter.
          Nuclear physicist often have difficulties to understand that, except on some speciality (among which material science).

          human science is even worse.

          for me one rule of success is being useful, curing or repairing… many things that are real are not useful however. Moreover to detect success is not easy. I have heard that competence in statistic is not enough developed in medical/environmental research (and in many sciences) and some basic errors for statisticians are done by many scientists (sometime on purpose with peer reviewer not spotting the fraud)…
          maybe it would be more productive to give a solid leading-edge competence in statistics to more scientists, rather than increasing groupthink by stronger regulation.

          someone explained me that scientific method is not so holy, it is just common senses for any wize man to check his work… when it is too “sacred” people forget that it is not rule, but common sense. I have heard so many absurdities with people abusing of scientific method arguments to cancel evidence on artificial points while common sense would have accepted with usual care.

          scientific method is treated sometime like the 10 commandment. some interpret id literally removing all substance.

  2. I think this is an excellent post. I’d like to add a bit to your inversion of the burden of proof:

    It’s self evident that papers with meticulous archiving of data and detailed analysis descriptions are better papers, as they can be checked and reproduced. Much higher rates of data archiving etc would therefore be accomplished if journals formally recognised this and included it in their decision criteria.

    After all, almost all journals aim to publish ‘the best’ science, and rejecting a few papers because the authors wouldn’t share the data would get the message across really quickly.

    I wrote a letter to Nature about this idea: http://www.nature.com/nature/journal/v508/n7494/full/508044c.html

    • Our hope is that journals (and funding organizations) will ultimately require full data sharing. That said, the more the research community can do for itself without relying on other bodies, the better things will be. We should be leading not following. The choke points we have access to are papers, grants, jobs, promotions and now also post-publication peer review. Maybe as referees we should start asking to see data and/or asking whether it will be made publicly available. It ought to be feasible to request images of complete gels for instance; the journal would probably be happy that someone is checking.

      • There is a problems with replications.
        Even when officially replicating, most scientists add variation and use their creativity to improve… Real replication are seldom done by scientist, and in a domain I follow the few case were by research engineers, very careful in following directives.

        another problem is that a paper is too small to contain all, and aven any written or digital corpus.
        Nothing replace face to face, or at least remote personal communication.
        Best replication in the domain i follow were done by friends, or neigbours of the initiators… or as i say by engineers who asked the tips of the “inventor”.
        Replication based on the paper, are nearly hopeless if the problems is really interesting.
        moreover in the middle of a controversy, replicators often have an incentive to assassinate and won’t battle as required to “make it work or make is say why”.

        This is a human job. adding rules will only make things worse.
        what we need is more open data, more human communication, less rules, more respect, freedom to dissent and investigate, not concluding early.

        Practically in a serious domain, concluding before 1-2 years of sincere replication effort with collaboration of the initiator and long experience of the domain, is a joke.
        It is however very common if media are involved.

        Media are worse to corrupt science than is even money.

      • The key group here is journals – at no other point in the scientific process is there 1) a readily definable dataset (the data underlying the paper), and 2) something that the journal can trade (publication) in return for the authors releasing the data. Sharing rates can get very high when journals are committed to enforcing a strong data archiving policy (e.g. http://arxiv.org/abs/1301.3744)

        It’s very helpful if other organizations like funding agencies say that data sharing is important, but there’s no equivalent to the ‘share or get rejected’ moment.

  3. There is an additional problem with journal and institutional investigations that deserves to be highlighted. They nearly always take too long to be of any real use in terms of establishing the reliability of the data. What people often want to know quickly is whether they can trust the conclusions or not. Often that just requires looking at the original data, which shouldn’t be such a sensitive issue. Then, if problems are revealed, by all means take your time over a full investigation to find out who’s fault it is (after all, a couple of years voluntarily being mentored are at stake). But why not get as much useful information out as quickly as possible? Post the data publicly. Moreover, several recent cases have highlighted that internal investigations can miss obvious issues in the data. Public posting would therefore help the investigation.

  4. The NYT piece you cited by Oransky and Marcus, and your words about it, are NOT right. As a chief research fraud investigator for the federal Office of Research Integrity (ORI) from 1989 to 2006, and as one who has worked with ORI, and on ORI cases privately as a consultant, for eight years since then, I know the true facts.

    ORI is NOT “toothless” — while the subpoena authority for materials that Oransky and Marcus recommended be granted by HHS to ORI could be helpful in a few cases, the fact is that the 1989 PHS regulation and the 2005 HHS regulation on research misconduct give the ORI oversight authority over all institutional investigations related to HHS research funding AND require those institutions to provide to ORI on request the research records and any other documentation related to the research misconduct case. The institutions have uniformly done so in almost every case since 1989, compiling with numerous ORI requests.

    Second, ORI has NOT continued to suffer from the loss at the HHS Departmental Appeals Board (run by non-scientist HHS staff lawyers) in the 1990s of the Imanishi Kari MIT misconduct case. Instead, NIH and then HHS strengthened ORI with professionally trained staff and counsels, and they developed fine procedures as well as new regulations that made ORI very strong — and well-recognized in the scientific and federal regulatory community as the best federal research misconduct investigation and oversight office. [See my article on ORI’s difficult but successful history from 1989 to 2005 in Accountability in Research 20 (5-6), 291-319 (2013).

    ORI HAS continued, since 1992, to publish its formal findings of research misconduct [other federal agencies have not done so] in the Federal Register and in the NIH Guide to Grants and Contracts, as well as the ORI website and ORI Annual Report – naming the person who committed research misconduct, their position at their institution, what they falsified or fabricated or plagiarized, and what administrative actions that ORI imposed on them (typically debarment from receiving any federal funds and prohibition from advising PHS/HHS) for a given period (usually three years, but up to lifetime). ORI has made and published over 275 such findings of research misconduct. ORI’s actions save the U.S. taxpayer money by barring many perpetrators from receiving federal money, or by requiring that they be closely monitored and their work on HHS grants be certified by the institution. Scientists and the Public should be proud of ORI, as I am.

    • Thank you for responding to our post. Obviously you know more about the ORI than us (partly because we see so few results in the cases posted here…)

      We are happy to be corrected regarding the importance of the administrative subpoenas. What kind of issues give rise to the minority of cases where institutions do not cooperate?

      Maybe you can explain to us as to when the more “severe” punishment of debarment from federal funding is considered appropriate? The reported cases we have followed suggest that it is not at all systematic. For the most recent example, see:

      http://retractionwatch.com/2014/07/28/ori-sanctions-collaborator-of-nobel-winner-buck-for-data-fabrication/#more-21812

      “Zou agreed to a three-year settlement in which he must be supervised while doing any research with Public Health Service money.” (It is also made to sound like an agreement requiring negotiation, rather than a punishment that the ORI can impose unilaterally.) How often are people required to return funding?

      As we mention in the article, we do consider the existence of an investigator external to the conflicted institutions and journals to be a good thing in principle. And, excepting the leniency of the “punishments”, we wouldn’t have much quarrel with the work the agency does produce. However, 275 cases since 1989 amounts to less than 20/year. Do you feel that the ORI can scale to the numbers of cases currently being discovered? I would guess that the current rate on PubPeer alone would be about 100 papers per year (and rising) involving US labs.

      Should we encourage people to contact the ORI with concerns raised here?

      Then there is the issue of speed. All investigations take forever, during which time no information is available. Yet the few years following a publication are often the most critical, in terms of impact on the field and on people’s careers. Do you agree that it might be a good idea, when an investigation is considered justified, to make any original data obtained available to the public immediately? This would allow interested researchers to draw their own scientific conclusions (and maybe contribute observations to the investigation), independently of the determinations of the investigation regarding responsibility for any misconduct.

      • Well, in my 17 years in ORI, I recall only one institution that did not cooperate (at first), instead joining a suit filed in federal district court by the respondent whom I was investigating for ORI [Professor James Abbs, University of Wisconsin] – after the federal appeals court in D.C. reversed the district court’s decision, I resumed the ORI investigation, and the University counsels then sent to ORI the requested documents that Abbs claimed were from the research [ORI used them to proved that Abbs had falsified and fabricated the data for the paper — see a discussion of the Abbs and Univ. Wisconsin vs. Sullivan/HHS court case in my 2013 Accountability in Research paper listed above]. I understand that in recent years there was another university that declined to cooperate for some time because of another pending law suit. That is rare for ORI.

        ORI tries to impose debarment from federal funding on those who commit a substantial amount of serious research misconduct (especially involving several papers and/or grant applications, and those in their defense who may wrongly accuse collaborators, destroy or falsify records from the research, endanger human subjects or patients, and so on). The typical debarment period imposed by ORI (and by other federal agencies) – in 131 ORI cases of debarment since ORI was created in 1992 – is 3 years. Some first-time offenders, especially below professorial level, have received 1 or 2 years debarment (15 cases) The worse offenders got 4 or 5 years (31 cases), 7 or 8 years (4 cases), 10 years (3 cases), or even permanent lifetime debarment (3 cases: Dr. Eric Poehlman in 2005, Mr. Paul Kornak in 2006, and Dr. Jon Sudbo in 2007). [Note: the ORI website and its online ORI Annual Reports can be searched for names, debarment, etc,]

        As you noted, most of the ORI cases are closed by “settlement” — just as most federal DoJ and other court cases are closed by prosecutors — in order to save the justice lawyers’ time and the taxpayers money that is required to pursue an investigation and a trial over several years. Most ORI respondents agree to, or at least do not contest, the ORI-proposed settlement (findings of misconduct and administrative actions). However, some respondents (particularly those who can afford expensive defense lawyers) delay, argue, try to negotiate, refuse to bend, and threaten to appeal – or do appeal – the ORI findings and actions for a hearing (“trial”) before an HHS administrative law judge – which involves legal filings, discovery of evidence, interview of witnesses, etc.) – all of which can take 1 to 3 years to complete. Thus, HHS counsels for ORI try to “negotiate” as you indicated a “settlement” [just as DoJ does], Some of those negotiations do lead to lesser administrative actions than the ORI scientists proposed. Federal law under the U.S. Constitution does not allow ORI – nor DoJ – to “impose unilaterally” (as you suggested) any such administrative action (sanction).

        As to your question of “how often are people required to return funding” – that is a matter for the federal funding agency (like NIH or other in HHS) or a federal, state, or local prosecutor to pursue. The federal government awards research grants and contracts to institution – almost never to individuals. The ORI cases that I recall had some recovery of funds from the person who committed research misconduct were: Dr. Eric Poehlman ($180,000 DoJ fine, plus 1 year in prison from DoJ in 2005); Paul Kornak ($639,000 VA restitution, plus 6 years in prison from DoJ in 2006); Dr. Vipul Brighu at University of Michigan ($10,000 fines and costs, and 6 months probation from Michigan district court in 2011); and “Dr” [false claim of three degrees] Pat Palmer at University of Iowa ($1,000 fine and $19,000 in restitution in Iowa district court).

        Of course, as you asked, Pub Peer bloggers could and should report to ORI their allegations of falsification, fabrication, or plagiarism — in research that is proposed for funding to, or supported by funds from, NIH or HHS agencies. But suspicions of misconduct based on forensic image analysis, for example, can be wrong (false-positives), honest errors or misunderstandings, or insignificant — not warranting investigation or findings of misconduct (so your numbers of Pub Peer annual allegations vs. ORI annual final case is not a valid comparison). ORI will review such allegations and may forward them to the responsible institution for confidential inquiry or investigation, and ORI will ask the complainant to maintain confidentiality about the case, as required by the HHS and all other federal regulations on research misconduct (in order to protect the reputation of the respondent, until there is a formal finding of misconduct by the institution and/or by ORI. Obviously, your Pub Peer bloggers are often eager to make public their observations and suspicions – and those leads may be useful to institutions and ORI in their ongoing investigations. However, there is always the danger that the respondent, on seeing the Pub Peer observation or allegation, will then take the opportunity to destroy, alter, or create “records” — before the institution (under ORI regulations) is informed and can sequester the original records “intact,” and this may well severely damage the investigation.

        I have no idea how to implement your suggestion that “any available original research records” (that are in question during a confidential institutional investigation) should be “made available immediately to the public.” Do you have any precedent for such action? I cannot imagine one from my two decades in the Government [other than the NIH requirement for x-ray crystallographers to deposit their protein diffraction data, after it is published, in a public repository]

        Sorry to be so long, but you asked a lot of questions, which did not have simple answers and required numbers. I hope this is helpful to you and your bloggers [I have many responses on Retraction Watch too.]

        Best,, Alan Price, P.R.I.C.E. website

        • Thank you for the comprehensive background information. And we really appreciate the time you have taken to compile the detailed statistics. The reasons for the lenient (in our view) punishments are at least clear.

          The issue that might be worth discussing a bit further concerns the distinction we are trying to construct between bare information that is (or should be) public and more complicated legal issues of guilt, private actions, personal conflicts etc.

          All of the comments on PubPeer are based directly on the published record. We argue that all aspects of published data are absolutely open to discussion, including elements that suggest sloppy practice or, surprisingly often, blatant misconduct. (Authors are completely free to explain and defend their work, which some do quite successfully.)

          The suggestion of “making original research records available” would go something like this. If some anomaly is apparent in a publication that is sufficient to trigger an investigation, the authors will have to provide their version of the original data to the journal/institution/ORI. We suggest that it be made public at this stage. There isn’t much reason for the data or the fact of an investigation to be secret. The authors were careless enough to leave a big problem in plain sight in their paper, so they bear the responsability of clearing up the mess. In addition, by publishing they have made a promise to the community that the original data are faithfully represented in the paper. It seems perfectly fair that they should be required to back up that promise if there are sufficient grounds for doubt. This is not in any way unfair to the authors, as it would be their version of the data, which they promised existed. Note that if publishing now in a PLoS journal that information would already be available, while at NPG the authors sign an undertaking to provide it to whomever should ask, so it’s not that revolutionary.

          Making available the authors’ version of the complete data as early as possible (i.e. at the beginning of an investigation, rather than never…) would have several benefits: scientists could make up their own minds whether to base their work on the published results on the basis of more complete information. Furthermore, the record of journals and institutions in these investigations is less that stellar, in that they miss loads of issues (see just a few examples on the “topic” on the question: https://pubpeer.com/topics/1/3CB9BC765DD8A6F7D10AC6D1942E7F#fb12537 ), so a bit of additional scrutiny may not go amiss.

          It is worth reiterating what we see as a key distinction, one that is currently not made in investigations. A problem apparent in published data is not the same thing at all as a private accusation by some lab insider. In the former case there has been a public promise regarding the data represented, by the authors – so fine, back it up in public. There is no need for any confidentiality, as the authors have foregone any right to it by publishing. Thus, if they are dumb enough to publish what appear to be exactly the same rows of bands five times next to each other (https://pubpeer.com/publications/8BF0AE6D785C404F9ED363C590DD95#fb12582 ), for whatever reason, well they had better be prepared to produce some convincing originals… In contrast, in the case of a whistle blower’s accusation, a condidential process makes much more sense. There has been no publication promise or publicly visible issue, and it is often necessary to weigh conflicting accounts with great care.

          The current process you describe seems to have the effect of throwing a cloak of confidentiality over everything, even including discussion of published data that may suggest misconduct, yet it is public already. This has the effect of depriving scientists of the information they most need at the time when they most need it. If no discussion is allowed (per confidentiality requirements of the complainant), it may be years before the world discovers there was a problem. Whole research programs may already have been wastefully launched, grants and jobs awarded. We feel that grounds for caution should be freely discussed if they arise from published data. Again, if an author publishes something unconvincing or dubious, why should they suddenly be protected from any discussion or criticism for the X years it can take an investigation to conclude (if one is even engaged)?

          Even if, as you suggest, PubPeer posts may warn fraudulent researchers and enable them to alter their “data” convincingly (despite strict rules about data retention), we believe that the balance of benefits is still strongly tilted towards immediate public discussion. Science will be better served by knowing immediately that there are potential problems with a piece of work than by discovering several years later that the researcher has been convicted of fraud. And the two outcomes are by no means exclusive.

          We also suspect that public knowledge of a pattern of problems associated with a particular author/group will, rightly, lead to more serious reputational damage than the current (unavoidably) slow and confidential investigations. Scientists are quite capable of making up their own minds.

          Sorry for the length of my own reply. The central point is that authors should be required to back up the promise they made when they publish. Because of that promise, they forego any right to confidentiality of their data if their publication is unconvincing, be it through carelessness, poor practice or misconduct.

        • I’m a bit confused about the process. Consider this ORI action:

          http://ori.hhs.gov/chenli

          It sounds like the respondent did not agree to anything. It even says that “The Respondent failed to take responsibility for the fabrication and falsification described in ORI’s findings.” Wasn’t this a case of ORI unilaterally imposing sanctions?

          • Actually this is an example (though not completely explained in the ORI notice you cited) of what ORI has to do to guarantee fairness to the respondent — who (as I described above) has the right to appeal proposed ORI findings and administrative actions. In cases like this one, ORI has to try to contact the respondent at their last known addresses, tell them the proposed settlement with ORI, and give them time to respond. If there is no response, ORI’s scientists and counsels have to write a formal charge letter, describing ORI’s proposal based on the evidence (and/or prior admissions), then get approval from the ASH (above ORI in HHS), and send the proposed PHS findings and actions to the respondent, and finally wait until the deadline passes for a response or appeal. If there is no response, then ORI can proceed with its findings and actions, as in this case. Giving the respondent such a full and fair opportunity to respond and appeal is required by the ORI regulation (updated by HHS in 2005 – see ORI website). This fair process takes many months to be completed.

          • Thank you for responding.

            As I understand the rules, the administrative law judge may deny the request for a hearing, and would do so fairly quickly. I wonder how often a hearing is actually granted, and how high the bar is for obtaining one.

          • To quote from my history of the Office of Research Integrity (Accountability in Research 20, 291-319 (2013), available online at: http://www.tandfonline.com/} – which included citations to ORI cases that went before the HHS’ Departmental Appeals Board – or after 2005 to its Administrative Law Judges (ALJs):

            “As noted by former PHS Counsel, turned defense attorney, Robert Charrow (2010 book), this appeal system at HHS can be challenging to the appellant: First, as a practical matter, few if any scientists will have the resources to seek full review by the DAB. . . . Second, recent changes in the regulations have made an appeal to the DAB less attractive. . . access to an appeal [hearing] is no longer automatic. To qualify you must now specify those aspects of the ORI finding that are factually incorrect and why they are incorrect. Even if you were to prevail at the DAB, the ALJ decision is no longer a true ruling as in the past, but now ‘constitutes a recommended decision to the Assistant Secretary for Health.’

            Since 1996, no ORI/PHS findings of research misconduct have been overruled by the DAB. Since 2005 (to date in 2013), in response to four such appeals, no formal hearings have been held by the ALJs, who have upheld the ORI/PHS findings and recommended administrative actions.”

            Nonetheless, the process of ORI scientists and OGC attorneys writing a charge letter, waiting for a response from the respondent, which may be an appeal to the DAB / ALJ, preparing a response to the appeal, and waiting for the ALJ decision, can take many months or even years.

  5. Very useful post. Few questions came to my mind however:
    1. Is it physically feasible to mantain data repositories, given the growth rate of data volumes? And if so, should we focus on finding the right formats and the proper repository not to waste time and money?
    2. If I am correct, you are proposing that the verification of data and the detection of manipulations will ultimately be left to forums such as PubPeer. However, in my experience PubPeer is far from complete – don’t get me wrong, I like very much your proposal, but I feel there can be some limitations in its practical implementation. Can we imagine a complementary way, where some automatic detection method is used on every single paper, sending signals to the PubPeer community – let’s say a potential plagiarism case, an inconsistence inthe distributions of numbers in a table, or even a potential image manipulation (you name other examples)?
    3. Can we imagine that, after a paper has passed a screening of such sort or has extensively discussed by the community, it will gain a “reliability status”, may be based on the number of times it has been questioned (and found OK)? Such reliability index can be an incentive to authors to respond, as well as a deterrent for futile discussions (which sometimes obviously happen and possibly will increase).

    • You raise some very interesting points, Enrico. We try to respond below, using your question numbers.

      1) The data used for research must be stored once. In most cases it ought to be feasible to store it a second time, in the open. Given the tumbling costs of storage, for many types of research the storage costs would probably only be a small fraction of what one would pay to publish an open-access paper. There will probably always be exceptions, but we see no reason why open access of all data should not become the default.

      How to represent data and analyses, and what file formats to use, are obviously huge problems, especially if the aim is to enable data mining. Clearly a huge amount of work on standardisation will be required. However, we feel strongly that the open data initiative cannot wait until the “format problem” is solved. In the near future that means going with whatever formats the authors have used. These will at least allow ad hoc checks and targeted reuse.

      2) Our post and comments above are mostly addressed to what should happen once an anomaly has been identified – ideally it would be made public and interested scientists should be able to examine the underlying data, eliminating the potentially inefficient, ineffective and ultimately unnecessary intermediaries like journals and institutions. Your question relates more to the initial detection of anomalies in papers, for which you suggest systematic screening for dupilcated images or text. We agree that such an approach could be (is already) a very useful source of information.

      3) It might however be dangerous to attempt to create a “quality metric” from the results of such screening, for at least of couple of reasons. Firstly, both false negatives and false positives are bound to exist. Secondly, misconduct and poor practices in many fields of research are not amenable to automatic detection. On a more philosophical level, we tend to agree with David Colquhoun that the only reputation system worth anything is what the experts in your field think of your work (http://www.dcscience.net/?p=6636 ); understanding a paper and its context simply cannot be automated… Nevertheless, we are certainly in favour of any initiatives that improve detection of misconduct and that therefore discourage it. Systematic screening could have a huge influence, although fraudsters will of course adapt. And such screening, would, as you imply, supply some rare unbiased information about the extent of the problem.

  6. May I ask your views on the question of research fraud in different fields of science (alluded to in a couple of the response herein)?

    From my two decades as a chief ORI research misconduct investigator and as a private consultant (see my researchmisconductconsultant.com website), I have never understood why there has always been this huge difference:
    – over 90% of Office of Research Integrity (ORI) cases [in the federal Department of Health and Human Services, dominated by National Institutes of Health (NIH) grants]
    involve falsification or fabrication of research data – while
    – over 90% of Officd of Inspector General (OIG) case in the federal National Science Foundation (NSF) involve plagiarism or theft of words (or ideas).

    I find it hard to believe that biological and medical scientists predominately falsify and fabricate, while physical, mathematical, social, and engineering scientists just plagiarize! Of course there have been a few major and very public cases of fabrication in the physical sciences in recent years – but they are a huge minority in NSF OIG cases. Anyone on PubPeer have an explanation? Is it a reporting problem?

    • This sounds like a reporting issue.. It may also be that the sheer size of the bio-medical field leads to more “flavors” of problems being reported. Perhaps it is also “easier” to fabricate a blot (though it doesn’t seem like it from my browsing of this site :)) than fabricate a chemical reaction or a physics experiment? Plus, there are other types of QRPs which may be more widespread in other fields (though they may not be direct fabrication, but cherry-picking results, selectively choosing your baseline methods to show some improvements etc.).

    • How is fraud distributed between different scientific fields? It is very difficult to know that unknown, but here are a few thoughts and even fewer facts.

      The vast majority of comments on PubPeer are in the life sciences, but a number of other fields are represented (notably chemistry and nanoscience). Essentially no physics or mathematics. Most of the life science reports involve image manipulation – a good majority are gels, with a bunch of duplicated specimen images as well.

      We think several aspects contribute to the preponderance of this kind of misconduct on our site. Of course, there is a lot of life science. It is relatively easy to detect image manipulation (when badly done) and there is also something almost uniquely effective and self-contained about a duplication in an image – anybody can understand it. There are certainly other forms of fraud and poor practice in the life sciences, but they can be harder to detect and it may require expertise to understand them, which can limit their impact when they are reported. Various forms of data selection and p-hacking spring to mind. There is of course the whole range of willful misinterpretations, but they are not usually considered fraud (especially if complex), but can nevertheless often be deliberate misrepresentations.

      Few other sciences seem to have misconduct that is as easy to demonstrate as a gel image manipulation; the exceptions that we see sometimes on PubPeer are things like doctored NMR spectra in chemistry (as Dave Fernig mentioned above).

      It is interesting that you mention mathematics. I suspect that it would be nearly impossible to demonstrate fraud in that field. What would it look like? An error in some derivation. To demonstrate that it was fraud, you would have to show that it was introduced knowingly, which must be very difficult. (Is there a requirement for mathematicians to keep “lab books” of their preliminary work?) In any case, I suspect that having an error demonstrated in your work as a mathematician is almost as damaging as an accusation of fraud.

      The same notion can be extended to theoretical and big data fields. It is probably possible to influence the results through a number of model assumptions, which may be criticized, but that are generally not characterized as fraud. Again, there is a strong tendency not to consider as misconduct anything that is complex or difficult to understand.

      Mathematicians and physicists are quite used to exchanging ideas via the ArXiv, which may also reduce their need for a platform like PubPeer (in addition, we have no facility for displaying equations).

      In one of our comments above, we opined that the complexity and variability of preparations in life science, and the fact that research in that field is essentially about discoveries rather than developing general theories, might offer more latitude for fabricated results to escape undetected. But maybe the difference is not as extreme as that: research everywhere is dealing with the unknown, so people usually don’t know what to expect.

      Nevertheless, some subjects of research are so complex that there is no realistic hope of establishing a mechanism (a lot of psychology…). The absence of any requirement for a mechanism probably does make it harder to detect fraud and incorrect results, as there are simply fewer firm predictions to test. The opposite is probably true to some extent in the physical sciences.

      • I don’t know about these explanations – maybe.

        As to you mention of the field of “psychology,” there have in fact been quite a number of Office of Research Integrity (ORI) findings against persons in psychology research (see Annual Reports on the ORI website, ori.hhs.gov):

        1995: Danya Vardi, former Research Associate in Psychology at Harvard Medical School, fabricated emotional recall responses of human subjects in research.

        1997: Christopher Leonhard, graduate student in Psychology at Dartmouth College, fabricated experimental and surgical records for research.

        1998: Katrina Berezniak, Research Assistant in Psychology at University of Missouri–St. Louis, falsified scoring of taped interviews of nine human research subjects.

        1999: Karrie Recknor, Graduate Research Assistant in Psychology at University of Washington, falsified electronic mail responses of a collaborator in confirming interview scoring.

        2001: Karen M. Ruggiero, Ph.D., former Assistant Professor, Department of Psychology at Harvard University, fabricated records for hundreds of interviews of women on discrimination.

        2006: Amy Goldring, graduate student in Psychology at University of California at Los Angeles, falsified or fabricated data and statistical results for nine pilot studies on vulnerability in decision-making.

        2007: Nicholas McMaster, undergraduate student in Psychology at University of Chicago, fabricated data on reflex scoring and cell type recording on rats.

        2008: Roxana Gonzalez, graduate student in Social and Decision Sciences and Psychology, falsified the main dependent variable and data in human subjects research.

        2012: Marc Hauser, Professor of Psychology at Harvard University falsified data on tamarin monkeys behavior in multiple papers.

        2013: Adam Savine, doctoral student in Psychology at Washington University St. Louis, falsified data on memory patterns to improve the statistical significance.

  7. I think the main problem is the missing job safety for scientists. Scientists who have to “publish or perish” have a high incentive to cheat – it may be a question of survival in science. If one wants independent and honest scientists, one should give them security. Basic job security would not be expensive at all – a small but safe salary would be sufficient.

    This would not only solve a large part of the problem of cheating. To get rid of the “publish or perish” would also increase the overall quality of the papers. It would also solve the problem with string theory – that essentially the whole research in fundamental science is concentrated on a very small number of directions. A scientist without a permanent position has simply no possibility to invest several years of research into own ideas – his scientific career would be finished before he would be able to publish something. With low paid but safe jobs a lot of young scientists would take the risc and invest their own life into their own idea. Of course, there would be a high rate of failure – but there is no reason to expect that it is much higher than that of todays few fashionable directions.

    In other domains where we want to have honest people doing the job, we also give them job security. Or do you want to have judges who have to apply every two years for a new job extension, based on, say, the number of …., hm which number?

    • yes, and the paradox is that often good talents inside private sector sometime find safer places, where their boss give them time to investigate, sometime decades, and trust their intuition.
      You find such places in Toyota, Mitsubishi, SRI, IBM (see Senior Fellow), Lockheed Martin…

      Those companies often have understood that it was counterproductive to put their nose of their talented scientists business…

      The problem is that they are interested in domain which have application in the next decade, not in basic science…
      Academic science today don’t look for result, but for paper… but this paper have to be produced in a very short term.

      worst is when you mix both systems, with academic research funded by private sector… when not publishing mean not being funded.

  8. Totally agree with the author. Basically, I don’t believe other peoples results unless I can reproduce them. Unfortunately many times I can’t reproduce them. Also, there is almost no or little price for reporting exaggerate or fake eye catching results even in high profiling journals. On the contrary, the reward is tremendous, money, job, promotion….. Hardworking & honesty is much less rewarded than cheating and cooking book. That’s the reality in today’s academy.

Leave a Reply

Your email address will not be published. Required fields are marked *