• Record: found
  • Abstract: found
  • Article: found
Is Open Access

Statistical analysis of numerical preclinical radiobiological data

Read Bookmark



Scientific fraud is an increasingly vexing problem. Many current programs for fraud detection focus on image manipulation, while techniques for detection based on anomalous patterns that may be discoverable in the underlying numerical data get much less attention, even though these techniques are often easy to apply.


We applied statistical techniques in considering and and comparing data sets from 10 researchers in one laboratory and three outside investigators to determine whether anomalous patterns in data from a research teaching specialist (RTS) were likely to have occurred by chance. Rightmost digits of values in RTS data sets were not, as expected, uniform. Equal pairs of terminal digits occurred at higher than expected frequency (>10%) and an unexpectedly large number of data triples commonly produced in such research included values near their means as an element. We applied standard statistical tests (chi-square goodness of fit, binomial probabilities) to determine the likelihood of the first two anomalous patterns and developed a new statistical model to test the third.


Application of the three tests to various data sets reported by RTS resulted in repeated rejection of the hypotheses (often at p-levels well below 0.001) that anomalous patterns in those data may have occurred by chance. Similar application to data sets from other investigators was entirely consistent with chance occurrence.


This analysis emphasizes the importance of access to raw data that form the bases of publications, reports, and grant applications in order to evaluate the correctness of the conclusions and the importance of applying statistical methods to detect anomalous, especially potentially fabricated, numerical results.

Related collections

Author and article information

[1]Renaissance Associates, Princeton, NJ, USA
[2]NJ Medical School, Rutgers University, Newark, NJ 07101-1709, USA
Author notes
[*]Corresponding author’s e-mail address:
(View ORCID Profile)
ScienceOpen Research
22 January 2016
: 0 (ID: 8aa0f248-2bad-44c6-adfd-42816c14c272)
: 0
: 1-22
© 2016 Pitt et al.

This work has been published open access under Creative Commons Attribution License CC BY 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Conditions, terms of use and publishing policy can be found at

Figures: 2, Tables: 3, References: 28, Pages: 22
Original article


2017-01-31 17:25 UTC
2017-01-31 17:18 UTC
2017-01-26 21:17 UTC
2017-01-26 18:14 UTC

Let me begin by commending the authors for a thought-provoking and persuasive paper on an important topic.  Since my background is in political science, not biology, and since Chris Hartgerink’s review aptly discussed a number of technical issues, I will focus my remarks on some of the big picture issues that arise in statistical exercises designed to detect data fraud, responding to the authors’ comment about “routine application” of this type of investigation:

“We believe that routine application of statistical tools to identify potential fabrication could help to avoid the pitfalls of undetected fabricated data just as tools, for example, CrossCheck and TurnItIn, are currently used to detect plagiarism.” (p.1)

Like the authors, I see no reason that that statistical methods for detecting fabrication should be “all but ignored…by the larger world.”  Clearly, these methods are useful and have the potential to be quite convincing, and the current application is a case in point. 

Still, if applied in a “routine” manner, the statistical detection of irregularities raises the question of how the presumption of innocence on the part of the accused should be built into the statistical analysis.   The authors say little about this, for understandable reasons. This manuscript presents such overwhelming statistical evidence that almost any priors about innocence would be swept aside by the tsunami of evidence showing that RTS’s data are incompatible with basic probability models, with data generated by other works, and by data generated by other labs. 

However, because the authors ultimately wish to speak to the policy question of whether such data checks should become routine, they should step back and consider the consequences of routine checks performed on a vast scale across a large number of labs.  Even in a world were no data fraud occurs, suspicious patterns will occur by chance.  For example, if 100,000 innocent data producers were subjected to routine checks, 1000 of them will be the objects of suspicion based on hypothesis tests with a 0.01 Type I error rate.  Perhaps more poignantly, 10 of them will be objects of quite intense suspicion based on hypothesis tests with at 0.0001 Type I error rate.  The authors should consider the systemic implications of mandating routine checks and how one might develop statistical procedures and investigative guidelines that balance the aim of detecting data fraud and the downside risk of false inculpation.

More generally, I would recommend that the authors place their analysis (and policy prescriptions) within a Bayesian framework.  Prior to conducting the statistical investigation, the investigator who suspects fraud starts with some priors about whether the lab worker fabricated the data.  These priors might be informed by a number of circumstantial facts – for example, was the lab worker unsupervised when recording the data?  The authors of this paper start with a null hypothesis of no fabrication and try to reject it at some high level of significance, but another approach is to begin with a strong presumption of innocence (e.g., a prior probability of data fabrication of 0.001 or less) as an input into Bayes’ Rule.  Next, the researcher assesses (theoretically or intuitively through experience) the likelihood of observing statistical evidence suggesting fabrication given that fabrication occurred as well as the likelihood of observing statistical evidence suggesting fabrication given that fabrication did not occur.  These likelihoods depend on the characteristics of the statistical tests (such as the Poisson model the authors propose) and on intuitions about the fabrication process.  In this regard, the authors do a nice job of suggesting that the suspicious triplet pattern at the center of their analysis would be consistent with data fabrication because it would be especially convenient for the data fabricator. 

More subtly, these likelihoods also depend on how many such tests were conducted and which ones were presented.  Again, in this particular application, the results that the authors present are overwhelming, and there is no reason to think that tests other than the ones presented were conducted or would be relevant.  In the general case, however, an impartial reader might wonder whether tests other than the ones presented were conducted and, if so, how they would affect the posteriors that would emerge from Bayes’ Rule. Just as the authors call for open access to replication data, they should also call for transparency and comprehensiveness in reporting of investigative analyses.  Forensic exercises such as this one should report the full set of tests that were conducted (including code and data) so that the reader is assured that tests were not presented selectively or inaccurately.  The concluding section of the paper might fruitfully include a checklist that lays out what investigations of this sort should present to readers.

Suppose that the evidence is presented in a comprehensive and accurate manner.  The final step would be to generate a posterior probability of data fabrication given the evidence using Bayes’ Rule.  For example, one could plug in the inputs at one of the many on-line sites like this one:

In some cases, the results of this exercise may be sensitive to the prior probability of guilt.   If the procedure is being applied to all researchers as a matter of “routine,” then the prior probability may be fairly low (if one believes that data-faking tends to be rare).  Given the severity of the accusation of data fraud, it may make sense as a matter of policy to keep the prior fairly low even when a specific person comes under investigation in the wake of some suspicious behavior.

The results from Bayes’ Rule may also be sensitive to the specified probability of inculpatory evidence given data fabrication.  Although this paper presented an intuitive theory about why data fraud in this domain would take the form that it did, in other situations we may not have a clear sense of how fabrication would occur, and so it may be hard to pin this quantity down.  I would be curious to hear the authors’ thoughts on how this quantity should be handled as part of routine surveillance.

In conclusion, the authors do a good job of developing statistical tests tailored to the application at hand and presenting overwhelming evidence of guilt.  The final section of their paper summarizes other cases of fraud in which statistical irregularities resulted in a fuller investigation that left no doubt about guilt.  Less is said about instances in suspicions were found to be groundless (or ambiguous) upon further statistical investigation.  In sum, would invite the authors to say more about the potential for, and systemic implications of, false alarms.

2016-07-01 10:50 UTC

Comment on this article

Register to benefit from advanced discovery features on more than 28,000,000 articles

Already registered?