by Judith Curry
Inside Higher Education (UK) has a lengthy article by Darrel Ince entitled “Systems failure”, with the subtitle “A scandal involving clinical trials based on research that was riddled with errors shows that journals, institutions and individuals must raise their standards.” An interview with Ince can be found here. There is discussion of this on two threads at Bishop Hill (here and here). The article and interview are very thought provoking, with relevance to the context of the climate debate.
The article chronicles the publication of a 2006 paper on chemotheraphy by a group of researchers at Duke University that was touted as a major breakthrough and stimulated several clinical trials. Two biostatisticians, Baggerly and Coombes, investigated the research and found major problems with the statistical analysis, and pointed this out to the Duke researchers. They corrected some of the small errors, but remained adamant that the core research was solid. Then follows a story whereby Baggerly and Coombes attempted to publish their concerns, with little success. The issue morphed into an ethical one when the Duke research was being used on cancer patients in clinical trials. Finally they managed to get their critique published in the Annals of Applied Statistics. This came to the attention of the National Cancer Institute. Duke responded but continued with the clinical trials. Baggerly and Coombes finally resorted to FOIA request, Duke’s investigation doesn’t find any problem, and on and on the saga goes until finally there was an Institute of Medicine inquiry.
Excerpts from the article about what we should learn from this:
No one comes out of this affair well apart from Baggerly and Coombes, The Cancer Letter and the Annals of Applied Statistics. The medical journals and the Duke researchers and senior managers should reflect on the damage caused. The events have blotted one of the most promising areas in medical research, harmed the reputation of medical researchers in general, blighted the careers of junior staff whose names are attached to the withdrawn papers, diverted other researchers into work that was wasted and harmed the reputation of Duke University.
What lessons should be learned from the scandal? The first concerns the journals. They were not incompetent. Their embarrassing lapses stemmed from two tenets shared by many journals that are now out of date in the age of the internet. The first is that a research paper is the prime indicant of research. That used to be the case when science was comparatively simple, but now masses of data and complex programs are used to establish results. The distinguished geophysicist Jon Claerbout has expressed this succinctly: “An article about computational science in a scientific publication isn’t the scholarship itself, it’s merely advertising of the scholarship. The actual scholarship is the complete software development environment and the complete set of instructions used to generate the figures.”
Baggerly and Coombes spent a long time trying to unravel the Duke research because they had only partial data and code. It should be a condition of publication that these be made publicly available.
The second tenet is that letters and discussions about defects in a published paper announcing new research have low status. Journals must acknowledge that falsifiability lies at the heart of the scientific endeavour. Science philosopher Karl Popper said that a theory has authority only as long as no one has provided evidence that shows it to be deficient. It is not good enough for a journal to reject a paper simply because it believes it to be too negative.
Journals should treat scientists who provide contra-evidence in the same way that they treat those putting forward theories. For an amusing and anger-inducing account of how one researcher attempted to have published a comment about research that contradicted his own work, see “How to Publish a Scientific Comment in 1 2 3 Easy Steps“.
The second lesson is for universities. University investigations into possible research irregularities should be conducted according to quasi-legalistic standards. In his evidence to the Institute of Medicine inquiry, Baggerly stated that he and Coombes had been hindered by the incompleteness of the Duke review – specifically in that the university did not verify the provenance and accuracy of the data that the researchers supplied to the review, did not publish the review report, did not release the data that the external reviewers were given and withheld some of the information Baggerly and Coombes had provided to the review.
The university’s explanation for not passing on the new Baggerly and Coombes material was a “commitment to fairness to the faculty” and a senior member of the research team’s “conviction and arguments, and in recognition of his research stature”. A similar argument in a court of law would not have been allowed.
The third lesson is for scientists. When research involves data and computer software to process that data, it is usually a good idea to have a statistician on the team. At the “expense” of adding an extra name to a publication, statisticians provide a degree of validation not normally available from the most conscientious external referee. Indeed, the statistics used might merit an extra publication in an applied statistics journal. Statisticians are harsh numerical critics – that’s their job – but their involvement gives the researcher huge confidence in the results. Currently the scientific literature, as evidenced by the major research journals, does not boast any great involvement by statisticians.
A fourth lesson from the Duke affair concerns reproducibility. The components of a research article should be packaged and made readily available to other researchers. In the case of the Duke study, this should have included the program code and the data. This did not happen. Instead, Baggerly and Coombes spent about 200 days exploring the partial materials provided to conduct their forensic investigation. In a pre-internet, pre-computer age, packaging-up was less of an issue. However, the past decade has seen major advances in scientific data-gathering technologies for which the only solution is the use of complex computer programs for analysis.
A number of tools are now being developing for packaging up research. Among the best is Sweave, a software system that combines an academic paper, the data described by the paper and the program code used to process the data into an easily extractable form. There are also specific tools for genetic research, such as GenePattern, that have friendlier user interfaces than Sweave.
What is worrying is that more scandals will emerge, often as a result of the pressure on academics, who are increasingly judged solely on the volume of their publications (some systems even give an academic a numerical rating based on paper citation) and their grants, and on how patentable their work may be. Our universities are ill-prepared to prevent scandals happening or to cope with the after-effects when they do happen. There is a clash here between collegiality and the university as a commercial entity that needs to be resolved.
About Darrel Ince
Darrel Ince is professor of computing at the Open University. His web site is here. He is writing an account of the Duke University affair with the title The Cracks in Science. Ince also wrote an article on Climategate in the Guardian.
Accompanying editorial
The Times Higher Education Supplement has an accompanying editorial, that is also quite good, with the title “To get to the truth, open up” and subtitle “More transparency from scientists, journals and institutions would go a long way to ensuring that flawed research is quickly detected.” An excerpt:
So what lessons can be learned?
We may struggle to change human nature, but we ought to be able to ensure that journals, as Professor Ince says, “acknowledge that falsifiability lies at the heart of the scientific endeavour” – they must be less quick to dismiss challenges to their published papers and more willing to admit mistakes.
Duke itself has acknowledged that in work involving complex statistical analyses, most scientists could benefit from a little help from the statistics department before publishing.
Professor Ince goes a step further, arguing that all elements of all the work (in the Duke case, the full raw data and relevant computer code) should be made publicly available so that others can replicate or repudiate the findings.
In this age of information and the internet, that can’t be too difficult, can it?
JC comments. At the heart of this is the rather extreme disincentives for researchers to admit and correct mistakes; this needs to change. But the system will be self correcting with greater transparency (public availability of raw data and computer code). Some of the parallels to recent episodes in climate science are striking. The interesting thing about this example is that it wasn’t just an academic dispute with the egos of researchers at stake, but the research impacted chemotherapy decisions with life and death impacts. The climate dispute was mostly academic prior the 1990’s but now it as at the heart of international energy policy debate. The single most important thing that could be done is institutionalizing the requirement for complete transparancy of data, methods, code.
