by Judith Curry
Continuing the themes of conflict prevention and best practices developed in Part III, I would like to discuss some pages from Edward Tufte’s book Beautiful Evidence, which was introduced here by Steve Mosher (seconded by MrPete). Of particular relevance is a chapter entitled “Corrupt Techniques in Evidence Presentations: Effects Without Cause, Cherry Picking, Punning, Chartjunk.”
From the introduction to the chapter:
Making a presentation is a moral act as well as an intellectual activity. . . Consumers of presentations should insist that presenters be held intellectually and ethically responsible for what they show and tell. Thus consuming a presentation is an intellectual activity and a moral act.
From the subsection entitled “Cherry Picking, Evidence Selection, Culled Data”
The most widespread and serious thread to learning the truth from an evidence-based report is cherry-picking, as presenters pick and choose, selet and reveal only the evidence that advances their favored point of views.
Not all presenters are saintly enough to provide their audience with competing explanations, contrary evidence or a description of the larger pool of evidence tapped to construct the presentation.
Credible explanations grow from the combined testimony of 3 more less independent, mutially reinforcing sourcesl–explanatory theory, empirical evidence, and rejection of competing alternative explanations. Cherry-picking dilutes, confounds and mixes up these 3 elements into the wisful circular thinking of Ignorance.
Between the intial data collection and the final published report falls the shadow of the evidence-construction and evidence-represntation process: data are selected, sorted, edited, summarized, massaged, and translated into the published graphics, tables, diagrams, models, images, numbers, words. In this sequence of representation, the physical world is represented and summarized by the raw data and, in turn, the raw data is represented and summarized by the graphics, tables, images of the published report.
This process of evidence construction and representation, although not a black box but certainly a gray area, consists of all the decisions that cause the published findings of a report. These decisions are made, to varying degrees, both in the spirit of doing analytical detective work to discover what is going on and in the spirit of advancing a favored point of view.
What, then are consumers of reports and presentations to do?
The integrity and credibility of a report depend on the integrity and credibility of the process of evidence-construction and evidence-representation: thus alert consumers need to see an adequate description of this process.
Ask the following: (1) Are the substantive findings the result of the methods used in the evidence-construction process? (2) Did the favored view compromise the integrity of the analysis? (3) How much does the decision to be made depend on the evidence of the report at hand?
JC comments: I find this essay interesting for several reasons. The essay puts the burden also on the consumer of evidence (not just the producer). The policy makers (who are the primary consumers) seem less interested in the traceability of evidence construction and representation. The probity is coming from people in industry (I don’t particularly mean oil industry, but broader industry), where this kind of accountability is expected. It was the absence of information of evidence construction and representation (not just the data itself) that fueled the skeptical climate blogosphere. And when this information materialized in the context of the emails, well, there was a blogospheric explosion.
As to the question, “Did the favored view compromise the integrity of the analysis?” It certainly seems to have. Kevin Trenberth lays it on the line when he says the null hypothesis should be changed to AGW is real and dangerous; this seems to be the implicit null hypothesis of the IPCC. Bias can enter into such an analysis explicitly or subliminally; the remedy is full transparency so people can examine your judgements and reasoning for doing what you did. And so, we are back to basics: transparency and integrity.
So, given that data of natural systems is messy (much messier than data from controlled laboratory experiments), I am interested in examples of good and bad practices of data representation (beyond the hockey stick).
