by Judith Curry
The demand for climate information, with long observational records spanning decades to centuries and the information’s broad application for decision making across many socioeconomic sectors, requires that geophysicists adopt more rigorous processes for the sustained production of climate data records (CDRs). Such processes, methods, and standards are more typically found in the systems engineering community and have not generally been adopted in the climate sci- ence community. – John Bates and Jeffrey Privette
The latest issue of EOS has an article entitled A maturity model for assessing the completeness of climate data records. The paper is behind paywall, here are some liberal excerpts:
We propose the use of a maturity matrix for climate data records that characterizes the process of moving from a basic research product (e.g., raw data and initial product) to a sustained and routinely generated product (e.g., a quality-controlled homogenized data set).
This model of increasing product and process maturity is similar to NASA’s technical readiness levels for flight hardware and instrumentation and the software industry’s capability maturity model. Over time, engineers who have worked on many projects developed a set of best practices that identified the processes required to optimize cost, schedule, and risk. In the NASA maturity model, they identified steps in technology readiness, denoted as the technology readiness level (TRL). TRL 1 occurs when basic research has taken the first steps toward application. TRL 9 is when a technology has been fully proven to work consistently for the intended purpose and is operational.
Similarly, the computer software industry has widely adopted the Capability Maturity Model Integration (CMMI) to develop software processes to improve performance, efficiency, and reproducibility.
[T]here are numerous iterative steps involved in the creation of climate data records. These steps can be imagined as an expanding spiral, beginning with instrument testing on the ground, expanding to calibration and validation of the instrument and products to archiving and preservation of relevant data and provenance of the data flow, and finally broadening to comparisons and assessments of the products. In addition, the sustained involvement of research experts is required, as history has shown that new problems in producing homoge- neous CDRs arise as different instruments are used over time to observe the climate.
The proposed CDR maturity matrix combines best practices from the scientific community, preservation description information from the archive community, and software best practices from the engineering community into six levels of completeness. These maturity levels capture the community best practices that have arisen over the past 2 decades in fielding climate observing systems, particularly satellite observing systems. Each level is defined by thematic areas: software readiness (stability of code), metadata (amount and compliance with international standards), documenta- tion (description of the processing steps and algorithms for scientific and general com- munities), product validation (quality and amount in time and space), public access (availability of data and code), and utility (uses by broader community).
Maturity levels 1 and 2 are associated with the analysis of data records from new instruments or a new analysis of historic observations or proxy observations. Although products at this stage of development may be used in research, there is insufficient maturity of the product for it to be used in decision making. Initial operational capability (IOC) is achieved in maturity levels 3 and 4. At these levels, the product has achieved sufficient maturity in both the science and applications that it may tentatively be used in decision making. Finally, full operational capability (FOC), levels 5 and 6, is achieved only after the product has demonstrated that all aspects of maturity are complete. This level of matu- rity ensures that the CDR product can be reli- ably used for decision making.
Quantifiable standards should exist at each maturity level for each thematic area. For example, peer-reviewed publications are required in three separate areas to address product documentation, validation, and utility. The maturity level matrix also pays particular attention to software maturity and access. This includes requiring that the code be managed and reproducible, that meta- data have provenance tracking and meet international standards, and that all code be publicly accessible. The product must be assessed by multiple teams, and positive value must be demonstrated; uncertainty must be documented. Each of these steps must be independently verifiable.
This maturity matrix model may serve in the future as a requirement for use of data sets in international assessments or in other societal and public policy applications, similar to certification programs that engineering professions conduct. The model focuses on process improvement to ensure traceability and transparency of CDRs but includes steps related to standard scientific review and assessment. Adoption of this standard by the climate community would help ensure quality long-term CDRs and facilitate their use in decision making across all natural and social science disciplines.
JC message to NOAA: MUCH more of this, please.
The hodge podge nature of climate data records was revealed by the Climategate emails. Since then, the situation has been moving in a much better direction, large through the efforts of John Bates, which I have been aware of since the early planning stages a number of years ago.
Establishment of credible Climate Data Records is essential not only for climate research, but also for ‘climate services.’ NOAA – save your money and axe the rest of ‘climate services;’ the private sector, universities and local governments can handle the rest as needed. Spend your money on the Climate Data Records.