I’ve been reading up on the early days of the whole shebang, starting with the early correspondence between McI and Mann.
Mann told Antonio Regalado of the Wall Street Journal that he would not be “intimidated” into releasing source code for MBH98.
Here is an account of correspondence with Mann and with the U.S. National Science Foundation:
Even before publication of MM03, we politely requested clarification on issues in MBH98. This was a source of controversy in late 2003. Here is a record of correspondence with Mann which we made available some time ago.
After publication of MM03, Mann argued that MM03 contained an incorrect implementation of a stepwise principal components procedure (which was not documented in MBH98) . Details of this procedure have continued to drift in, with the first listing of the number of PC series retained in each calculation step/tree ring network combination provided in the July 2004 Corrigendum SI. This listing was inconsistent with prior information.
In August 2004, through Nature, we became aware privately of claims that a variation of Preisendorfer’s Rule N had been used to determine the number of retained PC series. This claim was published in November 2004. (We have not been able to verify actual application of this criterion, as actual numbers are impossible to replicate. See Was Preisendorfer’s Rule N Used?
In any event, immediately after we learned of the previously undocumented stepwise procedure, we asked to inspect MBH98 source code so that we can completely reconcile results and avoid this type of dispute. Attached is our correspondence after MM03, which obviously cannot be construed as a form of “intimidation”, but as an entirely proper request.
Subsequently, we located some Fortran code at Mann’s FTP site for the calculation of tree ring principal components. Although this code is only a very small fraction of the total code, it contains a procedure which was materially misrepresented in MBH98 and which additionally is not statistically valid. We reported on this in MM05 (GRL) and MM05(EE).
As I’ve pointed out in various postings on Replication, it is impossible on the present record to replicate important steps in other parts of MBH98.
After our unsuccessful attempts at obtaining source code, we asked the U.S. National Science Foundation for assistance. This was also unsuccessful. The correspondence is here.
We also made attempts with Nature and I’ll get to describing this on another day.
Since the links are to the now-defunct website, I used the wayback machine to locate this page which has a record of McIntyre’s correspondence with Mann etc.
Here’s the first email, dated April 8, 2003:
Dear Dr. Mann,
I have been studying MBH98 and 99. I located datasets for the 13 series used in 99 at ftp://eclogite.geo.umass.edu/pub/mann/ONLINE-PREPRINTS/Millennium/DATA/PROXIES/ (the convenience of the ftp: location being excellent) and was intereseted in locating similar information on the 112 proxies referred to in MBH98, as well as listing (the listing at http://www.ngdc.noaa.gov/paleo/ei/data_supp.html is for 390 datasets, and I gather/presume that many of these listed datasets have been condensed into PCs, as mentioned in the paper itself.
Thank you for your attention.
Stephen McIntyre, Toronto, Canada
There is some back and forth between Rutherford and McI that you can see at the link.
Here is another interesting email from McI to Mann, dated September 9, 2003:
Dear Prof. Mann,
I have tried diligently to reconstuct your termperature principal components as described in MBH98, but without success and would appreciate some assistance.
I downloaded hadcrut2.dat from CRU (July 2003 edition), truncated the data to 1902-1995 and further truncated it to the 1082 cells at gridpoints.loc and arranged as 1082 time-series with 1128 monthly readings. This step was successful as I could match your map of cell locations. I standardized each series to mean 0 and sd 1 for the period 1902-95. In MBH98, you say that you carried out “conventional” PCA, but there is so much missing data that conventional PCA failed when I tried. In particular, 4 cells had no values at all and I don’t see why they were included in your selection. Most PCA algorithms balk at missing data or exclude it. How did you deal with the extensive missing data?
I downloaded the EOFs, PCs and eigenvector loadings from ftp://eclogite.geo.umass.edu/pub/mann/MANNETAL98/EIGENVECTORS/ . I spliced the EOFs into a 16×1082 matrix and the PCs (pc01.out, etc.) into a 92×16 matrix. I made a diagonal of the first 16 values in column 2 of “tpca-eigenvals.out”, which look like eigenvalues, and carried out an expansion. I then deducted the grid-box values generated from this expansion from the Jones data as above; calculated variance for each year across available cells and made a sum, comparing this to the variance similarly calculated in the standardized Jones data. I obtained very low/much lower explained variance from this than you got. I also tried some experiments and it also doesn’t seem to me that the first 16 EOFs maximize explained variance, as they should. I would appreciate any assistance or clarification which you could give.
Regards, Steve McIntyre
He received no reply on this and so sent the following, dated September 25, 2003:
Dear Prof Mann
Here is the pcproxy.txt file sent to me last April by Scott Rutherford at your direction. It contains some missing data after 1971. Your 1998 paper does not describe how missing data in this period is treated and I wanted to verify that it is the correct file. How did you handle missing data in this period? In earlier periods, it looks like you changed the roster of proxies in each of the periods described in the Supplementary Information using only proxies available throughout the entire period. I have obtained quite close replication of the rpc1 in the 20th century by calculating coefficients for the proxies and then calculating the rpc’s using the minimization procedures described in MBH98 and the selection of PCs in the Supplementary Information. The reconstruction is less close in earlier periods. I also don’t understand the reasoning for reducing the roster of eigenvectors in earlier periods. The description in MBH98 was necessarily very terse and is still very terse in the Supplementary Information; is there any more detailed description of the reconstruction methodology to help me resolve this? Thank you for your attention.
The response, on the same day:
Dear Mr. McIntyre,
A few of the series terminate prior to the nominal 1980 termination date of the calibration period (the earliest such instance, as you note, is 1971). In such cases, the data were continued to the 1980 boundary by persistence of the final available value. These details in fact, were provided in the supplementary information that accompanied the Nature article. That information is available here (see first paragraph):
The results, incidentally, are insensitive to this step; essentially the same reconstruction is achieved if a calibration period terminating in 1970 (prior to the termination of any of the proxy series) was used instead.
Owing to numerous demands on my time, I will not be able to respond to further inquiries.
Other researchers have successfully implemented our methodology based on the information provided in our articles [see e.g. Zorita, E., F. Gonzalez-Rouco, and S. Legutke, Testing the Mann et al. (1998) approach to paleoclimate reconstructions in the context of a 1000-yr control simulation with the ECHO-G Coupled Climate Model, J. Climate, 16, 1378-1390, 2003.]. I trust, therefore, that you will find (as in this case) that all necessary details are provided in the papers we have published or the supplementary information links provided by those papers.
Best of luck with your work.
Michael E. Mann
So according to Mann, Zorita et al. successfully reconstructed the MBH methodology.
Here’s the abstract:
Statistical reconstructions of past climate variability based on climate indicators face several uncertainties: for instance, to what extent is the network of available proxy indicators dense enough for a meaningful estimation of past global temperatures?; can statistical models, calibrated with data at interannual timescales be used to estimate the low-frequency variability of the past climate?; and what is the inﬂuence of the limited spatial coverage of the instrumental records used to calibrate the statistical models? Possible answers to these questions are searched by applying the statistical method of Mann et al. to a long control climate simulation as a climate surrogate. The role of the proxy indicators is played by the temperature simulated by the model at selected grid points.
It is found that generally a set of a few tens of climate indicators is enough to provide a meaningful estimation (resolved variance of about 30%) of the simulated global annual temperature at annual timescales. The reconstructions based on around 10 indicators are barely able to resolve 10% of the temperature variance. The skill of the regression model increases at lower frequencies, so that at timescales longer than 20 yr the explained variance may reach 65%. However, the reconstructions tend to underestimate some periods of global cooling that are associated with temperatures anomalies off the Antarctic coast and south of Greenland lasting for about 20 yr. Also, it is found that in one 100-yr period, the low-frequency behavior of the global temperature evolution is not well reproduced, the error being probably related to tropical dynamics.
This analysis could be inﬂuenced by the lack of a realistic variability of external forcing in the simulation and also by the quality of simulated key variability modes, such as ENSO. Both factors can affect the large-scale coherence of the temperature ﬁeld and, therefore, the skill of the statistical models.
Perhaps people can help me out here:
McI makes a request to Mann for data so he can do his ‘study’ on MBH98/99. Mann refers McI to his associate, Rutherford, who can’t locate the ftp site with the data and so sends McI an excel file. As a result, McI cannot replicate the results and so McI asks for more assistance and help working out the methodology. Mann points him to Zorita et al. who he argues successfully reproduced the methodology in their climate modeling.
It is only after M&M03 that Mann responds that the Excel data was erroneous. Thus the M&M03 findings were wrong because they used the wrong data.
Here’s the original paper: Corrections to the Mann et al. (1998) Proxy Database and Northern Hemispheric Average Temperature Series
Here’s the abstract:
The data set of proxies of past climate used in Mann, Bradley and Hughes (1998, “MBH98” hereafter) for the estimation of temperatures from 1400 to 1980 contains collation errors, unjustifiable truncation or extrapolation of source data, obsolete data, geographical location errors, incorrect calculation of principal components and other quality control defects. We detail these errors and defects. We then apply MBH98 methodology to the construction of a Northern Hemisphere average temperature index for the 1400-1980 period, using corrected and updated source data. The major finding is that the values in the early 15th century exceed any values in the 20th century. The particular “hockey stick” shape derived in the MBH98 proxy construction – a temperature index that decreases slightly between the early 15th century and early 20th century and then increases dramatically up to 1980 — is primarily an artefact of poor data handling, obsolete data and incorrect calculation of principal components.
Here’s the second E&E paper: The M&M Critique of the MBH98 Northern Hemisphere Climate Index: Update and Implications:
Here’s the abstract:
The differences between the results of McIntyre and McKitrick and Mann et al.  can be reconciled by only two series: the Gaspé cedar ring width series and the first principal component (PC1) from the North American tree ring network. We show that in each case MBH98 methodology differed from what was stated in print and the differences resulted in lower early 15th century index values.
In the case of the North American PC1, MBH98 modified the PC algorithm so that the calculation was no longer centered, but claimed that the calculation was “conventional”. The modification caused the PC1 to be dominated by a subset of bristlecone pine ring width series which are widely doubted to be reliable temperature proxies. In the case of the Gaspé cedars, MBH98 did not use archived data, but made an extrapolation, unique within the corpus of over 350 series, and misrepresented the start date of the series. The recent Corrigendum by Mann et al. denied that these differences between the stated methods and actual methods have any effect, a claim we show is false. We also refute the various arguments by Mann et al. purporting to salvage their reconstruction, including their claims of robustness and statistical skill. Finally, we comment on several policy issues arising from this controversy: the lack of consistent requirements for disclosure of data and methods in paleoclimate journals, and the need to recognize the limitations of journal peer review as a quality control standard when scientific studies are used for public policy.
Them’s fightin words.
Here’s the Corrigendum:
I’ll post some commentary from other observers later.
So, the outstanding issues, according to M&M include the correct data used for MBH98/99 — what is the correct data? There is a list of questions here:
Here they are:
QUESTIONS FOR PROFESSORS MANN, BRADLEY AND HUGHES THAT ARISE FROM THIS ANALYSIS.
These questions summarize the results of our audit of the data set. Answers to these questions are required to settle the contradiction between the original and corrected results.
1. Does the database contain truncations of series 10, 11 and 100? (and of the version of series 65 used by MBH98)?
2. Are the 1980 values of series #73 through #80 identical to 7 decimal places? Similarly for the 1980 values of series #81-83? And for the 1980 values of series #84 and #90-92? What is the reason for this?
3. Where are the calculations of principal components for series in the range #73-92 that would show that these have been collated into the correct year? Do you have any working papers that show these, and if so, would you make them FTP or otherwise publicly available?
4. Do the following series contain “fills”: #3, #6, #45, #46, #50-#52, #54-#56, #58, #93-#99?
5. How did you deal with missing closing data in the following series: #11, #102, #103, #104, #106 and #112?
6. What is the source for your data for series #37 (precipitation in grid-box 42.5N, 72.5W)? Did you use the data from Jones-Bradley Paris, France and if so, in which series? More generally, please provide, identifications of the exact Jones-Bradley locations for each of the series #21-42. Where are the original source data?
7. Did you use summer (JJA) data for series #10 and #11 rather than annual data. If so, why?
8. Does your dataset contain obsolete data for the following series: #1, #2, #3, #6, #7, #8, #9, #21, #23, #27, #28, #30, #35, #37, #43, #51, #52, #54, #55, #56, #58, #65, #105 and #112?
9. Do you use the following listed proxies: fran003, ital015, ital015x, spai026 and spai047? If so, where?
10. Did you commence your calculation of principal components after the period in which all dataset members were available for the following series: #69-71, #91-92, #93-95, #96-99?
11. What is the basis for inclusion of some tree ring sites within a region in regional principal component calculations and others as individual dataset components?
12. Did you commence your calculation of principal components before the period in which all dataset members were available for the following series: #72-80, #84-90? If so, please describe your methodology for carrying out these calculations in the presence of missing data and your justification for doing so?
13. What is the explained variance under your principal component calculation for the period of availability of all members of your selected dataset? Would you please make your working papers that show this FTP or otherwise publicly available?
I encourage readers who are in the know to comment or provide links to analysis of these questions. I’ll try to post response from MBH when I have some time.
From my limited read of the literature surrounding all this, it seems to me that the HS controversy is limited to a few main issues:
1. PC Analysis — short centered vs. conventional PC analysis
2. Data used — which data set was used, and how were missing data filled in.
3. Use of certain proxies, such as BCPs and others.