First, some background: On August 6, 2012, NASA issued a press release with the title Research Links Extreme Summer Heat Events to Global Warming (I saw it thanks to Anthony Watts). Accompanying the article is an animation that purports to show how the density curve of the temperature distribution is shifting to the right and it is getting fatter tails (i.e. temperatures are higher, and extremely high temperatures are more likely) as time passes.
Here is the video:
Now, from earlier examination of the GHCNv2 data set, I know that the number of temperature stations in the GHCNv2 have been declining over time (incidentally, that is the data set the authors used). Their density curve shows frequency of various temperatures as a percentage of all temperatures in the data set, but, since the set of temperature stations in the data set is not constant, this is misleading. (To see why, think about the meaning of the statement "50% of the people in the sample had a million dollars" when the sample consists of you and Bill Gates versus when it consists of 1,000 randomly chosen adults in the U.S.)
Plus, while I think I understand the purpose of using anomalies rather than temperature levels (to remove assumed to be constant within station variability), I am always very uncomfortable with arbitrarily chosen sub-periods with respect to which those anomalies are calculated.
In short, I like to see the entire data set first, before putting it on the chopping block.
How can I do that in this case?
Well, at first I thought I would download the most up-to-date GHCNv2 data set, and check out the distribution of number of observations in the entire data set for every year and month. Well, for some reason, the GHCNv2 data sets are no longer there.
I do have the data set from 2010 somewhere on an external hard drive, but there was really no reason to go hunting for that, so decided to download and check out the GHCNv3 data set. So, I downloaded the quality control adjusted data set
ghcnm.tavg.v18.104.22.16820809.qca.dat. I wrote a quick Perl script to extract the nonmissing observations and put the data in an SQLite database.
First, let's look at the distribution of Northern hemisphere temperatures. Keep in mind that the heights of the bars correspond to the number of observations in that bin for that particular time period (as opposed to percent of observations):
Northern Hemisphere Temperatures 1850 - 2012
Now, this whole back and forth between winter and summer months makes it somewhat harder to follow what's happening. So let's look at June, July, and August distributions through the years (again, Northern hemisphere only):
Northern Hemisphere June Temperatures 1850 - 2012
Northern Hemisphere July Temperatures 1850 - 2012
Northern Hemisphere August Temperatures 1850 - 2012
I am not able to see any substantial shift. What I do see are changes in the number of observations across time periods. Sure, there is some variability in the location of bars, but nothing to write home about.