I like to "see" data in a disaggregated fashion. I have been a little perplexed about the adjustments in the GHCN-v2 data set, so I decided to look at the frequency and nature of adjustments for every observation in the v2.mean_adj file by plotting pie charts for each month.
There are five types of adjustments:
- Positive (red):
- The mean temperature in
v2.mean_adjis greater than the mean temperature inv2.mean. - Negative blue):
- The mean temperature in
v2.mean_adjis smaller than the mean temperature inv2.mean. - Zero (white):
- The mean temperature in
v2.mean_adjis the same as the mean temperature inv2.mean. - Missing value to non-missing (white):
- The mean temperature is missing (
-9999) inv2.mean, but there is a non-missing value inv2.mean_adj. - Non-missing to missing (white):
- There is a non-missing value in
v2.mean, but the corresponding value inv2.mean_adjis missing. - Missing to missing (white):
- An observation is missing in both files.
Finally, there are a bunch of observations in v2.mean for which there is no entry in v2.mean_adj. I excluded those observations when plotting these charts.
Here is the result:
Pay attention through the 80s and the 90s.
As far as I can tell, complete lack of positive or negative adjustments after March 2006 is not a programming error on my part.
Caveat: These charts say nothing about the magnitude of adjustments. They just display relative frequency by type.
Note: Both v2.mean and v2.mean_adj used to produce this animation were dated 2010/10/15.

WOW! Sinan, I also like to look at things visually and I have had a very different way of looking at adjustments that I have not yet put up on the blog. In part this is because I have to be sure of what my analysis has been telling me. Yours is quite different and powerful because you are able to look at the individual months.
ReplyDelete@VJones: One of the reasons I am putting these up is I always have that nagging feeling that I might be doing something wrong. I would not really call what I am doing analysis ;-) Just running some queries and seeing what comes out. It's been a few years since I last looked at this stuff.
ReplyDelete@VJones: By the way, I would love to see how you are visualizing these adjustments. One of these days, I am going to do an empirical CDF for each month but that is so much more time consuming (both machine and programmer).
ReplyDelete