Tuesday, November 24, 2009

On "tricks" and science

Are people really claiming that the word "trick", when used by climate scientists to describe a data analysis technique via a mailing list for other climate scientists, indicates some nefarious conspiracy of deception? Why yes, they are, and Acephalous has the rebuttal:

Global warming skeptics are attacking climate scientist Phil Jones for encouraging trickery in an email recently stolen off the webmail server at the University of East Anglia in which he wrote:

I've just completed Mike's Nature trick of adding in the real temps to each series for the last 20 years (ie from 1981 onwards) amd from 1961 for Keith's to hide the decline.

Over at RealClimate, the skeptical response to the word "trick" is to treat it as a colloquial:

“a cunning or deceitful action or device; “he played a trick on me”; “he pulled a fast one and got away with it”
“Something designed to fool or swindle; ”
“flim-flam: deceive somebody; “We tricked the teacher into thinking that class would be cancelled next week”

. . .

Schmidt obliges:

. . . It's mostly used in mathematics, for instance in decomposing partial fractions, or deciding whether a number is divisible by 9 etc.etc.etc.

The skeptics rejoinder:

This is nonsense. Both are examples of teaching or explaining concepts to lay people. The first intentionally places “tricks” in quotations marks to emphasize its non-technical use.

The problem with nonspecialists reading the private correspondence of experts is that their ignorance transforms all the technical points into nefarious inkblots. To continue with the example above, skeptical nonspecialists encounter the word "trick" and ask for clarification. Schmidt provides evidence that the word is innocuous, but because nonspecialists can interpret neither the context of the original nor that of the further examples, they redouble their efforts: now the rhetorical situation in which the word "trick" is uttered matters; now the appearance of quotation marks matters, etc. They are convincing themselves that those black blobs represent what they insist they represent, and when experts inform them that those are not Rorschach blots to be subjectively interpreted—that they are, in fact, statements written in a language that skeptics simply do not understand—the nonspecialists look over them again and declare that it could be a butterfly, or maybe a bat.

For the programmers out there, this is a little like finding a comment in some random piece of program source code

// Hack: just disassemble the whole tree for now

and concluding that the author of the software is attempting to "hack" your password.

The word "trick", like the word "hack", is a term of art with an esoteric meaning different from its lay meaning. And no, "trick" does not denote either a deception or a way of "teaching concepts to lay people". See, for example, the tricks on Terence Tao's blog. Can anybody seriously believe that this amplification trick is "a way of explaining math to laypeople"? Well, I guess they can, if they're shooting their mouths off without having the least fucking clue what they're talking about.

So, OK, what does trick mean? Well, C. Shalizi's review of Tao's book gives a reasonable definition:

Tao’s third theme is tricks: patterns of establishing results that replicate across many situations, but in which any one result is too small to be a theorem in its own right, while the general pattern is too vague. These are an important part of how math actually gets done, but by their nature they tend not to have a recognized place in the curriculum, getting passed down by oral tradition, or by being absorbed by those who are lucky enough not only to run across a paper using the trick, but also to guess that it will generalize. There are numerous tricks throughout the book, and one of the nicest chapters, 1.9, expounds a family of tricks for improving inequalities, which Tao calls amplification.

Although, to be more precise, the climate scientists in question do not seem to be using the word "trick" in its narrowest mathematical sense, but rather in the more general sense of a useful technique — however, again, usually one too small to be publishable on its own [0] [1].

Incidentally, this sort of irresponsible misreading shows how little most climate change "skeptics" on the Internet know about math or science. Why the heck would I lend credence to people who don't know jack about scientific practice, and cannot be bothered to learn, and still feel comfortable dismissing thousands of working scientists as frauds? And especially why would I believe them over scientific professionals who have dedicated their lives to studying the data?

I mean, yesterday I got a comment on my previous post on climate change denialism saying:

Models that can exhibit large errors tend to exhibit them in the direction that the modellers would prefer. That's why real sciences use double-blind testing.

Anyone who doesn't know what results climate "scientists" are looking for isn't paying attention.

"Real" sciences use "double-blind testing"? O RLY? No wai! Blind testing, of either the single- or double- variety, is irrelevant to most experiments in the natural sciences. I would like to know what "double-blind" even means for a microarray assay; are we not informing the microscopic dots of DNA whether we've hybridized them or not? Are we getting them to sign little microscopic release forms and giving them little microscopic placebo pills?

If most researchers in the natural sciences even had to single-blind every experiment they do, they'd never get anything done. Natural science labwork is often incredibly laborious and grad students do not have time to, like, close their eyes and spin a little roulette wheel of test tubes every time they pipette a drop of reagent. The usual way to weed out observer bias is by (1) designing experimental methods which are relatively robust to observer influence; (2) repeating your experiments; and (3) describing a procedure in sufficient detail for other sufficiently trained people to reproduce the result. Blind experimentation is reserved for certain types of experiments where observer or subject bias is especially dangerous or probable.

As for the email the commenter links to, I find it hard to see anything suspicious about it. Here's an excerpt:

The Soon & Baliunas paper couldn't have cleared a 'legitimate' peer review process anywhere. That leaves only one possibility--that the peer-review process at Climate Research has been hijacked by a few skeptics on the editorial board.

. . .

[In quoted reply:] I looked briefly at the paper last night and it is appalling - worst word I can think of today without the mood pepper appearing on the email ! . . . The phrasing of the questions at the start of the paper determine the answer they get. They have no idea what multiproxy averaging does. By their logic, I could argue 1998 wasn't the warmest year globally, because it wasn't the warmest everywhere. With their LIA being 1300-1900 and their MWP 800-1300, there appears (at my quick first reading) no discussion of synchroneity of the cool/warm periods. Even with the instrumental record, the early and late 20th century warming periods are only significant locally at between 10-20% of grid boxes.

I'll freely admit that I don't know what the jargon means, but "they have no idea what X does" is not a phrasing I'd ever have wanted to read in a review of a paper of mine. This is not prima facie evidence of anything but that some journal published a bad paper.

And subversion of publication venues by cranks is actually an everpresent danger in the sciences. Do climate change denialists really believe that concern over a journal's publishing trash is evidence of a groupthink conspiracy? No doubt they'll be taking up the heroic cause of M. S. El Naschie next.

[0] The notion of "trick" seems loosely related to what the C.S. community calls a "pearl", except that pearls are perhaps more rigidly described, and computer scientists publish them in some venues regardless (e.g. ICFP). Truthfully the "unpublishability" of tricks and other semi-formal knowledge seems like a flaw in scientific publishing, albeit one that blogs and other Internet-based venues may be correcting.

[1] It's not even unheard-of for laypeople to use the word "trick" in roughly this sense. See: Clemenza's recipe from The Godfather. I look forward to climate skeptics' exegesis of the diabolical agenda behind Clemenza's spaghetti sauce.


  1. There was a comment asking why a sceptic would expect errors in only one direction. I felt a bit guilt about replying to a weeks-old comment on someone else's blog, and so I cut my argument down too far, to the point of inaccuracy. Since you bring it up, I am now replying to a current blog, not a stale comment, so I feel more confident in being long-winded, and, I hope, more accurate.

    The reason for my bringing up double-blind testing, as used in medical research, is that it is an acknowledgement that the expectations of honest and competent researchers can influence the results they find. The ideal is that results are obtained without prior knowledge of what they "should" be. This is difficult to achieve, but it is always beneficial if it can be done.

    The email I linked was not meant so show any wrongdoing, only to show that the researchers in this field do have a clear idea of what results they want to find, and what results they don't. Since I was claiming that that was significant, I also needed to show what it was they wanted. Of course, the same is true of most scientists.

    So far as I have seen, none of the emails show misconduct by the scientists. One or two look a bit fishy, but if you can't find a bit of stuff that looks fishy in ten years of anyone's mail archives, you're not looking properly.

    The thing that does, I think, answer your original question as to the prevalance of climate scepticism among I.T. professionals is the HARRY_README file written by the programmer who produced the HadCRUT temperature history. Again, there is nothing in it that suggests dishonesty or incompetence, and nothing that I find particularly surprising. All the problems they faced - incompatible data sets, inconsistent data sets, code written by departed programmers doing things they don't understand, corruption introduced by format conversions, ad-hoc fixes to cope with missing or corrupted data, mysterious factor-of-ten discrepancies, struggling with inappropriate out-of-date programming languages, success defined as getting data out at the end that "looks right" after nights and weekends of failure - they are all things I have seen and done myself, and I am not ashamed of them. You only have to think about how you would go about creating a global temperature record covering a hundred years of history to realise that all these difficulties would occur. However, it is a far cry from what anyone would imagine whose idea of scientific software came from watching CSI. Clean data into clean software with pretty graphical interfaces, and the answer pops up.

    That's why you might expect someone who's seen inside the sausage factory of data analysis to be less keen on the sausages.

  2. I think the word "trick" is really the least of the issues around this controversey, and is not something with which I have a particular problem. In that particular email, the phrase "hide the decline" - even in context - implies that the so-called trick is meant to hide the genuine data trend. Even if an argument can be made there is no such implication, then it is hardly reasonable to expect others not to draw that inference for themselves.

    But even all that is secondary to, and a distraction from the bigger issues: implications of unethical and illegal activity relating to obstruction of FoI process, possible destruction of FoI requsted data, collusion to exert control over the editorial process of peer reviewed literature and exclude contrary views, undermining skeptical scientists on political rather than technical grounds (and in one case cheering the death of an opponent), and outright falsification of data. Even confirmed believers such as George Monbiot can admit the CRU documents raise serious questions, so there must be something there..

  3. Don't worry. MINITRU will be along shortly to make all of this go away.

    I predict the full weight and authority of every national government that could even conceivably claim jurisdiction will be brought to bear here, and an example to endure for generations will be made of whoever hacked the server and brought us the truth.

    Those who candidly discuss falsifying research data and destroying data requested under FOIA will get off scot-free, keep their jobs, and continue producing leftist agitprop and calling it objective scientific facts. The leftist newsmedia will carefully ignore the revelations of falsified data and continue spoonfeeding their agitprop to the public and calling it truth.

  4. AMcGuinn: "Double-blind testing" is not always beneficial if it can be done. It is, as I said, irrelevant to most natural science experiments, as blinding the subject is literally meaningless in the vast majority of cases. How do you "blind" a non-sentient subject?

    Furthermore, the email linked does not show that scientists "have a clear idea what results they want to find"; it shows that when bad science is done, they call it out. Reread the passage I quoted; they're complaining about methodological errors ("They have no idea what multiproxy averaging does") and logical errors ("By their logic, I could argue..."). They are not complaining about the results except insofar as the results reflect shoddy work. The discussion of "skeptics" taking over the board is simply a forensic investigation into how such shoddy work got published.

    As for the rest of this CRU nonsense, I really don't have the time or the energy to rebut every accusation. If you care, see the first subthread on this this reddit post and the realclimate.org thread.

    For me, the highest-order bit is simply that CRU has 19 staff, out of thousands of climate scientists in the world, and whatever the outcome of the current media carnival, it says almost nothing about the broader enterprise of climate science. The next-highest order bit is that the climate denialists are, by and large, utterly clueless about science; and between the utterly clueless and the well-informed but humanly flawed, I know where I'm putting my chips.

    Finally, "Anonymous", pro tip: if you want anyone to take your wild predictions seriously, then sign with your handle so that we can laugh at you when they don't come true.