Monday, December 27, 2004

C. Shalizi on crowds and algorithms; also further complaints from yours truly

C. Shalizi has many useful things to say, and to link, about "the wisdom of crowds", partly in response to something I posted a while back.

Shalizi includes a pointer to a Rational Herds: Economic Models of Social Learning (ISBN 052153092X). Aside from having really cute penguins on the cover --- reason enough to buy most books --- the book also looks intellectually fascinating, and instantly makes my to-read list, though with my recent binge of book-buying [0] I most likely won't get around to reading it anytime soon.

In related news, I actually read/skimmed large chunks of The Wisdom of Crowds whilst browsing during the aforementioned book-buying binge. I concluded that the book itself (as opposed to the publicity, or the vulgarized versions of Surowiecki's thesis that are making the rounds) is not exactly bad, but rather good, yet frustrating. Surowiecki's tackling an important subject. He writes with the fluency and accessibility you'd expect from a New Yorker writer. The book recounts many fascinating anecdotes, and it even lays out a set of criteria for organizing "wise crowds" that's sensible and convincing (though stated too vaguely for my tastes). But these strengths make the book's failures all the more disappointing. Each chapter contains at least a few things that get my ersatz-scientist hackles up: an overgeneralization from meager data, or an incomplete and vague summary of a more systematic study, or an example cherry-picked to support his point without adequate treatment of counterexamples [1]. The best ideas in Surowiecki's book aren't new, and the intellectual frame he puts around them often adds little [2]. Lastly, and perhaps most importantly, as Shalizi writes in the post linked above, although Surowiecki does give a nod to the difficulties of crowd organization, in general he does not place enough emphasis on it.

My guess, therefore, is that readers genuinely interested in the ideas Surowiecki discusses would be better off reading the primary sources in Surowieki's acknowledgments. I don't have a copy handy, and I regrettably forgot to scribble them down. Oh well. Next time I'm in a bookstore...

Bonus link: Radio National interview with Surowiecki.

[0] At the Cherry Creek Tattered Cover in Denver, last week, while visiting a friend; the bargain shelves should be labeled with warnings for compulsive verbivores.

[1] For example, one form of "crowd wisdom" that Surowiecki returns to several times is the fact that groups of people appear, in aggregate, to be very good at estimating quantities. One of Surowiecki's stories in support of this claim: in 1906, economist Francis Galton found that crowd of people at a fair were collectively able to estimate the weight of a thousand-pound-plus ox to within one pound, better than any individual in the crowd. He has a few more examples in this vein, but almost no discussion of the abundant counterexamples. For example, experiments show that, on average, people consistently overestimate the height of men and underestimate the height of women, even when they're shown photographs of the subjects standing next to common reference points. Surely a trained surveyor would do much better than a crowd in this case. Surowiecki briefly mentions some studies wherein experimenters were able to skew estimation results by using explicit suggestion, but he ignores systematic, consistent, a priori bias --- which gives the reader the impression that estimation bias is something induced in relatively rare and peculiar circumstances.

[2] Returning to the collective estimation problem in the previous footnote: the success of averaged estimates would lead me to conclude that the human senses can measure accurately, but with a random error that follows a symmetric (Gaussian?) distribution. This is interesting, but it says little about the "wisdom of crowds". Instead, it testifies to the value of repeated measurement, a bog-standard part of scientific orthodoxy. You will get similar results with inanimate scientific instruments (e.g., a thermometer or a light-sensitive CCD) operating near the limits of their precision: measure many times, and you get a better, rounder bell curve than if you measure only a couple of times. Surowiecki's framing seems simply superfluous here.

Friday, December 10, 2004

Eolas v. Microsoft case grinds on

Via Politech, oral arguments at the Eolas v. Microsoft appeal were heard yesterday; but more interestingly, Declan points to Perry Pei-Yuan Wei's note on the Viola browser, which is clearly relevant to the discussion of prior art. I find it pretty astounding that the court decided not to allow the jury to see a demo of Viola, or to know that Wei had told Doyle about Viola.

Yet more evidence that the patent system, in this case, did not produce incentives for innovation, but instead supported an unwarranted "intellectual property" land grab.

Previous posts on Eolas: one, two, three.

Thursday, December 09, 2004

Random tip for LaTeX users

On Unix/Cygwin boxen with the watch utility, open up a console and use the following command to extract a live, automatically updated outline of your paper:

watch --interval=30 grep '^\.*section\{' latexfilename.tex

where latexfilename.tex is replaced with your paper's filename, of course. This will refresh every 30 seconds; use a different value for the --interval argument in order to get a view that's updated more or less frequently.

The above assumes that, like me, you compose using a documentclass that uses the standard \section, \subsection, and \subsubsection commands, and that your section declarations start at the beginning of a line.

(Yes, I'm using this tip at this very moment. Sigh. Deadlines.)

Sunday, December 05, 2004

Stephen Prothero writes at least two stupid things

Here are two things about Stephen Prothero's Sunday Times review of James Ault's Spirit and Flesh: Life in a Fundamentalist Baptist Church that are stupid, and betray a fundamental thoughtlessness.

First: In the first paragraph, Prothero drags out the shopworn "irony" that people tend to be intolerant of fundamentalists exactly because fundamentalists are intolerant. One can only assume he means to imply --- as do most people who point out this "irony" --- that those who are intolerant of fundamentalists are hypocrites.

Hey, that reminds me of a pretty similar "irony": if you decide to kidnap somebody and hold them prisoner against their will, then law enforcement will come and "kidnap" you, holding you prisoner against your will. Law enforcement: a bunch of hypocrites! Or, how about this: Nazis armed with guns were rounding up the Jews of Prague and killing them like dogs; "ironically", the Jews decided to gather up guns and fight back, shooting at the Nazis in return. Those Jews of Prague, they sure were hypocrites!

Prothero, like everyone who drags out this idiotic cliché about failing to tolerate intolerance, chooses to ignore the difference between initiating a wrong, and retaliating against that wrong. Sometimes, when you retaliate against a wrong, you must adopt some tactics that are superficially similar to the tactics of those guilty of the wrong.

Prothero also confuses the real issue, which is not "intolerance" per se. We are, all of us, intolerant of someone: pedophiles, rapists, con men, Al Qaeda members. The point of dispute is not "tolerance", but rather the thing being tolerated, or not tolerated. Fundamentalists hate gays and atheists categorically and irrationally; homosexuality and atheism are things that are essentially worthy of tolerance; therefore, fundamentalists are wrong. No such argument applies to the categorical, irrational hatred itself --- that is not essentially worthy of tolerance.

Second: About midway through the review, Prothero writes:

The stereotype, of course, is that fundamentalists are Manichean moralists. And the strict rules they follow certainly seem to be black and white. In the application of these moral absolutes, however, Ault finds plenty of gray. Shawmut River functions like a close-knit family, he argues, and the brothers and sisters in that kinship network demonstrate a "situation-specific flexibility" in morality that is difficult to distinguish from the situation ethics they so vehemently decry. Divorce, for example, is prohibited, and [Rev.] Valenti tries to talk his parishioners out of it. Yet when they call a marriage quits, he is the first to let bygones be bygones. "While fundamentalists' timeless, God-given absolutes may appear rigid from the outside," Ault writes, "within the organism of a close-knit community where much is known in common about persons and situations, they can be surprisingly supple and flexible."

But this doesn't upset any stereotypes at all. One of the most common criticisms about fundamentalists is that they're hypocrites, content to assail the degenerate lifestyles of others from afar while giving themselves and their kin a pass. Duh. I have little doubt that Valenti has sermonized against the degeneracy of people not in his congregation, without understanding that people the world over have lives just as complex and difficult as the life of any sheep in his personal flock.

A note from a paranoid consumer

My good friends know that I'm prone to a mild, generalized paranoia, which, to me, is just healthy skepticism. I take for granted that most of my public behavior and electronic communications are being surveilled, though perhaps not by any human being.1 A recurring feature of my dreams is a scene wherein the facade of reality gets stripped away to reveal what's beneath (waking up from these dreams is always an interesting experience). For me, The Matrix, The Truman Show, and the stories of Philip K. Dick weren't mind-blowing out-of-this-world fantasies, but rather variations on an old familiar theme. And I've never read a really convincing refutation of Nick Bostrom's Simulation Argument.

Now, I don't believe that anyone cares about me enough to bother constructing some kind of pervasive, systematic deception specifically for me. But I do believe that large, well-organized, intentful forces are watching us collectively, and interfering with our lives, in ways that we generally ignore, for the sake of preserving the day-to-day fiction that we're living in a world hospitable and comprehensible to ape brains that evolved two million years ago on the savannah.

All this is to preface my saying that if I ever discovered that someone I knew was a "volunteer" or an "agent", I would do my best to eradicate that person from my life.

I can accept that my friends might deceive me, or at least keep secrets, for the sake of their dignity, or their reputation, or to spare my feelings or somebody else's feelings; there are all the ordinary human deceptions that knit together our social fiction. They make us human. However, deceive me in the service of an organization --- a business, a church, a government, whatever --- and you're no longer human, exactly, but rather a cell in some larger organism, trying to feed itself by extracting my money and labor; and let us remember that money and labor are abstractions for the output of our minds and bodies. In a precise sense, these agents lie in order to turn us into food; they are the glowing bulb on the forespine of the anglerfish.

1 I'm talking about computers, not aliens. Between ECHELON and the pervasiveness of surveillance cameras, you can basically assume that you're being recorded whenever you send email or walk down any commercial street. It doesn't have to be in a large city: after the Oklahoma City bombings, they used recordings from surveillance cameras all over the city to track down the van.

Wednesday, December 01, 2004

Another name for the "wisdom of crowds"

I just read yet another online essay that, at one point, referenced "the wisdom of crowds". This phrase has been getting a lot of attention lately from the chattering classes. I don't have time to give this idea a fuller treatment, and in fact I have not read James Surowiecki's book (which started all the ruckus), but I will say something that I've wanted to get out there for a long time:

"The wisdom of crowds" is just another name for "the behavior of distributed algorithms".

When you think about it, "Let's exploit the wisdom of crowds!" really means: "Let's set up a whole bunch of independently acting, loosely federated entities, each with an incomplete view of the system, and let's make them do some cognitive task." In other words, if a crowd ends up having any wisdom, it will have arrived at it through a distributed algorithm.

Why does this matter? Two reasons.

First, it de-mystifies the concept. "The wisdom of crowds" is a phrase precisely calibrated to mystify the thing it denotes. Consider the diction: "crowds", suggesting spontaneous, informal, natural gatherings; and "wisdom", suggesting a folksy knowledge born of experience, as opposed to, say, "intelligence", "cleverness", or "expertise". The phrase "wisdom of crowds" carries within it the seeds of the message that gosh darn it, if you just got those elitist social engineers out of the way, and let everybody alone to act on their common sense, everything would be just peachy. In fact, if you read the blurbs from the publisher's page, this is exactly the message that's being pushed --- if not by Surowiecki himself, then by his promoters, with his tacit assent.

By contrast, the phrase "the behavior of distributed algorithms" is a more forbidding thing, one that highlights a crucial fact: all systems for extracting knowledge from "crowds" are, in fact, intricate constructions that achieve their results through precise engineering of the rules governing the crowd.

This leads into my second point. Any computer scientist who has tangled with distributed systems knows that designing a distributed algorithm that actually does what you want it to do is extraordinarily tricky. On the other hand, it is really easy to design distributed algorithms that, for deviously subtle reasons, end up prone to behaviors like wildly unpredictable, bizarrely pathological oscillations, race conditions, deadlock, livelock, network floods, etc., etc., etc. Until you have studied the Paxos algorithm, or at least hacked on a distributed system (and I doubt very much that James Surowiecki has done either), you probably lack the humility and skepticism needed to evaluate distributed algorithms accurately.

Naïvely lauding the alleged "wisdom of crowds" obscures the critical issue, which is the design of the distributed algorithm --- i.e., the social organization of the crowd. What are its mechanisms for passing information? For reaching consensus? Where are the possibilities for feedback loops? What happens in the obscure corner cases that result from the interactions of all its features? Etc., etc.

There's no such thing as a free lunch, and gathering together a large number of independent actors does not magically make problem-solving any easier. In fact, it can make problem-solving incalculably harder. After you gather the crowd, you have to figure out how to make it do something useful, and it is by no means the case that you'll always get acceptable outcomes by letting each individual make decisions that "look sensible" (whatever that means) based on locally available information.

Now, as I said, I have not read Surowiecki's book. It is entirely possible that I'm being utterly unfair to him based on the yammerings of others. On the other hand, the publisher's excerpt is not encouraging.

UPDATE 2007-09-29: If you're coming from this ycombinator blog, then note that I wrote a followup after reading most of the book and my opinion of Surowiecki himself has only marginally improved.

Also, in retrospect, it seems to me that this post is more about the "wisdom of crowds" meme --- how and why it's been successful, what's wrong with it, and the role of Surowiecki's publicist in promoting it --- than about Surowiecki's book itself.