Tuesday, May 30, 2006

Anti-network neutrality astroturfing comment spam

Exhibit A: Read the comments to my post from Sunday, May 21. Notice anything about the two comments by "Net Chick"? Do they strike you as, perhaps, somewhat perfunctory and non sequitur given what I wrote in the extensive main body of my post?

Exhibit B: Observe "Net Chick"'s Blogger profile, which reveals that "her" account was created recently, in May 2006. Note the utterly generic username, and the absence of any personal details or blog; screencap below.

Does anyone find it odd that somebody who just got onto the Internet in May 2006, and has no home page, has such strong opinions about telecommunications regulations?

Exhibit C: Observe the Google searches for '"Net Chick" network neutrality' and "posted by net chick", which reveal a wide array of comments splattered across the web under the same pseudonym. Click through to a few links, and you will see perfunctory talking points repeated in response to every post, with little regard for the post's content. Notice that all blog comments to date by "Net Chick" concern network neutrality legislation; not only does "she" care about network neutrality, it's the only thing "she" cares about.

In case the results of the above linked searches change, here are a few direct links where you can find comments which are probably (or certainly) by this same (ab)user: one, two, three, four, five, six, seven, eight, nine, ten, eleven... I guess that's enough for now. (UPDATE: If you run one of these blogs, please don't delete Net Chick's comments; we need them for evidence.) Below, for posterity, I have also captured screen shots from a few of these links.


Exhibit D, just to beat this dead horse down: post to Ars Technica...

...and corresponding profile, freshly created in May 2006:

Okay, enough evidence. Time to render a verdict.

Ladies and gentlemen, we have been defaced by a comment spammer. Not your garden-variety porn/drugs/gambling comment spammer, who would have made certain to include at least one link to an advertising-supported website somewhere in either the post or the profile page. Nor is this the work of a mere desperate narcissist spammer, who would have made some link to a personal blog available.

No; "Net Chick" is an astroturf comment spammer: an astro-spammer, if you will. Judging by the volume of spam (dozens of sites, rather than hundreds), the degree of comment differentiation, and the variety of comment systems to which the astro-spam was posted, I'm guessing that it's a human being, rather than a computer program. For this quality of A.I., it's probably cheaper to pay a human than to hire a computer scientist to write a program. This appears to be retail astro-spam, not wholesale astro-spam, although it's likely that the same entity's posting a lot more under other aliases.

(In fact, among the commenters from my original thread, the pseudonyms "Luv2Box", "MRT", "watcher", "stevens33", and Katie2020 all look pretty suspicious to me. Note the absence of links to any substantive personal page, blog or otherwise; yet these people care deeply enough about telecom policy to comment en masse at other blogs? Well, OK, it looks like "Luv2Box" has shilled previously for the Iraq War as well.)

Now, note the following traffic log, timestamped roughly when "Net Chick"'s most recent comment was posted:

This visitor left his/her browser window open for over twenty minutes, which isn't consistent with automated comment spam. Also, notice that the comment originates at an IP in Charleston, WV, which isn't the home of any major PR or telecom companies. At first, this puzzled me: what seasoned spammer would waste that quantity of time on an individual blog? Why is the spam originating from a home DSL provider in podunk West Virginia, rather than a city with major PR firms? Then I made the connection: I also recently received anti-net neutrality email advocacy spam from a PR company that provides integrated, cross-medium marketing services. The people posting the astro-spam aren't tech-savvy viral PR ninjas; they're telemarketing employees being paid minimum wage by the hour to browse Technorati and comment on the blogs they find. "Net Chick" is probably some suburban housewife moonlighting as a PR shill, or some guy who lives in his parents' basement and can't get a real job.

Notice, incidentally, that all most comments by "Net Chick" et al. appear on posts that tilt against network neutrality legislation. This makes me wonder about the online marketing strategy. Do they have different personas for posting to different sorts of blogs? How much latitude do they leave the telemarketers? Etc. I have more evidence that leads me to speculate about these things, but I prefer to keep it quiet for now. If the astro-spam sponsors see too many of the bread crumbs they're leaving behind, they'll change up their strategies.

Anyway, I'm not the first one to notice that the shills have come out: Seeing the Forest, MyDD, and IPDemocracy appear to have run across a different pack of pseudonyms shilling on the blogs they read, although I've documented "Net Chick" somewhat more thoroughly.

This sort of thing will only become more common (and more subtle) in the future. Henceforth, be wary of any blog comment that isn't backed by some persistent and credible web identity, one with a history. Even be suspicious of comments that are backed by a history. The PR-industrial complex is all around us.

I also plan to aggressively delete comments that strike me as shill-ish in the future, leaving previously posted comments for forensic purposes only. As you can imagine, I am pissed (though not surprised, since I've always been pretty paranoid).

UPDATE 31 May: Fixes: (1) Telemarketers get paid somewhat more than minimum wage. (2) Added more links to astro-spammed blog posts.

UPDATE 1 June: Fix: Moved a link that was confusingly placed.

Wednesday, May 24, 2006

In which I acquiesce to the siren song of capitalism

OK, I suppose some disclosure is in order. I recently accepted the offer of a software development position at Google.

Provided I defend as planned this summer (hardly a foregone conclusion, but my advisor seems to believe in me), this fall I'll be ending my career in academia and moving to the Bay Area.

At this point it is worth noting that if I were smarter, more talented, more focused, or simply hungrier, I would probably have done as most other recent Ph.D.'s from my research group have done, and obtained an academic job, or at least joined a research lab. I'm reminded of the time I chewed out Jonah Goldberg for implying that liberals become academics because they can't get jobs in the private sector. In computer science, the opposite's closer to the truth. (Well, I did have one academic job offer, but for various reasons I turned it down. I also turned down a few interviews, mostly because I'd concluded that I wouldn't accept those positions even if they were offered.)

Not that I'm complaining. There are worse things in life than getting a job offer that 99% of the people in my profession would kill to have.

Finally, you may wonder how this affects my blogging. Well, in my previous post, I took a position that tilts against one of Google's most prominent current lobbying efforts. That doesn't really settle the issue, of course --- especially since the mechanics of options dictate that I want GOOG to drop as much as possible before I start work, and rise to stratospheric heights only thereafter --- but anyway I hope that my opinions will remain independent of my paycheck, even after I shuffle off this academic coil.

I won't take it personally, however, if you consider me a wholly owned subsidiary of my putative future employer.

Sunday, May 21, 2006

Notes on network neutrality

So, "network neutrality" has been getting lots of press lately. Briefly, network neutrality is the principle that Internet carriers should provide "dumb pipes" that carry all Internet traffic equally, rather than discriminating based on the type, content, or destination of the traffic that flows across their networks. A wide array of interests --- including MoveOn, Google, and a number of other groups and companies across the political spectrum --- have been lining up behind initiatives to codify this principle in law for US Internet carriers. The principal opposition to the coalition comes, unsurprisingly, from two quarters: the telecommunications companies, and libertarians.

My reaction to this is twofold. My first and predominant reaction is: I am troubled by the idea of getting the FCC involved in regulating how Internet service providers architect their networks. The Internet actually works pretty well these days, so it seems dangerous to get Congress monkeying around with its guts. And the question of what constitutes a "neutral" network is pretty subtle, as the term has no widely agreed-upon technical definition.

In the absence of a technical consensus, the output of the legislative process is likely to be either a mishmash of mistargeted micro-regulations, or a vague and overly broad mandate for the FCC. Who knows what will come out of that process, but in my opinion the most likely outcome is simple regulatory friction that slowly, invisibly eats away at network innovation. Network innovation won't go away, but certain kinds of innovation will become more difficult because of legal complications, and the Internet will suffer. Therefore, I suspect that any legislation written today will hurt the Internet more than it helps. It is with some surprise that I find myself agreeing with the telecom giants and the Cato Institute, and disagreeing with MoveOn, Google, etc. If network neutrality regulation passes now, I think progressive activists and technology companies alike will live to rue the day they begged for it.

My second reaction is that this whole debate strikes me as a kind of bizarre ritual theater in which people are making noises and gesticulating wildly, but nobody talks about the real issue.

Political outfits --- ranging from MoveOn to the Christian Coalition --- are worried that network providers will begin to discriminate based on the political content of messages. This is pretty unlikely. It's not easy for an algorithm to look at a bag of bytes and classify its political content; and network providers probably can't pay for the computational power required to apply such an algorithm to the many terabytes of data that flow across their networks daily.

And even if they could, why would they? There's no percentage there. In fact, I can think of two very strong reasons for them not to start filtering based on political content. First, there would be an enormous consumer backlash. Second, there would be enormous political fallout. The latter would include not only backlash against abuse of quasi-monopoly power, but possibly the imposition of responsibility for the content that flows across the pipes. Once you begin filtering based on political content, lawmakers may poke their heads in and wonder why you aren't filtering out all that kiddie porn and gambling and such too --- and if something gets through your filters, why can't we hold you liable? The network providers don't want to open that can of worms.

So, political censorship isn't the real issue here. Nor, pace Moby et al., is it interconnection with small media providers versus large ones. Verizon's not terribly likely to block access to your music blog. They might, someday, contract with certain service providers for improved performance. For example, they might strike a deal with iTunes to store songs in a local proxy cache, so that Verizon customers would observe slightly improved performance with iTunes, but not your music blog. That doesn't strike me as either disastrous or a betrayal of the Internet's principles. Networking researchers have been proposing schemes like this for years. In fact, Akamai's basically a third-party version of this scheme: people pay them to store content in caches close to where it's demanded, so Akamai-cached websites perform better than non-Akamai websites. Akamai's been operating since 1999, and so far the Internet hasn't been torn asunder.

So what's the real issue?

As Ars Technica noted back in January, Verizon CEO Ivan Seidenberg was making noise about Google's (over-)"use" of Verizon's bandwidth. And last November, SBC CEO Edward Whitacre complained about Google and Microsoft using "my pipes" (meaning, of course, SBC's pipes; Whitacre suffers from your usual case of CEO megalomania):

"So there's going to have to be some mechanism for these people who use these pipes to pay for the portion they're using. Why should they be allowed to use my pipes?"

In a way, you have to give Seidenberg and Whitacre credit. In making these noises, they display a level of stupidity and chutzpah that rivals the dudes from Jackass.

First, the chutzpah: Google does pay its ISP for its Internet connectivity, just as Verizon customers pay Verizon for their Internet connectivity. Yet Seidenberg claims to believe that Google should pay Verizon for the Internet connectivity that Verizon's customers have already paid for. It's as if Ford were to ask Wal-Mart to pay fees to Ford, because Wal-Mart's customers were driving to Wal-Mart in Fords.

Second, the stupidity: five and a half months later, Verizon's lobbyists are working overtime to prevent network neutrality legislation from passing. And guess who's paying for the lobbyists on the other side? In many ways, Seidenberg, Whitacre, and their telecom industry cronies brought this circus on themselves through overreaching arrogance and greed.

So, here's the real story. The telecom giants currently sell you Internet access, which is okay, but dull. Dumb pipes are cheap. But hey --- what if they could sell you lots of bundled services? That really gets the dollar signs flashing in their eyes. These are the companies that want to sell ring tone subscriptions for your cell phone, and bundled cable packages with more channels than you'll ever watch. Their dream is to add a dozen extra bullshit services to your Internet service bill, so that you're paying them eighty dollars per month instead of fifty.

The big problem with that plan is that once you have dumb pipes and smart endpoints --- in other words, the Internet --- the endpoints can build essentially any service on top of the network. Of course, this is fantastic for Internet users. Once you pay for Internet access, you automatically get to use every Internet application that's ever been invented: email, the web, instant messaging, peer-to-peer, and (increasingly) the two V's, voice and video. These last two really drive the telecom giants nuts, because they used to sell you voice and video: they're phone and cable companies.

Therefore, the most likely form of telecommunications discrimination in the foreseeable future is discrimination by application, not by content. Verizon wants to give preference to its voice services, and Comcast wants to give preference to its video services. They're deeply freaked out by Skype, Google Video, and the like. If they can convince customers that competing services are slow and crappy compared to their own offerings, they think they'll have a better chance of getting you to pay for their bullshit services. Failing that, they'd like to convince Google and other Internet companies to pay fees for non-degraded service. Think of it as protection money: "Nice customer base you got there. Sure would be a shame if your packets were dropped 20% more often than your competitor's..."

If network providers got serious about this, the results would be pretty bad. Network providers should not be picking winners and losers in the Internet applications game. Applications should succeed or fail on their own merits. Now, I think network discrimination schemes would fail in the long run, but that's just a hunch and it's really an open empirical question. Meanwhile, in the short run, discrimination schemes could cause major distortions in the Internet applications market; and when you consider the possible network effects in domains like Internet telephony, it's possible that these distortions could cause lingering damage, locking in inferior applications for years to come.

So I'm really glad that people are paying attention to network neutrality. But I'm also alarmed that so few of those people seem to understand what's really going on here, and I'm skeptical that now is the time to make laws about it. So far, the Internet's still neutral. My bottom-line recommendation would be to watch and wait.

(And also to increase competition in local ISP markets, which would give customers a choice when confronted with discriminatory network policies. Note that this wouldn't be a panacea, because in practice most localities would still be served by a few providers, each of which might have an incentive to discriminate, albeit in differing ways. Oligopolies don't necessarily lead to efficient markets.)

p.s. Selected links on network neutrality:

Wednesday, May 17, 2006

SML hacking tip: Turn off polyEqual warnings

Note: Narrowly targeted Google-food. Skip if you do not program in Standard ML.

Recent versions of Standard ML of New Jersey (SML/NJ) print a message "Warning: calling polyEqual" when you write code that uses polymorphic equality. Here's how to turn it off:

sml -Ccontrol.poly-eq-warn=false

This works in CM mode, as in:

sml -Ccontrol.poly-eq-warn=false -m sources.cm

I am posting this here as Google-food because I just spent an hour Googling and greping around in SML/NJ sources trying to figure this out.

If you're using SML/NJ interactively, then you can also type

Control.polyEqWarn := false

into the read-eval-print loop. However, this doesn't work when you're trying to invoke the SML Compilation manager in "make" mode (-m), because Control is not present in the default linkage environment. (And the SML/NJ documentation does not specify how to add it; or, at least, I haven't figured it out yet.)

More generally, all control flags (all bool refs in Control) can be toggled at the command line. This is documented in the command-line section of the SML Compilation Manager manual. There, we learn that -C can be used to set control parameters. You can get a listing of all control parameters using -S, as follows:

sml -S

Finally, I just want to remark in passing that unlike, say, "match redundant" or "match nonexhaustive", use of polymorphic equality is purely a performance problem, not a probable logic error. It's therefore highly questionable design to enable a warning message for polymorphic equality by default. The default setting should have been off, but available as a profiling/debugging option.

Wednesday, May 10, 2006

"Advance review copy: Not for resale" --- my ass

Rosina Lippi points to an irate discussion at a romance novel fansite called "Smart Bitches, Trashy Books": a couple of authors are mad that reviewers are reselling the free "advance review copies" that authors send them.

My reaction: get real, control freaks. Authors have no legal standing whatsoever for preventing reviewers from reselling advance review copies --- as authors and publishers know perfectly well, otherwise they'd be tracking down these reviewers and suing them. An advance review copy is a gift, i.e. a transfer of property, not a license for use. If providers of review copies were serious about controlling redistribution, they would do what people do when they really want to control how something is used, and make the reviewers sign a contract before mailing them copies. And no, a notice inside the book saying "NOT FOR RESALE" does not qualify as a contract, for any number of reasons that a lawyer could explain to you in great detail.

OK, so authors have no legal standing for preventing resale of advance review copies. What about the reviewer's ethical obligations?

My reaction: What ethical obligations? The review copy showed up one day in the reviewer's mailbox. The reviewer didn't ask for it, and certainly didn't promise to engage in any particular behavior with respect to it. If the reviewer doesn't feel like keeping it, then she has exactly the same ethical obligation towards the book as she does towards a supermarket circular or a credit card offer or any other piece of unsolicited junk mail that appears in her mailbox. It so happens that, unlike most junk mail, some advance review copies have economic value, and can be disposed of by selling on eBay, rather than by tossing into the shredder or writing "RETURN TO SENDER" on the package and dropping it in the mailbox. More power to the reviewer.

Advance review copies are gifts. The recipient of a gift has no particular obligations w.r.t. the giver. That is the nature of a gift. And, in fact, authors want advance review copies to be gifts. They are quite happy to take advantage of the properties of gift-giving, such as the fact that gift-giving does not require prior consent or up-front costs for the reviewer. An author who requires reviewers to pay for copies or to sign contracts isn't going to get very many reviews, so authors choose the regime of gift-giving rather than the regime of market exchange or contractual negotiation. The authors complaining at SB/TB, having chosen gift-giving because of certain advantages it affords, are now upset because other consequences of gift-giving are biting them in the ass. Tough cookies, bitches. There are good reasons that unsolicited gifts don't come with ethical obligations attached.

Now, I wouldn't be posting about this at all, except that it strikes me that there's a common thread running between the notion that authors have a right to control redistribution of advance review copies, and the notion that record companies have a right to tell you what devices you're allowed to play your music on, and the notion that the guardians of Nabokov's estate have the right to prevent the publication of Lo's Diary. In all these cases, somebody holds an intellectual property right in a piece of information, and somehow believes that this entitles him/her to essentially unlimited control over what happens to any embodiment of that information. There's a deeply vainglorious sense of entitlement here that just makes me angry.

Every act of creativity draws from a rich oceanic well of human culture and achievement, built up over thousands of years, that the creator never asked permission to use. Most of the people who shaped and transmitted that culture are dead and anonymous and were never compensated for their role in building that culture: retelling the folk tales that populate our cultural subconscious, or inventing the vivid turns of phrase that give our language its flavor, or keeping the memory of certain arts and crafts alive by practicing them. For most of these people, creativity was an inevitable corollary of being alive, not a professional activity for which they needed to be compensated. And you come along, draw up a bucket from this well, and use that to invent one more story, write one more song, draw one more picture, and all of a sudden you think that you have a moral right to impinge arbitrarily on the property rights of everyone else in the world, now and forever? Step off, son. Redistribution of legally acquired copies, fair use, and production of substantially transformative derivative works are all perfectly justified activities. Your feelings may be hurt, or you may make less money than you would if your rights were unlimited, or you may be aesthetically offended, but none of these give you moral standing to stop other people's exercise of their rights.