Wednesday, January 24, 2007

Notes on [Hahn Litan 06]: Network Neutrality Part 1: Requests For Comments

[Full disclosure: I work for a large technology company that presently lobbies for network neutrality legislation. My personal views on network neutrality predate my employment, and are independent of it. Everything posted on this site reflects strictly my own thinking, and is not endorsed, sponsored, or approved by my employer in any way. Finally, this should go without saying, but nothing I write here reflects any confidential or proprietary information from my employer.]

The generally excellent Tim Lee @ TLF today writes two posts reacting to a recent white paper on network neutrality by R. W. Hahn and R. E. Litan of AEI/Brookings[0] (also to be published in the Milken Institute Review).

As I read it, Hahn and Litan's paper makes the following major claims. First, they claim that the Internet's not neutral, and never has been --- hence the title, "The Myth of Network Neutrality...". Second, they claim that existing proposed legislation to codify network neutrality into law would do more harm than good. Third, they claim that there are economic benefits to tiered pricing for network-layer "quality of service" (QoS).

The major weakness of the paper is that the authors do not understand Internet technology, and they seem to have consulted zero experts who do. As a result, they make many elementary errors of fact, rendering their argument unsound and their conclusions unsupportable. I am rather too tired tonight to go through all the errors at once, so I will delineate only a few in this post. Expect at least one follow-up post sometime in the next N days.

On Requests For Comment

Hahn and Litan cite several historical RFCs in support of the following conclusion:

. . . early writings on the Internet indicate that prioritization has always been considered an important design characteristic for TCP/IP --- sharp contrast to the romantic ideal of the end-to-end principle.

This post will examine how the authors attempt to support this claim, and how they fail.

As an aside, before I dive in, it is important to recognize that RFCs ("Requests For Comments") are not necessarily authoritative design documents for the Internet. RFCs have no binding force except insofar as many engineers independently decide to follow them --- a sort of community-based moral suasion, given economic force by network effects. Furthermore, RFCs vary widely in purpose: they may be arcane memos warning about one-time events, ideas of untested merit from the dustbin of history, cutting-edge research that may or may not ever be adopted, or even jokes.

Only a few RFCs describe protocols that have been widely implemented and deployed on the Internet, and even those are almost always provisional.

On to the meat. Hahn and Litan cite four RFCs. Tim seems to have missed read the short version of Hahn and Litan's paper, which lacks RFC citation numbers, but they're in the footnotes of the long version:

  • RFC 675: Specification of Internet Transmission Control Program
  • RFC 791: Internet Protocol
  • RFC 1633: Integrated Services in the Internet Architecture: an Overview
  • RFC 794: Pre-emption

The first two are (ancestors of) bona fide, widely-adopted standards. The third is a position paper by a group of highly respected networking researchers. So, those three RFCs are not jokes, although the first was superceded by RFC 793 before ARPANET even became the Internet, and the third has never, to date, been deployed on the Internet at all. Then there's the fourth, which does not even describe the Internet, but another network entirely; so it is not exactly a joke, but it's pretty funny to see it cited as evidence of the Internet's principles.

So, here are the mistakes the authors make w.r.t. each of these RFCs in turn. Note that I share Tim's frustration that the authors have not, in most cases, provided either page numbers or quotes, so in some cases I have had to interpolate the exact citation.

RFC 675

This RFC describes an early version of TCP, one of the two fundamental protocols of the Internet. The authors state that Vint Cerf "explained that outgoing packets should be given priority over other packets to prevent congestion on the ingoing and outgoing pipe" [HahnLitan06, p. 4]. I believe the authors are referring to section 4.4.1., as the word "incoming" only appears in a handful of places in this RFC, and only once in any context related to priority:

From the standpoint of controlling buffer congestion, it appears better to TREAT INCOMING PACKETS WITH HIGHER PRIORITY THAN OUTGOING PACKETS.

The all-caps are in the original. Hahn and Itan appear to have the capitalized part exactly backwards, which doesn't speak well of their conscientiousness, or that of the editors at the Milken Institute Review. However, that's not the deep problem. The deep problem is that Hahn and Litan do not understand what TCP is, and what is being described here.

First of all, TCP is an end-to-end protocol. Period. Every single normative sentence[1] in RFC 675 describes an operation that occurs on an end-host, not on a router internal to the network. The above sentence describes how an end-host should prioritize processing of packets in buffers inside its networking stack. It is, in other words, a hint to operating system implementors who want to write TCP/IP stacks. It has nothing whatsoever to do with "the network" prioritizing packets.

If this sounds like an abstruse distinction, imagine "the network" as the US Postal Service, and an end host as your home. The operating system's network buffer is your mailbox. What the above sentence is saying is that before you stuff outgoing mail into your mailbox, you should take your incoming mail out of your mailbox. It is saying nothing about whether the US Postal Service should pick up your mail in one order or another.

Does RFC 675 present a "sharp contrast to the romantic ideal of the end-to-end principle"?

It does not.

RFC 791

This RFC describes IP, the other fundamental protocol of the Internet. Again, the authors do not give exact quotes or specific citations, but they state:

A 1981 Request for Comments explained that precedence—a measure of importance of the data stream—could be used as a means of differentiating high priority traffic from low priority traffic.

Now, RFC 791 does contain some discussion of precedence. A "packet" is a little bundle of bits that a network shuffles around. Among other things, a network protocol must specify the form of its packets, just as the US Postal Service demands that envelopes be addressed and stamped in a particular manner. IP specifies a packet format with 8 bits reserved for the "Type of Service" field, which can technically be used to indicate the priority of a packet.

The motivation for this is as follows. Back in 1981, before the Internet emerged as the winner in the ecology of network designs, networking researchers were experimenting with different kinds of networks to run IP on. Some of those networks prioritized packets based on how the packets described themselves. It was believed that IP packets should reserve some space so that these networks could stash priority information in them. This reserved space is the "Type of Service" field.

RFC 791 does not describe how networks would use the "Type of Service" (TOS) field. That is specified in RFC 795, which describes how TOS is used by the AUTODIN II, ARPANET, PRNET, and SATNET networks.

None of those networks was the Internet. They were networks for military communications in the 1960s and '70's. None of them exists today. Now, as every geek knows, ARPANET was the ancestor of the Internet; but not all the features of ARPANET were carried over to the modern Internet. In particular, modern Internet routers do not use the TOS field, at least not as described in RFCs 791/795. Eliding many gory details, DiffServ (a.k.a. DSCP) supercedes TOS, and it is used for traffic shaping within individual subnets, not on the Internet as a whole.

In short, the section on precedence in RFC 791 describes a mechanism that is not, and has never been, used to prioritize packets on the Internet.

Does RFC 791 show that "prioritization has always been considered an important design characteristic for TCP/IP"?

It does not.

RFC 1633

RFC 1633 is, as noted above, a position paper by a group of distinguished networking researchers: R. Braden, D. Clark, and S. Shenker. In this RFC, Braden et al. argued (in 1994) that at some point in the future, a QoS mechanism should be adopted into the Internet's fabric.

Considered as a technical question, this is a controversial argument, but not a ludicrous one. I could discuss it at some length (and, if I ever get my act together, perhaps someday I will do so in this space), but for the moment I must focus on this RFC's relevance to Hahn and Litan's white paper. Hahn and Litan cite this as an "early writing on the Internet" that indicates that "prioritization has always been considered an important design characteristic of TCP/IP". There are at least two problems with this reading.

First, a document dated 1994 cannot be an early writing on the Internet. In June 1994, the Internet had not become a commercial mass phenomenon --- that had to wait for the spread of Netscape --- but it had existed for almost a decade. And, indeed, RFC 1633 sketches a speculative protocol extension to the existing Internet that has not, to date, been adopted by anybody.

Second, and more importantly, here are a few direct quotes from RFC 1633, Section 2:

The fundamental service model of the Internet, as embodied in the best-effort delivery service of IP, has been unchanged since the beginning of the Internet research project 20 years ago [CerfKahn74]. We are now proposing to alter that model . . .

. . . Internet architecture was [sic.] been founded on the concept that all flow-related state should be in the end systems [Clark88].

Designing the TCP/IP protocol suite on this concept led to a robustness that is one of the keys to its success.

In short, the authors state exactly the opposite of what Hahn and Litan would have us adduce. End-to-end flow control was part of the "fundamental service model of the Internet" and "one of the keys to its success".

Does RFC 1633 show that the Internet presents "a sharp contrast to the romantic ideal of the end-to-end principle"?

It does not.

RFC 794

Best for last. This one's particularly hilarious. Hahn and Litan quote this RFC at length --- one of the few times they do so:

In packet switching systems, there is little or no storage in the transport system so that precedence has little impact on delay for processing a packet. However, when a packet switching system reaches saturation, it rejects offered traffic. Precedence can be used in saturated packet switched systems to sort traffic queued for entry into the system. In general, precedence is a tool for deciding how to allocate resources when systems are saturated. In circuit switched systems, the resource is circuits; in message switched systems the resource is the message switch processor; and in packet switching the resource is the packet switching system itself.

That's a fine excerpt from RFC 794. The problem is that RFC 794 describes AUTODIN, not the Internet. Do you use AUTODIN? Me neither.

Vint Cerf was a networking researcher. He and Bob Kahn tried lots of things. The fact that some of his projects used packet prioritization has almost no relevance to the fact that the one project that succeeded wildly was a neutral network with end-to-end flow control.

Does RFC 794 give us an "early writing on the Internet"? Does it show that "prioritization has always been considered an important design characteristic for TCP/IP"? Does it demonstrate the Internet's "sharp contrast to the romantic ideal of the end-to-end principle"?

It. Does. Not.

Conclusion

The above points become apparent to anybody of moderate technical knowledge who attempts to read the RFCs carefully and understand them. RFCs were frequently written by Ph.D.'s, but they were not written for Ph.D.'s; they were written for hackers.

It is, perhaps, understandable that Hahn and Litan --- two economists --- could not understand these RFCs in detail. However, they have misread the RFCs so completely that it is almost inconceivable to me that they could have consulted someone with the necessary background.

They construe RFC 675 --- a description of an end-to-end transport protocol --- as a blow against the end-to-end principle. They construe (portions of) RFCs 791, 1633, and 794 --- documents which do not describe the Internet --- as documents describing the foundational principles of the Internet. In some cases, as with 1633, they cite these documents in support of a claim that is specifically refuted by plain text in the document.

How could this happen?

I would guess that Hahn and Litan's "research" process went something like this. First and foremost, they knew that they wanted to produce a paper arguing against network neutrality regulation. They had heard somewhere about these "RFC" things, and they knew that Vint Cerf, one of the current big pro-neutrality voices, had written a bunch of them. So, they decided to go search for the words "precedence", "priority", and "quality of service" in the old RFCs. To their great delight, these words appeared in some RFCs by Cerf himself, and by other prominent networking researchers. Alas, these technical documents turned out to be pretty tough to interpret if you've never written a line of networking code in your life. However, never mind meaning or context: knowing their "research" community --- economists predisposed to disliking regulation --- they figured they could get away with fudging the citations anyway, because none of their peers would understand the RFCs either. Most of them wouldn't even bother to try. So they went ahead and wrote their paper, and got it accepted to a little economics review.

Now, I understand that this is a pretty nasty thing to say. Given Hahn and Litan's long and distinguished careers in academia and public service, I would like to believe something else, but I'm having trouble doing it. I mean, look at the evidence above. They have clearly leaned upon the facts, as the proverb goes, as a drunkard leans upon a lamppost: for support, not illumination.

At best, I can understand this behavior as a combination of ignorance and arrogance: maybe the authors believed their vast experience in parsing documents in economics and law made it unnecessary to consult experts in computer science ("Not even a real science --- it has 'science' in the title!"). At worst, though, one could argue that it's a mixture of intellectual dishonesty and irresponsibility.


In Part 2 (if I ever manage to write it): Hahn and Litan's errors regarding VPNs and World of Warcraft.


[0] I normally ignore anything that comes out of AEI, as it tends to be 99% worthless on technology issues, and it's more work than it's worth to sort the wheat from the chaff. Based on Tim's decision to post about this paper, I waived my normal skepticism, and was pretty badly disappointed. Sigh. I have adjusted my priors, as the Bayesians say.

[1] By "normative sentence", I mean one stating a property that a TCP implementation must have in order to be rightfully called a TCP implementation. Now, like most RFCs, 675 is not an ultra-terse mathematical specification, but a document intended to be a useful and readable guide to practical implementors. So, it gives some background about routers and such to provide the reader with context. But as Cerf and Kahn state, TCP makes almost no requirements of the underlying network beyond its ability to carry bits, which is one reason why it works over substrates ranging from circuit-switched telephony (dial-up Internet) to the postal service.

Wednesday, January 17, 2007

War's most significant bit

Read this. Then read this.

One of the reasons I post less frequently than I used to is that lately, my despair at the stupidity of humanity exceeds my fury, whereas the opposite once obtained. Shall I bother to point out the obvious? All right, once more into the breach. This post will be more indirect than it needs to be, but I can only overcome my sense of the inherent futility of it all by creeping up on the subject sideways.

Computers represent everything, including numbers, using bits. Each bit is either a one or a zero. One and zero are not the only numbers we would like to represent: two and three and five hundred million are all nice too. To represent other numbers, computers use several ones and zeros at a time. This is called binary notation:

11010110

represents the number 214. Binary notation works more or less like the decimal notation that everyone's familiar with. In decimal, you interpret a number by multiplying each successive digit from right to left by an increasing power of ten, so 214 = (2 * 102) + (1 * 101) + (4 * 100). In binary, you interpret the bits by multiplying by successive powers of two, so that the rightmost bit is multiplied by 1 (20), the next-to-rightmost bit is multiplied 2 (21), and so on, up to the leftmost bit, which is multiplied by 2n-1, where n is the number of bits in your string. In the above case, because there are eight bits, the leftmost bit represents 27, or 128. Summing up, 128 + 64 + 16 + 4 + 2 = 214.

Note that the leftmost bit is vastly more important than the rightmost bit. If you twiddle the rightmost bit of 11010110, you get the string

11010111

which corresponds to 215. That's pretty close to 214. If you twiddle the leftmost bit, you get the string

01010110

which corresponds to 86. That's pretty far from 214. Programmers call the leftmost bit the most significant bit; we call the rightmost bit the least significant bit.

As the size of your string grows linearly larger, the difference between the significance of the most significant bit and the least significant bit grows exponentially larger. With a 16-bit string, the difference is about about thirty-three thousand to 1. With a 32-bit string, the difference is about 2 billion to 1.

The concept is suggestive, and programmers readily adapt it metaphorically to other subjects. In most decisions in life, it's important to get the most significant bits exactly right, and much less important to get the least significant bits exactly right. Err on the most significant bit in the quantity of your jet's fuel, and you're making a "water landing" in the middle of the Atlantic. Err on the least significant bit and nobody will notice.

The ability to concentrate on the most significant bits is also called "having a sense of proportion".

Here, in what I consider roughly descending order of significance, are several "bits" of truth from the years 2002-2007:

  • America should not have invaded Iraq.
  • Congress should not have given Bush authority to invade Iraq at his sole discretion.
  • Invading Iraq without a UN mandate hinders America's diplomatic efforts, which are crucial to antiterrorism and nuclear non-proliferation policy.
  • Given its failure to establish a stable state in Afghanistan, the Bush administration could not be trusted to handle the aftermath of the invasion of Iraq.
  • Invasion of Iraq and its subsequent destabilization fuel anti-American sentiment and provide a propaganda bonanza for violently radical Islamist movements all over the world.
  • Saddam Hussein's WMD program had not made significant progress towards either nuclear weapons or mass casualty biological weapons.
  • The doctrine of preventative war does not suffice to justify the Iraq War.
  • ...
  • ...
  • ...(several thousand more bits)...
  • ...
  • As a companion to coffee, key lime pie is superior to blueberry pie.
  • Absolute isolationism and absolute pacifism are philosophically unsound.
  • The Beyoncé Knowles song "Deja Vu" was produced by chart-topping super-producer Rodney Jenkins.
  • Theoretically speaking, the doctrine of preventative war might someday suffice to justify some hypothetical war.
  • ...

One can summarize Megan McArdle's point (and Kevin Drum's point here) as follows: "Some leftists were wrong about the least significant bits on the Iraq War; therefore, the arguments of anti-war advocates remain no more credible than those of people who were wrong about the most significant bits."

Of course, stated this way, nobody would dare make such an argument. Obviously, a decision procedure that leads to correct answers in the most significant bits is strictly preferable to one that gets the most significant bits wrong but the least significant ones right. Arguing otherwise is transparently stupid. So former hawks take a circuitous route that's no less stupid, but slightly less transparent. By filling the conversation with angels-dancing-on-the-heads-of-pins arguments about absolute isolationism and absolute pacifism and preventative war, McArdle and Drum hope to divert readers' attention to the least significant bits and induce the illusion that the most significant bits are not, in fact, far more significant.

And I suspect that McArdle and Drum even believe that they're talking about important subjects, despite the fact that virtually nobody actually believes in absolute isolationism or absolute pacifism, or that preventative war can never be justified. If you're arguing that "sometimes war can be justified", you're arguing with Quakers and half-mad hermits who live alone in the woods. To pretend otherwise requires massive cognitive dissonance, or a completely unprincipled and remorseless willingness to erect straw men, or both.

Of course, pundits have ample reason for cognitive dissonance. Pundits blow hot air around for a living. Their sense of self-worth depends on the belief that such vigorous thermoconvection makes them better qualified to judge matters of import than the hoi polloi, who rely on simple rules of thumb, like "War Is Bad". It literally does not compute in their minds that some shaggy dumbass off the street with a picket sign could have better judgment than, say, a professional writer for the Economist or the Washington Monthly.

But "War Is Bad" is a pretty good rule, because in the vast majority of practical cases, nonviolent action leads to better outcomes than war. "War Is Bad" gets the most significant bit right far more often than the punditological prestidigitation of which McArdle et al. are so fond. If all liberal (and "libertarian") hawks had shouted from the rooftops in 2002 that "War Is Bad" instead of the pseudo-nuanced bullshit they actually said, it would have strictly improved objective outcomes for the nation.

But rather than learn from this experience, McArdle's looking around for excuses to ignore the lesson. As an individual, of course, McArdle barely matters at all, but she's representative of a whole equivalence class of formerly hawkish intellectuals who want to emerge from this debacle without troubling themselves to rethink a single assumption.

Hence my despair and fury. The stupidity, the arrogance, the willful blindness; and these people will not be called to account. If anything, they'll be rewarded.