Friday, November 14, 2014

The goTo object pattern in Node.js code

I've been messing around with Node.js lately. Like everyone using Node.js, I've been wrestling with the fact that it forces programmers to use hand-rolled continuation passing style for all I/O. Of course, I could use something like async.waterfall() to eliminate boilerplate and deeply indented nested callbacks. However, since I am crazy, I am using Closure Compiler to statically typecheck my server code, and I don't like the way async.waterfall() defeats the typechecker.

You can stop reading here if you're perfectly happy with using async for everything, or perhaps if you're one of those people convinced that static typing is a boondoggle.

Anyway, I've been using a "goTo object" for my callbacks instead. The essence of the pattern is as follows:

  • Define a local variable named goTo whose members are your callbacks.
  • Asynchronous calls use goTo.functionName as the callback expression (usually the final argument to the asynchronous function).
  • At the end of the function, outside the goTo object, start the callback chain by calling goTo.start().

Here is an example, loosely patterned after some database code I recently wrote against the pg npm module. In raw nested-callback style, you would write the following:

var upsert = function(client, ..., cb) {
  client.connect(function(err, conn) {
    if (err) {
      cb(err);
      return;
    }
    
    conn.insert(..., function(err, result) {
      if (err) {
        if (isKeyConflict(err)) {
          conn.update(..., function(err, result) {
            conn.done();
            if (err) {
              cb(err);
              return;
            }
            cb(null, 'updated');
          });
        }
          
        conn.done();
        cb(err);
        return;
      }

      conn.done();
      cb(null, 'inserted');
    });
  });
};

With a goTo object, you write this:

var upsert = function(client, ..., cb) {
  var conn = null;

  var goTo = {
    start: function() {
      client.connect(..., goTo.onConnect);
    },

    onConnect: function(err, conn_) {
      if (err) {
        goTo.finish(err);
        return;
      }
      conn = conn_;  // Stash for later callbacks.
      conn.insert(..., goTo.onInsert);
    },

    onInsert: function(err, result) {
      if (err) {
        if (isKeyConflict(err)) {
          conn.update(..., goTo.onUpdate);
          return;
        }
        goTo.finish(err);
        return;
      }
      goTo.finish(null, 'inserted');
    },

    onUpdate: function(err, result) {
      if (err) {
        goTo.finish(err);
        return;
      }
      goTo.finish(null, 'updated');
    },

    finish: function(err, result) {
      if (conn) {
        conn.done();
      }
      cb(err, result);
    }
  };
  goTo.start();
};

This pattern is easy to annotate with accurate static types:

var upsert = function(...) {
  /** @type {?db.Connection} */
  var conn = null;

  var goTo = {
    ...

    /**
     * @param {db.Error} err
     * @param {db.Connection} conn_
     */
    onConnect: function(err, conn_) { ... },

    /**
     * @param {db.Error} err
     * @param {db.ResultSet} result 
     */
    onInsert: function(err, result) { ... },

    /**
     * @param {db.Error} err
     * @param {db.ResultSet} result
     */
    onUpdate: function(err, result) { ... },

    /**
     * @param {?db.Error} err
     * @param {string=} result
     */
    finish: function(err, result) { ... }
  };
  goTo.start();
};

Some notes on this pattern:

  • It is slightly more verbose than the naive nested-callback version. However, the indentation level does not grow linearly in the length of the call chain, so it scales better with the complexity of the operation. If you add a couple more levels to the nested-callback version, you have "Tea-Party Code", whereas the goTo object version stays the same nesting depth.
  • As in the nested-callback style, error handlers must be written by hand in each callback, which is still verbose and repetitive: the phrase if (err) { goTo.finish(err); return; } occurs repeatedly. On the other hand, you retain the ability to handle errors differently in one callback, as we do here with key conflicts on insertion.
  • Callbacks in the goTo object can have different types, and they will still be precisely typechecked.
  • The pattern generalizes easily to branches and loops (not surprising: it's just an encoding of old-school goto).
  • Data that is initialized during one callback and then used in later callbacks must be declared at top-level as a nullable variable. The top-level variable list can therefore get cluttered. More annoyingly, if your code base heavily uses non-nullable type annotations (Closure's !T), you will have to insert casts or checks when you use the variable, even if you can reason from the control flow that it will be non-null by the time you use it.
  • Sometimes I omit goTo.start(), and just write its contents at top-level, using the goTo object for callbacks only. This makes the code slightly more compact, but has the downside that the code no longer reads top-down.
  • There is no hidden control flow. The whole thing is just a coding idiom, not a library, and you don't have to reason about any complex combinator application going on behind the scenes. Therefore, for example, exceptions propagate exactly as you'd expect just from reading the code.
  • The camelCase identifier goTo is used because goto is a reserved word in JavaScript (reserved for future use; it currently has no semantics).

For comparison, here is the example rewritten with async.waterfall():

var async = require('async');

var upsert = function(client, ..., finalCb) {
  var conn = null;
  async.waterfall([
    function(cb) {
      client.connect(..., cb);
    },

    function(conn_, cb) {
      conn = conn_;
      client.insert(..., cb);
    }

  ], function(err, result) {
    if (isKeyConflict(err)) {
      client.update(..., function(err, result) {
        conn.done();
        if (err) {
          finalCb(err);
          return;
        }
        finalCb(null, 'updated');
      });
      return;
    }

    conn.done();
    if (err) {
      finalCb(err);
      return;
    }
    finalCb(null, 'inserted');
  });
};

In some ways, this is better and terser than the goTo object. Most importantly, error handling and operation completion are isolated in one location. Also, blocks in the waterfall are anonymous, so you're not cluttering up your code with extra identifiers.

On the other hand, this has some downsides, which are mostly the flip sides of some properties of goTo objects:

  • Closure, like most generic type systems, only supports arrays of homogeneous element type (AFAIK Typescript shares this limitation update: fixed in 1.3; see comments). Therefore, the callbacks in async.waterfall()'s first argument must be typed with their least upper bound, function(...[*]), thus losing any useful static typing for the callbacks and their arguments.
  • Any custom error handling for a particular callback must be performed in the shared "done" callback. Note that for the above example to work, the error object must carry enough information so that isKeyConflict() (whose implementation is not shown) can return true for insertion conflicts only. Otherwise, we have introduced a defect.
  • Only a linear chain of calls is supported. Branches and loops must be hand-rolled, or you have to use additional nested combinators. This doesn't matter for this example, but branches and loops aren't uncommon in interesting application code.

Now, goTo objects are not strictly superior to the alternatives in all situations. The pattern still has some overhead and boilerplate. For one or two levels of callbacks, you should probably just write in the naive nested callback style. If you have a linear callback chain of homogeneous type, or if you just don't care about statically typing the code, async.waterfall() has some advantages.

Plus, popping up a level, if you are writing lots of complex logic in your server, I'm not sure Node.js is even the right technology base. Languages where you don't have to write in continuation-passing style in the first place may be more pleasurable, terse, and straightforward. I mean, look: I've been reduced to programming with goto, the original harmful technology. By writing up this post, I'm trying to make the best of a bad situation, not leaping out of my bathtub crying eureka.

Anyway, caveats aside, I just thought I'd share this pattern in case anyone finds it useful. Yesterday I was chatting about Node.js with a friend and when I mentioned how I was handling callback hell, he seemed mildly surprised. I thought everybody was using some variant of this already, at least wherever they weren't using async. Apparently not.


p.s. The above pattern is, of course, not confined to Node.js. It could be used in any codebase written in CPS, in a language that has letrec or an equivalent construct. It's hard to think of another context where people intentionally write CPS by hand though.

Monday, November 10, 2014

A lottery is a tax on... people who are good at reasoning about risk-adjusted returns?

Rescued from the drafts folder because John Oliver has rendered it timely.

People who consider themselves smart sometimes joke that a lottery is a tax on people who are bad at math.

The root of this reasoning is that the expected return on investment for a lottery ticket is negative. It can't help but be otherwise: the lottery turns a profit, hence the payout multiplied by the probability of winning must be less than the price of the ticket. Therefore, the reasoning goes, the people who buy lottery tickets must be incapable of figuring this out. Ha ha, let us laugh at the rubes and congratulate ourselves on our superiority.

One rejoinder is that the true value of a lottery ticket, to the buyer, is the entertainment value of the fantasy of winning. One obtains the fantasy with probability 1 and thus as long as the entertainment value of this fantasy exceeds the price of the ticket, the buyer comes out ahead.

There is something to this. But I think a stronger claim can be made.

A lottery is the mirror image of catastrophe insurance. Note that buying insurance also has a negative expected return. Provided their actuarial tables are accurate, insurance companies turn a profit. Therefore the probability of being compensated for your loss, multiplied by the compensation, must be less than the cost of the insurance premiums. But nobody says that insurance is a tax on people who are bad at math. Quite the opposite: buying insurance is viewed as a sign of prudence.

The issue, of course, is that the naive expected return calculation fails to adequately consider the nature of risk and catastrophe. At certain thresholds, financial consequences as experienced by human beings become strongly nonlinear, probably due to the declining marginal utility of money. Suffering complete ruin due to, say, a car accident entails such severe consequences that we are willing to accept a modestly negative expected return in exchange for the assurance that it will simply never occur.

A lottery is simply the flip side of this coin. The extremely rare event is a hugely positive one, instead of a hugely negative one, huge enough to produce qualitative rather than merely quantitative changes to your lifestyle. And one accepts a modestly negative expected return, not so that one can avoid the risk, but that one can be exposed to the risk.

If you are a member of the educated, affluent middle class, there is an excellent chance that your instinct rebels, and you're already hunting for the flaw in this reasoning. Surely there's something sad, and not prudent, about those largely working-class souls who buy a lottery ticket every week, rather than rolling that $52 per year into a safe index fund at a post-tax rate of return of roughly 2.5% per annum, at which rate their investment, if it somehow survived unperturbed the ups and downs of a life that is considerably more exposed to financial risk than a middle-class person's, might compound to the princely sum of a couple months' rent by the time they die, or alternatively enough to pay for a slightly nicer coffin.

And maybe there is a flaw in this reasoning. I'm not completely convinced myself and I'm not going to start buying lottery tickets. But I'm honestly having trouble finding the flaw. Any indictment of spending (small amounts of) money on lottery tickets must surely also apply to buying insurance. If it is worth overpaying a little bit to eliminate the possibility of a hugely negative outcome, surely it is worth overpaying a little bit to create the possibility of a hugely positive outcome. The situations are symmetric, and I think one can only break the symmetry by admitting the validity of loss-aversion (usually viewed as irrational by economists) or something similar.

Alternatively, if you have access to the actuarial tables of your insurance company, then perhaps you can argue that a lottery is, quantitatively, simply a worse deal than your insurance typically is... but you probably don't have access to those actuarial tables. And I sincerely doubt that the widespread middle-class snobbery towards lottery players is based on quantitative calculations of this sort. (Actually, I strongly suspect that it is based on a fallacious, gut-instinct "folk probability" feeling that gambling on any extremely remote event, like winning the lottery, is somehow inherently foolish.)

So what's the flaw? I ask this question non-rhetorically; that is, I am genuinely curious about the answer.


p.s. None of the above, of course, is to say that I think the taxation effect of lotteries, which is staggeringly regressive, is a good thing. I would strongly support the replacement of lotteries with progressive tax increases plus transfer payments! Gambling addiction is bad too. But these things seem distinct from the notion that playing the lottery in small amounts is irrational in some game-theoretic sense.

Wednesday, October 01, 2014

On recent events in Hong Kong, and related lessons from American history

The BBC reports:

In China, an editorial in People's Daily warned of "unimaginable consequences" if the protests continued...

Unimaginable consequences! I literally can't imagine, actually. They can't roll out the tanks this time. In the age of ubiquitous digital photography and the Internet, it won't be a few grainy shots of one guy with a shopping bag.

I've been telling people for a while that I think that Chinese leaders are way too fearful of political protests. Look at the United States. Street protests may have affected the course of history a half century ago, but in my lifetime, street protests in America have proven to be ephemeral outbursts of emotion, more a substitute for change than a precursor of it. As I've said before, the system of government by elected representatives has evolved a nearly 100% effective immune system against alteration via protest.

For example, a hundred thousand people took to the streets of New York protesting the Iraq war. A decade later, we appear to be mired in eternal war in the Middle East. At this stage, I have serious doubts that the United States military will leave the Middle East in my lifetime. But those protestors, oh, I bet they felt heroic, like they were doing something world-changing when they were marching down that street in such massive numbers. Even that one inevitable guy holding the "Free Mumia" sign at the anti-war rally.

China's leaders need to learn to relax. Rather than clamping down on protests, they should learn from the West: create free speech zones, let the protestors expend their energy, let them lose focus and direction as the tide of life's million distractions gradually erodes their morale, and meanwhile make all the important decisions in incredibly boring committee meetings and reports, which are freely available (and sometimes even broadcast on television channels that nobody watches) but too numerous and too dull for any normal person to keep up with.

It turns out that if you construct your society just right, the drama and glamour of protests can be used, in a judo move, to undermine their effectiveness. The protest becomes a big identity-affirmation party, an end-in-itself. The shorthand "demo" (for "demonstration"), which is what many activist groups in the U.S. call a street protest, is apropos: the point becomes to show off. After the demo winds down, a bunch of energy has been expended, yet the same people are in charge and are even quite likely to be re-elected.

I don't know quite how we got here, and I'm not saying that I have a better answer, but I will note that if you told aliens that a widely held theory of change among many political activists on Earth was

  1. throw a parade;
  2. the government changes its behavior.

then the aliens might be confused. They might ask: what is the mechanism that you propose for this causal effect, given that the formal mechanisms for altering laws generally make no reference to parades, but rather to activities that take place mostly indoors, such as voting, drafting legislation, bureaucratic rule-making, and court decisions?

Ultimately, the Chinese government's attitude towards protest displays a deep, perhaps even touchingly naive faith in the raw power of people to change the government. It is heartwarming, in an odd way. And, all cynicism aside, I sincerely wish the Chinese people the best with the whole protest thing. Maybe they can keep it working better than we did.

Thursday, September 18, 2014

US ISPs do not deserve much credit for increasing broadband speeds

Disclaimer: I worked for Google from 2006-2013, although not on Fiber.

Towards the end of this Twitter thread sparked by Timothy B. Lee, a commenter writes (by way of defending US ISPs):

Internet speeds have increased 1500% in ten years. 250k Wi-Fi hotspots are now available. That's progress.

When I read this, I just thought, this is such bullshit. Taking it apart with sufficiently satisfying thoroughness requires more than 140 characters, so I'm going to say it here.

US ISPs are riding the wave of technological progress. Giving them credit for the wave is evidence of cluelessness so extreme as to strongly suggest intellectual dishonesty.

First, crediting ISPs for the spread of wireless hotspots is especially egregious: it's a bit like giving them credit for Moore's Law. Even if ISPs had completely failed to increase speeds beyond dial-up, people would still want local networks without the hassle of Ethernet cables. The development and spread of wireless technology was not driven by ISPs. In fact, in some ways the opposite is true, as many ISPs retained official policies against running an open wireless hotspot (or even connecting multiple devices via NAT!) long after broadband became widely available.

Second, as for a 15x improvement in ten years[0]: that might sound impressive to some people, but all I can think is that ten years equals nearly seven doublings of Moore's Law and 27 = 128. Network speed doesn't track transistor density exactly, but computing technology is full of exponential curves like this (storage density, for example, doubles even faster than Moore's Law). To anyone with a clue about computing technology, 15x in 10 years obviously sounds somewhere between mediocre and lousy. In fact, Nielsen's Law predicts compound improvement of 57x over 10 years, or nearly 4x the observed improvement claimed by Dietz.[1] When Dietz calls out 15x improvement as a talking point in ISPs' favor, he is trying to rhetorically exploit an information asymmetry, namely his audience's presumed ignorance of exponential curves in computing technology.

Therefore, the reality is that US ISPs are badly managed technological laggards, just like everyone thinks.

In fact, the pace at which ISPs have taken up technological innovation is so bad that an entrant from another industry was able to enter the market and, without even trying very hard, spank the incumbents so thoroughly that it became a national embarrassment.

Google does so many things so well that you may not even be properly surprised by this fact. Let me try to give you a visceral feel for how anomalous Google Fiber is. Consider what happened when Google entered a market where its major competitor really was pushing innovation to the limit: consider Android. Google had to dedicate hundreds of its best and hardest-working engineers to the project and enlist the support of essentially every other company in the industry, and after eight hard-fought years, the best it can show for its efforts is rough technological parity with Apple.

Or, to take an example where Google was the defender, consider what happened when Microsoft decided to enter the search engine market. Again, this is a market where the strongest competitor really was pushing innovation to the limit. Billions of dollars and millions of engineer-hours later, Bing's search quality is still slightly worse than Google's.

I don't know the head count for Google Fiber (and even if I did, it would be covered by my NDA) but I will venture a strong guess that its engineering head count is far less than Android's, like at least an order of magnitude. In Google's product portfolio, judging by the scale of investment and strategic value, Google Fiber is basically a hobby, one that Google would never even have tried if US ISPs were remotely as good at their jobs as, say, ISPs in South Korea. And yet, technologically, Google Fiber simply outclassed these incumbents, who are supposedly competing and innovating furiously to earn your dollar.[2]

Imagine you had a friend who claimed to be training super-hard at tennis. But whenever you're at the courts, you see them go out to the court, set up the tennis ball machine, and swat lethargically at balls for about ten minutes, after which they sit down to sip an energy drink and diddle around on their phone. Then, one day, your niece, who's in reasonably good shape but has never picked up a tennis racket in her life, drops by, and she walks onto the court and crushes your friend in straight sets without breaking a sweat. You might reasonably question whether your friend was really training as hard as they claim.

In short, lauding the big US ISPs for their piddling achievements is the soft bigotry of low expectations.

But don't worry, ISPs, America isn't giving up on you. We believe that if you're subjected to the right mixture of tough love, including the right kinds of competition and regulation, you're capable of achieving just as much as Google Fiber did. Or, at a minimum, we believe that the ISP market is capable of achieving that. In the meantime, we'll try hard to ignore the risible bullshit of telecom hacks who claim that you're doing well enough.


[0] Engineers generally use multiples rather than percentages to describe improvements of this magnitude. Dietz's use of "increased 1500%" rather than 15x is a classic PR hack's way of making modestly sized numbers seem gigantic.

[1] Nielsen's predictions are fitted to observations, but those observations are for "high-end users". Dietz's number, which presumably describes mean or median users, reveals how poorly ISPs have done at making high-speed internet affordable despite the march of technology. Note that we do not see this failure in other, more competitive areas of computing technology: Intel's mid-range notebook-class processors follow Moore's Law just as well as its high-end server chips.

[2] This paragraph contains a logical honeypot for telecom hacks, who are likely to see an argument that they think they can debunk, but which they can only debunk by resorting to utter bullshit, which they will promptly be called upon. See if you can spot it.

Wednesday, September 17, 2014

An anecdote from the UK, autumn 2013

With polls opening tomorrow for Scottish independence, I feel like sharing a little story. Last fall, A. and I were in Scotland and England for a couple of weeks*. In Scotland, there was more or less incessant discussion of independence: it was discussed constantly on BBC Scotland; there were leaflets, posters, and newspaper headlines; one cab driver gave us a long disquisition, during a short ride to the train station, on the evils of Blair's capitulation in Iraq and how an independent Scotland was nae goan tae be dragged intae wars over Israel and oil any longer by the U.S.

Now, if you travel through Scotland and then England as a tourist, there is a good chance that you will pull some money out of an ATM in Scotland and want to spend it in England. And although the UK shares a single currency, bills printed in Scotland look different from bills printed in England, and the Scottish variants are rare enough in England that paying with a Scottish £10 note in London elicits a moment of surprise. (Apparently it is not even technically legal tender in England, although everyone accepts it anyway.)

So, I was buying a sandwich at a Pret a Manger in London, and as the cashier was eyeballing both sides of the note, I remarked, "Got it in Scotland. Still a part of the UK, for now."

The cashier looked at me and laughed, saying instantly, "Oh, that's not going to change anytime soon."

"Really? It's all they can talk about up there. It's on the radio all the time..."

"Nah, it's not going to happen."

Her tone was casually amused, not so much Scotland shouldn't secede, that would be terrible and more Ha ha, this Yank, thinks that silliness up north is going to amount to something.

Her accent was English, of course; being American, I don't have the ear to pare it down more finely than that, but she was dark-skinned, of African or South Asian descent, and she sounded to me like any other working class London girl. I would venture to guess that this wasn't really your classic upper-class English snobbery about Northerners at work, at least not directly. I think, rather, that the English and Scottish, despite sharing a government, a currency, a language, and a relatively small island, amply interconnected by transit, media, and commerce, had developed completely different collective understandings of the state of the Scottish independence movement. There were, for example, no front-page headlines in London papers (that I can recall) about Scottish independence.

As it turns out, the Scots were right. Even if the current referendum fails, independence has clearly become a real possibility. And it's interesting to me how unseriously the English seemed to take the whole thing, even as late as last fall. Why didn't they see that how real this was? Did most English have no Scottish friends, did they not travel to Scotland? Or maybe the thought of the UK breaking apart was just too massive an event to ponder, sort of like how San Franciscans live in a state of day-to-day collective denial about The Big One?


*Photos (more than any normal person would want to see): Edinburgh, Holyrood Park, Edradour distillery, Quiraing on the Isle of Skye, miscellaneous Scotland; London, York.

Tuesday, September 16, 2014

The world's most popular functional language, and what it teaches us

I realized today, when I read the phrasing of this LtU post, that my last two posts were too pessimistic about functional programming languages. There is, of course, at least one very popular functional programming language, and that is Emacs Lisp. Emacs Lisp is even widely used, at least a little bit, by countless programmers who never use any other functional programming languages at all. But this just confirms my original hypothesis that language popularity is driven almost entirely by platform, not by characteristics of the language itself.

Monday, September 08, 2014

More on programming language adoption, from Meyerovich and Rabkin

A little bit of vindication from Meyerovich and Rabkin; a quote I found particularly interesting (emphasis added):

A given prior language only occasionally correlated with the choice of a specific different language for the next project. Most notably, developers have high propensities to switch between Windows scripting and application languages, such as VBScript and C#. These languages also correlate with Microsoft web-development languages such as ASP. Such correlations are also visible in the results of Karus and Gall [12], who found groupings such as WSDL and XML being used in conjunction with Java.

Notably, we do not see significant exploration within linguistic families. There is a relatively low probability of switching between Scheme and LISP, or between Ruby, Python, and Perl. We conclude that developer movement between languages is driven more by external factors such as the developer’s background or technical ecosystem than by similarity of the underlying languages. This implies that language advocates should focus on a domain and try to convince programmers in that domain, instead of trying to convince programmers who use languages with semantic similarities to the new language.

Note that this clearly weighs against the Chaudhuri/Hicks hypothesis that education or unfamiliarity with functional programming is the "real problem". If developers tended to choose languages based primarily on comfort and familiarity, then we would expect them to switch more frequently among languages within a family than across families. Instead we observe the converse pattern: developers switch quite freely between programming language families whenever they need to do so in order to get work done in their domain.

In fact I think Meyerovich and Rabkin are too tentative in their formulation (maybe appropriate in an academic paper, but here we don't need to be so tentative). I think it is quite unlikely that developer background is a major deterrent to new language adoption. To repeat something I said the other day, developers routinely learn all kinds of weird, complicated, and frequently frustrating technologies in the course of their work. New programming languages are not fundamentally harder than all these other technologies, and programmers will learn them when they need to in order to get work done. The problem most unpopular programming languages face is simply that nobody needs them to get work done.

Overall, people who wish to change the mix of programming languages currently in use should spend less time extolling the virtues of their language (and criticizing competing languages!), and much, much more time developing platforms and libraries to make their language of choice a stellar tool in some concrete domain.

Tuesday, September 02, 2014

Is education to blame for functional programming's minority status?

Rice University's Swarat Chaudhuri asks (and attempts to answer) the perennial question: why isn't functional programming more popular? I have my own long-running theory about why programming languages become popular (or don't), but first let me dispute a couple of specific things in the linked post. Chaudhuri writes:

The same survey also showed that the factor that correlates the most with preferring and using a language is the availability of libraries. This is certainly behind the meteoric rise of, say, Python. However, it seems implausible that this factor is the primary reason why functional programming is unpopular. If it were so, the F# language, which allows functional programmers to utilize the entire .NET infrastructure, would be a mainstream language by now. There’s no indication it will be any time soon.

I think Chaudhuri dismisses the hypothesis far too lightly. Here are three obvious reasons why F# is not a counterexample:

  • I have never programmed in F# (although I've done a little OCaml, and I gather they're almost identical), but my long experience with cross-language interoperability makes me suspect strongly that accessing nontrivial C# libraries from F# is nothing like using libraries written idiomatically in F#. It probably feels much more like calling through a foreign function interface — for example, like accessing Java classes from Jython, except possibly worse, because C# is not only a different language but a different programming paradigm.
  • There are a large variety of inevitable network effects that come from using a single language within a project. If you are going to use mostly C# libraries in the .NET ecosystem, then a sensible project manager is probably going to choose to implement the project itself in C# rather than F#. This is especially true if libraries would force you to write a lot of your code in a semi-OO style within F# anyway.
  • The .NET ecosystem has never had great mindshare in the communities where most of the "hot" industrial software development is happening: open source, backend software running in the datacenter, web development, and mobile development. Spend a little time walking around Silicon Valley and San Francisco, and see how many hackers are using Windows. If, somewhere in the sea of Macbooks, you even glimpse someone using a Thinkpad, there's an excellent chance that it's running Ubuntu. Conversely, if you see someone using Windows, there's an excellent chance they're a business suit from a large corporation (at startups, even the businesspeople use Macbooks).

    In fact, this was almost as true, last I checked (years ago, admittedly), even within the programming language research community. It is startling to me that a programming language researcher would look around, observe almost nobody they know hacking on Windows, and still ask why F# on .NET has not taken off.

    (Mono notwithstanding, my understanding is that Microsoft has never made it a priority to make .NET development a really great experience on non-Windows platforms. C# may have a lot of libraries, but Mono has always been a second-class citizen and there is an excellent chance that large swaths of the C# ecosystem depend on APIs (or, worse, subtle implementation quirks) specific to Microsoft .NET. I suspect any prudent project manager looking at the .NET ecosystem is unlikely to bet the farm on Mono.)

Next, Chaudhuri goes on to argue that the lack of university education in functional programming is to blame. Well, I won't deny that this is a contributing factor, but: few CS schools these days teach Ruby or Perl or Objective-C, yet those languages seem reasonably popular; few CS programs teach more than rudimentary use of version control, but git (i.e. the most complex version control system known to humankind) seems popular; few CS programs teach web frontend development frameworks or MVC or template metaprogramming or MapReduce (at least, not until recently, and certainly not in intro level classes), yet all those things and many more have managed to achieve significant adoption in industry.

In short, professional developers routinely adopt all sorts of complex technologies that are not taught academically. As cool as functional programming is, I just don't believe it's fundamentally that much weirder or harder than all the things modern developers use every day. If I had told you a decade ago that in 2014, a nontrivial number of professional programmers would be writing server applications and developer tools in hand-rolled continuation-passing style, you would have looked at me funny; yet here we are!

So, then, how do I explain the relative unpopularity of functional programming languages?

First, I would observe that most programming languages are not popular, period. People have invented tens of thousands of programming languages, and nearly all of them languish in obscurity. Only a very select few manage to achieve popularity. Given that functional programming languages are a minority of all languages, we should naively expect a minority of popular languages to be functional, just from random selection. The null hypothesis does a lot of work here.

Second, I would observe that nearly all popular programming languages seem to be hybrids. Consider a different programming paradigm: Smalltalk-76 was purely "object-oriented" (everything is an object, every object has a class, every class has a superclass, objects communicate strictly by sending messages), but its most popular descendants seem to be hybrids. For example, C++, Java, and Python are not purely OO.

Therefore, we should expect that a popular functional programming language would also be a hybrid. And indeed when you view things in this light, many popular languages today have adopted bits and pieces that were once viewed as features of functional programming, such as automatic memory management, first-class lexical closures, and parametric polymorphism. Functional programming purists no doubt view this ad hoc borrowing as hopelessly inadequate cargo-cultism that misses the fundamental point of functional programming, but it is nevertheless exactly what we should expect from the gradual popularization of functional programming. In the essay Real Programming in Functional Languages (1982), J. H. Morris Jr. memorably wrote:

Functional languages as a minority doctrine in the field of programming languages bear a certain resemblance to socialism in its relation to conventional, capitalist economic doctrine. Their proponents are often brilliant intellectuals perceived to be radical and rather unrealistic by the mainstream, but little-by-little changes are made in conventional languages and economies to incorporate features of the radical proposals. Of course this never satisfies the radicals, but it represents progress of a sort.

I therefore claim that some small part of FP's "unpopularity" is apparent rather than real.

However, I admit that even the combination of the previous two explanations does not seem sufficient to explain why no primarily functional programming language has become the default way to program in a popular domain. But I don't think education is enough explanation either.

So I have to fall back on my primary theory: I maintain once again that languages reach popularity via platforms.

Thus, for example, Swift will probably be a big deal, independent of almost any qualities it has as a language. Apple is the dictator for the iOs platform. It seems likely that Apple will eventually make Swift the default way to program on iOs. Therefore, Swift will become popular, despite the fact that zero people graduating from university computer science programs in 2014 were taught Swift in school.

If functional programmers want FP to be a bigger deal, then my personal recommendation is:

  • develop an industrial-quality platform for doing something that large numbers of developers really want to do, and
  • evangelize the hell out of it, with a seriousness matching that of professional DevRel teams: videos, tutorials, books, portfolio-quality demo sites in GitHub, reliable turnkey commercial hosting infrastructure if need be, etc.

Web development is one good candidate domain, since (a) web development is a clusterfuck and thus ripe for improvement; (b) web developers are culturally eager to try the new hotness (in fact, arguably a little too eager); (c) you can reach a large audience without requiring any hard technology transitions of users since everyone has a web browser.

Look, for example, at how Rails lifted Ruby from relative obscurity to the default way (at least for a little while) that startups built websites in the Valley. The web framework space is more crowded today, but the field for new ones still seems fairly open, as long as you bring something new to the table. For example, focusing on realtime interaction seems to have bought Meteor a lot of buzz, despite the fact that its backend is currently built on a broken database.

Personally I think there is an opening for a "better PHP" — for all PHP's WTF/lol, if you study Keith Adams's talk "Taking PHP Seriously" (slides) it is clear that the PHP runtime does a few things right that no other platform currently does. Of course, at this point, you're probably laughing at the notion that a bunch of functional programming mandarins is going to successfully devise something for the median PHP programmer to pick up and use. But that is the type of work that might make functional programming the default way to do stuff.


EDIT: For more evidence that Chaudhuri and Hicks are wrong, see Meyerovich and Rabkin's study on language adoption.


p.s. Bizarrely, in a comment on Chaudhuri's post, Bob Harper (whom I have tremendous respect for) claims that Java doesn't have a conditional expression. What? Am I missing something?

  Object x =
    boolExpr1 ? valExpr1 :
    boolExpr2 ? valExpr2 :
    boolExpr3 ? valExpr3 :
    defaultexpr;

Is this not just cond with somewhat uglier syntax?

Friday, August 29, 2014

Uber vs taxi wait times: non-anecdotal evidence

More on the taxis vs. Uber in San Francisco beat.

I repeat my previous assertion that, anecdotally, ">20 min" is an understatement on weekend nights, and that the decision to use only 3 buckets ("<10 min", "10-20 min", ">20 min") conceals the long tail. As in so many other applications, tail latency has a disproportionate negative effect on on user experience, and looking at even the median or mean latency is insufficient: you should measure out to 95th percentile latency at least (n.b. this is standard procedure at Google).

Saturday, August 23, 2014

Making a package.json from exact installed npm package versions

Attention conservation notice: Google-food for nodejs users.

It's easy to get a list of just the packages you directly depend on:

npm list --depth=0

However, there's no built-in way to get npm list to output results in package.json format. Here's a little shell recipe:

npm list --depth=0 \
  |tail +2 \
  |sed '/^\s*$/d' \
  |gawk -F ' ' '{print $2}' \
  |gawk -F '@' '\
     NR > 1 { printf ",\n" } \
     { printf "\"%s\": \"%s\"", $1, $2 } \
     ENDFILE { printf "\n" }'

This isn't a complete package.json, but it's a format that can easily be copied into one, via a second shell recipe or whatever else you like. (If your pipe-fu is strong you can probably figure out how to extend this to do the whole package.json in a one-liner.)

Motivation: When you npm install --save or npm install --save-dev, npm inserts packages with "the default semver operator" by default. It's easy to forget to pass --save-exact; or, if you're just doing exploratory hacking, you might not even want --save-exact. But when you're ready to cut a build for deployment, you need to capture exact package versions, because semver is basically bullshit can't be relied upon. Hence the recipe above, which can be used to generate or update the dependencies section of the package.json in a deploy directory.

Saturday, August 09, 2014

A few problems with Firefox

Disclaimer: I worked at Google for 6.5 years, although not on the Chrome team. I am currently independent. I also worked a little bit with Rob at IBM in 2001.

Rob O'Callahan has a good post explaining why you should use Firefox. I am sympathetic to this argument, but I can't bring myself to switch yet. I try periodically, and every time I end up bouncing off again. Alas, Firefox is currently inferior in specific ways that are cripplingly bad for my needs.

First, as a web developer:

  1. The developer tools are really janky.
  2. The profiles functionality is buried and has no in-browser UI. I need quick, simple profile management and switching when testing my apps. (Don't refer me to the add-on market; add-ons are a cesspool.)

Second, as someone who recommends tech to my family:

  1. Multi-process isolation and sandboxing have still not shipped.
  2. Firefox's updates are still not as timely as Chrome's. On my personal machines, I often find that I'm running an older version of Firefox weeks after a new version is released. In fact, just as I was writing this, I found that the machine I'm typing it on was still running 30.0, when 31.0 was released on July 22.

As a result of (c) and (d), Firefox is plausibly the least secure major web browser shipping today. I can't recommend Firefox to my family until these things are fixed. I won't expose them to a high risk of exploitation, here and now, solely to protect them from a theoretical risk that they'll be harmed by the Chrome team's product choices in the future.

In addition to all of the above, I think Rob overstates the extent to which Google is (1) winning and (2) likely to use that position to harm users in the foreseeable future.

Obviously, I am biased w.r.t. (2), so I don't think it's productive for me to try to convince you of my point of view in this post. At any rate, Rob is better-informed than me about browser politics, not to mention much smarter than me, so I am willing to believe that he has good reasons for believing what he does.

However, w.r.t. (1):

  • On desktop, Chrome doesn't even have majority market share. It is trending upwards, but it's a long, long climb from its present ~40% position to the 80-90% that Microsoft once had in desktop operating systems. And Google's competitors today are a lot more impressive than Microsoft's were in the 1990s. I don't think it's plausible that a large fraction of the web will build strictly for Chrome, or even Blink-based web browsers, anytime in the foreseeable future, unless all of Google's competitors fumble the ball mightily.
  • On mobile, iOs isn't going away any time soon. And I would bet (a small amount of) money that forks of Android, from China and elsewhere, will reach rough market share parity with Google Android in the long run.[0].

Today's giants always look more invulnerable than they really are. Apple looked unstoppable just a few years ago, Microsoft not that long before that. Facebook looked like it would become the identity layer for all human interaction; now it's just a boring and somewhat declassé social media site for middle aged people (plus a server farm for a few flashy acquisitions). Google may seem like a juggernaut, and to be fair I think it is much more competent on average than any of the three companies I just mentioned, but it's vulnerable in ways that aren't even obvious to us today and I'll be very surprised if we look back in 2024 and find that Google is dictating terms to the rest of the technology industry, in the old Microsoft (or new Apple) mold, rather than being merely one influential player among many.


[0] Incidentally, since Google has let the AOSP web browser languish, and restricted Google Chrome(TM) to its increasingly-tightly-constrained partners, making Firefox run amazingly well on non-Google Android — well enough that non-Google Android users almost universally either get Firefox shipped with their device or install it themselves — might be a more plausible path to getting large mobile market share globally than Firefox OS. (I do think Firefox OS is an important and worthwhile project as well.)

Monday, June 23, 2014

More on taxis

I will fully cop to the fact that my previous bitching about taxis concerned a classic San Francisco yuppie firstworldproblem but here are some more stories about how taxis in many cities serve people of all stripes exceptionally poorly.

Relatedly, I took an UberX with a friend the other night and the experience was awesome.

I might have liked the option of tipping the the driver (via the app, after you get out of the car), but UberX doesn't allow that. On the other hand, maybe tipping would establish a social norm for companies like Uber to underpay their drivers with the expectation of tips? I really like traveling abroad in countries that don't have a tipping culture; it seems more rational and arguably in the long run leaves labor in a better position since their compensation is assured by contractual terms rather than manners. So maybe it's great that Uber doesn't have tipping, only reviews.

Wednesday, June 18, 2014

What happens when you let people pay to remove ads

Disclaimer: Obviously I worked at Google for 6.5 years so take this with the appropriate grain of salt.[0]

"I wish Google would just let me pay them money to remove ads from this service. WTF GOOGLE IS SO DUMB AND EVIL WHY WON'T THEY DO THIS."

How many times, in various corners of the Internet, have you read some variation of the preceding sentiment?

Now, it turns out, Google is trying to do this with a subset of music videos on YouTube. But when you do this, you need a contract with the rights holder. This is complicated. The rights holder might not agree to your contract terms. For example, they might want more money. So you end up in negotiations. While those negotiations are ongoing, or if they break down, you can't include that music in your paid service.

The Internet's collective reaction has been: "WTF GOOGLE IS SO DUMB AND EVIL WHY ARE THEY DOING THIS?"

Now, YouTube will still host music videos that are not opted into its monetization program, i.e., if you are a band and want to put music videos on the Internet for free, YouTube will host those videos for you. Let's be clear what this means: if you want to distribute a video to a hundred million strangers on the Internet, YouTube will pay for the software and servers and a petabyte of network bandwidth and a small army of SREs holding pagers who will wake up at 3am if too many people in Kuala Lumpur click on your video and get an HTTP 50x error, and it will do so without you paying YouTube a dime. None of that's changing. But if you opt into YouTube's monetization program, you will have to opt into its full, updated monetization program: ads for non-subscribers, and no ads for subscribers.

Note that there is no sensible way to let users pay to remove ads from music videos while also still showing ads to those same users for some music videos. If YouTube did this, you can bet the Internet would collectively scream, once again, "WTF GOOGLE IS SO DUMB AND EVIL WHY DON'T THEY REMOVE ADS ON MUSIC VIDEOS WHEN I PAID TO HAVE ADS REMOVED? I'm unsubscribing from this bullshit service!" The subscription service would fail and YouTube would have to revert to its old model of monetizing via ads only.

In short, "blocking" — i.e., excluding unlicensed music from the monetization program — is an inevitable consequence of having a paid subscription service.

The press has done an abysmal job covering this. It seems that every year my contempt for (most) journalists finds reasons to grow greater and greater. I'm pretty disappointed that the Financial Times, reputedly pretty reliable, appears to be Ground Zero for this particular blast of misinformation.

p.s. As for claims that YouTube has anything approaching a monopoly on online video sharing, I'm honestly puzzled by the claim. Just to take one example competitor, Vimeo is pretty reliable and seems to have rough feature parity, including embedding in external sites. Vimeo's content is included in Google's video search corpus (example) and therefore shows up as a thumbnail image in Google web search, just like YouTube videos. If you need to make money, Vimeo has a few built-in monetization options; if those are insufficient, there's a lot of innovation occurring in the world of funding, and in my opinion the Patreon/Subbable model seems much more promising for creators with a small-to-medium-sized audience than Spotify-style monetization, which currently amounts to "we'll give you 0.0001 pennies per stream play so that both the company and the artists can lose money hand over fist!" Is it really true that Adele fans will face substantial (or even non-substantial) barriers to watching her music videos if they're on Vimeo? And that's just one competitor. The upshot is that indie labels are probably wrangling with YouTube over licensing terms not because it has anything like a monopoly, but because they think wrangling with YouTube will make them more money than all the other alternatives. Which is perfectly fine — more power to them, negotiating the best deal is essential in business, etc. — but we should not misread the situation.


[0] On the other hand, one benefit of no longer working there is that I can write stuff like this post, just like I used to before I worked there. Arguably, working at Google made me less predisposed to harshly criticizing misinformed critics of Google.

Uber and taxis in San Francisco

Uber is a national story now, but this post from a former colleague reminded me of a little fact about Uber's origins that many observers outside of San Francisco probably don't know.

Before Uber existed, San Francisco taxicabs were extremely unreliable. On busy nights, you would commonly call a cab dispatching company on your cell phone, talk to a dispatcher who promised that a cab would arrive, wait half an hour or more, and still have no cab show up. This only has to happen a couple of times, on a chilly San Francisco Saturday night while your date stands on the street shivering in her dress and heels, before you start saying to yourself, "Fuck all taxicab companies, forever."

Conversely, I speculate that it was not uncommon for cab drivers to receive a call, only to have the passengers disappear, since they'd grabbed another cab that happened by sooner. In fact, once I concluded that taxi drivers were as likely to flake out on me as not, I started doing this myself. What else could you do?

In other words, game-theoretically, taxicabs and passengers were stuck in a low-trust equilibrium. Passengers could not rely on cabs, so they would grab any cab they could, even if they'd called one. Cabs could not rely on passengers, so they would pick up any fare that looked promising, even if they had been dispatched. A taxi company could have disrupted this equilibrium by offering reliable service and building a brand name as such. None of them did. Much like your local cable company, they were collectively content to sit on their government-protected oligopoly and treat customers like garbage.

(Notice that passengers can't disrupt this equilibrium: there's no equivalent "union of passengers" to which a reputation for reliability could be attached.)

The most important thing about Uber, at least initially, was not that you called it with a handy smartphone app, or that Uber cars were fancy black limos, nor even that they always accepted credit cards.[0] It was simpler than that: Uber cars came when you called. Uber achieves this result through a variety of means, but the most obvious is simply holding individual drivers accountable for every ride.

Therefore, at least in San Francisco, the whining of taxi companies exposed to competition from Uber and Lyft is the whining of any industry that serves its customers badly, and then is exposed to superior competition that serves its customers well.

Uber's initial success in San Francisco was a key stepping stone in their rise to national prominence. One wonders whether Uber would exist today if even a single SF cab company had served customers reliably.


[0] p.s. The credit card thing. Every SF taxi rider has had this happen: the driver has the credit card reader in the car, staring you in the face, but the driver says: "Sorry, it's not working." Now, if this happened once in a while, you might believe it. But it happens every single time. Is it really the case that every single credit card reader in the SF taxicab fleet is out of order 100% of the time despite a good-faith effort to keep them operational? Or is it much more likely that the driver is trying to cheat on his taxes, which is made easier if he forces all his customers to tip him in cash? Or else the taxi company is cheating on its obligations to its drivers by keeping tips paid on credit cards for itself? And now this confederacy of cheats wants the government to protect them from competition? Do you feel sympathy for this band of cynical hypocrites?

Tuesday, February 18, 2014

Why should Mozilla continue developing Rust?

The title of this post is not (entirely) rhetorical. I'm puzzled by Mozilla's decision to burn several of its full-time engineers' cycles on the Rust programming language. Maybe someone can enlighten me. I offer the following observations with all due humility.

As a problem of engineering management, programming language development can be analyzed in terms of cost, risk, and return. I do not see how Rust makes sense for Mozilla along any of these dimensions.

Cost: Getting a new programming language off the ground only requires a handful of people, if you have the right people. For giant technology corporations, which have thousands of engineers on staff, that's a tiny relative overhead, well within the margin of a sensible research budget. By contrast, for Mozilla, a lean, low-headcount organization, it's a major opportunity cost. A handful of Mozilla's top full-time developers represents a big bite out of Mozilla's engineering throughput.

Risk: The baseline probability that a new programming language will become popular is low. It is even lower when you cannot use the popularity of a platform to strong-arm external developers into adopting it (the existence of such a platform is the story behind the success of JavaScript, Objective-C, and C#). Mozilla has no platform that it can use to strong-arm anyone into using Rust, nor does it even wish that it did. And when a programming language does not become popular, its few sponsors end up doing all the heavy lifting for it indefinitely. The language also remains a laggard behind comparable languages in libraries and tools. Mozilla isn't well positioned to mitigate this risk or to deal with its downside.

Return: The returns to adopting a better language are roughly proportional to the developer-hours spent writing code with it. As a organization with both low headcount and few projects, Mozilla will have difficulty recouping its investment:

  • Consider Google and Go. Suppose 5 core Go developers work on the language for 2 years; this costs about 5 * 2000 * 2 = 20k developer-hours. This investment can be recouped by saving 1000 Google engineers an average of 2 hours per week for just 10 weeks, or equivalently by providing a quality increase equivalent to 2 more hours per week of design, testing, etc. Viewed this way, Go has a plausibly huge upside for Google. But Mozilla will never have 1000 full-time engineers using Rust; it has a few hundred employees roughly 1k employees total, not all of whom are engineers, and few of whom might be writing Rust full-time in the foreseeable future. Perform a similar calculation using any remotely plausible numbers, and the case for the Rust's return to Mozilla comes out pretty weak. Either Rust will take far too long to pay off, or it must be orders of magnitude better than the next-best alternative (and the latter seems unlikely).
  • It is easier to realize the benefits of a better language when starting a new, greenfield project than when rewriting legacy code. In fact, when rewriting legacy code, developers incur more costs than benefits until a tipping point is reached. Google, Microsoft, or Amazon, to take a few examples, each start dozens of substantial projects and hundreds of little projects per year. At such companies, it is conceivable that, by winning over new projects at their inception, a new language could achieve an annual adoption rate of hundreds of developers per year. By contrast, Mozilla can't start more than few projects per year without losing focus. So the rate at which Mozilla can realize the benefit of a new language is constrained even further than their headcount would suggest. And indeed the core developers of Firefox, Mozilla's biggest project by far, will not benefit from Rust anytime soon.

Now, in spite of all the above, one might disregard risk, cost, and return if one had a truly unique opportunity. If you're the only organization in the world that might plausibly deliver on some project of unique importance, then it might be worth rolling the dice: "We have to save the world from X; nobody else will." Perhaps reasonable people can disagree, but I can't see how Rust qualifies. As a "fast, safe, concurrent, static, systems language", Rust is quite comparable to Go or D; even if one grants, for argument's sake, that it's better than those languages in some ways, it occupies a similar space. So Mozilla's opportunity is simply that it has the money and freedom to pay a few software engineers to work on Rust full-time. The "unique opportunity" rationale for overriding cost/risk/reward does not seem to apply.

Alternatively, one might rebut all of the above calculations by saying that Rust is a research project, not strictly an engineering project. But it is a curious research project. Research justifies its existence by hoping to advance computer science in a fundamental way. Rust conversely aims to minimize novelty, as its FAQ states: "Old, established techniques are better [than novelty]". I cannot think of a fundamental computer science question that is investigated by Rust's development. It is research only in the sense that it is a speculative engineering project without an obvious short-term payoff.

Furthermore, all of the above must be considered in light of Mozilla's current position. Mozilla is fighting to maintain relevance in a world where all the major desktop and mobile operating system vendors are shipping high-quality web browsers. And Mozilla is fighting with far less resources than its competitors. Mozilla punches far above its weight, and for that it should be proud, but I doubt that it has the luxury of diffusing its focus at this time. Mozilla is not like the IBM or Microsoft of old; it cannot plow huge surplus profit margins into basic research without subtracting needed resources from product engineering.

Therefore, in my opinion, Mozilla should spin out the Rust project to community maintenance, and try hard to convince its paid engineers to work on something with a cost/risk/return/opportunity profile better suited to Mozilla's place in the world. Rust developers are experts in programming language implementation, so ES6, asmjs, or Firefox and Firefox OS developer tools all seem like promising alternatives.

Or, if Mozilla really believes that reducing dependence on C++ is a major strategic priority, these developers might investigate rewriting Firefox components in another language — one for which they won't have to do most of the heavy lifting in toolchain and library development. Delegating language development to someone else makes far more sense than doing it in-house. By way of analogy, Mozilla does not try to innovate in datacenter design. To do so would be ludicrous, as Google, Amazon, Facebook, etc. are all far better-positioned to do so. Instead, Mozilla is happy to reap the fruits of robust multi-party competition in datacenters. They should take a similar approach to programming language design and implementation (outside of the web client platform, where they are unavoidably a key contributor).

A third alternative might be to find a deep-pocketed external sponsor to take on the heavy lifting for Rust. Samsung made some noise a while ago about sponsoring Rust; but how many active Rust contributors come out of Samsung's headcount rather than Mozilla's? I believe that the key cost of Rust development for Mozilla is the opportunity cost of expert Mozilla developers' labor. Until Samsung or some comparable organization credibly commits not only financial sponsorship but long-term ownership and headcount to Rust development, its costs will not be truly offloaded from Mozilla.

p.s. I say all of the above as someone who likes what I've seen of Rust, purely as a programming language. I would be happy if some organization took up the Rust banner and carried it forward. But I do not think Mozilla is a logical primary sponsor for it.