Tuesday, May 11, 2010

How to design a popular programming language

This has been kicking around in my brain for at least half a decade, and if you know me well then I've probably spoken it aloud in your presence; so it's high time to get it down in writing. Here is my Grand Unified Theory of Programming Language Adoption. There are three steps:

  1. Find a new platform that will be a huge success in a few years.
  2. Make your language the default way to program on that platform.
  3. Wait.

That is all. Note that none of the above steps has anything to do with the language design itself. In fact, nearly all popular languages are terribly designed. Languages become popular by being the "native" way to program a certain kind of system. All of history's most widely used programming languages fit this model — Fortran (scientific programming), C (Unix), C++ (MS Windows), JavaScript (web pages), Objective-C (Mac OS X), . . .

Or, in fewer words: Languages ride platforms to popularity.

Why is this so? Well, to a first approximation, no piece of software ever gets rewritten in another language; and once a critical mass of software for a platform has been written in one language, nearly all the rest will follow, for two reasons:

  • Nobody has figured out how to make cross-language interoperability work well.
  • The network effects from language adoption are immense. Programming is, despite appearances, a deeply social profession. To write successful software quickly, you must exploit the skills of other programmers — either directly, by hiring them, or indirectly, by using library software they've written. And once a language becomes the most popular in a niche, the supply of both programmers and libraries for that language rapidly accumulates to the point where it becomes economically irrational to use any other language.

In fact, I claim that in all the history of programming languages, no language has ever successfully unseated the dominant language for programming on any platform. Instead, a new platform gets invented and a new language becomes the "founding language" for that platform.

Well, OK, there are exactly two exceptions: Java and Python. It took me a while to figure out what happened in those cases, and the answers I came up with were surprising (to me).

Java is anomalous because although it is widely used in its primary domain (Internet application servers), it is not predominant, the way that e.g. C++ is predominant in writing native Windows GUIs. My explanation is that the web architecture has a uniquely high-quality interoperability protocol in the form of HTTP and HTML(/XML/JSON/...). Hey, stop laughing. HTTP and HTML fail all kinds of subjective measures of elegance, but they succeed in isolating clients and servers so well that it is economically viable to write the server in any language. In other words, as unbelievable as it sounds, HTTP and HTML are the only example in history of cross-language interoperability working really well.

I'll abandon this explanation if I can find, in all the annals of computing, another protocol that connected diverse software components as successfully as HTTP and HTML. The only things I can think of that come close are (a) ASCII text over Unix pipes or (b) ODBC, and neither of these provide nearly the same richness or connect components of similar diversity.

Python is anomalous because rather than riding a new platform to success, it simply seems to be displacing Perl, PHP, etc. in the existing domains of shell scripting, text processing, and light web application servers. My explanation is that Python appears to be the only language in history whose design was so dramatically better than its competitors' that programmers willingly switched, en masse, primarily because of the language design itself. This says something, I think, both about Python and about its competitors.

Incidentally, this theory predicts that all the new(ish) programming languages attracting buzz these days — whether Ruby, or Scala, or Clojure, or Go, or whatever — will fail to attract large numbers of programmers.* (Unless, of course, those languages attach themselves to a popular new platform.)


UPDATE 2010-05-15: Reddit and HN weigh in.


*Which is fine. Very few languages become hugely popular, and in fact nearly all languages die without ever seeing more than a handful of users. Being either influential (so that later languages pick up your ideas), or even merely useful to a significant user population, are fine accomplishments.

28 comments:

  1. I do not think and do not see Python displacing Perl in shell scripting at all.

    ReplyDelete
  2. @anonymous: this is funny, because I used to use Perl as a more secure and more portable replacement to shell*. Now I use Python as a more portable replacement for Perl, because it includes the batteries.

    * Despite careful attempts at Bourne compatibility, and having my scripts running on bash and FreeBSD's /bin/sh, the Sun machines at college in 2001 choked badly on them.

    ReplyDelete
  3. Python seems to be replacing Perl as the glue of Linux. It used to be all the system scripts were Perl, now a modern Linux distro can't function without Python.

    I also see Python replacing FORTRAN in the scientific computing arena.

    ReplyDelete
  4. Does that mean Python is the greatest language ever created?! I knew it!

    ReplyDelete
  5. do you consider the JVM or .net as a next platform?

    ReplyDelete
  6. I see that python is slowly replacing perl, but ruby is also helping against perl.

    Personally, I use ruby.

    Perl came to popularity ages ago with hardly any competitor back then.

    Nowadays you have so many languages hitting on good old perl ...

    The reason I think ruby rocks is because it is beautiful.

    Python is beautiful too, but not as beautiful as ruby. Ruby feels like an art, creativity - and there is more than one way to do it means you can be more creative as well.

    ReplyDelete
  7. Other protocols which have managed to connect diverse software components even more successfully: IPv4, TCP, UDP. IPv4 was so good at this purpose that it has become ubiquitous.

    These days an operating system is not complete unless it includes a TCP/IP stack whereas it is acceptable not to include an HTTP server for the majority of users.

    ReplyDelete
  8. It's worth pointing out that Ruby has already attached itself to a popular new platform, and you're right -- that's the reason for the vast majority of its success at this point.

    Not that I think it'll ever rival C++ for popularity or anything. But I agree, Ruby won't go away as long as Rails doesn't go away, which looks to be for awhile here.

    ReplyDelete
  9. When the Macintosh first arrived, the programming language and API of choice was Pascal as documented by Apple in the original SDK. This later was displaced by C in documentation and practice around the time Lightspeed C and later Metrowerks became available.

    ReplyDelete
  10. If you are correct - than Erlang might have pretty good chances. The current emerging platform is "The Cloud" along with all the "NoSQL scale-at-will" solutions and if you look at them there are quite a few being written in Erlang (CouchDb, Riak, RabbitMQ, ejabberd...).

    ReplyDelete
  11. This seems like it would be extremely difficult.
    electronic cigarette

    ReplyDelete
  12. Why did Python displace Perl instead of Ruby? Is Python better designed?

    One issue to consider is that the Ruby culture tends to be looser and more anti-corporate, whereas python suffers no such drawback.

    ReplyDelete
  13. I agree. I also think Java is less of an exception than you say: Java runs on the "web platform", which happens to encompass both browsers and servers. But part of Java's success was also the pent-up frustration with C++, which was poorly suited for large-scale business application programming. What about Ruby?

    ReplyDelete
  14. Another cross platform language would be SQL

    ReplyDelete
  15. - TCP IPv4 etc are all part of the HTTP/HTML originally mentioned. Author was keeping it simple

    - Java rode its own cross-computer platform: the JVM, enabling windows, linux, and *nix development across the board.

    - The cloud is still a web platform, so no language will dominate it

    - the new platform that matters is the smartphone OS, so it is currently android and ObjC/iPhone battling that out.

    ReplyDelete
  16. I moved from perl to python. Python i smuch better in a team environment or if you want to write legible code that others can work on. Perl object orientation is nasty to boot.

    ReplyDelete
  17. Java did ride a platform to success. Java replaced *COBOL* and, from there, became the "enterprise" web language.

    ReplyDelete
  18. In response to "constance eustace":

    TCP/IP are not "part" of HTTP/HTML. HTTP relies on a TCP connection. IP and TCP were developed first and became the predominant mode of internet communication. Not just for the web, but for nearly all distributed systems. TCP/IP is even more widespread than HTTP/HTML. If the author wants to lump protocols together, then generalizations should be made about TCP/IP, not HTTP/HTML.

    ReplyDelete
  19. Where do C# and the .NET framework fit into the this analysis?

    C++ was indeed the dominant language for Windows development, but it has been largely displaced by C#.

    Surprisingly, C# has also displaced Visual Basic, which was very widely used whenever an 'application' was actually a GUI sitting over components written in C++. C# is technically superior to VB, by a wide margin - indeed a vast gulf when you consider inheritance - but there are very few languages with VB's ease of use for the part-time and intermediate-level programmer.

    ReplyDelete
  20. > Nobody has figured out how to make cross-language interoperability work well.

    Having two languages co-hosted on the JVM means language interop can be fairly trivial. This is why Clojure has gone so far despite being barely two years old--it's effortless to leverage the huge corpus of existing JVM libraries.

    ReplyDelete
  21. The cgi rfc (3875) which defines the interaction between a browser and web server is a literal extension of the unix command line.

    ReplyDelete
  22. i would also like to add that a language such Ruby ( considered as a scripting language ) might not get such a huge base of programmers but somehow it can be integerated in a program thats written in the major languages such C/C++ , such that its used for configuration for instance , that way people use ruby with[out] noticing, my point is a new language does not have to ride on a new platform ( as in OS if you mean that ) but also to ride a famous application written in one of the major languages.

    ReplyDelete
  23. @Jeff Satterley: No, Lisp is still the greatest language ever created.

    ReplyDelete
  24. I wonder... if stagnation may have also been a factor in Python eating away at PERLs dominance in its domain. PERL 6 has taken such a long time to materialize, and the 5 line hasn't been overly active until recently. Perhaps this lack of change in the PERL line left an opening for any new or frequently updated language to take hold where it might not otherwise have been able to. On an only semi-related note, I also wonder if fundamental hardware changes could impact a new language's abaility to overtake the current dominant language in a particular domain. For example, if Scala or some other language came along and made it vastly easier to program for concurrency - coupled with the rapid increase in the number of processors/cores on the average computer - could this displace Java?

    ReplyDelete
  25. @Phil Hagelberg:
    I think the author means *all* languages potentially interoperating well. HTTP is a good example because even many of the most obscure languages can at least do TCP/IP in order to create an HTTP library, or they have HTTP out-of-the-box. With what you're suggesting, "interoperability" means all languages have to run on something like the JVM (or CLR, etc.). That's not a solution, especially when there's a reason languages like Erlang can't really run on the JVM very well, but run well on its own BEAM virtual machine. The JVM isn't a good fit for every language. Languages that require tail-calls (Scheme, Erlang, etc.) or first-class continuations don't necessarily have an easy time being implemented on the JVM.

    @Anonymous (who switched to Python)

    I agree. Regardless of Python the language, its strategy was the right one:

    1. Cross-platform batteries included. If not included, it's really easy to install 3rd party libs (easy_install, Windows installer packages readily available). I don't make the distinction between built-in and 3rd party Python libs, because they're just as easy either way. No fiddling with flaky alpha-quality custom compiling, etc. I can't say that for 99% of languages.

    3. Make it easy to understand (one way to do most things, focus on reading code over writing code but still make writing easy)

    4. Sane licensing


    I like the LISPs (Common Lisp, Scheme, etc.) but it's impossible to find ODBC support, decent GUI support, cross-platform support, and sane licensing all at the *same time*. Scheme sucks at ODBC support, Common Lisp sucks at GUI support, etc.

    I WANT to learn other languages, and have dozens of them on my computer (Haskell/GHC, Pharo Smalltalk, Clojure, OCaml, Erlang, FreePascal/Lazarus, PLT-Scheme/Racket, Gambit, etc.), and I like and use many of them (at least for fun, with Erlang being one practical exception). But for real programs, it's all the same theme: PROGRAMMERS WANT POLISH AND CONVENIENCE, NOT HOOPS TO JUMP THROUGH TO GET OFF THE GROUND.

    ReplyDelete
  26. How to design a popular programming language ?

    The "Method" was "invented" long ago, the only thing that has changed is the Hardware.


    If I am incorrect about that then here is "Rob's Method":

    1. Design for the Hardware that _many_ people already have as well as what they will likely be getting in the Future.

    - You can do that by "developing" a 'Language' that _many_ people already know and will likely know in the Future.

    -- - You can do that by "re-abstracting" what you "truthfully want to do".

    --- - If what you really want to do is "have your Hardware produce the correct result quickly" then you desire to "'talk' to the Hardware directly" (IE: Use Assembly Language).

    --- - Since that is non-portable and difficult (at the present time AND for the foreseeable Future) we will not "'talk' to the Hardware directly" but instead use "C" and "OpenCL" to implement a new Language, that will be easiest and most portable.


    2. "Don't put the the Horse before the Cart".

    - You can avoid that problem by "developing" a 'new' "Operating System" to run the new Language that you will create as _fast_ as possible, this is also _critical_.

    -- - You can do that by "re-abstracting" what you "truthfully want to do".

    --- - If what you really want to do is "have your 'Operating System' produce the correct result quickly" then you desire to "'talk' to the Hardware directly" (IE: Use assembly Language).

    --- - Since that is non-portable and difficult (at the present time AND for the foreseeable Future) we will not "talk to the Hardware directly" but instead use "C" and "OpenCL" to implement a new 'Operating System', that will be easiest and most portable.


    ---- - You can do that by "re-abstracting" what you "truthfully want to do".

    [***** Note: my "Arguement" becomes _partially_ "Recursive" *****]

    ----- - If what you really want to do is "have your 'Operating System' produce the correct result quickly" AND you desire a portable solution we will use "C" and "OpenCL" to [***** Note that "Recursion" partially un-winds here. *****] implement a new 'Operating System', that will be easiest and most portable.


    Do you see where the "Arguement", or "Solution" goes ?


    You say:

    "There are three steps:

    1. Find a new platform that will be a huge success in a few years.
    2. Make your language the default way to program on that platform.
    3. Wait. "


    I have suggested that for your first Point we use "C" and "OpenCL" to abstract the Specification [Note: That is a "Technical" Term, see: http://en.wikipedia.org/wiki/Formal_specification ] of our Hardware.

    - In order to produce the "C" and "OpenCL" code so we are "certain, without exceptions or restrictions" that it is correct we can use "ACL2s" to produce our Code by abstracting the Platform of "cl-opengl" to provide the "Lisp Machine" for "ACL2s" to run on.


    ACL2s will compile executable for that is "CORRECT" and runs "QUICKLY" (AFTER it is compiled).

    We will let the "'C' Language Optimizer" (GCC) cause it to run as fast as it can (get something else (we already have) to do the "heavy lifting").

    Note: Along with not putting the Cart before the Horse we do not reinvent the Wheel.


    When we know that everything is running COMPLETELY correctly and as quickly as it is able THEN we can write that new Language you were talking about...

    ...

    TBC,
    Rob

    ReplyDelete
  27. Continued.


    References:

    ATI Stream for X86 OR (not and) OpenCL/DirectX11 Graphics Cards:
    http://developer.amd.com/gpu/ATIStreamSDK/pages/Documentation.aspx


    cl-opengl
    http://common-lisp.net/project/cl-opengl/

    ACL2s
    http://acl2s.ccs.neu.edu/acl2s/doc/



    In Point two you mention:

    "Make your language the default way to program on that platform."

    We do that by giving everyone ONE OPINION (for each line of Code) which they may choose to express OR not, the right to revise it, and the ability to Vote for their own Opinion (if they chose to give it) or someone else's "Line of Code".


    Ultimately, or if I am mistaken about that, penultimately (tell me what comes FIRST):

    In point three you say "Wait".

    Here is the outline for the "Operating System / Language to run on it":

    This is how the Operating System will Boot.

    1. Well wait I shall as must we all and then I will ask you "when do you want your answer" and "what do you want me to do" ?

    I imaging you desire your answer to be correct and to receive it as queckly as possible - or am I mistaken ?

    What you will want me to do is to Boot the OS - or am I mistaken ?

    We will abstract how we do this (this is where "Wait" comes in, you get TOTAL control over the duration, how is that for power and expressivness), we will all decide what is best and that is what we will end up doing.

    In order to "Start" this "OS" as fast as possible in the quickest manner possible so that the "Multi-Core Lisp" can run the ACL2s Interpretor/Compiler (to produce correct "C" Code that can be Optimized to fast code by GCC), to write the Language we will use.

    Sort of like the "2 Penny Nail".


    Here is either the "Start" of all of this (or if I am mistaken about that then someone else will say that thier code should execute before mine OR that thier code replaces mine and is better) - then we all Vote.



    /* Rob's New OS/Langauge */

    /* Stub Code */
    very long function_of_contribtor_0000000000001__contribtion_0000000000001_ {
    return(very long int);
    };


    /* Rob's Contibution for the first line */
    function_of_contribtor_0000000000000__contribtion_0000000000001_( function_of_contribtor_0000000000001__contribtion_0000000000001_()); /* Contibutor's ONE's Contribution */
    return;
    }


    main(VOID) {

    function_of_contribtor_0000000000000__contribtion_0000000000001_(); /* Rob's Contibution for the First Line of Code. */

    exit(VOID);
    }

    Thanks,
    Rob


    Note: Please add to this program OR fix my code.

    ReplyDelete
  28. Is the next Contributor stuck and in need of a suggestion ?


    We can add a comment at the end of our code to suggest what is NEXT but we presume that no other code will preceed, or is better than our own, IF we chose to write any code, otherwise we just vote (once for each line of Code).

    (Note: The "Language" is expanding already.)


    My suggestion is that the _first_ Contributor's Code should check that _MY_ Code is correct (not a Virus or somehow the memory got corrupted).

    IF everything looks OK then _you_ can simply return TRUE (a 1) and allow the _next_ person to use that value to determine IF thier Code should RUN _OR_ if thier Function should exit will a FAILURE ("Environment Corrupted, no means to Repair") Error.

    OBVIOUSLY the THIRD contributor MUST NOT copy the First or Second Contrubtor's Code (that would make the OS/Language slower and possibly pointless (if everyone did the same thing)).

    That is my suggestion to the Second Person, the Third person may wish to declare that the OS is at "Time 0" and Operating Correctley by returning a "1" to the Fourth Contributor.


    Eventually we will get to the Person who causes ACL2s to execute
    and we can Compile the Language we Vote to develope.

    Thanks,
    Rob

    ReplyDelete