Friday, February 26, 2010

Q: Why does string get tangled up in knots?

A: Because it can.

This is not a joke answer. The question is very nearly tautological. A knot is, by definition, a conformation of string which resists untangling. If you perturb a length of string randomly, then at any given time it may either become tangled in a knot or remain free. If it is in a knot, it will resist becoming a non-knot; if it is not a knot, it is free to change. If a knot is possible, one will eventually emerge. The only way that a knot could fail to emerge is if it were completely impossible for an unknotted string to tangle into a knot.

A very similar question would be why a shaken ratchet eventually turns forward. Of course, nobody would ask that question. The lower dimensionality means that the answer is transparent.

There is a similar result in evolutionary population dynamics which states that as time goes to infinity, given a fixed population cap and a randomized chance to reproduce, every species goes extinct. Intuitively, if you roll k dice infinity times, then eventually they all come up sixes.

The many, many applications of this principle are left as an exercise for the reader.

Wednesday, February 24, 2010

Tab completion for meta-bang shell commands (Wednesday Emacs blogging)

You probably use M-! to run a quick shell command now and then, when you don't want to be bothered with a full M-x shell. But if, like me, you're a former XEmacs user, then you probably find FSF Emacs' default lack of tab-completion for files in the minibuffer rather annoying. Well, your pain ends here:

(if (not (string-match "XEmacs" emacs-version))
    (progn
      (defadvice read-from-minibuffer
        (around tab-is-pcomplete-in-minibuffer activate)
        "Bind TAB to pcomplete in minibuffer reads."
        (let ((keymap minibuffer-local-map))
          (define-key keymap "\t" 'pcomplete)
          (ad-set-arg 2 keymap)
          ad-do-it))))

Ta-da. Now when you M-! mv LongAnnoyingFileName.java LongAnnoyingFileNameFactory.java, you'll be able to tab-complete the filename.

Incidentally, who knew that elisp supported aspect-oriented programming? Apparently it does. Astonishing. I owe this tip to a co-worker, who I'd name except that I doubt he'd want to be associated with the other content on this blog. (p.s. AF, if you ever run across this post and don't mind being credited, I'll happily add your name.)

Wednesday, February 17, 2010

Meta-slash performs dabbrev-expand, and you require this knowledge (Wednesday Emacs blogging)

If you don't know this one already, then go to an emacs window right now, open up any source file (~/.emacs works fine), navigate to any function, and type the first couple of characters of a nearby identifier. Then type M-/. Ta-da!

Details: By default M-/ is bound to dabbrev-expand, which triggers the dynamic abbreviations facility. This dynamically compiles a dictionary from nearby identifiers in the source file, and offers matching identifiers as completions for the current token, preferring identifiers closer to the cursor over more distant ones.

For bonus points, type M-/ multiple times to cycle among recent matches, or use C-M-/ to pop up a list of matching completions in another buffer.

dabbrev-expand isn't as sophisticated as the semantically aware tab-completion available in many IDEs. Conversely, however, it works with no modification in almost every buffer type under the sun, so you can use it when editing code in elisp, Java, or the language you invented this morning. It even completes reasonably well when editing English prose (although since token prefixes are much less unique within an English document, it's only worthwhile for longer words).

Meanwhile, because dabbrev-expand's algorithm is so simple, it doesn't require a heavyweight background process to scan all your project files and keep an in-memory database up-to-date. This is, of course, a typical IDE pitfall. You'll never be waiting for emacs to repopulate the dabbrev-expand database after you refresh all the files in your project checkout.

I'm a little embarrassed to admit that I only learned this keyboard shortcut a couple of months ago. Yes, that's right, I've been typing all my identifiers manually (or using M-w/C-y to copy-and-paste) for my entire freaking career. I estimate that my long-term danger of RSI declined dramatically the day one of my teammates mentioned this feature.

(On the other hand, my incentive to keep names short has been reduced slightly, and I wonder what effect this will have on the code that I write. It seems to me that although names that are too short can be cryptic, it's good to keep code as concise as it can be, consistent with maintaining clarity. Along similar lines, I suspect, for example, that IDEs which make it too easy to extrude large volumes of boilerplate code, or to import functions from many different modules, result in looser, less organized code.)

Tuesday, February 16, 2010

The state of Kindle backups and data portability, February 2010

I recently plugged my Kindle into my workstation's USB port for the first time. In ordinary operation, there's no need whatsoever to do this, but I wanted to try backing up my ebooks. Also, since writing this I wanted to confirm my suspicion that the current Amazon DRM scheme is more akin to Apple's FairPlay "speed bump" than a serious playback control technology.

In short, it is.

The Kindle connects as an ordinary USB mass storage device with a simple folder structure, containing four root-level directories:

  • Audible: audio ebooks? (empty in my case)
  • documents: ebooks
  • music (empty in my case)
  • system: Not exactly what it sounds like — it doesn't actually contain the operating system, only auxiliary data files used by system software. I suppose it's sensible enough not to let some clueless user bork their OS by accidentally dragging this file to the trash. (I suspect that there's a backdoor code that will mount the OS/firmware as well; at least, that's how I'd design this device if I were a developer and wanted to debug it.)

For each ebook, the documents folder contains at least one .azw, .azw1, or .tpz file, and usually a .mbp or .tan file that stores some auxiliary data. Your "clippings" file (containing excerpts that you highlight or note) is stored as a plain .txt file (yay).

Free samples and free public domain ebooks from Amazon are not DRM-restricted. Purchased books, of course, are.

Incidentally, no technology in the Kindle device prevents copying. As noted, the Kindle mounts as an ordinary USB mass storage device, and it is inherent in the filesystem abstraction that you can do simple things like copy the entire contents of the documents folder onto your hard drive. You can do it once or a thousand times, and no technology even tries to stop you. This is an inherent function of the type of device that Amazon has made.

What the files' DRM prevents, in theory, is "playing back" the files' content on some other device after it has been copied. But of course, it doesn't really do that in practice. Without going into details, there are downloadable programs on the Internet, widely available in source and executable forms, which can extract the contents of a restricted AZW file.*

So, in short, it's trivial for you to back up your ebook library. If you're a programmer, it's also pretty easy to write a script that will harvest your entire ebook library, shuck off the obnoxious DRM enclosure, and transcode the contents into some other format. Nontechnical users, unfortunately, don't have easy access to the DRM removal/transcoding step, although this may change as the transcoding software matures and distribution channels route around the legal jurisdictions where this software is banned.

Anyway, as I wrote in my earlier post, I'm hoping that the content production cartels will eventually realize that DRM serves Amazon's interests, not theirs, and abandon even the "speed bump" DRM currently in place. In the meantime, I've found that the value of having a dozen unread books in my bag at any given time, and being able to buy and read a book instantly at midnight on a Sunday, is sufficiently huge that I'm willing to make the compromise.**


*Even if this weren't true, people determined to infringe copyright for monetary or other gain will do so. DRM does not prevent the widespread, willful, uncompensated distribution of copyrighted content. The only thing that DRM does is prevent legitimate paying customers from getting the value from their books that they have been promised by electronic booksellers' use of the phrase "buy this book".

**A compromise, incidentally, that I was never willing to make with iTunes DRM. I suppose that the value I get in my life from reading greatly exceeds the value I get from music.

Monday, February 15, 2010

Q: How is Spock like a fortune cookie?

A: His major lines are vastly improved when you add the suffix "in bed". For example, in last year's Star Trek movie:

  • (To Uhura): "I need everyone to continue performing admirably. In bed."
  • (To himself): "Do yourself a favor: put aside logic, and do what feels right. In bed."
  • (To himself): "As my customary farewell would seem oddly self-serving, I will simply say: Good luck. In bed."
  • (To Kirk): "I will not allow you to lecture me about the merits of emotion. In bed."
  • Kirk: "You know, traveling through time, changing history... that's cheating."
    Spock: "A trick I learned from an old friend. In bed."
  • Spock: "Furthermore, you have failed to understand the purpose of the test."
    Kirk: "Enlighten me again."
    Spock: "The purpose is to experience fear, fear in the face of certain death, to accept that fear, and maintain control of oneself and one's crew. This is a quality expected in every Starfleet captain. In bed."
  • Bones: "You know, back home we have a saying: 'If you wanna ride in the Kentucky Derby, you don't leave your prized stallion in the stable.'"
    Spock: "A curious metaphor, doctor, as a stallion must first be broken before it can reach its potential. In bed."

Thursday, February 11, 2010

Electronic goods markets: end-to-end wins?

Hypothesis: As industries of cultural production adapt to digital distribution, content publishers in each industry will follow, with minor variations, the four-phase pattern set by the music industry:

  1. Denial: Publishers pretend that digital distribution does not exist, attempting to salvage business models based on distribution of physical media. In some cases, publishers use the legal system to try to make this fantasy a reality. Regardless of the legal outcomes, this proves unsustainable in the long run. During this phase, publishers may make halfhearted forays into digital publishing, which invariably fail because they are deeply and deliberately user-hostile.
  2. Faustian Bargain: A technology company designs a system which disguises computers' fundamentally general nature with a fig leaf of DRM. The disguise allows this company to strike a deal with major content publishing cartels to distribute content. Because a technology company has taken control of the technology, the system finally works in a way that doesn't make customers want to tear their hair out. The DRM system fails to prevent widespread copyright infringement, but it provides a hook for the technology company to build a vertically integrated stack which is somewhat inconvenient for customers to exit.
  3. Clash of the Titans: Publishers realize that they are in a weakening bargaining position with respect to the technology company, which has acquired considerable monopsony power due to its control of the platform. Publishers butt heads with the technology company over prices and other contractual terms. This, too, proves unsustainable.
  4. End-to-End Wins: Publishers realize that architectures which embed control in the distribution mechanism put more power in the hands of middlemen than endpoints. Conversely, end-to-end architectures, wherein the endpoints negotiate the transaction and any number of interchangeable mechanisms carry data between them on a best-effort basis, place power in the hands of endpoints rather than middlemen. Publishers furthermore realize that publishers and customers are the endpoints; that in the long run both are best served when the customer can purchase a bundle of data which is not bound (even weakly) to the sales channel, the software stack, or the physical device, all of which are intermediaries between the content and the customer. Publishers finally offer their content in a portable format via multiple sales channels.

This is just a hypothesis. I'm not sure I believe it. However, as evidence that expecting the final stage is not laughably utopian, I offer Sony and Warner's deals with eMusic and the introduction of MP3s on iTunes as evidence that stage 4 is already happening for music.

Detailed application of the above model to current hoopla in the e-book market left as an exercise to the reader. However, I will note that one reason I bought a Kindle is that I thought book publishers were so ornery, retrograde, and technophobic that they'd never progress to stage 4 unless they had an obnoxious would-be monopsonist (viz., Amazon) to frighten them through stage 3.

(A counterpoint to the above argument would be to observe that certain goods, like streaming video and computer games, appear to be evolving in the direction of fairly strong architectures of control. Neither Netflix streaming nor Steam give you much freedom w.r.t. your "purchase". It's unclear whether this means their respective markets haven't progressed far enough yet, or there's something fundamentally different about these media.)

Monday, February 08, 2010

J. Blow: Games as Instruments for Observing Our Universe

You might be dissuaded from listening to this talk by Jonathan Blow because it's distributed as a PowerPoint presentation and a couple of MP3s, or else because it's nominally about the much-maligned artifacts of human civilization commonly called "video games".

You would be making a mistake.

Jonathan Blow is a minor genius, and this talk is worthy of attention from anyone interested in science or art or really any creative activity. I have previously mocked video game apologists for viewing games as a failed (or at least not-quite-successful-yet) aspirant to "interactive cinema" — the teleological destiny of gaming, by this aesthetic, being the creation of an action movie in which You! Are! The! Hero! — and Blow is perhaps the most articulate proponent of the opposite view.

E. W. Dijkstra famously said that "Computer Science is no more about computers than astronomy is about telescopes." He was suggesting that there are properties of the universe — viz, certain mathematical truths — that can only be inspected by studying algorithms, which humans can only do through the construction of computing devices. Per Dijkstra, the devices are not the point, or at least not the only point.

In practice, most of computer science amounts to cleverly engineering around messes that humans have created; but sometimes you do glimpse something which appears to be a property of the broader universe. This is a point that is mostly unappreciated by non-computer-scientists, who assume that the essence of computer science is fiddling around with gadgets.

Similarly, the word "game" applies, in the broadest sense, to any system of rules with which one or more agents interact. Blow's basic point is that the generative systems of rules that we call games can be profound devices for exploring truth, just like the generative systems of rules we call algorithms. But that's a pretty inadequate summary of the talk. You should really listen to the talk itself.

(The Q&A is longer and somewhat more inside-baseball w.r.t. the Game Industry as it actually exists today, and therefore less interesting overall, although there are some good bits there too.)