Saturday, December 25, 2010

Email, instant messaging, immediacy, and productivity

Krugman's response to this NY Times trend article on the alleged death of email and triumph of instant messaging reminds me of Donald Knuth's explanation of why he doesn't use email:

Email is a wonderful thing for people whose role in life is to be on top of things. But not for me; my role is to be on the bottom of things. What I do takes long hours of studying and uninterruptible concentration. I try to learn certain areas of computer science exhaustively; then I try to digest that knowledge into a form that is accessible to people who don't have time for such study.

As it turns out, even email is insufficiently immediate for many people these days. And nobody can reasonably dispute that when you're out and about, coordinating real-world social activity, it's sometimes better to have instant messaging — "[I'm] down the street, be there in a minute, stay put" is not a useful message to receive a couple of minutes late, nor do email's strengths offer much benefit in this context.

But you choose your communications medium based, to some extent, on the type of person that you want to be. If you want to be like Krugman or Knuth, then you need long periods of uninterrupted concentration, and you must favor asynchronous communication. If you want to be like a teenager gossiping about his/her classmates, then instant messaging is probably good enough. So, the question: do you want to be more like Don Knuth or more like... well, millions of people whom you've never heard of because they never accomplished anything great? Most of us settle for something in between, of course, but this is a question of aspirations, and anyway in practice the true issue is about modifying one's behavior at the margin and not about absolute positioning.

On the other hand, it is possible to take enforced inaccessibility too far. It is worth quoting a bit from Richard Hamming's classic essay:

I noticed the following facts about people who work with the door open or the door closed. I notice that if you have the door to your office closed, you get more work done today and tomorrow, and you are more productive than most. But 10 years later somehow you don't know quite know what problems are worth working on; all the hard work you do is sort of tangential in importance. He who works with the door open gets all kinds of interruptions, but he also occasionally gets clues as to what the world is and what might be important. Now I cannot prove the cause and effect sequence because you might say, ``The closed door is symbolic of a closed mind.'' I don't know. But I can say there is a pretty good correlation between those who work with the doors open and those who ultimately do important things, although people who work with doors closed often work harder. Somehow they seem to work on slightly the wrong thing - not much, but enough that they miss fame.

Saturday, November 20, 2010

eMusic: Pay again for music you've already bought

One of the major advantages of eMusic in the past was that, after paying for a track, they'd remember that you'd paid for it and allow you to download it again if you ever lost the file. No more. So long, eMusic, it's been fun. In your continuing quest to court clueless businessmen at the major record labels, you've been steadily making your service worse and worse for some time now, and now I'm done with you. Amazon mp3 for me from now on.

Monday, August 16, 2010

Two steps to freeing yourself from operating system zealotry

First, repeat the following mantra to yourself until you really, really believe it:

Other people's needs, habits, and experiences with software differ dramatically from mine, and are just as legitimate.

Your platform of choice may seem to satisfy your needs exactly. It may ring every single chime in the halls of your heart. But if you sincerely believe that every user who wants something different is simply mistaken, then you're presuming to a knowledge that you do not possess: namely, the knowledge of how every other user behaves.

Second, pay close attention when you're using your preferred computing device, and make a mental note every time you end up staring, slack-jawed or pissed-off or confused, while your computer does something other than what you just asked it to do. (This includes while you're waiting for some indicator to stop spinning, or when you need to stab the "Cancel" or "Back" button.) If you believe that this never happens to you, then you are not paying close enough attention, and in fact you should be frightened, because you have become so habituated to your platform of choice that you've learned to automatically edit these moments out of your consciousness. The inescapable truth is that all computing platforms suck in different ways, including yours. And until you realize this, you will not have achieved enlightenment.

Once these two conclusions sink in, you will shortly see that all OS zealotry is the futile worship of invented gods — cruel, greedy, capricious, and temporary gods to boot. Arguing with people on the Internet about the superiority of your computing bauble will seem to you like a species of insanity.

But who am I kidding. Most OS zealots are fractally wrong and it's a fool's errand to dissuade them. But maybe you, dear reader, can be saved...

Monday, August 09, 2010

Golden Promise and The Macallan

Attention conservation notice: Several hundred words on whisky.

One of the signature qualities of The Macallan whisky has traditionally been its use of Golden Promise barley. It is sometimes claimed that Macallan uses 100% Golden Promise. However, in Raw Spirit (2003)*, Scottish author Iain Banks (yes, that one) writes:

Macallan uses Golden Promise barley, a variety which is out of favour with farmers these days because it produces much less yield than more recent, more productive but less tasty forms. As a result, Golden Promise has become hard to get hold of over the years and even Macallan has had to resort to other varieties, using only about 30 per cent Golden Promise since 1994. It'll be interesting to see whether the 10-year-old Macallan bottled in 2004 tastes appreciably different compared to the year before.

For all my Google-fu, I am unable to find concrete numbers on the precise barley composition of Macallan's maltings prior to 1994. It's possible that they used 100% Golden Promise before then. Certainly, the implication of the passage above is that a much larger proportion of Golden Promise was once used.

Macallan themselves seem evasive on the point. The current notes on their website avoid mentioning Golden Promise entirely, opting instead to claim that they have always been at war with Eastasia "The Macallan uses a proportion of Minstrel barley, a variety grown exclusively for The Macallan . . . to ensure the rich, oily character of the Macallan new make spirit". This evasiveness is understandable, given the extent to which whisky drinkers (like gourmands of other types) tend to be irrationally influenced in their judgment by untasteable qualities, such as strict adherence to tradition.

And, after all, whisky gains most of its flavor from subjecting barley to various extraordinarily refined and artificial practices — artificial not in the sense of false or disingenuous but simply in the sense of artifice as calculated construction — including malting, milling, mashing, fermentation, distillation, and barrel-aging. And therefore, it is especially irrational to develop an attachment to an exact source barley per se. In the hands of skilled distillers, one can trust that the new expression will share many characteristics of the old, and may even be indistinguishable when imbibed, even if it's not chemically identical down to the last molecule.

Nevertheless, I offer the following friendly note to anyone out there who likes a nice Speyside. If Macallan last used its original ratio of Golden Promise barley in 1993, then this year's bottling (2010) is the last 17-year Macallan made with that ratio. Likewise, 2011's bottling will be the last 18-year expression made from the same. As older Macallan tends to be an especially finely balanced whisky, and as whiskies older than 18 years tend to be exorbitantly priced**, you may wish to buy a bottle of Macallan 18 sometime in the next year to save for a special occasion. If nothing else, you will be able to impress*** your friends by telling them that this exact Macallan cannot be had any longer at any price.

*By the way, Raw Spirit is almost impossible to read from cover to cover. It may be the most self-indulgent, meandering thing Banks has ever written. On the other hand, it's entertaining to dip randomly into short passages, including some nice bits on Scotland and sundry. In many ways, the writing suffers from being incarnate as a bound codex. A whisky blog written by Banks would be a lot more fun to read. Anyway, by leafing through the book and sticking labeled Post-Its on the pages where each distillery appears, I was able to transform it into a decent introductory reference guide to Scotch, and one that's far less stuffy and pretentious than that sort of thing would normally be.

**While I'm dispensing unsolicited advice, here's some on whisky and age (this will be obvious to any serious whisky drinker, but it may be news to some readers of this blog). Some people assume that older whiskies are strictly better than younger ones. This notion is supported by the exceptionally high price of old whisky, and also pop culture artifacts like that West Wing episode when Leo rhapsodizes on the age of Johnnie Walker Blue. In fact, age changes the taste of whisky, but whether that change is an improvement is entirely a subjective matter. Younger whiskies retain more of the character of the original malting and distillation process, whereas older whiskies take on more of the character of the barrel in which they are aged. As a rule of thumb, the barrel exerts a gentling, sweetening influence. Thus some qualities, like intense peatiness, which are especially beloved of the most insufferable whisky nerds, are more commonly found in younger spirits.

***Or, more likely, annoy.

Thursday, July 22, 2010

Personal digital curation: a software category that does not exist, but ought to

If you're a typical computer user today, you have lots of data that you've created or participated in creating. The data takes several forms, typically including:

  • Personal documents: writing, drawing, photographs, home movies, etc.
  • Purchased media: music, movies, software, etc.
  • Electronic records: receipts, account records, etc.
  • Communication: email, chat logs, & other social media messages.

There are several things about how you keep this data today which ought to bother you.

First, it's probably inadequately backed up. By this, I mean that you don't back it up frequently enough, and when you do, it's probably either (a) on a USB hard drive or (b) on cloud storage. Taken alone, either of these is inadequate. Typical hard drive backups are inadequate because on most file systems, the data's neither pervasively checksummed nor stored with redundant error-correcting codes; as a result, all your files are subject to random corruption. There's also the small matter that you probably don't archive your backups regularly to a remote location that's safe if, e.g., your home burns down. Cloud storage alone is inadequate because (1) any compromise to your account could result in malicious deletion or corruption of your data and (2) Amazon/Google/... may seem like permanent institutions today, but on the scale of decades I am somewhat dubious; DEC and SGI and Sun were once lords of Silicon Valley; ultimately one must consider Phlebas and all that.

Second, the data's stored in a mishmash of formats, some of which will be exceptionally difficult to read in years to come. Microsoft's file formats are particularly egregious, but I also have my doubts that, for example, today's video file formats, or an iPhoto or Picasa metadata database, will be readable by commonly available software in twenty years.

Third, a lot of your data's stored in multiple related forms, and the relationship between those forms is totally ad hoc and not captured by future-proof software. For example, you might have a batch of raw photos, of which you pick a few to clean up, rescale to lower resolution, and upload to the web. So now you've got multiple versions of the photo. If you need to go trawling through this mess some years from now, you're in for a lot of curatorial tedium reorganizing it and figuring out what's redundant and discardable versus what's a pristine original that you must keep.

Conquering any one of these problems, let alone all of them, requires serious geekery today. For example, if you want to have good backups, you need to store data both in cloud storage and on multiple media, and you need software that records and verifies the checksums of all your files. The other two problems are just as gnarly, if not more so.

This may seem like a totally anorakish concern that doesn't matter to most people. Maybe most people are OK with most of their data being ephemeral, except for the rare object that they print out into physical form. Maybe it's just me, because I've been thinking about posterity a lot lately — including photos and video, my raw data generation rate has risen to something like a couple of GB per month. But I suspect I'm merely one of the people on the leading edge of this problem. Someday, everybody will generate a couple of GB per month and they really will want to share the family albums with their grandchildren without inordinate curatorial effort.

So, as far as I can tell, there's a big gaping hole in the market for personal digital curation software — software that would help you not only back up your data (there's plenty of software out there for that) but that would take care of ensuring the posterity of your data. This implies at least (1) backing it up to multiple distributed locations, (2) transcoding it into future-proof forms, and (3) remembering the relationship between different parts of your data.

This software would not be simple to build. It would have to be cross-platform. To offer a credible promise of future-proofness, it would have to be built on well-documented protocols and file formats so that if your organization went bust, someone else could write software, from scratch, that at least recovers the data. It would have to either include software that manages common file types like photos, or to hook into existing software that manages them, or compute relationships between the files after the fact (for example, it would have to either replace Picasa, or hook into it, or be able to figure out by post hoc analysis when two files were really variants of each other). It would have to be performant. It would need a nice UI.

I suppose the difficulty of building such software is one reason it hasn't been done. Much easier to just build a social networking doodad or a little timewasting mobile app or whatever the next Valley flavor of the month is. On the other hand, I think there's actually a reasonable (although perhaps tough to pitch) business case. There's probably at least tens of thousands of digital obsessives in the world who'd pay Photoshop CS-level prices for a credible digital curation package. The need to support new file formats or cloud storage APIs as they come online could provide a steady stream of upgrade revenue. If you built it right, then there's the potential for standard licensing deals where you bundle value-subtracted versions of the software with new computers, digital cameras, and other doodads.

Oh well. Anyway, add it to the list of stuff that I wish existed but does not, and also the list of things I wish I the time and focus to write but will probably not get to in my lifetime.

UPDATE 2010-08-03: Apparently you can actually learn something by blogging in ignorance and waiting until Reddit sends some commenters your way. There's an IT service category called digital asset management (DAM), and it's a big deal for enterprises (which shouldn't be surprising). (In library science, the analogous problem is called digital curation, which IMO is closer to the problem I care about.) The question, I suppose, is whether DAM can be scaled down, made sufficiently comprehensive, and encapsulated in a mixture of consumer-grade software and services so that individuals can have credible assurance that their data will be preserved on decades-long time scales. I'm somewhat dubious that Expression Media or Bridge can really offer that kind of promise (for example, those packages seem media-focused; do they back up stuff like email and source code?) but of course I haven't looked very deeply. Thanks interwebs!

Tuesday, July 20, 2010

Ubuntu 10.04 (Lucid) on Dell Mini 12

After a recent system update, my Dell Mini 12 went on the fritz: wireless networking stopped working reliably. Obviously, that's completely unacceptable in a device like this. I guess I could have tried messing around with the configuration files and drivers, but Dell's oddball Ubuntu 8.04 (Hardy) lpia distribution has been feeling long in the tooth lately anyway. So, following my mostly satisfactory Lucid workstation experience, I decided to try upgrading my Mini 12 to Lucid.

And, once again, almost everything worked just fine.

I prepared a USB drive with the installer (actually just a memory card reader plus the card from my camera) according to Ubuntu's instructions. Then I rebooted (using F12 to bring up Dell's boot menu, and then selecting USB), and selected installation.

The installer was exceptionally sluggish — for which I blame the Mini 12's underpowered hardware — but otherwise the installation went through without a hitch.

If you try the same with your Dell Mini 12, you'll want to look at these notes before you run the installation. Two post-install tweaks are necessary:

  • To get acceptable graphics performance, you'll need to enable the Poulsbo GMA500 proprietary driver.
  • For the wireless networking you'll have to enable the Broadcom STA wireless driver from the System -> Administration -> Hardware Drivers menu.

Overall I'm mostly satisfied. Lucid both looks and feels much slicker than Hardy, from the fonts to the windows. The desktop distribution's UI works fine on the 12 inch screen. And once you perform the tweaks above, most everything in the hardware works fine, including wireless, bluetooth, trackpad, sound, and the webcam.

The fly in the ointment this time? Suspend and resume are sometimes flaky. In particular, sometimes resume either fails completely or requires that I switch virtual terminals a couple of times (Ctrl-Alt-F2, Ctrl-Alt-F7) to jog it out of its slumber. Given the way I use this device, it's actually less of a big deal than you might expect (basically, when suspend fails, I just hard-reboot and restart my web browsers and emacs), but if this matters to you a lot then you might want to hold off. I guess I could try debugging the problem, but like I said it hasn't been that important to me.

So, OK, I have to admit that owning this computer overall hasn't been a seamless experience. (But then nothing is these days, not even my Macbook Pro from work; I've had many travails with MacPorts and Fink and and...). When I bought the Mini 12, my goal was to see whether a sub-$700 computer could keep me satisfied for more than one year, which would make it more cost-effective than a $2k computer which typically lasts me 3 years. In that sense, the experiment succeeded: it's lasted over a year, and I've gotten good mileage out of it. Meanwhile, I have not cursed at my Mini especially more than I've cursed at any other computing device I've ever owned. And the biggest positive qualities — compactness, light weight, near-silent operation — remain salient even today.

Monday, July 19, 2010

Virtual cosmetics, redux

OK, well, I thought they'd do faces before bodies but the basic motivation isn't far off (research paper link.

UPDATE 2011-01-09: Updated YouTube link; the original appears to have had its permissions toggled to private. Also added link to original research.

Wednesday, June 09, 2010

On the verbosity of Java generics and related type systems

So, recently T. B. Lee tweeted:

Java's generics syntax feels really clumsy. Are there other (strongly typed) languages that do it better?

I replied as best I could in 140 characters while riding a crowded Muni bus home after work, but I mangled the explanation, so I think I should rectify this.

First, it is worth comparing Java generics to other languages that have parametric polymorphism. For example, consider ML (and more generally the Hindley-Milner family of languages). In ML, you don't need to write down types most of the time, because of type inference. Whereas in Java you might write:

<K, V> LinkedList<K> keysAsList(HashMap<K, V> aMap) {
  LinkedList<K> result = new LinkedList<K>();
  for (K key : aMap.keySet()) {
  return result;

in OCaml you can write:

let keysAsList aMap =
  Hashtbl.fold (fun key _ rest -> (k::rest)) aMap [];;

Notice that although we had to annotate all the Java variables with types, there's not a single type annotation in the OCaml code (Hashtbl.fold is the name of a function qualified by its module name; it is not a type annotation). But OCaml is statically typed nevertheless.

So, one might ask, what gives? Why can't you just add type inference to Java?

Well, the short answer is that typechecking Java generics is a fundamentally harder problem. ML has only parametric polymorphism; Java has both parametric polymorphism and subtype polymorphism (i.e. the object-oriented kind). It is perhaps not obvious why this makes things hard until you learn that B. C. Pierce proved in 1992 that bounded quantification in F (pronounced "F-sub") — a formalization of the most straightforward and general combination of subtyping and parametric polymorphism — is undecidable.

In other words, in F it is possible to write programs for which the type checker would not terminate. This is generally held to be a bad thing (n.b. I disagree with the prevailing opinion, but that's a discussion for another day), so over the next decade or so there followed several papers by Pierce and others attempting to isolate calculi weaker than F with type systems that were both usable and decidable. The most practically relevant outcome of this work was Featherweight Java, which provided the formal foundation for (most of) Java generics. C#, Scala, etc. build on this line of work, although in Scala's case fairly indirectly.

What does all this have to do with type inference? Well, nothing directly. But for any given level of type system expressiveness, full type inference is at least as hard as type checking. And the type checking problem for Java with generics already lives close to the undecidability ceiling (in fact, the decidability of Java generics with wildcards is, AFAIK, still an open problem; proving this sort of stuff used to be a hot research subject but I think everyone's gotten bored of object calculi and moved on). Oh, and I should mention that many of the best minds in academic language design have thrown themselves at the parametric polymorphism + subtyping + inference problem at one time or another, and come up empty. Now, none of this is hard proof that much richer type inference for object-oriented languages with generics is impossible, but it all hints strongly in this direction; at a minimum any such system is likely to be exceptionally intricate and difficult to prove sound.

So, basically, I believe that it is unlikely that anyone will come up with a fundamentally more concise type system for a programming style that combines (1) objects, (2) generics, and (3) static typing. At best, people will fiddle around on the margins — using different punctuation, for example, to denote type parameters, or making other parts of the language more terse to compensate.

That said, although general type inference seems hopeless, there are clearly some things that could make Java's generics more syntactically lightweight in certain common cases. For example, it would be trivial to infer the type of a variable at any declaration site with an initialization expression. C# and Scala appear to do some of this.

Saturday, May 15, 2010

Ubuntu Lucid update

Following my disappointment with Kubuntu Lucid, I just got around to replacing it with the standard Ubuntu Lucid desktop. It's possible to switch desktop environments using a couple of package manager commands, but I decided to do a from-scratch reinstall.

With a little effort, I was able to make most of the Ubuntu desktop behave OK. Window management is still not up to par with KDE 3.5 + KStep window decorations, but it's enough for now. I'll probably switch my window manager to WindowMaker at some point. (NeXTSTEP-style window decorations with X11 window management gestures are the apex of desktop window management, for reasons that I could go into at length but won't today.) Visually, the new Ubuntu theme looks nice; in fact it looks and feels much better when you're using it than it does in screen shots.

However, there's one fly in the ointment. Sound didn't work. At all. Note that for all its flaws, Kubuntu, which is derived from the same base distribution, had no such problem, so it isn't simply a driver issue. I could bore you with all the details of my debugging adventure, but at the end of the day I blame PulseAudio, and Ubuntu's decision to make PulseAudio central to their desktop sound system. After a couple of hours of unproductive web searching and config file wrangling, removing the PulseAudio packages in Synaptic made sound work, sort of.

Yes, sort of. It still doesn't work quite right. When I open the System > Preferences > Sound menu, I get a dialog box saying "Waiting for sound system to respond" and nothing else. (This behavior occurred before I uninstalled PulseAudio, so that's not the cause.) Apparently a whole lot of people have run into variations of this problem since at least Ubuntu 9.10, and nobody seems to have definitive answers on how to solve it. I'd report it as a bug, but I suspect that it's one of those opaque symptoms with dozens of underlying possible causes and it's probably futile.

I want to emphasize that I haven't had a problem like this with a Linux distribution in years. This is literally a regression in behavior to Linux ca. 2005. Poking around by hand with .conf files in /etc just to get something working on my desktop is something I used to do. It's not something I expect to be doing in the year 2010.

So, anyway, I can't set my sound preferences. I guess I'll just have to cross my fingers and hope Ubuntu didn't assign any really annoying sounds to desktop events.

Tuesday, May 11, 2010

How to design a popular programming language

This has been kicking around in my brain for at least half a decade, and if you know me well then I've probably spoken it aloud in your presence; so it's high time to get it down in writing. Here is my Grand Unified Theory of Programming Language Adoption. There are three steps:

  1. Find a new platform that will be a huge success in a few years.
  2. Make your language the default way to program on that platform.
  3. Wait.

That is all. Note that none of the above steps has anything to do with the language design itself. In fact, nearly all popular languages are terribly designed. Languages become popular by being the "native" way to program a certain kind of system. All of history's most widely used programming languages fit this model — Fortran (scientific programming), C (Unix), C++ (MS Windows), JavaScript (web pages), Objective-C (Mac OS X), . . .

Or, in fewer words: Languages ride platforms to popularity.

Why is this so? Well, to a first approximation, no piece of software ever gets rewritten in another language; and once a critical mass of software for a platform has been written in one language, nearly all the rest will follow, for two reasons:

  • Nobody has figured out how to make cross-language interoperability work well.
  • The network effects from language adoption are immense. Programming is, despite appearances, a deeply social profession. To write successful software quickly, you must exploit the skills of other programmers — either directly, by hiring them, or indirectly, by using library software they've written. And once a language becomes the most popular in a niche, the supply of both programmers and libraries for that language rapidly accumulates to the point where it becomes economically irrational to use any other language.

In fact, I claim that in all the history of programming languages, no language has ever successfully unseated the dominant language for programming on any platform. Instead, a new platform gets invented and a new language becomes the "founding language" for that platform.

Well, OK, there are exactly two exceptions: Java and Python. It took me a while to figure out what happened in those cases, and the answers I came up with were surprising (to me).

Java is anomalous because although it is widely used in its primary domain (Internet application servers), it is not predominant, the way that e.g. C++ is predominant in writing native Windows GUIs. My explanation is that the web architecture has a uniquely high-quality interoperability protocol in the form of HTTP and HTML(/XML/JSON/...). Hey, stop laughing. HTTP and HTML fail all kinds of subjective measures of elegance, but they succeed in isolating clients and servers so well that it is economically viable to write the server in any language. In other words, as unbelievable as it sounds, HTTP and HTML are the only example in history of cross-language interoperability working really well.

I'll abandon this explanation if I can find, in all the annals of computing, another protocol that connected diverse software components as successfully as HTTP and HTML. The only things I can think of that come close are (a) ASCII text over Unix pipes or (b) ODBC, and neither of these provide nearly the same richness or connect components of similar diversity.

Python is anomalous because rather than riding a new platform to success, it simply seems to be displacing Perl, PHP, etc. in the existing domains of shell scripting, text processing, and light web application servers. My explanation is that Python appears to be the only language in history whose design was so dramatically better than its competitors' that programmers willingly switched, en masse, primarily because of the language design itself. This says something, I think, both about Python and about its competitors.

Incidentally, this theory predicts that all the new(ish) programming languages attracting buzz these days — whether Ruby, or Scala, or Clojure, or Go, or whatever — will fail to attract large numbers of programmers.* (Unless, of course, those languages attach themselves to a popular new platform.)

UPDATE 2010-05-15: Reddit and HN weigh in.

*Which is fine. Very few languages become hugely popular, and in fact nearly all languages die without ever seeing more than a handful of users. Being either influential (so that later languages pick up your ideas), or even merely useful to a significant user population, are fine accomplishments.

Thursday, May 06, 2010

On internationalized TLDs (a contrarian opinion)

The Timberites rejoice. I'm obviously revealing my North America-centric roots but I think that this is a huge amount of cost for insufficient benefit.

Great civilizations leave their stamp on the conventions of world culture. The Romans gave us their calendar; the Indians gave us the modern numbering system; the Italians gave us terms and symbols used throughout the Western world in musical notation (piano, fortissimo, crescendo, ...). The global recognizability of these signifiers is part of what makes them useful.

In the modern era, American culture predominates in computing. In practically every programming language, English words like begin, define and integer (or abbreviations thereof) have special meanings understood by every programmer in the world.

With respect to TLDs, there are two alternatives before us. Alternative one is to make everyone in the world simply learn to use ASCII TLDs. Alternative two is to make everyone in the world learn to use, or at least recognize, TLDs in every Unicode script. Alternative one is actually the simpler alternative, even for non-English speakers.

Imagine if numbers were subject to the politics of modern i18n. We would have the modern positional decimal numeric system, but also the Roman numeral system, and the Babylonian numeral system, and so on, and nobody would ever have asked anyone to standardize on any of them. After all, we have to be sensitive to the local numeric culture of the Romans!

It's not like I'm saying people should communicate in English all the time. I'm only saying that people learn to type and to recognize ASCII TLDs. This is a relatively limited set of special-purpose identifiers. There are only about an order of magnitude more ccTLDs than months in the year or decimal digits. And I would claim that it's useful for everyone in the world to recognize that, say, .uk and .com look like the end of a domain name, whereas .foobar123 does not. Pop quiz: which one of the following is a new Arabic ccTLD, مص or مصام? The reason you can't recognize it is not just that you're an English speaker — people who only speak Mandarin or Spanish or Russian are in exactly the same boat as you. And when ICANN unveils the Chinese Simplified or Cyrillic or Bengali TLD scripts, Arabic speakers in turn won't be able to make heads or tails out of those.

But, whatever, my opinion's on the losing side of history, so it's almost pointless to express it. I just thought I'd get it out there that there was a real benefit to, and precedents for, the status quo where a convention originating in one culture diffuses and becomes universal.

Saturday, May 01, 2010

Kubuntu Lucid and KDE 4 reactions

Speaking of how software makes you dependent on other people, the newest Ubuntu Long Term Support (LTS) release just came out. This means that in a year, support for the previous LTS release will wind down; which in turn means that Ubuntu users must upgrade sooner or later, unless they want to sacrifice security updates and compatibility with new releases of third-party software.

So, I took the plunge: yesterday I downloaded and installed Kubuntu Lucid.

This is the first Ubuntu LTS release that runs KDE 4, the latest major revision of KDE. I've been using KDE for about 11 years, ever since version 1.1. My immediate reaction was simply that KDE 4 is a mess. And after playing around for a few hours, tweaking settings, and trying to settle in, I still think KDE 4 is a mess. As I use it more, I'm not settling into it; I'm simply accumulating more irritations.

Without exhaustively listing all the details, my complaints basically break down into three categories.

First, there are pervasive performance problems. In every corner of the UI, "shiny" effects have been prioritized over responsive, performant interactivity. To take just one example, under KDE 3.5, the Amarok media player used to be super snappy and responsive; it left iTunes or Windows Media player in the dust. In KDE 4, Amarok takes a couple of seconds to expand one album or to queue up songs, and resizing UI panels is painfully slow and janky. (My workstation has a 2.13GHz Core 2 Duo and a good graphics card. This should not be happening.) Similar problems can be observed in the desktop panels, file manager, etc.

Second, in general, the UI changes seem designed to push KDE's new technology into your attention space, rather than getting out of the way so you can accomplish tasks. Again, here's just one example: in the upper right corner of the desktop, there's a little unremovable widget that opens the "activities" menu:

The upper right corner of the desktop is a hugely valuable piece of screen real estate. By placing this widget in the upper right corner, the developers are signaling that this menu contains operations which will be frequently accessed. Do they really think users will add new panels to the desktop frequently? (For non-KDE users, a "panel" is KDE's equivalent of the Mac OS X dock or the Windows taskbar.) So far, almost every time I've clicked this widget has been by accident while trying to close or resize a window.

If you're a desktop developer who wants to show off your technology, this design may sound good: you put this menu there to make sure users discover your desktop widget and "activities" technology*. However, if you're a user, then this menu mostly gets in your way, and you wish it were tucked away somewhere more discreet.

Third, the KDE 4 version of every application has fewer features and more bugs than the KDE 3 version. The "Desktop" activity no longer has a way to "clean up" icons without repositioning all of them in the upper-left-hand corner. The Konsole terminal application's tab bar no longer has a button from which you can launch different session types. The list goes on.

Anyway, of course, I don't pay for KDE, and so in some sense this is all bitching about free beer. However, suppose I did pay for KDE. Would I have any more input into the process? Windows users pay for Windows; if you don't like the direction Vista and Windows 7 are taking the UI, do you think you personally have any chance of influencing Microsoft's behavior? Mac users pay for Mac OS X; if you disagree with Steve Jobs, do you have any chance of influencing Apple's behavior? In fact, you do not, and both user populations have experienced this reality multiple times in the past decade. Mac users loved the Mac OS 9 UI but they had to give it up when Apple stopped supporting it on new Macs. Microsoft users who are attached to the Windows XP UI will likewise be forced to give it up eventually, when Microsoft stops sending security patches.

The KDE 3 to KDE 4 transition is simply KDE's version of the OS 9 to OS X transition, or the XP to Vista/7 transition. Except that those seem to have worked out OK in the end, whereas KDE 4, which was released over two years ago, seems to have lost its way permanently.

I'm writing this post not just to point out KDE 4's defects — I mean, it feels good to vent, but who really cares — but also to marshal further evidence in support of my contention that owning software doesn't mean much anymore.

Even the fact that KDE is Free Software means little in this case. I mean, what am I supposed to do now? I can't stay with the previous Ubuntu LTS release forever, unless I want to expose myself to security risks, and also be unable to run or to compile new software, both of which are deadly for a software developer. Conversely, I can't singlehandedly maintain a fork of the KDE 3 environment forever; this guy's trying but without a large and active community behind the project, it's doubtful that it will remain current for long. And frankly, I'm getting older, and I don't have enough time to invest in both hacking around with my desktop environment and also accomplishing the other things I want to accomplish in my life.

So, I can either (1) suck it up and live with KDE 4, or (2) abandon the desktop environment I've grown to love over the past 11 years, and jump ship to GNOME or something. (Right now I'm leaning towards (2).) Adopting software means making a calculated bet on the behavior of other people. And sometimes you lose.

*BTW "activities" are 80% redundant with virtual desktops and therefore hugely problematic and confusing as UI design, but I won't get into that.

Sunday, April 18, 2010

In which an Icelandic volcano prompts the funniest paragraph on the Internet today

Yglesias writes:

Ever since the eruption, I know I can’t be the only person who’s been wondering how to say “Eyjafjallajökull.” In principle, the Internet and its multimedia cornucopia ought to shed a lot of light on this issue. In practice, no matter how many times I click over here and hear it pronounced, I can’t come any closer to saying it myself.

I thought oh come on how bad can it be. And then I clicked through to Wikimedia and I burst out laughing.

Cynicism and libertarian ends

So when I wrote that cynicism about government does not help the libertarian cause, a libertarian might regard the suggestion with suspicion, given that I'm just another big government liberal. Well, at least one libertarian agrees with me. And his NYTimes column actually contains a lot of stuff that I find pretty risible.

Sunday, April 11, 2010

Computer science and the iPhone developer agreement

Full disclosure: I work for Google. However, this blog reflects my personal opinions only.

Programming and computer science are not synonymous, but obviously the two are deeply intertwined. The fundamental activity of programming is the construction of abstractions. Programming language design and implementation is one of the fundamental forms of abstraction building. It is central to the field, and has been so nearly since its inception. One of the oldest and most important research conferences in computer science is named Programming Language Design and Implementation.

This suggests a particular understanding of what Section 3.3.1 means. Section 3.3.1 says: "Thou shalt not build abstractions other than those we prescribe." It bans one of the fundamental activities of programming.

This would be a mere curiosity, except for Apple's unusually influential position in the computing industry. All trends point towards mobile devices* becoming much more pervasive than all other general-purpose computing devices. Indeed, the combination of mobile and cloud computing may someday replace all other user-visible hardware except what's needed to support input and output (screens, cameras, etc.). And Apple has the credible goal of becoming the preeminent mobile device provider, setting standards for the industry and defining the entire computing experience for a huge swath of future computer users.

Section 3.3.1 therefore constitutes a direct attack on computer science, delivered by a powerful and well-funded organization that aims to transform laypeople's interface to the field. As long as 3.3.1 stands, for a computer scientist to purchase an iPhone or iPad is akin to a biologist purchasing a textbook that advocates against teaching evolution. Full stop. Go ahead and do it if you can't resist the shiny, but understand the moral weight of the decision you're making.

I can already hear people ready to trot out the standard roster of excuses. Hit the comment box if you want, but realize that I've anticipated the common objections and the only thing stopping me from preemptively rebutting them all is the fact that I'm moving soon and I have a huge number of boxes to pack. To pick just three examples:

Q: "The iPad isn't for people like you. Why do you care?"
A: "This post isn't for people who don't care. Why are you reading it?"

Q: "Apple has a right to do whatever it wants with its platform. If you don't like it, you shouldn't use it."
A: "Thank you for agreeing with me."

Q: "You can program whatever you want in HTML5 and access it through Safari."
A: "Yes, the web is an open platform, which Apple fortunately does not control.** I'm talking about Apple's rules for programming on the platform that it does control."

*A.k.a. "phones". Incidentally, I think the British slang "mobile" is more elegant and generalizes far better any of the {cell,smart,super,...}phone terms that are used on this side of the pond.

**Although I will remark that it's naive to imagine that a platform can be preeminent for very long without influencing the market of content and applications to which non-participants in the platform regime have access. There's a reason Hulu used to work on Flash only. But that's a post for another day.

Tuesday, March 09, 2010

Button order in web browsers and phones

Disclaimer: I work for Google; however I do not work on the Android team and this post reflects my personal opinions only.

Do you have an Android phone?  Look at your Chrome toolbar.  Now look at your Firefox toolbar.  Heck, look at your Internet Explorer or Safari or Opera toolbars, if you have those handy.  Now look at your phone's face buttons.  Notice anything?

The Nexus One gets it right:

Likewise the Droid:

A smartphone is an Internet access device, and as such its interface should strive for consistency with the most widely-used Internet access interface: the web browser.  Maybe the Menu button could go to the right of the Home or Search buttons, but it seems obvious to me that on a smartphone with Back, Home, and Search buttons, those buttons should appear from left to right in that order.  And therefore, for example, the Back button should not be on the right side of the phone.

Alas, via the gadget blogs in my RSS reader, I find that many Android smartphones permute button order in violation of this principle.  One might speculate that button order is a manufacturer-specific quirk, except that the layout changes even within a single manufacturer's devices.

What's going on here?  Does this parallel not occur to the designers of these phones?  Do they disregard consistency for the sake of trivial product differentiation?  Or is it just that I'm the only person bothered by this?

Friday, February 26, 2010

Q: Why does string get tangled up in knots?

A: Because it can.

This is not a joke answer. The question is very nearly tautological. A knot is, by definition, a conformation of string which resists untangling. If you perturb a length of string randomly, then at any given time it may either become tangled in a knot or remain free. If it is in a knot, it will resist becoming a non-knot; if it is not a knot, it is free to change. If a knot is possible, one will eventually emerge. The only way that a knot could fail to emerge is if it were completely impossible for an unknotted string to tangle into a knot.

A very similar question would be why a shaken ratchet eventually turns forward. Of course, nobody would ask that question. The lower dimensionality means that the answer is transparent.

There is a similar result in evolutionary population dynamics which states that as time goes to infinity, given a fixed population cap and a randomized chance to reproduce, every species goes extinct. Intuitively, if you roll k dice infinity times, then eventually they all come up sixes.

The many, many applications of this principle are left as an exercise for the reader.

Wednesday, February 24, 2010

Tab completion for meta-bang shell commands (Wednesday Emacs blogging)

You probably use M-! to run a quick shell command now and then, when you don't want to be bothered with a full M-x shell. But if, like me, you're a former XEmacs user, then you probably find FSF Emacs' default lack of tab-completion for files in the minibuffer rather annoying. Well, your pain ends here:

(if (not (string-match "XEmacs" emacs-version))
      (defadvice read-from-minibuffer
        (around tab-is-pcomplete-in-minibuffer activate)
        "Bind TAB to pcomplete in minibuffer reads."
        (let ((keymap minibuffer-local-map))
          (define-key keymap "\t" 'pcomplete)
          (ad-set-arg 2 keymap)

Ta-da. Now when you M-! mv, you'll be able to tab-complete the filename.

Incidentally, who knew that elisp supported aspect-oriented programming? Apparently it does. Astonishing. I owe this tip to a co-worker, who I'd name except that I doubt he'd want to be associated with the other content on this blog. (p.s. AF, if you ever run across this post and don't mind being credited, I'll happily add your name.)

Wednesday, February 17, 2010

Meta-slash performs dabbrev-expand, and you require this knowledge (Wednesday Emacs blogging)

If you don't know this one already, then go to an emacs window right now, open up any source file (~/.emacs works fine), navigate to any function, and type the first couple of characters of a nearby identifier. Then type M-/. Ta-da!

Details: By default M-/ is bound to dabbrev-expand, which triggers the dynamic abbreviations facility. This dynamically compiles a dictionary from nearby identifiers in the source file, and offers matching identifiers as completions for the current token, preferring identifiers closer to the cursor over more distant ones.

For bonus points, type M-/ multiple times to cycle among recent matches, or use C-M-/ to pop up a list of matching completions in another buffer.

dabbrev-expand isn't as sophisticated as the semantically aware tab-completion available in many IDEs. Conversely, however, it works with no modification in almost every buffer type under the sun, so you can use it when editing code in elisp, Java, or the language you invented this morning. It even completes reasonably well when editing English prose (although since token prefixes are much less unique within an English document, it's only worthwhile for longer words).

Meanwhile, because dabbrev-expand's algorithm is so simple, it doesn't require a heavyweight background process to scan all your project files and keep an in-memory database up-to-date. This is, of course, a typical IDE pitfall. You'll never be waiting for emacs to repopulate the dabbrev-expand database after you refresh all the files in your project checkout.

I'm a little embarrassed to admit that I only learned this keyboard shortcut a couple of months ago. Yes, that's right, I've been typing all my identifiers manually (or using M-w/C-y to copy-and-paste) for my entire freaking career. I estimate that my long-term danger of RSI declined dramatically the day one of my teammates mentioned this feature.

(On the other hand, my incentive to keep names short has been reduced slightly, and I wonder what effect this will have on the code that I write. It seems to me that although names that are too short can be cryptic, it's good to keep code as concise as it can be, consistent with maintaining clarity. Along similar lines, I suspect, for example, that IDEs which make it too easy to extrude large volumes of boilerplate code, or to import functions from many different modules, result in looser, less organized code.)

Tuesday, February 16, 2010

The state of Kindle backups and data portability, February 2010

I recently plugged my Kindle into my workstation's USB port for the first time. In ordinary operation, there's no need whatsoever to do this, but I wanted to try backing up my ebooks. Also, since writing this I wanted to confirm my suspicion that the current Amazon DRM scheme is more akin to Apple's FairPlay "speed bump" than a serious playback control technology.

In short, it is.

The Kindle connects as an ordinary USB mass storage device with a simple folder structure, containing four root-level directories:

  • Audible: audio ebooks? (empty in my case)
  • documents: ebooks
  • music (empty in my case)
  • system: Not exactly what it sounds like — it doesn't actually contain the operating system, only auxiliary data files used by system software. I suppose it's sensible enough not to let some clueless user bork their OS by accidentally dragging this file to the trash. (I suspect that there's a backdoor code that will mount the OS/firmware as well; at least, that's how I'd design this device if I were a developer and wanted to debug it.)

For each ebook, the documents folder contains at least one .azw, .azw1, or .tpz file, and usually a .mbp or .tan file that stores some auxiliary data. Your "clippings" file (containing excerpts that you highlight or note) is stored as a plain .txt file (yay).

Free samples and free public domain ebooks from Amazon are not DRM-restricted. Purchased books, of course, are.

Incidentally, no technology in the Kindle device prevents copying. As noted, the Kindle mounts as an ordinary USB mass storage device, and it is inherent in the filesystem abstraction that you can do simple things like copy the entire contents of the documents folder onto your hard drive. You can do it once or a thousand times, and no technology even tries to stop you. This is an inherent function of the type of device that Amazon has made.

What the files' DRM prevents, in theory, is "playing back" the files' content on some other device after it has been copied. But of course, it doesn't really do that in practice. Without going into details, there are downloadable programs on the Internet, widely available in source and executable forms, which can extract the contents of a restricted AZW file.*

So, in short, it's trivial for you to back up your ebook library. If you're a programmer, it's also pretty easy to write a script that will harvest your entire ebook library, shuck off the obnoxious DRM enclosure, and transcode the contents into some other format. Nontechnical users, unfortunately, don't have easy access to the DRM removal/transcoding step, although this may change as the transcoding software matures and distribution channels route around the legal jurisdictions where this software is banned.

Anyway, as I wrote in my earlier post, I'm hoping that the content production cartels will eventually realize that DRM serves Amazon's interests, not theirs, and abandon even the "speed bump" DRM currently in place. In the meantime, I've found that the value of having a dozen unread books in my bag at any given time, and being able to buy and read a book instantly at midnight on a Sunday, is sufficiently huge that I'm willing to make the compromise.**

*Even if this weren't true, people determined to infringe copyright for monetary or other gain will do so. DRM does not prevent the widespread, willful, uncompensated distribution of copyrighted content. The only thing that DRM does is prevent legitimate paying customers from getting the value from their books that they have been promised by electronic booksellers' use of the phrase "buy this book".

**A compromise, incidentally, that I was never willing to make with iTunes DRM. I suppose that the value I get in my life from reading greatly exceeds the value I get from music.

Monday, February 15, 2010

Q: How is Spock like a fortune cookie?

A: His major lines are vastly improved when you add the suffix "in bed". For example, in last year's Star Trek movie:

  • (To Uhura): "I need everyone to continue performing admirably. In bed."
  • (To himself): "Do yourself a favor: put aside logic, and do what feels right. In bed."
  • (To himself): "As my customary farewell would seem oddly self-serving, I will simply say: Good luck. In bed."
  • (To Kirk): "I will not allow you to lecture me about the merits of emotion. In bed."
  • Kirk: "You know, traveling through time, changing history... that's cheating."
    Spock: "A trick I learned from an old friend. In bed."
  • Spock: "Furthermore, you have failed to understand the purpose of the test."
    Kirk: "Enlighten me again."
    Spock: "The purpose is to experience fear, fear in the face of certain death, to accept that fear, and maintain control of oneself and one's crew. This is a quality expected in every Starfleet captain. In bed."
  • Bones: "You know, back home we have a saying: 'If you wanna ride in the Kentucky Derby, you don't leave your prized stallion in the stable.'"
    Spock: "A curious metaphor, doctor, as a stallion must first be broken before it can reach its potential. In bed."

Thursday, February 11, 2010

Electronic goods markets: end-to-end wins?

Hypothesis: As industries of cultural production adapt to digital distribution, content publishers in each industry will follow, with minor variations, the four-phase pattern set by the music industry:

  1. Denial: Publishers pretend that digital distribution does not exist, attempting to salvage business models based on distribution of physical media. In some cases, publishers use the legal system to try to make this fantasy a reality. Regardless of the legal outcomes, this proves unsustainable in the long run. During this phase, publishers may make halfhearted forays into digital publishing, which invariably fail because they are deeply and deliberately user-hostile.
  2. Faustian Bargain: A technology company designs a system which disguises computers' fundamentally general nature with a fig leaf of DRM. The disguise allows this company to strike a deal with major content publishing cartels to distribute content. Because a technology company has taken control of the technology, the system finally works in a way that doesn't make customers want to tear their hair out. The DRM system fails to prevent widespread copyright infringement, but it provides a hook for the technology company to build a vertically integrated stack which is somewhat inconvenient for customers to exit.
  3. Clash of the Titans: Publishers realize that they are in a weakening bargaining position with respect to the technology company, which has acquired considerable monopsony power due to its control of the platform. Publishers butt heads with the technology company over prices and other contractual terms. This, too, proves unsustainable.
  4. End-to-End Wins: Publishers realize that architectures which embed control in the distribution mechanism put more power in the hands of middlemen than endpoints. Conversely, end-to-end architectures, wherein the endpoints negotiate the transaction and any number of interchangeable mechanisms carry data between them on a best-effort basis, place power in the hands of endpoints rather than middlemen. Publishers furthermore realize that publishers and customers are the endpoints; that in the long run both are best served when the customer can purchase a bundle of data which is not bound (even weakly) to the sales channel, the software stack, or the physical device, all of which are intermediaries between the content and the customer. Publishers finally offer their content in a portable format via multiple sales channels.

This is just a hypothesis. I'm not sure I believe it. However, as evidence that expecting the final stage is not laughably utopian, I offer Sony and Warner's deals with eMusic and the introduction of MP3s on iTunes as evidence that stage 4 is already happening for music.

Detailed application of the above model to current hoopla in the e-book market left as an exercise to the reader. However, I will note that one reason I bought a Kindle is that I thought book publishers were so ornery, retrograde, and technophobic that they'd never progress to stage 4 unless they had an obnoxious would-be monopsonist (viz., Amazon) to frighten them through stage 3.

(A counterpoint to the above argument would be to observe that certain goods, like streaming video and computer games, appear to be evolving in the direction of fairly strong architectures of control. Neither Netflix streaming nor Steam give you much freedom w.r.t. your "purchase". It's unclear whether this means their respective markets haven't progressed far enough yet, or there's something fundamentally different about these media.)

Monday, February 08, 2010

J. Blow: Games as Instruments for Observing Our Universe

You might be dissuaded from listening to this talk by Jonathan Blow because it's distributed as a PowerPoint presentation and a couple of MP3s, or else because it's nominally about the much-maligned artifacts of human civilization commonly called "video games".

You would be making a mistake.

Jonathan Blow is a minor genius, and this talk is worthy of attention from anyone interested in science or art or really any creative activity. I have previously mocked video game apologists for viewing games as a failed (or at least not-quite-successful-yet) aspirant to "interactive cinema" — the teleological destiny of gaming, by this aesthetic, being the creation of an action movie in which You! Are! The! Hero! — and Blow is perhaps the most articulate proponent of the opposite view.

E. W. Dijkstra famously said that "Computer Science is no more about computers than astronomy is about telescopes." He was suggesting that there are properties of the universe — viz, certain mathematical truths — that can only be inspected by studying algorithms, which humans can only do through the construction of computing devices. Per Dijkstra, the devices are not the point, or at least not the only point.

In practice, most of computer science amounts to cleverly engineering around messes that humans have created; but sometimes you do glimpse something which appears to be a property of the broader universe. This is a point that is mostly unappreciated by non-computer-scientists, who assume that the essence of computer science is fiddling around with gadgets.

Similarly, the word "game" applies, in the broadest sense, to any system of rules with which one or more agents interact. Blow's basic point is that the generative systems of rules that we call games can be profound devices for exploring truth, just like the generative systems of rules we call algorithms. But that's a pretty inadequate summary of the talk. You should really listen to the talk itself.

(The Q&A is longer and somewhat more inside-baseball w.r.t. the Game Industry as it actually exists today, and therefore less interesting overall, although there are some good bits there too.)