The Surface of the Metaverse

Looking for all the world like one of those old Ms. Pac Man video game tables found in older bars and pizza joints, the Surface device combines a high-power Windows computer with a 30" display, set horizontally. Surface is controlled by touching this screen with one or more fingers, manipulating images in a reasonably intuitive manner.

The system bears a remarkable resemblance to the multi-touch display Jeff Han demonstrated at TED in 2006, but it's unclear just how much (if anything) he had to do with the Microsoft product. Surface does include some nifty features that Han's vertical-mounted screens couldn't do, such as recognizing when a digital devices has been put onto the table and reacting accordingly -- downloading pictures from cameras, opening up a jukebox app for a MP3 player, etc.. I was impressed by the gestural controls for these features (such as "tossing" a file towards a device to upload it); a key aspect of a usable kinesthetic interface has to be a subtle sense of physics, so that "objects" (virtual though they may be) have a perceived mass and momentum.

Okay, nifty tech, undoubtedly terrifically expensive for the foreseeable future, but if it's at all functional -- and my guess is that it will be -- it's probably a progenitor of a device we'll have in our homes by the middle of the next decade, and will find in cereal boxes not too much longer after that.

What struck me while watching the demos and reading the breathless write-up in Popular Mechanics (of all places) was that the multi-touch display system is probably the apotheosis of the two-dimensional interface model. It comes the closest to treating virtual objects as having 3D space and weight without compromising the utility of more traditional flat documents and menus. Users aren't limited by a single point of contact with the display (e.g., a mouse pointer), breaking a ironclad law dating from the earliest days of computers. In the end, a mouse pointer and a text insert cursor are making the same claim: here is the sole point of interaction with the machine. Multi-touch interfaces (whether Microsoft's Surface, Apple's iPhone, or whatever) toss aside that fundamental rule.

The appeal of Surface (etc.) for computing tasks, however, will be limited in many commonplace arenas. Multi-touch isn't going to make spreadsheets, blogging or surfing the web any simpler or more powerful. It will have some utility in photo and video editing, although here the question of whether greasy fingers will prove a regular problem rears its head. No, the real market for multi-touch is in the world of the Metaverse, especially in the Augmented Reality and Mirror Worlds versions.

(The final version of the Metaverse Roadmap Overview will finally be out in the next couple of weeks, if not sooner, btw.)

The core logic of both Mirror Worlds and Augmented Reality is the intertwining of physical reality and virtual space, in large measure to take advantage of an information substrate to spatial relationships. This substrate relies heavily upon abundant sensors, mobile devices and a willingness of citizens to tag/annotate/identify their environments. The Augmented Reality form emphasizes the in situ availability of the information substrate, while the Mirror Worlds form emphasizes the analytic and topsight power. In each case, the result is a flow of information about places, people, objects and context, one which relies on both history and dynamic interconnections. This may well be the breakthrough technology that makes it possible to control information flows.

Both of these manifestations of the Metaverse could readily take advantage of an interface system that allowed complex kinetic and gestural controls, with Mirror Worlds working best with a massive table/wall screen, and Augmented Reality working best with a hand-held device -- or maybe just the hand. One of Jeff Han's insights while developing his multi-touch system was that human kinesthetic senses need something to push against to work right. "Tapping" something virtual in mid-air may look cool in the movies, but runs against how our bodies have evolved. Our muscles and minds expect something to be there, offering physical resistance, when we touch something. Rather than digital buttons floating in mid-air (or a total reliance on a so-called "conversational interface"), mobile systems will almost certainly have either a portable tablet or (in my view the eventual winner) a way to use one hand drawing on another to mimic a stylus and tablet. The parallel here is to the touchpad found on most laptops: imagine using similar gestures and motions, but on your other hand instead of on a piece of plastic.

There are some obvious downfalls to this interaction model -- from the aforementioned greasy fingers to the ergonomics of head and arm positions in extended use -- but my guess is that the number of innovative applications of the interface (most of which haven't even been imagined) will outweigh any initial physical clumsiness.