And yet the desktop GUI persists - and very few pundits acknowledge the elephant in the room and try to address why it persists, in the face of many attempts to replace it. (Steven F does at least touch on the subject, though he attributes it mostly to inertia rather than looking for a deeper cause.) I think it's instructive to look back at some of the failed attempts to replace it, and the reasons why they failed. For discussion's sake, I'll divide them into two main groups here: the 'cartoon map' camp and the 'complex GUI/desktop++' camp.
The "cartoon map"
This camp developed from the premise that the desktop GUI is still too complex and too abstract, and the GUI metaphor needed to be even simpler and more direct. And hence we get the 'cartoon map' - the entire interface is drawn as detailed pictures of actual rooms, desks, filing cabinets, telephones, clocks, etc. The effect is rather like a graphical adventure game. Here are some of the examples I can remember:
- Microsoft BOB: If the cartoon map had a poster boy, this would be it, as one of the most notorious and best-remembered 'maps' out there. Introduced in 1995, BOB was supposed to be a replacement interface for Windows 3.1 and Windows 95. It flopped.
- Magic Cap: An early PDA operating system from the mid-90s, Magic Cap powered devices from Sony (MagicLink), Motorola (Envoy) and Magic Cap's creator General Magic (DataRover). It also flopped.
- eWorld: Another mid-90s entrant (are we seeing a pattern here?), eWorld was Apple's attempt to do their own online service based partly on America Online software and structure, but with graphical navigation maps and a higher emphasis on style. Another flop, though some have argued this was more because of poor promotion and lack of support than of flaws with the interface.
- Andrew Tobias' Managing Your Money 5.0 (Macintosh): An alternate opening menu screen displayed a graphical office view to represent program functions, complete with a little 'mouse-hole' for exiting. (Sadly, I couldn't find any screenshots.) The screen was gone from later versions of the program.
The Lesson: Treat users as adults, not children, if you want to succeed.
The "complex UI/desktop++"
This camp comes at things from the opposite perspective. The desktop metaphor is too limiting, too restrictive, too chained to real-world metaphors; UI design needs to break away from it, build a richer and more complex interface that gives users more power and flexibility. There isn't the kind of consistency here as we saw with the cartoon map era, but there are some concepts that show up frequently:
- Compound Documents: Instead of focusing on applications that then create specific kinds of documents (i.e. Excel creating spreadsheets, Adobe Illustrator creating graphics, etc.), we should consider the document as the starting point, and use various modules to create different kinds of content within that framework.
- No More Files: On the flip side, why have documents at all? The very idea of documents filed away in a hierarchical structure of folders is too limiting; do away with the folder structure and just let files float around. Let the computer itself keep track of them for you; after all, isn't that something it's supposed to be good at? Or go a step further, and do away with files altogether, and have all your information float around in a data 'soup'.
- 3D Visualization: This group agrees that graphical/visual/spatial methods of display and information organization have merit, but they view the fixed 2D frame of the traditional desktop metaphor as too limiting; they seek to add additional content and/or organizational abilities by incorporating a third dimension into the interface in some fashion.
While some next-gen UI proponents want to start with a clean slate and build a new system from the ground up, most of them view the desktop paradigm as too embedded to get rid of completely, and build their innovations on top of existing GUIs; hence 'desktop++'.
Here are some of the UI experiments I'm familiar with - or at least have heard of, can remember off the top of my head, and can come up with some kind of reference for - incorporating one or more of the elements above. (There are several more where I frankly couldn't remember names well enough to track down references.)
- HotSauce: Based on the Meta-Content Format created by Apple in the mid-90s, HotSauce was billed as an alternate way of navigating information on websites in 3D-space. Users saw a website as a set of floating information tags, which they could 'fly through' like a virtual reality scene out of a movie, clicking on a tag to view associated information. HotSauce faded away within a couple of years, MCF metamorphosed into RDF, and hardly anyone remembers it these days. I played with it for a few days and then dumped it; as cool as flying through 3D-space may look in the movies, in real life it was cluttered and very hard to find things, as this screenshot demonstrates.
- OpenDoc: While not a UI per se, OpenDoc was one of the most 'complete' implementations of the compound document idea. Born at Apple in the early 90s and publicly released in System 7.5, OpenDoc was officially killed when Steve Jobs returned in 1997 - but it was effectively dead before then, with an initial burst of enthusiasm followed by lots of 'how do we make this work?' and stagnation. Others have had their own takes on why it failed; mine is that, like AOCE, OpenDoc was a victim of its own complexity. You needed too many software components to assemble something useful, putting together a compound document was more work than just doing a 'simple' document, and performance was poor on systems of the day.
- The Humane Interface: Originally from the book of the same name, later renamed to Archy, this was UI expert (and former Macintosh team member) Jef Raskin's last project before his death. Archy is one of the 'clean-slate' projects, completely dumping a windowing interface and any concept of a filesystem, and going with a 'Zooming User Interface' instead (all 'documents' are seen as items on an infinitely large 2D plane; items can be found by zooming out, scrolling, and zooming back in, or by instant always-available text search). It is also a compound-document system, intended to eliminate applications through the concept of 'commands' that can be typed anywhere in any document, and can be installed either individually or in groups of related functionality. (For example, sending email would involve typing the body of the mail, typing the address, selecting both and typing the Send Mail command.) So far, the system doesn't appear to have gained any traction outside of a small group of fans. The main problems I have with it are scalability (how well will a ZUI document system as described work when dealing with 5 or 10 years of document accumulation, trying to browse them on a single plane?) and typing commands (which requires either lots of rote memorization or reference to documentation, and is not discoverable). Also, the system seems to be focused heavily on creating and editing text documents; I haven't seen anything in the references I've found on how the system is expected to handle graphical documents, nor anything about how to handle stand-alone applications that don't create documents, like games.
- The Newton: Steven F covers this at some length in his post, particularly the clean-slate nature of the design and the 'data soup' system where any application could use bits of data from any other application. The Newton's marketplace failure has been discussed extensively elsewhere, but I'd like to comment on the 'data soup' idea. In practice, I found it pretty problematic, primarily because it didn't handle removable storage well. It was often difficult to tell if a bit of data was on internal storage or removable, with the result that data would often suddenly go missing when you pulled a storage card. Not good.
- 'Piles'/Stacks: Long-rumored after an Apple patent in 1994, 'Piles' were supposed to be a major rethinking of document organization. Documents could be grouped together and moved as a unit, or 'pile'; the pile would be a 3D graphical representation of the documents in it, and could be fanned through, searched/sorted, organized into sub-piles, and manipulated in other ways. As it finally debuted as 'Stacks' in OS X 10.5, it was considerably less ambitious; merely an alternate way of viewing folders in the Dock, replacing the existing folder icon with a 'composite' icon built from the icons of all the files in the folder, and 'fanning' the folder contents when clicked on. Stacks drew a great deal of criticism at 10.5's release; the composite icon was poorly done and not truly representative of the folder contents, the curve to the 'fan' made icons somewhat harder to target, and the new behavior replaced the former behavior of popping up a list of folder contents when clicked on. Later updates to 10.5 allowed users to (mostly) restore the prior folder behavior.
Verdict: Although many people have called for the replacement of the traditional desktop metaphor/GUI over the years, to date every replacement for it has failed to gain significant adoption.
The Lesson: While proponents of new UI paradigms will often acknowledge that they are more complex than current GUIs, they contend that increasing familiarity with computers on the part of the general public renders this moot; users are more sophisticated now than they were when GUIs first received broad adoption, and will take the added complexity in stride. And they contend that the additional power brought by going beyond the desktop metaphor makes the higher learning curve worth it. I think the record demonstrates otherwise.
A third option - "Just Right?"
The 'cartoon maps' were too simple, too condescending. The attempts to create a richer, more complex environment were too complex. So what's left? Are we stuck with the desktop metaphor forever? I don't think so, but I think any attempt to change it will need to walk a middle ground, the Goldilocks option - not so simple that it's useless, not so complex that it's hard, but balancing usability with power. And in some ways I think the iPhone suggests a path to move forward.
Steven F points to the iPhone as the first truly successful new UI paradigm since the desktop metaphor, and while it's still pretty early to be judging that, I think he's got a point. Pen-based computers, as he points out, have generally been unsuccessful. PalmOS was successful; but while it didn't use windows, in most other ways the interface was the standard GUI writ small, with scroll bars, a (hidden) menu bar, and so forth, with the stylus taking the place of a mouse. The various PDA flavors of Windows took this to a ludicrous degree; aping desktop Windows features like the Start menu, Task Bar, and Windows-style scroll bars, on a screen generally far too small to comfortably accommodate them. By contrast, the iPhone OS replaced many of the traditional GUI 'widgets' with direct manipulation of the interface; instead of a scroll bar, users scrolled by drawing a finger directly across the scrolling area, for example. The key here, I think, is that instead of adding increasing levels of abstraction and complexity, iPhone OS reduced them - there were fewer intermediaries between the user and the interface operation.
While the iPhone OS has managed to establish itself as an alternative UI paradigm, it has only done so on a handheld device; unfortunately, the experience does not translate well to a traditional computer. Touchscreens, as many have pointed out, become tedious and even painful to operate over an extended period when used on a vertical screen; laying them flat removes the stress from holding the arm out at length, but replaces it with stress from the neck constantly bending over to observe the display. As a direct model to copy, therefore, the iPhone isn't much help. But I think it does suggest a useful principle: empower the user but maintain simplicity, by replacing complex abstractions with more directly manipulated ones.
Note here that I do not consider a command-line interface to be removing abstractions, as many CLI fans do. While they may argue - and correctly - that any GUI involves much more abstraction from the actual operation of the machine, this is only true from the standpoint of the computer. From the standpoint of the user, a CLI is a much greater cognitive abstraction; it requires the user to hold the command set in their mind or look it up in an external reference, and requires them to memorize when, where, and how to use it. (This is the killer flaw to Archy, in my opinion, as well as the Oberon system described by Mathis.) A good GUI, by contrast, is discoverable; you can try interacting with the system and see what happens, because there are things displayed on-screen that you can manipulate without having to study beforehand.