By Frank Spillers

It's great to see excitement building for the Apple iPhone (launch expected late June '07). Korea's LG Prada, an iPhone competitor product,  launched a few months ago in the UK.

Just this week a new touch screen device- Taiwan's HTC Touch was announced for release immediately in the UK and later this year in the US. 

The hot attraction of course with the iPhone, the Prada and the Touch is that they offer touch and gesture-only user interaction. Clarification: Only the Apple iPhone offers gesture-like pinching interaction (eg. open-close; zoom in-out).

What's great about the excitement around these new phones is that they set a new precedent in communication device design for 2008-2010. They are, if you will, a generation up from the slide cover keyboard phones. These devices are hip, keyboard-free, MP3 and mobile Web-friendly.

Watch these demos if you are not familiar with how these new mobile devices work:

Bruce "Tog" Tognazzini offered an early usability review of iPhone features, worth a read, but as one of our consultants pointed out- "how can you do a usability review of a device only using a demo from Apple's website? Is that a real usability evaluation?".

What is Multi-modal interaction?

Multi-modal interaction is an area of Human Computer Interaction (HCI) that has a long history of usability research and empirical study. The virtual reality and game design research community have been studying this commercially since the early 1980's with pioneering researchers such as Brenda Laurel at Atari.

Today researchers like Jeff Han, see images left, are meshing with industrial efforts to bring multi-modal
interaction to life. Last week, Microsoft announced Surface- an interactive coffee-table surface that responds to touch and gesture. Users can explore information linked to objects placed on the desktop (TMobile will use it in stores to encourage handset purchase decision support).

The scenario videos on the Microsoft Surface website are worth a look. Very well presented.

Also, if you haven't seen Jeff Han's TED conference presentation, it's an amazing must see
of his Surface-type work-space.

Why multi-modal interaction design?

Multi-modal design brings the spirit of HCI to life by harnessing the rich sensory input afforded by the human body-mind (touch, gesture, sight, sound, voice...smell, well not yet, but it's in the works).

As interface designers, we have really had to compromise with the flat and lifeless limitations of desktop PC's (windows, icons, menus) compared to the original vision of how humans should use computers advocated in the late 1960's:

Computer graphics and interface pioneer Ivan Sutherland told us our computers should not be mere 2D screens that provide information, but instead they should be 'windows upon which we look into a virtual world...where we can see, hear and feel' multi-sensory information.

I began studying multi-modal interaction ten years ago as part of my early virtual reality research. It's an area of interface design that is truly fascinating for it's potential. It's also an area of interface design that is challenging due to the context in which the user is interacting. As a designer the questions become:

"Which sensory pathway does the user have available to complete this goal in that context?"

"Which sensory system is the lead, which is the secondary?"

"How much sensory over-lap is available, tangible, appropriate?"

"How do users back out or recover from a screen event- in a dynamically changing physical environment?"

Does gesture and touch interaction work in other cars?

For the past five years I have closely followed the emerging 'Internet in your car' trends in the automotive industry (aka telematics usability). A practical example I can share, which I've written about and spoken about at telematics automotive conferences, is the case of GM's OnStar. Several years ago, GM provided me with a fully loaded Cadillac CTS for a week that I used to evaluate the OnStar, a speech system for help, navigation, communication and additional information.

OnStar weighted it's user interface toward "voice" or speech interaction over a multi-modal interface. The result was a clunky system with a history of poor user adoption and satisfaction. In 2001, 60% of Onstar systems were switched off in the owner's vehicles. BMW on the other hand, weighted it's iDrive telematics solution to a knob-like control (tactile interaction) with 700 features in menus at the turn of a dial.

The result: eroded brand loyalty, confused and frustrated customers (including usability guru and BMW customer, Jakob Nielsen). Jakob Nielsen's wife said at the time she would never buy another BMW again...

The pattern in the design flaws for telematics human factors engineers?

Don't put all your "eggs in one basket" with regard to one modality.

It appears that neither GM nor BMW provided adequate multi-modal support, opting for a "lead" sensory system (speech, touch) over a mixed system.

I believe multi-modal is generally always better than singular modality (as an interaction design technique). But you must be careful if you are designing for multi-modal interfaces, as Oregon Graduate Institute Professor Sharon Oviatt reminds us in her Ten Myths of Multi-modal Interaction (PDF).

Best Wishes,
Frank Spillers, MS