Buxton, W. (1990). A Three-State Model of Graphical Input. In D. Diaper et al. (Eds), Human-Computer Interaction - INTERACT '90. Amsterdam: Elsevier Science Publishers B.V. (North-Holland), 449-456.
William A.S. BUXTON
Computer Systems Research Institute, University of Toronto, Toronto, Ontario, Canada M5S 1A4
A model to help characterize graphical input is presented. It is a refinement of a model first introduced by Buxton, Hill and Rowley (1985). The importance of the model is that it can characterize both many of the demands of interactive transactions, and many of the capabilities of input transducers. Hence, it provides a simple and usable means to aid finding a match between the two.
After an introduction, an overview
of approaches to categorizing input is presented. The model is then described
and discussed in terms of a number of different input technologies and
techniques.
If language is a tool of thought, then a start to addressing this shortcoming is to develop a vocabulary common to both devices and techniques, and which augments our ability to recognize and explore relationships between the two.
In the remainder of this paper, a
model which contributes to such a vocabulary is presented. It is a simple
state-transition model which elucidates a number of properties of both
devices and techniques.
Several studies have attempted to evaluate the technologies from the perspective of human performance. Many of these are summarized in Greenstein and Arnaut (1988) and Milner (1988). A common problem with such studies, however, is that they are often overly device-specific. While they may say something about a particular device in a particular task, many do not contribute significantly to the development of a general model of human performance. (There are exceptions, of course, such as Card, English and Burr, 1978.)
With the objective of isolating more fundamental issues, some researchers have attempted to categorize input technologies and/or techniques along dimensions more meaningful than simply "joystick" or "trackball." The underlying assumption in such efforts is that better abstractions can lead us from phenomenological descriptions to more general models, and hence better analogies.
An early development in this regard was the concept of logical devices (GSPC, 1977, 1979). This occurred in the effort to evolve device-independent graphics. The object was to provide standardized subroutines that sat between physical input devices and applications. By imposing this intermediate layer, applications could be written independent of the idiosyncrasies of specific devices. The result was more standardized, general, maintainable, and portable code.
As seen by the application, logical devices were defined in terms of the type of data that they provided. There was one logical device for what the designers felt was each generic class of input. Consequently, we had the beginnings of a taxonomy based on use. The devices included a pick (for selecting objects), a locator, and text.
Foley, Wallace, and Chan (1984) took the notion of logical devices, and cast them more in the human perspective than that of the application software. They identified six generic transactions (which were more-or-less the counterparts of the GSPC logical devices) that reflected the user's intentions:
• position an object in 1, 2, 3 or more dimensions;
• orient an object in 1, 2, 3 or more dimensions;
• ink, i.e., draw a line;
• text, i.e., enter text;
• value, i.e., specify a scalar value.
Buxton (1983) introduced a taxonomy of input devices that was more rooted in the human motor/sensory system. The concern in this case was the ability of various transducers to capture the human gesture appropriate for articulating particular intentions. Consequently, input devices were categorized by things such as the property sensed (position, motion, or pressure), the number of dimensions sensed, and the muscle groups required to use them.
Recent research at Xerox PARC has built on this work (Card, Mackinlay and Robertson,1990; Mackinlay, Card and Robertson, 1990). Their taxonomy captures a broad part of the design space of input devices. The model captures both the discrete and continuous properties of devices, (unlike that of Buxton, which could only deal with the latter).
Together, this collected research begins to lay a foundation to support design. However, none of the models are complete in themselves, and there are still significant gaps. One is the lack of a vocabulary that is capable of capturing salient features of interactive techniques and technologies in such a way as to afford finding better matches between the two.
In what follows, a model (first suggested
in Buxton, Hill and Rowley, 1985) is developed which provides the start
to such a vocabulary. It takes the form of a simple state-transition model
and builds on the work mentioned above. In particular, it refines the notion
of device state introduced in the PARC model of Card, Mackinlay and Robertson
(1990) and Mackinlay, Card and Robertson (1990).
Figure 1. Simple 2-State Transaction
In this case we also get two states, but only one is common to the previous example. This is illustrated in Fig. 2. The first state, (State 0), is what we will call out of range, (OOR). In this state, any movement of the finger has no effect on the system. It is only when the finger comes in contact with the tablet that we enter the State 1, the tracking state seen in the previous example.
Figure 2. State 0-1 Transaction
Figure 3. State 0-1-2 Transaction
The three states introduced in the above examples are the basic elements of the model. There can be some variations. For example, with a multi-button mouse (or the use of multiple clicks), State 2 becomes a set of states, indexed by button number, as illustrated in Fig. 4.
Figure 4. State 2 Set
Note that in what follows, states
will be referred to by number (State 1, for example) rather than by description.
The consequences of the action performed in a particular state can vary.
For example, State 2 could just as easily been "inking" or "rubber banding"
as "dragging." The ordinal nomenclature is more neutral, and will be used.
State State
State
Transaction
0
1
2 Notes
Pursuit Track Point/Select Drag Rubber Banding Sweep Region Pop/Pull Menu Ink Char Recognition |
x |
x x x x x x x x |
x x x x x x |
State 2 motion State 2 motion State 2 motion State 2 motion State 2 motion |
Table 1: State Characteristics of Several Classes of Transaction
A number of representative types
of transactions are listed showing their state and state transition requirements.
This table is of use as a means to help verify if a particular transducer
is well suited to that class of transaction.
State State
State
Transaction
0
1
2 Notes
Joystick & Button Trackball Mouse Tablet & Stylus Tablet & Puck Touch Tablet Touch Screen Light Pen |
x x 1 x x x |
x x x x x x x x |
x 3 4 x x x 4, 5 x 2 x |
6 |
If this cue is to be used in this or other types of transitions, it is important to note that input devices vary in how well they signal the transition. In particular, the majority of tablets (touch and otherwise) give no explicit signal at all. Rather, the onus is on the application to sense the absence of State 1 tracking information, rather than on the transducer to send an explicit signal that the pointing device has gone out of range.
Not only does this put an additional
burden on the software implementation and execution, it imposes an inherent
and unacceptable delay in responding to the user's action. Consequently,
designers relying heavily on this signal should carefully evaluate the
technologies under consideration if optimal performance and efficiency
are desired.
As commonly implemented, the pointing task is undertaken in State 1, and the selection is articulated by a State 1-2-1 transition, with no motion in State 2. This can be easily supported with any of the devices in Table 2 that have plain check marks (?) in the State 1 and State 2 columns.
Some transducers, including trackballs, many joysticks, and touch tablets do not generally support State 2. For the most part this is due to their not having buttons tightly integrated into their design. Therefore, they warrant special mention.
One approach to dealing with this is to use a supplementary button. With joysticks and trackballs, these are often added to the base. With trackballs, such buttons can often be operated with the same hand as the trackball. With joysticks this is not the case, and another limb (hand, foot, etc.) must be employed2. As two-handed input becomes increasingly important, using two hands to do the work of one may be a waste. The second hand being used to push the joystick or touch tablet button could be used more profitably elsewhere.
An alternative method for generating the selection signal is by a State 1-0 transition (assuming a device that supports both of these states). An example would be a binary touch tablet, where lifting your finger off the tablet while pointing at an object could imply that the object is selected. Note, however, that this technique does not extend to support transactions that require motion in State 2 (see below). An alternative approach, suitable for the touch tablet, is to use a pressure threshold crossing to signal the state change (Buxton, Hill, Rowley, 1985). This, however, requires a pressure sensing transducer.
The selection signal can also be
generated via a timeout cue. That is, if I point at something and remain
in that position for some interval
?t, then that object is deemed
selected. The problem with this technique, however, is that the speed of
interaction is limited by the requisite ?t interval.
• rubber-banding: as with lines, windows, or sweeping out regions on the screen;
• pull-down menus;
• inking: as with painting or drawing;
• character recognition: which may or may not leave an ink trail.
• Ease of motion while maintaining State 2.
It is this more obscure second point
which presents the biggest potential impediment to performance. For example,
this paper is being written on a Macintosh Portable which uses a trackball.
While pointing and selecting work reasonably well, this class of transaction
does not. Even though both requisite states are accessible, maintaining
continuous motion in State 2 requires holding down a space-bar like button
with the thumb, while operating the trackball with the fingers of the same
hand. Consequently the hand is under tension, and the acuity of motion
is seriously affected, compared to State 1, where the hand is in a relaxed
state.
In terms of the model under discussion, these devices have an important property: in some cases (especially with touch screens), the pointing device itself (stylus or finger) is the tracking "symbol." What this means is that they "track" when out of range. In this usage, we would describe these devices as making transitions directly between State 0 and State 2, as illustrated in Fig. 5.
Figure 5. State 0-2 Transitions
Second, there are examples where these same direct devices are used with an explicit State 1. For example, light pens generally employ an explicit tracking symbol. Touch screens can also be used in this way, as been shown by Potter, Shneiderman, and Weldon (1988), and Sears and Shneiderman (1989), among others. In these touch screen examples, the purpose was to improve pointing accuracy. Without going into the effectiveness of the technique, what is important is that this type of usage converts the direct technology into a State 0-1 device.
Consider the case of the touch screen
for a moment. Choosing this approach means that the price paid for the
increased accuracy is direct access to State-2 dependent transactions (such
as selection and dragging). Anything beyond pointing (however accurately)
requires special new procedures (as discussed above in Sections 4.1 and
4.2).
The model goes beyond that previously introduced by Buxton (1983) in that it deals with the continuous and discrete components of transducers in an integrated manner. However, it has some weaknesses. In particular, in its current form it does not cope well with representing transducers capable of pressure sensing on their surface or their buttons (for example, a stylus with a pressure sensitive tip switch used to control line thickness in a drawing program).
Despite these limitations, the model
provides a useful conceptualization of some of the basic properties of
input devices and interactive techniques. Further research is required
to expand and refine it.
2. I distinguish between joysticks
with buttons integrated on the stick, and those that do not. With the former,
the stick and button can be operated with one hand. With the latter, two
handed operation is required. Note, however, that in the former case operation
of the button may affect performance of operation of the stick.
Buxton, W. Hill, R. & Rowley, P. (1985). Issues and Techniques in Touch-Sensitive Tablet Input, Computer Graphics, 19(3), Proceedings of SIGGRAPH '85, 215-224.
Card, S., English & Burr. (1978), Evaluation of Mouse, Rate-Controlled Isometric Joystick, Step Keys and Text Keys for Text Selection on a CRT, Ergonomics, 21(8), 601-613.
Card, S., Mackinlay, J. D. & Robertson, G. G. (1990). The design space of input devices. Proceedings of CHI '90, ACM Conference on Human Factors in Software, in press.
Foley, J. & Van Dam, A. (1982). Fundamentals of Interactive Computer Graphics, Reading, MA: Addison-Wesley.
Foley, J.D., Wallace, V.L. & Chan, P. (1984). The Human Factors of Computer Graphics Interaction Techniques. IEEE Computer Graphics and Applications, 4 (11), 13-48.
Greenstein, Joel S. & Arnaut, Lynn Y. (1988). Input Devices. In Helander, M. (Ed.). Handbook of HCI. Amsterdam: North-Holland, 495-519.
GSPC (1977). Status Report of the Graphics Standards Planning Committee, Computer Graphics, 11(3).
GSPC (1979). Status Report of the Graphics Standards Planning Committee, Computer Graphics, 13(3).
Mackinlay, J. D., Card, S. & Robertson, G. G. (1990). A semantic analysis of the design space of input devices. Human-Computer Interaction, in press.
Milner, N. (1988). A review of human performance and preferences with different input devices to computer systems., in D. Jones & R. Winder (Eds.). People and Computers IV, Proceedings of the Fourth Conference of the British Computer Society Human-Computer Interaction Specialist Group. Cambridge: Cambridge University Press, 341-362.
Potter, R., Shneiderman, B. & Weldon, L. (1988). Improving the accuracy of touch screens: an experimental evaluation of three strategies. Proceedings of CHI'88, 27-32.
Sears, A. & Shneiderman, B. (1989). High precision touchscreens: design strategies and comparisons with a mouse, CAR-TR-450. College Park, Maryland: Center for Automation Research, University of Maryland.
Sherr, S. (Ed.)(1988). Input Devices. Boston: Academic Press.
Zimmerman, T.G., Lanier, J., Blanchard,
C., Bryson, S. & Harvill, Y. (1987). A Hand Gesture Interface Device,
Proceedings of CHI+GI '87, 189-192.