WE MEASURE USABILITY
by one stringent test and three loose tests.
The stringent test is called muscle event counting. A strategy
for discovering the difference of ease between two software approaches or indicator designs is
to count the muscle group events needed by each item for a
common set of operations. An operation requiring one
eye scan movement, a forearm movement, a finger movement and two voice
phrases (for a speech interface) would count as five muscle events. Why is this significant?
Each change of muscle group takes time, approximately one tenth
second, and requires a shift of attention from the task at hand.
The fewer muscle events there are, the less attention shifting
done by the user and thus the greater perceived ease.
The first loose test is to throw away the instruction manual
and see how far one can get without reference to the manual. Every
situation in which the user does not have adequate information
to carry out the task from whatever is otherwise presented by
the product is considered, by definition, an instance of poor
usability. This test normally brings howls of protest from vendor
developer teams until we remind them that users do not bring manuals
when traveling with their laptops. Note that needing to reference online help instead of the manual counts as additional muscle events.
The second loose test is a series of criteria. They are:
- visibility of system status - especially with regard to what
the system (speech recognizer in the case of voice interface) is currently doing
- degree of match between user concepts and product conventions
- e.g. no computer-eese language
- presence and depth of emergency exits - e.g. undo, redo, don't
do, please do because I need it very badly at this particular
crucial moment even if the product doesn't think so
- the ease with which various kinds of consecutive system misrecognition
errors can be diagnosed and corrected. The previous mentioned "undo"
criterion concerns user generated mistakes. In this criterion we concern ourselves with recovering from erroneous diagnostic messages generated by the software.These errors need characteristically different
tools than user keystroke mistakes.
- consistency - e.g. window shapes, colors, positioning, wording,
button location, grouping of similar concepts
- minimal need for user recall between one part of a dialog
- provision for experienced users as well as beginners - e.g.
shortcut key definition capability, diagnostic options, panic reset buttons
- messages which indicate problems precisely in plain talk and
recommend an implementable (as opposed to perfunctory) solution. If industry or trader terms must be present, they should be displayed with hyperlinks for jargon to accomodate the new user seeing this message for the first time.
- elegance of design - a small number of objects should do a large
number of tasks (which is not the same as a small number of
objects overloaded with optional, unrelated or temporally dependent
- error preventive design - e.g. each instance in which an "undo"
might serve well is carefully engineered out so that the need
to undo is reduced
- "hint count" of manuals- e.g. hints, tips and sidebars
in an instruction manual or help screens are red flags of either poor manual
writing or poor product design. A product shouldn't need tips
to work better, it should just work better all by itself if
you follow normal instructions. Well-designed instructions and
product functions don't need user cautions, hints, caveats, secondary
explanations and tips. Presence of hints often indicates failure
to attend to "error preventive design" criterion just
- time duration of operations - e.g. "easy" or frequently
needed operations should take shorter time than "hard"
- minimization of advertising, eye candy or other desiderata
which competes for user attention or screen real estate during
product tasks, especially product installation
- on-line help which really helps
The third loose test is to examine whether a given
product behavior is or is not an "eyebrow raiser", i.e.
it did or did not cause a tester to jump back slightly in the
chair and wonder, with eyebrows raised, what on earth the designer
could have been thinking in order to make that particular design
choice. Often such a product behavior turns out to be an outright
bug on some level. Sometimes development teams justify this sort of behavior
by pointing out that it simplifies internal product programming. We have zero sympathy
for that perspective. Sometimes it is justified by saying "the
user will get used to that behavior." We have tolerance for
that only if the weird product behavior speeds up learning the
product or actual productivity and no alternative is available.
We then examine these usability weaknesses according to whether
they are either frequent or one-time occurrences and whether they
require Herculean user efforts to overcome. That leads to finally
rating them as any of the three possibilities of "usability catastrophes",
"annoyances" or merely "cosmetic issues".
If you have applications under development and wish to
have an analysis performed along these lines please contact
More on bug testing...
More on expandability...
Usability Heuristics by Jakob Nielsen