The aim of this lecture is to understand how to perform
evaluation through usability
What is Usability Testing?
While there can be wide variations in where and how you conduct
a usability test,
every usability test shares these five characteristics:
1. The primary goal is to improve the usability of a product.
For each test, you also
have more specific goals and concerns that you articulate when
planning the test.
2. The participants represent real users.
3. The participants do real tasks.
4. You observe and record what participants do and say.
5. You analyze the data, diagnose the real problems, and
recommend changes to fix
The Goal is to Improve the Usability of a Product
The primary goal of a usability test is to improve the usability
of the product that is
being tested. Another goal, as we will discuss in detail later,
is to improve the process
by which products are designed and developed, so that you avoid
having the same
problems again in other products.
This characteristic distinguishes a usability test from a
research study, in which the
goal is to investigate the existence of some phenomenon.
Although the same facility
might be used for both, they have different purposes. This
distinguishes a usability test from a quality assurance or
function test, which has a
goal of assessing whether the product works according to its
Within the general goal of improving the product, you wilI have
more specific goals
and concerns that differ from one test to another.
You might be particularly concerned about how easy it is for
users to navigate
through the menus. You could test that concern before coding the
product, by creating
an interactive prototype of the menus, or by giving users paper
versions of each
You might be particularly concerned about whether the interface
that you have
developed for novice users will also be easy for and acceptable
to experienced users.
For one test, you might be concerned about how easily the
who do installations will be able to install the product. For
another test, you might be
concerned about how easily the client's nontechnical staff will
be able to operate and
maintain the product.
These more specific goals and concerns help determine which
users are appropriate
participants for each test and which tasks are appropriate to
have them do during the
The Participants Represent Real Users
The people who come to test the product must be members of the
group of people
who now use or who will use the product. A test that uses
programmers when the
product is intended for legal secretaries is not a usability
The quality assurance people who conduct function tests may also
problems, and the problems they find should not be ignored, but
they are not
conducting a usability test. They are not real users-unless it
is a product about
function testing. They are acting more like expert reviewers.
If the participants are more experienced than actual users, you
may miss problems that
will cause the product to fail in the marketplace. If the
participants are less
experienced than actual users, you may be led to make changes
improvements for the real users.
The Participants Do Real Tasks
The tasks that you have users do in the test must be ones that
they will do with the
product on their jobs or in their homes. This means that you
have to understand users'
jobs and the tasks for which this product is relevant.
In many usability tests, particularly of functionally rich and
products, you can only test some of the many tasks that users
will be able to do with
the product. In addition to being realistic and relevant for
users, the tasks that you
include in a test should relate to your goals and concerns and
have a high probability
of uncovering a usability problem.
Observe and Record What the Participants Do and Say
In a usability test, you usually have several people come, one
at a time, to work with
the product. You observe the participant, recording both
performance and comments.
You also ask the participant for opinions about the product. A
usability test includes
both times when participants are doing tasks with the product
and times when they are
filling out questionnaires about the product.
Observing and recording individual participant's behaviors
distinguishes a usability
test from focus groups, surveys, and beta testing.
A typical focus group is a discussion among 8 to 10 real users,
led by a professional
moderator. Focus groups provide information about users'
preferences, and their self-report about their performance, but
focus groups do not
usually let you see how users actually behave with the product.
Surveys, by telephone or mail, let you collect information about
attitudes, preferences, and their self-report of behavior, but
you cannot use a survey to
observe and record what users actually do with a product.
A typical beta test (field test, clinical trial, user acceptance
test) is an early release of a
product to a few users. A beta test has ecological validity,
that is, real people are using
the product in real environments to do real tasks. However, beta
testing seldom yields
any useful information about usability. Most companies have
found beta testing to be
too little, too unsystematic, and much too late to be the
primary test of usability.
Analyze the Data, Diagnose the Real Problems, and Recommend
Changes to Fix
Collecting the data is necessary, but not sufficient, for a
usability test. After the test
itself, you still need to analyze the data. You consider the
quantitative and qualitative
data from the participants together with your own observations
and users' comments.
You use all of that to diagnose and document the product's
usability problems and to
recommend solutions to those problems.
The Results Are Used to Change the Product - and the Process
We would also add another point. It may not be part of the
definition of the usability
test itself, as the previous five points were, but it is
A usability test is not successful if it is used only to mark
off a milestone on the
development schedule. A usability test is successful only if it
helps to improve the
product that was tested and the process by which it was
What Is Not Required for a Usability Test?
Our definition leaves out some features you may have been
to see, such as:
. a laboratory with
. a formal test
Each of these is useful, but not necessary, for a successful
usability test. For example,
a memorandum of findings and recommendations or a meeting about
the test results,
rather than a formal test report, may be appropriate in your
Each of these features has advantages in usability testing that
we discuss in detail
later, but none is an absolute requirement. Throughout the book,
we discuss methods
that you can use when you have only a shoestring budget, limited
staff, and limited
When is a Usability Test Appropriate?
Nothing in our definition of a usability test limits it to a
single, summative test at the
end of a project. The five points in our definition are relevant
no matter where you are
in the design and development process. They apply to both
informal and formal
testing. When testing a prototype, you may have fewer
participants and fewer tasks,
take fewer measures, and have a less formal reporting procedure
than in a later test,
but the critical factors we outline here and the general process
we describe in this
book still apply. Usability testing is appropriate iteratively
from predesign (test a
similar product or earlier version), through early design (test
throughout development (test different aspects, retest changes).
Questions that Remain in Defining Usability Testing
We recognize that our definition of usability testing still has
some fuzzy edges.
. Would a test with
only one participant be called a usability test? Probably not.
You probably need at least two or three people representing a
users to feel comfortable that you are not seeing idiosyncratic
. Would a test in
which there were no quantitative measures qualify as a
usability test? Probably not. To substantiate the problems that
you report, we
assume that you will take at least some basic measures, such as
participants who had the problem, or number of wrong choices, or
complete a task. The actual measures will depend on your
and the stage of design or development at which you are testing.
could come from observations, from recording with a data-logging
or from a review of the videotape after the test. The issue is
measures or how you collect them, but whether you need to have
quantitative data to have a usability test.
Usability testing is still a relatively new development; its
definition is still emerging.
You may have other questions about what counts as a usability
test. Our discussion of
usability testing and of other usability engineering methods, in
this chapter and the
next three chapters, may help clarify your own thinking about
how to define usability
Testing Applies to All Types of Products
If you read the literature on usability testing, you might think
that it is
only about testing software for personal computers. Not so.
Usability testing works
for all types of products. In the last several years, we've been
involved in usability
testing of all these products:
Bedside terminal Anesthesiologist's workstation
Patient monitor Blood gas analyzer
Integrated communication system for wards
Nurse's workstation for intensive care units
Network protocol analyzer (for maintaining computer networks)
Application software for microcomputers, minicomputers,
Electronic mail Database management software
Spreadsheets Time management software
Compilers and debuggers for programming languages Operating
Voice response systems (menus on the telephone)
Automobile navigation systems (in-car information about how to
get where you want to go)
The procedures for the test may vary somewhat depending on what
you are testing
and the questions you are asking. We give you hints and tips,
where appropriate, on
special concerns when you are focusing the testing on hardware
but, in general, we don't find that you need to change the
approach much at all.
Most of the examples in this book are about testing some type of
software and the documentation that goes with it. In some cases,
the hardware used to
be just a machine and is now a special purpose computer. For
however, the product doesn't even have to involve any hardware
or software. You can
use the techniques in this book to develop usable
. application or reporting forms
. instructions for noncomputer products, like bicycles .
. nonautomated procedures
Testing All Types of Interfaces
Any product that people have to use, whether it is
computer-based or not, has a user
interface. Norman in his marvelous book, The Design of Everyday
points out problems with doors, showers, light switches, coffee
pots, and many other
objects that we come into contact with in our daily lives. With
creativity, you can plan
a test of any type of interface.
Consider an elevator. The buttons in the elevator are an
interface- the way that you,
the user, talk to the computer that now drives the machine. Have
you ever been
frustrated by the way the buttons in an elevator are arranged?
Do you search for the
one you want? Do you press the wrong one by mistake?
You might ask: How could you test the interface to an elevator
in a usability
laboratory? How could the developers find the problems with an
before building the elevator-at which point it would be too
expensive to change?
In fact, an elevator interface could be tested before it is
built. You could create a
simulation of the proposed control panel on a touchscreen
computer (a prototype).
You could even program the computer to make the alarm sound and
to make the
doors seem to open and close, based on which buttons users
touch. Then you could
bring in users one at a time, give them realistic situations,
and have them use the
touchscreen as they would the panel in the elevator.
Testing All Parts of the Product
Depending on where in the development process you are and what
particularly concerned about, you may want to focus the
usability test on a specific
part of the product, such as
. installing hardware
. operating hardware
. cleaning and maintaining hardware
. understanding messages about the hardware
. installing software
. navigating through menus
. filling out fields
. recovering from errors
. learning from online or printed tutorials
. finding and following instructions in a user's guide . finding
instructions in the on line help
Testing Different Aspects of the Documentation
When you include documentation in the test, you have to decide
if you are more
interested in whether users go to the documentation or in how
well the documentation
works for them when they do go to it. It is difficult to get
answers to both of those
concerns at the same time.
If you want to find out how much people learn from a tutorial
when they use it, you
can set up a test in which you ask people to go through the
tutorial. Your test
paticipants will do as you ask, and you will get useful
information about the design,
content, organization, and language of the tutorial.
You will, however, not have any indication of whether anyone
will actually open the
tutorial when they get the product. To test that, you have to
set up your test
Instead of instructing people to use the tutorial, you have to
give them tasks and let
them know the tutorial is available. In this second type of
test, you will find out which
types of users are likely to try the tutorial, but if few
participants use it, you won't get
much useful information for revising the tutorial.
Giving people instructions that encourage them to use the manual
or tutorial may be
unrealistic in terms of what happens in the world outside the
test laboratory, but it is
necessary if your concern is the usability of the documentation.
At some point in the
process of developing the product, you should be testing the
usability of the various
types of documentation that users will get with the product.
At other points, however, you should be testing the usability of
the product in the
situation in which most people will receive it. Here's an
A major company was planning to put a new software product on
its internal network.
The product has online help and a printed manual, but, in
reality, few users will get a
copy of the manual.
The company planned to maintain a help desk, and a major concern
for the usability
test was that if people don't get the manual, they would have to
use the online help,
call the help desk, or ask a co-worker. The company wanted to
keep calls to the help
desk to a minimum, and the testers knew that when one worker
asks another for help,
two people are being unproductive for the company.
When they tested the product, therefore, this test team did not
include the manual.
Participants were told that the product includes online help,
and they were given the
phone number of the help desk to call if they were really stuck.
The test team focused
on where people got stuck, how helpful the online help was, and
at what points people
called the help desk.
This test gave the product team a lot of information to improve
the interface and the
online help to satisfy the concern that drove the test. However,
this test yielded no
information to improve the printed manual. That would require a
Testing with Different Techniques
In most usability tests, you have one participant at a time
working with the product.
You usually leave that person alone and observe from a corner of
the room or from
behind a one-way mirror. You intervene only when the person
"calls the help desk,"
which you record as a need for assistance.
You do it this way because you want to simulate what will happen
users get the products in their offices or homes. They'll be
working on their own, and
you won't be right there in their rooms to help them.
Sometimes, however, you may want to change these techniques. Two
ideas that many
teams have found useful are:
. co-discovery, having two participants work together
. active intervention, taking a more active role in the test
Co-discovery is a technique in which you have two participants
work together to
perform the tasks (Kennedy, 1989). You encourage the
participants to talk to each
other as they work.
Talking to another person is more natural than thinking out loud
alone. Thus, codiscovery
tests often yield more information about what the users are
what strategies they are using to solve their problems than you
get by asking
individual participants to think out loud.
Hackman and Biers (1992) have investigated this technique. They
confirmed that codiscovery
participants make useful comments that provide insight into the
They also found that having two people work together does not
distort other results.
Participants who worked together did not differ in their
performance or preferences
from participants who worked alone.
Co-discovery is more expensive than single participant testing,
because you have to
pay two people for each session. In addition, it may be more
difficult to watch two
people working with each other and the product than to watch
just one person at a
time. Co-discovery may be used anytime you conduct a usability
test, but it is
especially useful early in design because of the insights that
the participants provide
as they talk with each other.
Active intervention is a technique in which a member of the test
team sits in the room
with the participant and actively probes the participant's
understanding of whatever is
being tested. For example, you might ask participants to explain
what they would do
next and why as they work through a task. When they choose a
option, you might ask them to describe their understanding of
the menu structure at
that moment. By asking probing questions throughout the test,
rather than in one
interview at the end, you can get insights into participants'
evolving mental model of
You can get a better understanding of problems that participants
are having than by
just watching them and hoping they'll think out loud.
Active intervention is particularly useful early in design. It
excellent technique to use with prototypes, because it provides
a wealth of diagnostic
information. It is not the technique to use, however, if your
primary concern is to
measure time to complete tasks or to find out how often users
will call the help desk.
To do a useful active intervention test, you have to define your
goals and concerns, plan the questions you will use as probes,
and be careful not to
bias participants by asking leading questions.
Additional Benefits of Usability Testing
Usability testing contributes to all the benefits of focusing on
usability that we gave in
Chapter 1. In addition, the process of usability testing has two
specific benefits that
may not be as strong or obvious from other usability techniques.
. change people's attitudes about users
. change the design and development process
Changing People's Attitudes About Users
Watching users is both inspiring and humbling. Even after
watching hundreds of
people participate in usability tests, we are still amazed at
the insights they give us
about the assumptions we make.
When designers, developers, writers, and managers attend a
usability test or watch
videotapes from a usability test for the first time, there is
often a dramatic
transformation in the way that they view users and usability
issues. Watching just a
few people struggle with a product has a much greater impact on
attitudes than many
hours of discussion about the importance of usability or of
After an initial refusal to believe that the users in the test
really do represent the
people for whom the product is meant, many observers become
instant converts to
usability. They become interested not only in changing this
product, but in improving
all future products, and in bringing this and other products
back for more testing.
Changing the Design and Development Process
In addition to helping to improve a specific product, usability
testing can help
improve the process that an organization uses to design and
develop products (Dumas,
1989). The specific instances that you see in a usability test
are most often symptoms
of broader and deeper global problems with both the product and
Comparing Usability Testing to Beta Testing
Despite the surge in interest in usability testing, many
companies still do not think
about usability until the product is almost ready to be
released. Their usability approach is to give some customers an
ready) version of the product and wait for feedback. Depending
on the industry and
situation, these early¬
release trials may be called beta testing, field testing,
clinical trials, or user acceptance
In beta testing, real users do real tasks in their real
environments. However, many
companies find that they get very little feedback from beta
testers, and beta testing
seldom yields useful information about usability problems for
. The beta test site does not even have to use the product.
. The feedback is unsystematic. Users may report-after the
fact-what they remember
and choose to report. They may get so busy that they forget to
report even when
things go wrong.
. In most cases, no one observes the beta test users and records
Because users are focused on doing their work, not on testing
the product, they may
not be able to recall the actions they took that resulted in the
problems. In a usability
test, you get to see the actions, hear the users talk as they do
the actions, and record
the actions on videotape so that you can go back later and
review them, if you aren't
sure what the user did.
. In a beta test, you do not choose the tasks. The tasks that
get tested are whatever
users happen to do in the time they are working with the
product. A situation that you
are concerned about may not arise. Even if it does arise, you
may not hear about it. In
a usability test, you choose the tasks that participants do with
the product. That way,
you can be sure that you get information about aspects of the
product that relate to
your goals and concerns. That way, you also get comparable data
If beta testers do try the product and have major problems that
keep them from
completing their work, they may report those problems. The
unwanted by-product of
that situation, however, may be embarrassment at having released
a product with
major problems, even to beta testers.
Even though beta testers know that they are working with an
unfinished and possibly
buggy product, they may be using it to do real work where
problems may have serious
consequences. They want to do their work easily and effectively.
reputation and sales may suffer if beta testers find the product
frustrating to use. A
bad experience when beta testing your product may make the beta
testers less willing
to buy the product and less willing to consider other products
from your company.
You can improve the chances of getting useful information from
beta test sites. Some
companies include observations and interviews with beta testing,
going out to visit
beta test sites after people have been working with the product
for a while. Another
idea would be to give tape recorders to selected people at beta
test sites and ask them
to talk on tape while they use the product or to record
observations and problems as
Even these techniques, however, won't overcome the most
significant disadvantage of
beta testing-that it comes too late in the process. Beta testing
typically takes place
only very close to the end of development, with a fully coded
functional bugs may get fixed after beta testing, but time and
money generally mean
that usability problems can't be addressed.
Usability testing, unlike beta testing, can be done throughout
the design and
development process. You can observe and record users as they
work with prototypes
and partially developed products. People are more tolerant of
the fact that the product
is still under development when they come to a usability test
than when they beta test
it. If you follow the usability engineering approach, you can do
usability testing early
enough to change the product-and retest the changes.