Users Tell Stories

By Laura Palmer | Member
Carol Barnum | Fellow

When we conduct usability tests, we want to know how users feel about their experience. But if we ask them directly, we might not get an accurate reflection of how they really feel. Their tendency to respond more positively to questions than we, perhaps, observed in the testing session is called the “acquiescence bias,” and it is a common phenomenon in usability testing.

So, how can we get at their real feelings about their experience? We have found that our use of Microsoft’s product reaction cards gives users control of their story and lets them tell it as they experienced it.

The Origin of the Product Reaction Cards

The product reaction cards are the result of a brainstorming effort conducted by two usability engineers at Microsoft Corporation, Joey Benedek and Trish Miner. Benedek and Miner were seeking ways to get at the elusive quality of desirability in product design, and they had experienced limitations with surveys and interviews in their usability work. They put together a brainstorming session with eight usability engineers with a goal of coming up with the best strategies to try. Two tools emerged in what they called the desirability toolkit. The original desirability toolkit had two parts:

A faces questionnaire, in which participants were asked to look at a photograph of a face and rate how closely the facial expression matched their experience with performing a task.
Product reaction cards, which are a collection of word cards with 60% positive and 40% negative or neutral words, from which participants chose the words that reflected their feelings about the experience. The words in the deck were selected from prior usability and market research. The ratio of positive to negative words was determined on the basis of the acquiescence bias, recognizing that users tend to be more positive in their responses than observation might support.

The faces questionnaire confused some participants and didn’t produce consistent results, so it was abandoned. The product reaction cards played well, so these were refined before the final card deck of 118 words was determined. Table 1 lists the 118 entries in the product reaction card deck.

A big idea behind developing the cards focused on the simplicity of delivery and the ease of analysis, and the cards met this challenge on both counts. Another major benefit was to set users free from the constraints of surveys and questionnaires so that they could choose the cards they felt matched their experience and use them to tell the story of their experience. The cards proved to be perfect for this hoped-for response, and the most surprising outcome was that participants volunteered lots of information that had not come out beforehand.

At the time that Benedek and Miner presented these findings in 2002 at a Usability Professionals’ Association (UPA) conference, they said that the results from using this technique could not be generalized.

Given our extensive use of the cards across numerous platforms and products and over a number of years, we believe that the results can be generalized. We want to share why and how we can make this statement.

Going Beyond the Usual Techniques

Post-task and post-test questionnaires are an integral part of usability testing and we use them in almost every study. Most commonly, we use 5-point Likert scale questions to see how our participants rate their experience with a task. For example, we’ll ask them to rate the task they just completed on a scale of very hard to very easy. Another scale-based question we use relates to satisfaction. The participants rate their experience with the task from very satisfied to very dissatisfied. These scale-based questions give us basic results that can be quantified and illustrated in graphs. This is an excellent way to show trends across the participant pool.

However, there are a few problems with these types of data collectors. Post-task and post-test questionnaires are closed-ended, asking participants to agree or disagree with a statement or rate a response to a particular statement or question.

Semi-structured interviews and open-ended questions are another way to gather feedback from a usability study’s participants. However, when it comes to feedback on design, Michael Hawley makes the case for why a simple question can be so difficult to analyze. In his work testing Web page designs, Hawley found that when participants were asked to explain their design preference, they often said they might like a design for superficial reasons, such as It’s my favorite color. In other cases, they couldn’t articulate a preference. Without a framework or a scaffold through which they could express themselves, many participants were lost. None of the standard ways of getting at emotional response worked until Hawley tried the product reaction cards.

Accessible	Compelling	Disconnected	Exciting	Impressive	Not Valuable	Reliable	Too Technical
Advanced	Complex	Disruptive	Expected	Incomprehensible	Novel	Responsive	Trustworthy
Annoying	Comprehensive	Distracting	Familiar	Inconsistent	Old	Rigid	Unapproachable
Appealing	Confident	Dull	Fast	Ineffective	Optimistic	Satisfying	Unattractive
Approachable	Confusing	Easy to Use	Flexible	Innovative	Ordinary	Secure	Uncontrollable
Attractive	Connected	Effective	Fragile	Inspiring	Organized	Simplistic	Unconventional
Boring	Consistent	Efficient	Fresh	Integrated	Overbearing	Slow	Understandable
Business-like	Controllable	Effortless	Friendly	Intimidating	Overwhelming	Sophisticated	Undesirable
Busy	Convenient	Empowering	Frustrating	Intuitive	Patronizing	Stable	Unpredictable
Calm	Creative	Energetic	Fun	Inviting	Personal	Sterile	Unrefined
Clean	Customizable	Engaging	Gets in the Way	Irrelevant	Poor Quality	Stimulating	Usable
Clear	Cutting Edge	Entertaining	Hard to Use	Low Maintenance	Powerful	Straightforward	Useful
Collaborative	Dated	Enthusiastic	Helpful	Meaningful	Predictable	Stressful	Valuable
Comfortable	Desirable	Essential	High Quality	Motivating	Professional	Time-Consuming
Compatible	Difficult	Exceptional	Impersonal	Not Secure	Relevant	Time-Saving

Table 1. The 118 entries in the product reaction card deck

Pick a Card, Any Card

Here’s how the product reaction cards work. In Microsoft’s approach, they asked participants to pick any number of cards they wanted, then narrow their selection to a top five and present them. Hawley and others have reported putting them into an online survey or spreadsheet.

We have found that the cards work best in the following way for our lab studies, with a variation for remote studies we’ll explain later.

We put all our cards on the table, literally.

In most of our usability studies, we use the cards as a post-test measurement. We’ve used them in a post-task scenario when we are doing comparative evaluation studies so that we can compare the card choices for each product or version in the study. Coming at the end of the user’s experience with the product gives the participant the chance to reflect on the whole experience and gives us a holistic assessment of a participant’s overall experience.

At the end of a participant’s test session, we direct the individual to the table with the cards, which are randomly placed along the table. The participant is asked to pick three, four, or five cards—or more, if they want—to describe their experience. In just a few minutes we’ve got their choices, and they can now use the cards to tell their story. As their story unfolds, we learn not only the meaning of the words on the cards, but also what is most important to the participant in motivating his or her choice.

In remote studies, we show the list of the words in a Word table on the screen at the end of the study, give the remote participant mouse control and ask them to highlight the words they select as they explain each one.

We log the cards the participants choose and what they say. We use this explanation as part of our report and video highlights tape.

For each participant, we shuffle the deck and repeat the process. On completion of the study, we compile each participant’s card choices into one large data set and begin our analysis.

Where the Magic Happens

There’s real magic in the cards. The results always tell a powerful story about how desirable our participants found the process or product undergoing testing. However, the way in which the cards reflect the user’s story continues to amaze us.

Our first level of analysis is very simple; we tally the number of positive and negative cards selected in the study. This breakdown is our first indicator of participants’ general feelings about their experience in the usability test. For example, in a 10-person study, let’s say 57 total cards were selected. If 10 (18%) were positive and 47 (82%) were negative, we’d make an early assertion that this product isn’t a contender for the congeniality award.

Our second level of analysis—a detailed look at the individual selections—takes the cards from magic to near mystical in their powers.

In study after study, we see the phenomenon of repeated card selections. What do we mean? This time, let’s consider a study with eight participants. By repeated card selections, we mean a particular card—we’ll use Time-Saving—is independently selected by multiple participants. As an example, it’s not uncommon for us to see a card like Time-Saving selected six times. We might also see the card Fast selected five times and Efficient selected four times. Now, take a moment to consider what this means.

Wordle1_opt — Figure 1. Positive card choices are displayed in a word cloud

Wordle2_opt — Figure 2. Negative card choices are also displayed in a word cloud

We are seeing desirability, without a doubt. Participants are telling us the product was Time-Saving, Fast, and Efficient—these are cards that reflect a positive emotional response about the product. We’re also seeing a theme in this grouping around the notion of speed. Participants are telling us that they could perform the tasks quickly. Finally, we’re gaining an understanding of something even more intangible than desire: that “something” is shared experience. Multiple participants in the study are coming to the same conclusion. Right there, our findings get a real boost from participant responses.

Here’s an added benefit from the cards: They give us a way to triangulate our findings from other feedback mechanisms—like Likert-scale surveys—used in a study. If most of our participants ranked our tasks as very easy to complete in a survey and we see a significant number of positive card choices related to speed and ease-of-use, we know we’re getting good results.

Showing the Results

One of the most engaging and creative enterprises with the cards is creating a graphical display of the results. Over the last several years, we’ve tried visuals ranging from a basic table or bar graph to pictograms and word clouds. While there are no hard and fast rules about how to display the results of the cards, our best piece of advice is the following: Choose a display that has the most meaningful impact for your targeted audience.

Every study is different. In some cases, we may have high numbers of repeated cards but no thematic connection between the cards. In others, we may see that the repeated cards are low but the cards form strong themes. Still, in most cases we predict and are constantly amazed at the overlap in both the same cards selected and the cluster of similar themes that are revealed in analysis. And this result is as often present in studies with only five or six participants as it is in larger studies. The card choices will differ from study to study, but the consistency in the choices by participants in each study is the amazing outcome.

Here’s an example of the results from a study

One of our recent usability studies centered on an interactive voice response system, in which 15 participants were asked to listen to various telephone system prompts and choose the number on the telephone keypad that most closely matched their goal for each task.

In this study, the total number of cards selected by all 15 participants was 84. Of that 84, there were 59 (70%) positive selections. Once again, we saw high numbers of repeated cards in the study:

Efficient = 6
Easy-to-Use = 5
Effective = 4
Time-Saving = 4
Business-like = 3
Convenient = 3
Straightforward = 3

Participants were happy with this telephone-based, customer-service system and had a strong shared experience. In terms of significant themes, Ease-of-Use and Expediency were predominant.

This study’s negative choices were 25 cards, or 30%. There were far fewer repeated card choices, with Time-Consuming chosen four times and Inconsistent three times.

Because the results were to be presented in a meeting and via a slide presentation to management, we felt the word cloud would have the best impact for the intended audience (see Figures 1 and 2).

For examples of how we have presented the results from other studies, see our work in the Suggested Readings section.

Conclusion

We’ve found the cards unlock information regarding the user’s sense of satisfaction in a more user-centered way than any other tool or technique we have tried. In our experience, the reason for the success of product reaction cards is simple: this tool provides a way for users to tell the story of their experience, choosing the words that have meaning to them as triggers to express their feelings—negative or positive—about their experience.

We hope you’ll try Microsoft’s Product Reaction Cards in your work—the results are truly magical. And the good thing is that Microsoft has given permission for free use of the cards with the following attribution: “Developed by and ©Microsoft. All rights reserved.”