58.3, August 2011

Using Sorting Data to Evaluate Text Structure: An Evidence-based Proposal for Restructuring Patient Information Leaflets

Henk Pander Maat and Leo Lentz

Abstract

Purpose: This paper assesses the text structure imposed on patient information leaflets in the European Union (EU). It proposes an alternative structure based on reader-oriented research.

Method: Two card-sorting studies were used to identify reader expectations. In a closed card-sorting study, participants were provided with scenario questions on medication use and were asked under which of the template headings they expected to find information on each question. In an open sorting task, the schemata of patient information leaflet readers were explored. In this study, participants sorted a large set of sentences that can be found in actual patient information leaflets.

Results: The closed card-sorting study reveals that users provided with the EU template structure do not always look at the correct section when searching information about patient situations. The results of the open card-sorting study indicate that readers prefer the following structure: goal of the medicine – directions for use – potential problems – packaging and storage.

Conclusion: Card-sorting data help to evaluate and design text structures for genres such as patient information leaflets. The European template does not match users’ expectations concerning the leaflet’s structure. There is a mismatch between the wording of headings and reader interpretations. A second mismatch has to do with classifying and grouping information. Patient information presented in the alternative format may be expected to improve reading performance.

Keywords: patient information, reader expectations, structure, evaluation, card sorting

Practitioner’s Takeaway

  • Regulating authorities should be careful when imposing obligatory text structures or headings, since these may be at odds with reader expectations and thus decrease the findability of information.
  • Sorting studies, both closed and open ones, are most useful for investigating reader expectations concerning text structures.

Introduction

Text genres come with corresponding genre schemata or move structures, specifying what will be discussed and in what order. Genre conventions serve readers by providing a collectively shared shorthand for interpreting information (Kostelnick & Hasset, 2003). They help readers to scan a page and identify relevant information on the basis of structural expectations. For instance, an experienced reader of scientific articles in the experimental tradition is thoroughly acquainted with their structure (Swales, 1990), and the same goes for book reviews (Toledo, 2005) and application letters (e.g., Henry & Roseberry, 2001; Upton & Connor, 2001), to mention just a few well-established text genres. For most genres, these conventions have evolved in an interactive process within a certain discourse community of writers and readers. But some schemata are being “enforced” upon writers (and readers) by gatekeeping institutions. For instance, many scientific journals instruct authors to follow a certain structure, not only in their articles as a whole but even in particular sections such as the abstract. Imposing genre conventions may benefit readers, but it can also hinder writers because it does not allow them to further develop and improve their texts in order to better meet changing reader expectations. As Kostelnick and Hasset (2003) state, in such instances the social contract imprisons users, rather than fostering cooperation between designers and readers.

A particularly strong case of genre schema enforcement is provided by the European regulatory efforts concerning patient information leaflets (PILs) handed out with medicines. Not only the contents but also the structure of the PIL is constrained to such a degree that Askehave and Zethsen (2003) speak of a mandatory genre. Regarding their structure, these leaflets should comply with a so-called template published by the Quality Review of Documents (QRD) group of the European Medicines Agency (EMEA). This template is currently available in 25 European languages; the English version is annotated with instructions (for leaflet writers) about what information has to be placed under the various headings. The template can be downloaded from the EMEA Web site (EMEA, 2006).

The crucial question is, to what extent does the template benefit readers of patient information and to what extent does it hinder them? In an earlier study (Pander Maat & Lentz, 2009), we concluded that readers experienced serious problems in finding information in three patient information leaflets, of which two were designed according to the template. This study did not reveal a template benefit for readers, because the two documents that did comply with the template did not demonstrate better findability of information than the document that was produced before the template was mandated.

This raises the question of what expectations about the structure of the document readers will activate when confronted with patient information leaflets, and how these expectations compare to the text structure actually encountered. More generally, this study addresses the question of how to use empirical data in evaluating and redesigning text structures. This issue cannot be dealt with by performing a simple experiment with two or three design options, since the number of possible structures is far too large for that. What is needed here is a methodology to select promising text structures that may be subsequently tested experimentally.

We suggest that sorting tasks may provide this kind of information. Hence this paper sets out to investigate PIL structure expectations by means of sorting tasks. The first study, which can be characterized as a closed sorting task, provides a direct test of the present template. In this study, participants were provided with scenario questions on medication use and were asked under which of the template headings they expected to find information on each question. This first study provides information on items that do not fit naturally in the current headings, and on heading formulations that lead to unintended interpretations. However, it does not address the question of how users themselves might structure patient information. This was the purpose of the second study, in which we used an open sorting task to explore the schemata of PIL readers. In this study, participants sorted a large set of sentences that can be found in actual patient information leaflets. We were specifically interested in differences between the sorting results and the present European template.

The Structure of Patient Information Leaflets

European Template

In 1998 the European Community issued a directive that requires pharmaceutical companies to base their leaflets on a template. This so-called QRD template regulates four aspects of package leaflets: (1) the content elements that must be present; (2) the order in which these topics should be discussed; (3) the headings to be used for paragraphs and subparagraphs; and (4) the wording of a number of specific passages. In its present state, the structure of the document and the headings are as follows:

  1. What is X and what is it used for?
  2. Before you take X
    • Do not take X
    • Take special care with X
    • Taking other medicines
    • Taking X with food and drink
    • Pregnancy and breast-feeding
    • Driving and using machines
    • Important information about some of the ingredients of X
  3. How to take X
    • If you take more X than you should
    • If you forget to take X
    • If you stop taking X
  4. Possible side effects
  5. How to store X
  6. Further information
    • What X contains
    • What X looks like and contents of the pack
    • Marketing authorization holder and manufacturer
    • This leaflet was last approved in (date)

Clearly, a template like this takes many decisions out of the hands of medical writers. But will it also help readers?

Earlier Studies of Leaflet Comprehension

A fixed document structure might be a good thing, if it helps readers to scan the document and find relevant information. It might also help readers to “learn” the genre, by building a mental representation of its structure. Morrow, Leirer, Andrassy, and Tanke (1996) and Morrow, Carver, Leirer, and Tanke (2000) have repeatedly shown that medication instructions that follow the users’ medication schema, as they call it, may help users recall instructions. In Morrow et al. (1996), this medication schema was first constructed by having participants of different ages sort 10 short instructions regularly appearing in medication leaflets. For three fictional medicines, sentences with information on these topics were printed on 10 cards, preceded by topic label. Participants generally sorted these cards into two categories: “the medication and how to take it” and “potential problems associated with taking the medication and what to do if they occur.” The participants were also asked to provide their preferred order of appearance in an actual medication instruction. Generally, a “medication” section (name – purpose) precedes a “how to take it” section (dose – schedule – duration), followed by a “problems” group (warnings – mild side effects – severe side effects – emergency).

In an experiment, three versions of the instructions were designed. In the compatible instructions, the items were presented in the preferred grouping and in the preferred order of appearance. In the category version, the grouping was preserved but not the order. In the scrambled version, all items were in nonpreferred positions. The scrambled version yielded significantly poorer recall scores than either the compatible or the category version. Interestingly, even when presented with a scrambled version, participants tended to recall items in the preferred order.

Although these medication instructions are extremely short compared to the European PILs, these results are certainly relevant for the European PIL template. They suggest that readers actually have structural expectations for medication instructions and that following these expectations may improve the usability of these instructions.

Does the current template reflect the readers’ expectations? To our knowledge, its design is not based on such research. An earlier usability study on three Dutch PILs (Pander Maat & Lentz, 2009) revealed that finding information is the main problem of PIL users. The structure imposed by the template even seems to cause a number of findability problems:

  • Participants have problems finding information about ingredients that could produce allergic reactions. According to the template, the ingredients must be presented in the final section, under the obligatory heading “Further Information.” However, participants expected to find this information in the earlier paragraph headed “Before you use X.” Clearly, the heading “Further Information” does not help users to locate information.
  • Participants could not always find information on how to administer the medicine. This information has no separate heading in the template. Neither are there headings for information concerning the dose, the time to take, and how long to take the medicine. Hence, the directions for use often take the form of a long stretch of instructions without any visible structure.

These results suggest that the European template does not completely fit the expectations of participants searching for information. However, they had not been confronted with the template as such, but with real-life patient information leaflets in which much more information was presented than the headings of the template. Hence, we designed a study with the specific purpose to test the template.

Closed Card-Sorting Study

Materials, Participants, and Procedure

Card sorting is a well-known method in psychological research into knowledge organization (Rugg & McGeorge, 2005) and text comprehension (McNamara, Kintsch, Songer, & Kintsch, 1996); it is also used for investigating Web site designs (Spencer & Warfel, 2004; Stalker-Firth, 2007). Apart from studies by Morrow, Leirer, Altieri, and Tanke (1991) and Morrow et al. (1996, 2000), we know of no earlier application of the method to the study of written text organization. In closed card-sorting exercises, participants receive a set of cards as well as the names of the categories into which to sort them. This method can be used to test to what extent a predefined classification is usable for sorting a set of information items.

Our participants were invited to locate the answer to scenario questions under the current template headings. In order to test the bare template, the participants were confronted with a manipulated leaflet that had the look and feel of a real patient information leaflet, in which only the headings were readable. The document was a version of an existing leaflet for Rosuvastatine (brand name Crestor), in which the entire body text—except the headings—had been changed into bogus Latin while retaining the layout and typography (the way editors sometimes do for dummy documents). The headings were preserved, as these were the items to be tested. We could have used a more traditional card-sorting method by just presenting cards (with medication questions) and headings as sorting categories, but in our view ecological validity would be higher if the materials had the look and feel of real patient information leaflets. Thus, the headings were not presented as boxes into which cards could be placed, but as structuring devices of a real document. Participants were asked to point at the heading they thought was relevant for each medication question.

A content analysis of Dutch PILs yielded a list of 34 topics regularly present in leaflets. In principle all these topics could be assigned a scenario question, but seven of them were disregarded for the closed sorting study because they seemed less important in terms of medicine usage (e.g., manufacturer, name of registration), because the information was too general to be searched for (e.g., the general introduction of the side effects section stating that all medicines have side effects and the advice to consult your doctor in case of unknown side effects), or because they were difficult to relate to a single heading (e.g., a change in your physical condition). Hence the questionnaire contained 27 questions. They can be found in Appendix 1 to this paper on the Technical Communication Web site. Additionally, participants filled out a short questionnaire on demographic variables and medication use.

Participants were recruited in the networks of our students. There were 46 participants, 21 women and 25 men. Their mean age was 37.5 (SD 17.1). Only 6% of them had received more than three medicine prescriptions the past year. Thirty-nine percent of them reported always reading the patient information leaflet. The PIL reading experience correlated significantly (.30, p < .05) with the total success score and the total localization time of the participants (-.35, p. < .01).

 

First, the participants were introduced to the dummy leaflet with Latin text and Dutch headings. They were allowed to scan the document for one minute. Then the interviewer introduced the task as follows: “I will ask you some questions you may have when coming home after receiving this medicine. Could you tell me where you would expect to find the answer to each question?” The interviewer read the question and presented a card with this question to the participant. After the participant had mentioned the heading under which the answer was expected to be found, the next question was asked. The mean test duration was 21 minutes.

Results

The mean proportion of correctly located questions was .77, which comes close to the results of Pander Maat and Lentz (2009), who reported a localization success score of 75%. They concluded that readers had problems finding information about ingredients and directions for use. The two questions on ingredients in the present study were located successfully by 65% to 70% of the participants. For instance, a scenario question about lactose allergy taught us that the heading “Important information about some of the ingredients of X” is not understood as referring to potentially allergenic ingredients (success score 65%). Neither is this kind of information regularly expected under the later headings “Further Information” and its subheading “What X contains,” because the earlier and quite general heading “Before you take X” is thought to be more relevant for the topic of allergenic ingredients.

The lowest success scores were found in three questions on user directions, as Table 1 shows.

Table 1. Finding scores for three questions on user directions in the closed sorting tasks.

Question

Success score

Correct heading

Incorrectly chosen headings

1. You are going to a party tonight. Can you use alcohol with this medicine?

.35

Taking X with food and drink

What is X and what is it used for? Before you take X

2. You take your pills twice a day. How much time must pass between doses?

.41

How to take X

Taking X with food and drink

3. Can you throw this medicine into the dustbin when the expiry date has passed?

.46

How to store X

Further information

If you stop taking X

Reflecting on the incorrectly chosen headings in the last column, most of these are phrased very generally: “what is X?,” “before you take X,” “further information.” It seems that given the absence of specifically relevant headings, readers often turn to all-encompassing headings such as these.

“Taking X with food and drink” is incorrectly assumed to refer to the time between doses; apparently, this category is read as referring to a special kind of “How to take X” information. Possibly, readers search in vain for a heading about when to take the medicine, which is lacking in the present structure. Moreover, “Taking X with food and drink” is not strongly associated with information on alcohol use.

A more specific heading like “How to store X” is not uniformly associated with disposal of the medicine. It can be argued that storing the medicine is the opposite of disposing of it. Hence the two concepts belong together, but one of these cannot function as the heading for both topics. It seems that the heading needs to be more general.

Conclusions

Some of the mismatches found in this study may be due to the phrasing of particular headings. On the other hand, we need to realize that the heading set as a whole is a classification that might not fit the lay classification of leaflet users. Information classification deals with the quite fundamental question of what subtypes of information are to be grouped together and what subtypes are better kept apart. This brings us to the second study.

Open Card-Sorting Study

Materials

In open card-sorting studies, participants are presented with a set of cards and asked to sort these into different groups, according to their own perspective on the subject of the cards.

The general idea of this study was to have participants “recreate” their own information leaflet by providing them with a collection of sentences from actual patient information leaflets. In the literature, the maximum number of cards to be sorted is around 100, but usually a much smaller number is used (around 30). We aimed for a number of cards in between these boundaries.

A content analysis of Dutch PILs yielded a list of 34 topics regularly present in leaflets. We decided to assign one or more cards to every topic, depending on the variability of the statements within a topic. Ten of the topics are typically dealt with in single statements, such as manufacturer and expiry date. We decided to represent these topics by only one card. The remaining 24 topics were represented by two or three cards; we reasoned that this kind of information on topics like indications, instructions, and warnings would help the participants to see some structure and help them start sorting.

This added up to a total of 75 cards, to be found in Appendix 2 at the Technical Communication Web site. The cards contained simple sentences, but not so simple that they could not appear in actual leaflets. For instance, the following sentences were used to represent the topics ingredients and dosage, respectively:

Ingredients:

  • The active substance of this medicine is Risperidon.
  • This medicine contains liquid paraffin, microcrystalline, and monohydrate.
  • Other ingredients of this medicine are maize flour and talc.

Dosage:

  • The starting dose is usually 15 or 30 mg daily.
  • Your doctor decides how much of this medicine you should take.
  • When you are over 65 years of age, you should take only one tablet.

We tried to vary the wording of the cards within topics in order to discourage “shallow” sorting strategies based on surface characteristics. Furthermore, we kept the language and the syntax as simple as possible. No medicine name was used in order to prevent the use of prior knowledge on particular medications.

Procedure and Participants

The substantial number of cards and the fact that they contained sentences, not words or phrases, made it impossible to use digital applications currently available for card sorting, because the screen could not accommodate our stimuli. Hence we used paper cards and an empty table.

The participants were told that they would receive 75 cards with sentences from “medication instructions.” The size of the cards was about 3.5 inches by 2 inches and the typeface was Arial 14 point. The cards contained no more than three lines of text. They were shuffled for each new participant and handed over in a shoebox. The instruction for the first sorting phase was as follows:

Your task is to form groups of cards that belong together. You can make as many or few groups as you want. You can also change the groups during the task, by splitting them or joining them. You can also move a single card from one group to another.

The participants were told not to join the “singleton” cards, the cards not belonging to any group. When the participants were satisfied with their sorting, they were asked to label their groups with names, to be written on yellow stickers. Participants were requested to explain any unclear group names, but this was rarely necessary.

In the third phase, participants ordered their groups according to what they thought “should come first in a medication instruction and what should come later.” Finally, participants filled out a short questionnaire on demographic variables and medication use.

The interviews were done by 13 students of our university, who were previously trained in card-sorting methodology and interviewing as part of an undergraduate course in communication studies. Seventy-eight participants performed the sorting task, 46 woman and 32 men. Most participants were recruited in the networks of the interviewers. The mean age was 39 years (SD = 17.9). Thirty-nine percent of the participants had completed higher education, which is slightly more than the corresponding proportion in the Dutch population as a whole (35%). The large majority of the participants (97%) had Dutch as their mother tongue. All of them spoke Dutch fluently.

Of all participants, 26% had received no prescription medication in the past year, 51% had received one to three prescription medications, and 23% more than three. A majority (67%) said they read their patient information leaflets “usually” or “always,” and 21% read it “sometimes”; only 12% said they never read it. The mean sorting task duration was 30 minutes (SD = 21.4).

Analysis of the Sorting Data

Every group made by a participant was entered as a separate case. All cards were treated as two-valued variables (“belongs to group X” or “does not belong to group X”). The participants created a total of 693 groups and laid aside 112 singleton groups (i.e., unsorted cards). When ignoring singletons, the mean group size is 8.2 (SD = 6.4).

We first examined whether different groups of participants sorted the cards in different ways. The only reader factor affecting sorting behavior was reading experience. More experienced leaflet readers created more (and hence smaller) groups (Spearmans rank order correlation between reading experience and number of groups was .29, p < .05). This correlation is primarily due to the 12% of our readers who reported they never read leaflets: They formed a smaller number of (bigger) groups. Since the only reader characteristic influencing sorting behavior was reading experience and only the 12% nonreaders really differed from the other participants, the majority of our readers can be said to be a homogeneous group. We decided to retain all participants in subsequent analyses.

 

Sorting data have been conventionally analyzed by means of a cluster analysis, which produces a visually attractive tree diagram where the cards that are viewed as most similar by the participants are placed on branches that are close together. Coherence between medication sentences is thus expressed in terms of distances in the diagram. In this way both local associations and higher level groupings become visible.

An alternative procedure is advocated by Capra (2005). This is a factor analysis that enables us to estimate how much of the variability in the data can be explained by common components. These components can be interpreted as the main categories in the card-sorting set distinguished by the participants. A factor analysis on sorting data offers three important advantages. First, it allows a straightforward estimate of the variance explained in the data by the adopted factor solution and of the homogeneity of the different factors. Second, it allows cards to load on different factors at the same time, thus revealing potential multidimensionality (i.e., the fact that cards show different kinds of similarities with other cards, so that a single card may fit into more than one category). Third, the factors can be used to identify specific groups produced by individual participants, closely resembling a hypothesized factor. The names the participants give to these groups may help to indicate their common denominator.

Global Results: Eleven Groups

Because of the extra information yielded by the factor analysis, we will start by reporting this analysis and then report the results of the cluster analysis. We performed a Principal Components analysis with Varimax Rotation. The first analysis yielded 16 factors with eigenvalues of over 1. As the eigenvalues dropped steeply after the 12th factor, a second analysis was run with 12 factors. In this analysis, the 12th factor was considerably weaker than the 11th, and contained only cards with higher loadings on other factors. Hence we settled for an 11-factor solution. We took a minimum loading of .4 as a requirement for assigning cards to factors.

Table 2 summarizes the results of this analysis. Note that the total number of cards exceeds 75 because some cards load on more than one factor. More details can be found in Appendix 3 at the Technical Communication Web site. In the 11-factor solution, the smallest group has 4 cards and the largest group has 13 cards. The fact that 66.7% of the variability of the data is explained by the 11 factors indicates the degree of structure found in our sorting data.

Table 2. Eigenvalues and variances explained by the 11 factors in the rotated 11-factor solution (Varimax Rotation)

Provisional factor name

Nr. of cards

Eigenvalue

% of variance

Cumulative %

1

Directions for use

13

7.966

10.6

10.6

2

Do not use or take special care

12

7.062

9.4

20.0

3

Contact your doctor

11

5.638

7.5

27.6

4

Side effects

8

4.671

6.2

33.8

5

What the medicine is used for

5

4.358

5.8

39.6

6

Ingredients and medicine group

6

4.130

5.5

45.1

7

Storage

5

3.561

4.7

49.8

8

Packaging and appearance

4

3.537

4.7

54.6

9

Driving and using machines

6

3.165

4.2

58.8

10

Registration data

4

3.131

4.2

63.0

11

Pregnancy and breast feeding

6

2.772

3.7

66.7

How did we arrive at the names for our 11 groups? Why did we, for example, choose “Directions for use” as the best name for the first group? We used a three-step analytical procedure recommended by Capra (2005), who presented the Jaccard score as a useful measure in card-sorting studies.

In the first step, similarity indexes are computed between each actual group created by a participant on the one hand and the 11 factors on the other hand, with the Jaccard score as index. This score divides the number of cards present in both groups (a) by this number (a) together with the number of cards present in the group made by the participant but absent in the factor (b) and the number of cards present in the factor but absent in the group (c): a / a + b + c. When a specific group is identical to a factor, the corresponding Jaccard score is 1, since b and c are zero for such a group.

In the second step, we selected the groups made by participants that most closely matched a given factor. For each factor, we chose the 20 highest scoring groups, with the restriction that the Jaccard score should be above .5. In the third step, we listed the names given to these groups by the participants and identified the best common denominator for them.

As a demonstration, Table 3 lists the names for the 20 best matching groups for the first factor. Every line in this table gives the Jaccard score of one group made by a specific participant, plus the name the participant gave to this group of cards. The first line indicates that this group, with a maximum score of 1, perfectly reflects the factor that was the result of the statistical analysis. In the second line, a group is presented that probably has one or two cards more or that misses one or two cards of the factor. On the basis of these names, we selected the Dutch gebruiksaanwijzing (literally “usage instructions”) as the best label for this factor, which seems to translate best into “Directions for use.” This procedure was followed for all eleven factors, which resulted in the names presented in Table 2.

Table 3. Jaccard scores and names for the 20 groups that best match the “directions for use” factor

Jaccard-score

Label (Dutch)

Label (literal English translation)

1.000

Inname en gebruik

Taking and using (the medicine)

.929

Gebruiksaanwijzing

Usage instructions

.867

Gebruik

Usage

.846

Wijze van innemen

How to take

.846

Gebruiksaanwijzing

Usage instructions

.786

Dosering

Dosage

.786

Gebruik

Using (the medicine)

.769

Gebruik

Using (the medicine)

.769

Dosering

Dosage

.769

Hoe moet je het middel gebruiken

How to use the medicine

.765

Gebruiksaanwijzing

Usage instructions

.750

Hoe gebruik je het

How do you use it

.750

Inname

Taking (the medicine)

.706

Gebruik/dosering

Using (the medicine)

.692

Hoe in te nemen

How to take it

.692

Regelmaat van innemen

Regularity of taking it

.688

Dosering

Dosage

.684

Gebruiksvoorschriften

Usage instructions

.667

Gebruik

Using (the medicine)

.667

Dosering en inname

Dosage and taking (the medicine)

Exploring the Internal Structure of a Group

Combining the factor analysis with the cluster analysis allows us to analyze the lower order within group structure. By way of illustration, we will further explore the makeup of the directions for use group. Table 4 presents the cards that constitute the factor, together with their factor loadings. It shows that four cards have a factor loading of .83 or higher. Card 64, with the lowest factor loading “do not stop suddenly,” can indeed be interpreted as less strongly connected to the daily use of the medicine, but on the other hand the decision to stop can be seen as an aspect of usage.

Table 4. The cards assigned to the first factor (factor loadings are in parentheses)

Factor

Name of group

Cards (Factor loadings)

Short version card statement

1

Directions for use

30 (.75)

Wait after grapefruit juice

46 (.68)

Starting dose

48 (.67)

Take one tablet when over 65

49 (.88)

Take any time of day

50 (.87)

Six hours between doses

51 (.85)

Take before/after meal

52 (.73)

Injected

53 (.71)

Massaged

54 (.83)

Swallow whole

55 (.75)

Use for 3 months

60 (.71)

Missed dose

61 (.85)

Forgotten dose

64 (.53)

Do not stop suddenly

As mentioned, card sorting data can also be interpreted using a hierarchical cluster analysis. The entire cluster analysis tree is in Appendix 4 on the Technical Communication Web site. Figure 1 presents a selection of the results of this analysis, relevant for the first factor of the analysis discussed above. The key to the interpreting this diagram is looking at the points where branches join.

Two of the cards with the highest factor loadings in Table 4 (49 and 51) turn out to be the “best” pair in this diagram. Both refer to the time the medicine should be taken: “Take any time of day” and “Take before/after meal.” The majority of participants put these cards in the same group. Something we did not see in the factor analysis is the lower-order structure within a factor. For instance, the “best pair” is a subgroup with another fairly good pair of cards (50 and 61) with information about doses. On one higher level both these pairs form a group with cards about administering the drug (52 inject, 53 massage, 54 swallow); recall that the participants did not see these numbers while sorting. This group of seven cards seems to contain the “core” directions, all of them having loadings above .70 in the factor analysis. Cards with lower loadings are on the periphery of the sub tree in Figure 1, especially card 64.

Figure 1. Cluster analysis for the first group, “Directions for use”

Card 56 (“works after three days”) is the most eccentric one in the entire pack. The factor analysis does not place it in any factor, because it does not have any loadings of over .4. Its highest loading (.315) does place it in the vicinity of the directions group, however, as is shown by the sub-tree in Figure 1.

Grouping Complexities

Different Abstraction Levels A comparison between the factor analysis and the cluster analysis reveals that groupings may occur at different levels of abstraction. This can be demonstrated with the loadings of some cards in the “Driving and using machines” group. Two of these cards also load on the factor “Side effects”: “Do not drive when sleepy” (.732 and .359) and “Do not use machines when dizzy” (.665 and .316). This is something we also experienced in reader testing: Leaflet readers sometimes look for information on driving in the side effects section. In their cognitive structure, “Driving and using machines” might be considered as a lower level unit within “Side effects.” The same goes for information about effects on breast feeding, which might be ordered within “Pregnancy” as well as within “Side effects.” This is demonstrated in the cluster analysis, as can be seen in Figure 2.

Figure 2 shows that the heading “Side effects” might be interpreted as giving information on different levels of specificity:

  • Side effects 1 (SE1): The whole group of cards presented in Figure 2, including effects on pregnancy and breast feeding, effects on driving and using machines, interactions, and side effects.
  • SE2: SE1 minus the pregnancy information.
  • SE3: SE2 minus the driving and machines information; this gives us the side effects in a more strict sense (cards 65, 66, 67, 69) plus the interactions (27 and 28), plus 63.
  • SE4: SE3 minus card 63 (“Stopping gives higher cholesterol”) which seems to be a singleton within this group. It can be considered as a side effect of stopping the medicine.
  • SE5: SE4 minus the interactions gives us side effects in the strictest sense.

Figure 2. Cluster analysis of side effects (SE) cards

Medical and pharmaceutical experts will interpret the term side effects only in its strictest sense. They consider interactions to be conceptually different from side effects, which can only be effects of the medicine itself. The dendrogram clearly shows that lay people have difficulties making this distinction; the interaction cards have loadings on the side effects group similar to the side effect of fatigue. And as we see in these data, effects on driving and breast feeding may also be considered to be side effects of the medicine.

User Situations or User Actions as Grouping Criteria In the group “Directions for use,” all cards clearly belong to this group and do not load on other factors. But 12 other cards load on more than one factor, which suggests that they might be presented under different headings in the patient information leaflet. A good example is the group with cards about “Pregnancy and breast feeding.” Some of these cards also have high loadings on the group “Do not use or take special care” (31 – Do not use when pregnant (.643); 34 – Do not use when breast feeding (.617)). And card 32 (Tell doctor about child wish) also loaded on the factor “Tell your doctor” (.634). A closer look at such findings reveals a difference in the framing of information by the respondents. A card can be ordered in terms of user situations (e.g., “breast feeding”). But it may also be ordered in terms of the action that might be required (e.g., “stop using the medicine” or “talk to your doctor”.)

The same distinction can be found in cards belonging to the group “Driving and using machines” and the group “Do not use or take special care.” For example, the card “You should not drive because this medicine affects your reaction speed” (39) has been grouped in terms of situations (loading .624 on “Driving”) as well as in terms of actions (loading .442 on “Do not use”). The fact that we did not see such multidimensionality in the group “Directions for use” can probably be explained by the fact that this group does not distinguish between user situations.

The most striking example of an action framing that overrides situational differences is the factor “Contact your doctor.” The cards in this group mention various topics about which the doctor could be consulted:

  • conditions requiring special care
  • using other medicines
  • pregnancy
  • using machines
  • dosage
  • duration of treatment
  • overdose
  • stopping treatment
  • unlisted side effects
  • changes in health

Both the situation-action multidimensionality and the abstraction-level dilemma affect the findability of information in patient information leaflets. Readers may look for information on breast feeding in a section called “Side effects,” but will not find it there according to the European template. This reflects abstraction differences between experts and patients.

Readers looking for information on pregnancy under the heading “Do not use or take special care” prefer the action framing over the situational framing of this topic. The template creates a special paragraph for pregnancy information, probably because it is considered as an important section that needs special attention. This raises the question of whether readers will keep on searching when they do not find the information under the preferred heading.

Ordering Task Results

There are many ways to analyze the ordering data. Although not every participant grouped the cards into the 11 factors discussed above, we decided to analyze the ordering of groups on the level of these factors, given that the factors are the best approximation of grouping agreement we have. Our analytical procedure was as follows.

First, we calculated a so-called relative position for every group a participant made, excluding singleton cards. Consider a participant with five groups as an example. For this participant, there is a 5-point position scale. The relative position of a group refers to the proportion of groups (out of the five) preceding this group. Hence the first group always has position 0. The second group has position .20, the third .40, the fourth .60, and the last .80, because 80% of the participants’ groups precede this final group. For a participant with 10 groups, the second group would have .10 and the last .90.

This gave us a relative position for each card that was grouped. Since every card belongs to a factor, we can now take the mean of the relative positions for the cards belonging to each factor, and perform a one-way ANOVA in order to identify the relative position of every group. The results are presented in Table 5.

Table 5. Mean relative positions for groups of cards belonging to the factors

Factor nr.

Nr. of cards

Nr. of observations

Factor name

Mean relative position (SD)

Differs sign. from (by Bonferroni’s test):

5

5

383

What the medicine is used for

.16 (.20)

All others

1

13

983

Directions for use

.31 (.20)

All others

6

6

443

Ingredients and medicine group

.37 (.31)

All others except 2

2

12

679

Do not use or take special care

.42 (.22)

All others except 6, 4, 3, and 8

4

8

517

Side effects

.43 (.22)

All others except 2, 3, and 8

3

11

690

Contact your doctor

.45 (.24)

All others except 4, 2, and 8

8

4

300

Packaging and appearance

.45 (.32)

All others except 2, 4, 3, and 9

9

6

452

Driving and using machines

.50 (.23)

All others except 8, 7, and 11

7

5

360

Storage

.52 (.30)

All others except 9 and 11

11

6

450

Pregnancy and breast feeding

.53 (.23)

All others except 9 and 7

10

4

271

Registration data

.62 (.36)

All others

Table 5 is clear on the positions of the factors 5, 1, and 10; they differ significantly from all other factors. Hence readers expect patient information to start with information about the goal of the medicine. Then they expect to find directions for use. At the end of the document they will look for registration data.

There is also a clear “midfield” that is composed of two groups within which the factors do not differ reliably. The first midfield group is 2 (do not use or take special care), 4 (side effects), 3 (contact your doctor), and 8 (packaging and appearance). The second midfield group is 9 (driving and using machines), 7 (storage), and 11 (pregnancy and breast feeding). Most cards in these two groups differ from cards in the other group, the only exception being that 8 and 9 do not differ in mean position.

Somewhat unclear is the position of factor 6 (ingredients and medicine group). The relatively high standard deviation for the orderings of this factor indicates disagreement on how to place it. Hence we examined the individual orderings, focusing on how participants think about the first and last groups of information in patient information leaflets. For each factor and each participant, we calculated the mean relative position for its cards; then we counted the number of times a factor occupies either one of the first three or one of the last three places (see Table 6). Of course, this procedure only approximates the individual orderings, since not every participant produced the same groupings.

Table 6. The factors’ mean ranks within participants for the first and the last three places

Factor nr.

Factor name

Total 1-3

Total 9-11

5

What the medicine is used for

58

0

1

Directions for use

36

0

6

Ingredients and medicine group

33

26

2

Do not use or take special care

20

14

4

Side effects

12

9

3

Contact your doctor

13

15

8

Packaging and appearance

31

35

9

Driving and using machines

5

26

7

Storage

14

32

11

Pregnancy and breast feeding

4

31

10

Registration data

27

47

In Table 6, the last two columns present the number of participants positioning every factor in the beginning or at the end of the document. For instance, for 33 of our participants factor 6 was among first three factors, and for 26 participants it was among the last three. For the remaining participants, it was somewhere in between (not represented in the table).

We can see that two factors were never positioned at the end of the document and clearly seem to have a prominent position at the start: the goal of the medicine and the directions for use. We can also see that the final position of the registration data is less clear than Figure 2 suggests. Some readers place this group of cards at the beginning of the document. The same goes for information about ingredients, packaging, and storage.

Can we derive a preferred order from these data? We think we can, provided that we separate medicine information from usage information. Tables 7 and 8 again present the mean relative position analysis from Figure 2, but separately for medicine and usage information.

Table 7. Mean relative positions for medicine information groups

Factor nr.

Nr. of cards

Nr. of observations

Factor name

Mean relative position (SD)

Differs sign. from (by Bonferroni’s test):

5

5

383

What the medicine is used for

.16 (.20)

All others

6

6

443

Ingredients and medicine group

.37 (.31)

All others

8

4

300

Packaging and appearance

.45 (.32)

All others

7

5

360

Storage

.52 (.30)

All others

10

4

271

Registration data

.62 (.36)

All others

Table 8. Mean relative positions for usage information groups

Factor nr.

Nr. of cards

Nr. of observations

Factor name

Mean relative position (SD)

Differs sign. from (by Bonferroni’s test):

1

13

983

Directions for use

.31 (.20)

All others

2

12

679

Do not use or take special care

.42 (.22)

All others except 4 and 3

4

8

517

Side effects

.43 (.22)

All others except 2 and 3

3

11

690

Contact your doctor

.45 (.24)

All others except 2 and 4

9

6

452

Driving and using machines

.50 (.23)

All others except 11

11

6

450

Pregnancy and breast feeding

.53 (.23)

All others except 9

Information about the drug has a clearly preferred order, because all factor positions differ from each other. Drug information should start with the goal of the medicine, followed by ingredients, packaging and storage, and finally registration data.

For the usage information, three groups must be distinguished. Clearly it starts with directions for use. Then a group of more or less general problems follows, with information about special user situations, side effects, and situations in which patients should contact their doctor. Within this group, presentation order is unclear. Finally we find a group with more specific user situations concerning driving, using machines, pregnancy, and breast feeding, also without clear ordering preferences.

In the next section we will further discuss the template design implications of our card-sorting study.

Proposal for Grouping and Ordering Leaflet Information

What are the implications of this card-sorting study for the design of a template for patient information leaflets? Let us first list the results from the study. First, the factor analysis strongly suggests 11 groups of information; the cluster analysis suggests they can be divided into two subgroups: medicine and usage information:

Medicine information

Usage information

What the medicine is used for

Directions for use

Ingredients and medicine group

Do not use or take special care

Packaging and appearance

Side effects

Storage

Contact your doctor

Registration data

Driving and using machines

Pregnancy and breast feeding

How to join these two types of information in a single structure? The picture emerging from the ordering analysis is the following. Some subtopics of medicine information seem to be preferred at the end (storage and registration data), while the goal of the medicine clearly is preferred at the beginning. The usage information is mostly placed somewhere in the midfield and can be ordered in three clusters: directions, precautions and side effects, driving and pregnancy (Table 8). All this gives a structure as presented in Table 9.

Table 9. The proposed leaflet structure

We note that this structure is strikingly similar to the one arrived at by Morrow et al. (1996, 2000). In their results, a “medication and how to take it” group preceded a “problems” group. Within the first group, the order was name – purpose – dose – schedule – duration; within the second group, it was warnings – mild side effects – severe side effects – emergency.

This proposal largely follows from the results presented in Tables 7 and 8, but two issues need elaboration.

  • Why has the medicine information been split at the point between factor 6 (Ingredients) and factor 8 (Packaging)?
  • Why do we not mention a section “Contact your doctor”?

The division of the section on medicine information is based on the results of the cluster analysis. Figure 3 presents the relevant part of the cluster analysis tree.

The first part of the dendrogram shows three topics: goal of the medicine, ingredients, and medication group. These three topics are related to each other. Then three other topics are presented in the lower part of the dendrogram: manufacturer and registration, packaging, and storage. The main split between these groups is between cards 9 and 16. Thus, if part of the information about the medicine has to be presented as an introduction, it makes sense to tell the readers about the goal of the medicine, the medication group, and the ingredients.

Information on storage of the medicine was clearly positioned at the end of the document, according to the results presented in Table 7. And the dendrogram shows this topic to be related to packaging and registration. Thus, it makes sense to reserve the final position for packaging, storage, and registration.

Concerning the position of the group “Contact your doctor,” there are several options.

  • We might choose to ignore this factor because it introduces multidimensionality in the leaflet’s structure. However, multidimensionality and redundancy are not by themselves problematic in instructive text.
  • When accepting a separate section “Contact your doctor,” several positions could be considered. First, we could use the information as a kind of summary of the problem section, indicating the more serious problems that necessitate action. This summary could be placed either at the beginning or at the end of the “usage problems” section. Second, we could place the information group at the start of the leaflet, as a headline section for the entire leaflet (Dolk, Lentz, Knapp, Pander Maat, & Raynor, 2011).
  • We doubt that patients will ever search for a section describing all situations with a need for doctor contact. Generally, a patient will search for information with a specific situation in mind. If contact with a doctor is needed in that situation, this advice should be presented clearly in the section on that topic. Perhaps it should also be highlighted in some way, for instance by adding a telephone pictogram. According to this line of reasoning, a special “Contact your doctor” section would not be necessary.

Figure 3. Cluster analysis for the section on medicine information

Since the issue of the “Contact your doctor” section is still unresolved, we will not include such a section in our proposal.

How does our proposal so far relate to the current EU template? Table 10 presents a comparison of our empirically based proposal with the present EU template.

Table 10. Our proposal compared with the EU template

In proposing an alternative structure for patient information leaflets, we should make a distinction between three aspects of document structure: the grouping of related topics in sections, the order of presentation of sections, and the wording of headings for every section.

What are the main differences between the current EU template and our proposal in terms of grouping related topics in sections? There are two major and some minor differences. Major differences are the following:

  • The “Before you take X” section is now part of a larger “Potential problems” section, which also includes side effects.
  • Side effects in a broad sense also include interactions with other medicines. This is not to say that this group should be labeled “Side effects.” But the side effects in a strict sense and the interactions should at least be placed next to each other in the new structure.

There are also some smaller regrouping modifications.

  • The old “Taking X with food and drink” category covers two kinds of information, which are differently placed in the sorting data. The topic “Medicine combined with alcohol” is in the “Before you take X” category (now called “Do not use or take special care”). The topic “Medicine combined with grapefruit juice” is grouped in “How to take X” information (now called “Directions for use”), which makes sense because it tells the reader to wait half an hour before taking the medicine.
  • All information about ingredients (cards 1, 3, and 5) has been put together in the card-sorting data, so our participants did not make the distinction between the active ingredient (to be given in section 1) and the other ingredients (section 6). All ingredient cards are coupled with the medicine group cards. And since the medicine group information is placed at the beginning, in principle the ingredients are better placed in the beginning too.

With regard to the order of presentation, there are four differences between our proposal and the current template:

  • Since the medicine group information has been placed at the beginning, the closely related ingredients section has moved from the end to the start of the document.
  • The section with precautions “Do not use or take special care” has moved to a further position in the document after the directions for use.
  • Information about pregnancy and using machines has also been positioned further on, after the side effects.
  • Information about storage is now placed after the information on packaging and appearance.

Finally, our closed sorting study has yielded interesting information regarding the wording of our headings. One conclusion was that general headings like “Before you take X” and “Further information” should be avoided because they do not indicate any topic. A second conclusion was that some specific concepts in headings, such as “Side effects” and “Machines,” need careful consideration because pharmacists and lay readers may differ in their interpretations of them: Medicine effects on breast feeding may be considered as side effects, and everyday utensils may not be regarded as machines that should be avoided while taking medication. However, since wording decisions can only be made once grouping decisions have been made, we will not provide wording proposals here. These should be tested in another closed sorting study.

Conclusions

Our closed card-sorting method showed that readers had some problems in locating the correct answers to a set of 27 questions. These problems were the result of two kinds of mismatches between reader expectations and the current EU template. The first mismatch is between the wording of headings and reader interpretations. The second mismatch has to do with classifying and grouping information. These mismatches confirm the results of other studies in which readers experience difficulties in finding information in PILs.

In the open card-sorting study, we identified 11 sections that readers may expect when confronted with patient information about medicines. We found some important discrepancies between the current template and the results of the empirical data analysis. To our surprise, the participants grouped different cards in one factor named “Contact your doctor,” which may indicate that readers would like a (sub)section discussing all situations in which they should consult a physician. These situations might be combined in a so-called headline section, as has been proposed by the UK Medicines and Healthcare products Regulatory Agency (MHRA, 2005). Another option is a “Contact your doctor” summary within a section on potential problems of medicine use. A final option is to illustrate all “Contact your doctor” sentences with a pictogram instead of clustering this advice in one section. Further research is needed to investigate the effect of a separate “Contact your doctor” section.

We identified two main topic groups: information about the medicine and information about its use. Medicine information was preferred by some at the start and by others at the end of the document: Information about the purpose of the medicine clearly should be presented in a first paragraph, while registration data clearly belong at the end of the document. Usage information (instructions and warnings) was located in the center of the leaflet, starting with the directions for use. In terms of presentation order, this would mean a swap between the current warnings in “Before you take X” and the “How to take X” section. As in the first study, we found that participants do not differentiate between active ingredients and other ingredients. Findability of patient information would improve if all ingredients were presented in one section, immediately after the first section on the goal of the medicine.

Further investigation is needed on the relation among three sections in the current template: “Side effects,” “Driving and machines,” and “Pregnancy and breast feeding.” These sections clearly belong together, but they can be related in several ways. Should they be presented next to each other, or should “Driving and machines” and “Pregnancy and breast feeding” be subsections of the “Side effects” section?

From a methodological perspective, this study clearly shows that card sorting helps to investigate readers’ mental representations and to formulate hypotheses about readers’ genre expectations when confronted with PILs. The proposal we presented in Table 9 can be seen as a set of hypotheses about the locations where readers will start looking for specific information. Likewise, we predict that PILs presented in this alternative format will result in shorter localization times for patients who are asked to answer specific questions about the medicine and its use.

Both the dendrogram and the factor analysis help us to understand how different topics of this genre relate to each other and to see the multidimensionality of specific (sub)topics. The procedure advocated by Capra (2005) proved to be extremely helpful in analyzing a complex dataset. Note that this dataset consisted of 693 individual groups, produced by 78 participants sorting 75 different cards. With this three-step analytical procedure, we were able to identify sets of groups that most closely matched the results of the factor analysis and relate these groups with group names given by participants.

Further research is needed to show whether a design according to our proposal actually improves the findability of patient information in PILs. Such a test should concentrate not only on the grouping of topics and the presentation order, but also on the wording of headings. The first study has demonstrated that this can be done by using a manipulated leaflet in which only the headings are readable, which enhances ecological validity.

Decisions on a new version of the EU template should not be legalized without careful research into the effectiveness of such a proposal. The most important conclusion that can be drawn from both studies presented in this paper is that it is certainly possible to base a new template design on usability data. This study shows how sorting data may be used—and in fact seem to be indispensable—to select promising text structures from the infinite number of conceivable structures. The next step would be an experimental test of the most promising candidates.

References

Askehave, I., & Zethsen, K. K. (2003). Communication barriers in public discourse. The patient package insert. Document Design, 4(1), 22–41.

Capra, M. G. (2005). Factor analysis of sorting data: an alternative to hierarchical cluster analysis. Proceedings of the Human Factors and Ergonomics Society, 49th Annual Meeting (pp. 691 –695). Retrieved from http://www.thecapras.org/mcapra/work/Capra.CardSort.HFES2005.pdf

Dolk, S., Lentz, L., Knapp, P., Pander Maat, H., & Raynor, T (2011). Headline section in patient information leaflets. Information Design Journal, 19(1), 46–57.

EMEA (2006). QRD-template version 7.2, 2006. Retrieved from http://www.emea.europa.eu/htms/human/qrd/docs/Hannotatedtemplate.pdf

Henry, A., & Roseberry, R. L. (2001). A narrow-angled corpus analysis of moves and strategies of the genre: ‘Letter of Application’. English for Specific Purposes, 20, 153–167.

Kostelnick, C., & Hasset, A. (2003). Shaping information. The rhetoric of visual conventions. Carbondale, IL: Southern Illinois University Press.

McNamara, D. S., Kintsch, E., Songer, N. B., & Kintsch, W. (1996). Are good texts always better? Interactions of text coherence, background knowledge, and levels of understanding in learning from text. Cognition and Instruction, 14(1), 1–43.

MHRA. (2005). Always read the leaflet. Getting the best information with every medicine. Report of the committee on safety of medicines Working Group on Patient Information. London, UK: TSO.

Morrow, D. G., Leirer, V., Altieri, P., & Tanke, E. (1991). Elders’ schema for taking medication: Implications for instruction design. Journal of Gerontology: Psychological Sciences, 46, 378–385.

Morrow, D. G., Leirer, V. O., Andrassy, J. M., & Tanke, E. D. (1996). Medication instruction design: Younger and older adult schemas for taking medication. Human Factors, 38, 556–573.

Morrow, D., Carver, L. M., Leirer, V. O., & Tanke, E. D. (2000). Medication schemas and memory for automated telephone messages. Human Factors, 42, 523–540.

Pander Maat, H., & Lentz, L. (2009). Improving the usability of patient information leaflets. Patient Education and Counselling, 80, 113–119.

Rugg, G., & McGeorge, P. (2005). The sorting techniques: A tutorial paper on card sorts, picture sorts and item sorts. Expert Systems, 22(3), 94–107.

Spencer, D., & Warfel, T. (2004). Card sorting: A definitive guide. Retrieved June 8, 2009, from http://www.boxesandarrows.com/view/card_sorting_a_definitive_guide

Stalker-Firth, R. (2007). Anyone for a game of cards? Got something to say? Retrieved June 8, 2009, from http://www.digital-web.com/articles/game_of_cards/

Swales, J. M. (1990). Genre analysis: English in academic and research settings. Cambridge, UK: Cambridge University Press.

Toledo, P. F. (2005). Genre analysis and reading of English as a foreign language: Genre schemata beyond text typologies. Journal of Pragmatics, 37, 1059–1079.

Upton, T. A., & Connor, U. (2001). Using computerized corpus analysis to investigate the textlinguistic discourse moves of a genre. English for Specific Purposes, 20, 313–329.

About the Authors

Henk Pander Maat is an associate professor at the Utrecht Institute of Linguistics OTS at Utrecht University. His main research interests are readability and usability research, document design, and genre analysis. Contact: h.l.w.pandermaat@uu.nl, or phone +31 30 2538167.

Leo Lentz is a professor in Text Design and Communication at Utrecht University. Web Usability and Text evaluation is the main focus of his research. Contact: l.r.lentz@uu.nl, or phone +31 30 2538115.