By Yoel Strimling | STC Senior Member
As technical communicators, we put a lot of time and effort into creating the highest-quality documentation that we can. We write because we want to help our readers do the tasks they need to do or understand the concepts they need to know.
Because we are professionals, we take pride in our work and want it to be the best it can be. But how do we know if what we are writing is what our readers want? How do we know that the information we are sharing with our audience is helping them do or know what they need? We might be writing documentation with one standard in mind and be satisfied with it, yet our readers might look at the same documentation and be very unsatisfied. A disconnect between what we are producing and what our readers actually want makes it very difficult to justify writing documentation at all—why write information that nobody wants?
The best way to align ourselves with our audience’s needs is to get direct, meaningful, and actionable feedback from them, but this is not always possible. Instead, we often end up relying on our “gut instincts” and assume that readers define high-quality documentation in the same way that we do. In lieu of feedback, what we need is a proven model of how readers actually define documentation quality (DQ), which we can then use to ensure that what we produce is useful to our audience.
Defining Documentation Quality (DQ)
To properly define DQ, we must meet the following criteria:
- The definition must be from the reader’s point of view. Because it is our readers who determine whether the document we give them is high quality, any definition of DQ must come from the readers’ perspective. Writers can come up with any number of quality attributes that they think are important, but at the end of the day, nothing is as important as what the readers think.
- The definition must be clear and unequivocal. Both readers and writers have to be on the same page when it comes to defining what makes a document high quality. Misunderstandings of what readers actually want from the documentation are a recipe for unhappy readers.
- The definition must cover all possible aspects of quality. “Quality” is a multidimensional concept, and we must be sure that any attempt to define it is as comprehensive as possible. A definition that emphasizes one dimension over another, or leaves one out altogether, cannot be considered to be a usable definition.
- The definition must have solid empirical backing. To be considered a valid definition of DQ, serious research must be done to give it the proper theoretical underpinnings. Years of experience or anecdotal evidence can act as a starting point, but if we are serious about our professionalism and our documentation, we need more.
Building a Comprehensive Definition of DQ
To meet all of these DQ criteria, I turned to a fascinating study done in 1996 by Drs. Richard Wang (Co-Director of the MIT Total Data Quality Management Program) and Diane Strong (Director of the Management Information Systems Program at the Worcester Polytechnic Institute).
They developed a “comprehensive, hierarchical framework of data quality attributes” that were important to what they called “data consumers.” Their underlying assumption was that, to improve data quality, they needed to understand what “data quality” meant to data consumers—data quality cannot be approached intuitively or theoretically, because these approaches do not truly capture the voice of the data consumer. Their framework was made up of 15 quality dimensions, grouped into four quality categories: Intrinsic, Representational, Contextual, and Accessibility (see Applying Wang & Strong’s Model to Documentation for a description of their model).
Based on these categories, Wang and Strong concluded that high-quality data must be:
- Intrinsically good;
- Clearly represented;
- Contextually appropriate for the task; and
- Accessible to the consumer.
Wang and Strong claim that their proposed data quality framework could be used as a basis for further studies that measure perceived data quality in specific work contexts. They state that the framework is methodologically sound, complete from the data consumers’ perspective, and is useful for measuring, analyzing, and improving data quality. They cite “strong and convincing” anecdotal evidence that the framework has been used effectively in both industry and government and has helped information managers better understand their customers’ needs by “identifying potential data deficiencies, operationalizing the measurement of these data deficiencies, and improving data quality along these measures.”
Applying the Wang and Strong Data Quality Framework to DQ
Can we apply Wang and Strong’s data quality framework to documentation quality as well?
On the surface, it seems that Wang and Strong’s data quality framework is a good fit for our purposes. Like data quality, to understand DQ, we cannot rely on an intuitive or theoretical approach; we must get to the documentation consumers—that is, our readers. Like data quality, to improve DQ, we must understand what DQ really means to our readers. And, like data quality, high-quality documentation must be:
- Intrinsically good;
- Clearly represented;
- Contextually appropriate for the task; and
- Accessible to the reader.
But Wang and Strong’s framework focuses on data quality. Are the terms “data” and “documentation” synonymous and interchangeable in this comparison of data quality and DQ?
Documentation Is Not Data
“Data” is abstract, raw, and meaningless without context. However, when data is organized in a logical way and given context that can be understood by someone or something, it becomes “information.” In other words, information is data in a meaningful form.
While Wang and Strong’s originally stated object is data quality, it would be more accurate to call it information quality (which is what Wang and others, in subsequent research, have started calling it). When data consumers are asked to rate characteristics of data that they use, it can no longer really be called “data,” because it is now being given context by the consumer.
Documentation, then, is not data, but rather information that is intended to be used by readers in a particular context and for a particular reason.
Wang and Strong’s assumptions about the need for an empirical approach to determine what information consumers want, and what high-quality information must be, are parallel to those we are making about DQ. Because their framework is really a framework for measuring information quality, and the documentation we send to our readers is used as information, there is a strong basis for attempting to use this framework to create a model for accurately measuring what our readers consider to be high-quality documentation.
For this study, I developed two questionnaires:
- The questionnaire for writers asked them to rate the 15 dimensions from a reader’s assumed point of view, called the Writers’ Assumptions of Readers’ Ratings (WARR) group.
- The questionnaire for readers asked them to rate the dimensions from their own point of view, called the Readers’ Ratings (RR) group.
I posted the link to the writer questionnaire on numerous online technical communication forums and social media pages, and I sent the link to the reader questionnaire to several technical communicators and customer service personnel from various companies to send to their readers. I did this because I wanted to ensure that a broad, worldwide range of writers and readers from different fields answered the questionnaires, and that the people answering the questions were the people who actually created, read, and used the documentation.
Rating and Data Analysis
As in the Wang and Strong study, the participants provided their responses on a nine-point Likert scale, with 1 being “extremely important” and 9 being “not important at all.” Even though this range is cumbersome, I used it to get a finer gradation between the weights. Fewer points on a scale can lead to a broader weight range; more points can show differences better, especially in a study like this one with a small sample size.
I calculated the mean weights of the responses; and the lower the weight, the more important the dimension. To decide which dimensions were the most important, I used a cutoff of < 2.00. This value was based on the ratings of the most important information quality attributes found by Wang and Strong in their study (in their case, the attributes of Accurate and Correct).
Finally, I compared the mean weights and standard deviations per dimension of the WARR and RR groups using a one-way ANOVA (run at http://statpages.org/anova1sm.html) to determine whether the differences in mean weights between the selected groups were significant.
RR Mean Weight Rating Results
A total of 81 readers responded to the questionnaire, but only 80 of them rated all of the dimensions. Using a mean weight cutoff of < 2.00, the following dimensions were determined to be the most important for readers:
- Accurate (1.80), from the Intrinsic quality category
- Easy to Understand (1.91), from the Representational quality category
- Relevant (1.96), from the Contextual quality category
WARR Mean Weight Rating Results
A total of 66 writers responded to the questionnaire, and all of them rated all of the dimensions. Using a mean weight cutoff of < 2.00, the following dimensions were assumed by writers to be the most important for readers:
- Relevant (1.65), from the Contextual quality category
- Accurate (1.77), from the Intrinsic quality category
Comparing the RR/WARR Groups
Comparing the differences between the mean weights for each dimension between groups enables us to determine if the differences between them are statistically significant. If the mean weights of a dimension are significantly different (in this study, p < 0.05) between two groups, then we can state with some certainty that the two groups consider the importance of that particular dimension differently.
Analyzing the comparison statistically, I found that there were some significant differences between how writers assumed readers would rate the IQ dimensions and how readers actually rated them:
- Writers think that the Secure IQ dimension is significantly less important to readers than it really is (F = 19.9577, p < 0.0000).
- Writers think that the Objective IQ dimension is significantly less important to readers than it really is (F = 9.5802, p = 0.0024).
- Writers think that the Consistent IQ dimension is significantly less important to readers than it really is (F = 6.8994, p = 0.0095).
- Writers think that the Valuable IQ dimension is significantly less important to readers than it really is (F = 6.2277, p = 0.0137).
- Writers thought that the Timely IQ dimension is significantly less important to readers than it really is (F = 4.9567, p = 0.0275).
Figure 1 shows a comparison of the mean weights between the RR and WARR groups; statistically significant differences are marked with a box.
How Do Readers Define DQ?
The results of the RR group show that, above all, readers expect the documentation they get to be accurate, easy to understand, and relevant. Each of these dimensions represents one of the quality categories (Intrinsic, Representational, and Contextual, respectively). While this result might seem self-evident, it provides a strong empirical underpinning for the claim that DQ can be defined using a small, yet comprehensive, set of clear and unambiguous information quality dimensions.
How Do Writers Assume Readers Define DQ?
The results of the WARR group show that writers think that readers define DQ using only the Relevant and Accurate dimensions (from the Contextual and Intrinsic categories respectively). The Easy to Understand dimension (from the Representational quality category) barely misses the < 2.00 cutoff, but it is clear that it would have been counted had the sample size been larger.
This order is interesting, and might reflect writers’ beliefs that readers do not consider the grammar, style, and clarity of the documentation to be that important. In truth, though, readers do understand the importance of these, and rate the Easy to Understand dimension second (after Accuracy). However, the differences between the groups for these three dimensions were not statistically significant.
Where Writers Get It Wrong
Of the five dimensions that had significant differences between their perceived importance by writers and their actual importance to readers (Secure, Objective, Consistent, Valuable, and Timely), only the Valuable dimension was rated highly by readers. Indeed, this dimension might have even made the < 2.00 cutoff had the sample size been larger.
It is important for us to look more carefully at this result. What are readers telling us when they say that they want the documentation to be “valuable”? Why are we significantly underestimating the importance of this to readers? And what can we do to address it?
Making Our Documentation Valuable
The wording of the Valuable dimension’s definition gives us a clue: “The information in the documentation is beneficial and provides advantages from its use.” Compare this to the definition of the Relevant dimension (note that both of these dimensions are in the Contextual quality category): “The information in the documentation is applicable and helpful for the task at hand.”
Readers want the documentation we send them to be more than simply “applicable” and “helpful”—they want it to be “beneficial” and “provide advantages.” They want to look at the information and understand that if they use it, it will improve their situation in some way.
If the information in the document is “applicable and helpful for the task at hand,” then it helped readers do what they needed to do or know what they needed to know—no more, and no less. For example, the documented procedure helped them set up a complicated system, told them how to manage network clusters, or explained the hardware architecture.
On the other hand, if the information in the document is “beneficial and provides advantages from its use,” then it was more than just relevant; it gave the readers something extra. Using the previous examples, the documented procedure helped them set up a complicated system in the most efficient way, told them how to manage network clusters more effectively, or explained the advantages of the hardware architecture.
A document can be “helpful” but not “beneficial” or “advantageous.” Sure, the reader set up the system, but it took three hours when it could have taken two; the reader can manage the network clusters, but it’s more complicated than it needs to be; the reader understands the hardware architecture, but doesn’t understand why it is this way.
Readers look at documentation, and ask “What’s in it for me? Why should I care about this? What value will this information add to my work? How will this make my life easier?” They feel that there must be an additional, emotional level to the documentation. Readers are busy people and are often under a great deal of pressure to get their tasks done—reading documentation is not usually high on their list of priorities. If they feel that they are not wasting their time with the documentation, and that the writer truly wants to make life easier for them, then they will consider the document to be high quality. Accuracy, clarity, and relevance are critical—but for readers, there also needs to be an extra dimension of value.
It is no surprise that the Contextual quality category is represented twice (Valuable and Relevant) in the readers’ ratings of DQ. Documentation is never read in a vacuum and is only used in context. Writers, who are not the intended audience of the documentation, can easily lose sight of this and create content devoid of all connection to the context in which the documentation is to be used, but readers cannot use this kind of content.
How do we as writers add value to our documentation? We must understand who our audience is, what they want from the documentation, and in what context they will be using the information we give them. This can be done via user stories, use cases, personas, journey maps, and similar tools that put the writer in the reader’s place. We must also ask ourselves, “If I were the reader, would this information help me do my job better?” We must understand that readers look to documentation not only for conceptual and procedural information, but also for ways to make their workload easier.
Good News and Bad News
The good news is that both readers and writers define high-quality documentation using a small-yet-comprehensive set of clear and unambiguous information quality dimensions: Accurate, Easy to Understand, and Relevant. Each of these dimensions (together with the Accessible dimension) represents one of the Intrinsic, Representational, Contextual, and Accessibility information quality categories.
We can use these dimensions to classify and sort existing internal or external feedback—which can then be presented to management as clear indicators about the quality of the documentation—and to determine where more emphasis might need to be invested. We can also use them as unambiguous terminology for discussing and analyzing documentation needs with subject matter experts (SMEs) and other writers. This will ensure that everyone involved understands what readers want and how to get there, which should be the goal of all people involved in creating documentation.
The bad news, however, is that we significantly underestimate the importance of documentation value to our readers, but this can be solved by considering strategies that put us more directly in the readers’ shoes by:
- Thinking about how readers use documentation to make their lives easier;
- Realizing that there is an underlying emotional component to using documentation; and
- Carefully considering the context in which the documentation is used.
This “feel-good, make it worth my while” factor in documentation cannot be ignored. It is not enough for us to give our readers accurate, clear, and relevant information. We must also ensure that the information we give them enables them to feel that it was worth it for them to read what we have written.
Kumar, Manisha. “Difference Between Data and Information.” DifferenceBetween.net http://www.differencebetween.net/language/difference-between-data-and-information/. Retrieved 7 November 2017.
Strimling, Yoel. “Beyond Accuracy: What Documentation Quality Means to Readers—Creating an Actionable Documentation Feedback Model Based on Meaningful Information quality Dimensions.” Technical Communication (accepted for publication in 2019).
Wang, Richard, and Diane Strong. 1996. “Beyond Accuracy: What Data Quality Means to Data Consumers.” Journal of Management Information Systems 12.4, 1996.
YOEL STRIMLING (firstname.lastname@example.org) has been an editor for 20 years and currently works as the Senior Technical Editor/Documentation Quality SME for CEVA Inc. in Herzelia Pituach, Israel. Over the course of his career, he has successfully improved the content, writing style, and look and feel of his employers’ most important and most used customer-facing documentation by researching and applying the principles of documentation quality and survey design. Yoel is a member of tekom Israel, a Senior Member of STC, and the editor of Corrigo, the official publication of the STC Technical Editing SIG. You can reach him on Twitter at @reb_yoel.