Assessing the Overall Quality of a Document Based on Editorial Comments

By Kumar Dhanagopal | Senior Member

Technical writers are often responsible for creating and maintaining multiple documents. In organizations where a formal editorial review is integral to the documentation process, technical writers who own multiple documents might need to address a huge volume of editorial input, often received late in the documentation cycle. What do all of those editorial comments, when taken as a whole, really mean in terms of the overall quality of the document? Lots of red ink might mean either that the document is in bad shape or that the editor loves to explain every comment, however minor, in great detail. On the other hand, a short comment buried on page 63 might turn out to be the single most important editorial value-add for the entire document!

The problem

Is there a way to summarize pages and pages of editorial review comments into a single meaningful quality metric? Such a metric would enable technical writers who own multiple documents to decide how much time they should set aside to revise each document that comes back from editorial review. The metric would also enable managers of technical publications departments to assess the overall quality of all the technical writers' first drafts in the department. Technical writers and their managers can collect the metric over time and use it to identify trends in writing quality and, where required, take corrective measures to improve the writing. In addition, the metric, when collected over a period of time and analyzed, could reveal anomalies and patterns in editorial behavior, which can be used to tune and strengthen the editorial process.

This article describes a process for objectively assessing the overall severity of editorial feedback for a document. The process can be used to distill hundreds of editorial review comments about a document into a single, quantitative metric indicating the overall quality of the document.

Why the usual methods do not work

First let us examine how technical writers typically try to gauge the overall quality of a document based on editorial comments. There are three methods: Method 1: Count the total number of editorial comments. Method 2: Request an overall quality rating from the editor. Method 3: Judge the quality by reading the editorial comments.

Method 1 (count the total number of editorial comments) is easy to implement, but it does not help us assess the true quality of a document. It overlooks differences in severity between individual review comments. Typos and grammatical errors, for example, are not as critical as missing and inconsistent content. It also ignores error intensity: the ratio of editorial comments to the size of the document in terms of word or page count. A 500-word draft that received 50 editorial comments is perhaps of a higher quality than a 250-word draft with 40 editorial comments. Table 1 illustrates the pitfalls of method 1.

Comment type	Number of Comments in Document 1 (100 pages)	Number of Comments in Document 2 (200 pages)
Typo	10	16
Violation of style guideline or grammatical error	5	5
Error in logic, missing content, or inconsistent content	6	2
Total number of comments	21	23

Table 1. Assessing the Relative Quality of Documents Based on the Number of Editorial Comments—An Example

A simple count of the number of review comments would indicate that document 2 (with 23 comments) is probably of lower quality than document 1, but a closer look reveals that document 1 contains a higher number of critical errors, such as errors in logic and missing content. In addition, although document 2 received more editorial comments overall, the comments were spread across double the number of pages than document 1. So the absolute total number of editorial comments cannot be used in isolation to infer the relative quality of a document.

Method 2 (request an overall quality rating from the editor) merely shifts the onus of assessing the overall quality of each document from the technical writer to the editor. Editors typically edit multiple documents pertaining to diverse projects and operate under tight schedules. It would be unfair to expect an editor to remember enough about each edited draft to be able to come up with an objective assessment of each document.

Method 3 (judge the quality by reading the editorial comments) is subjective, difficult, and time consuming, and therefore impractical.

The proposed solution

We need a method that combines the ease of method 1 with the qualitative merits of methods 2 and 3. The ideal method should, at a minimum, fulfill the following requirements:

The method should yield an objective overall assessment of document quality based on editorial comments. Objectivity can be achieved only when the method takes into account the following parameters when assessing the quality of a document:
- The relative severity of each individual editorial comment.
- The error intensity: the number of errors per page or per 100 words. According to Donald S. Le Vie, this metric has a direct bearing on the quality (as measured internally, primarily) of a document.
The method should be reasonably easy to set up and use.

The severity of an individual editorial comment can be assessed based on the category of the underlying issue: typo, grammatical error, violation of a style guideline, error in logic, missing content, inconsistent content, and so on. As an editor marks up a draft, it should be a relatively simple task to annotate each comment to indicate its category. The annotation could be in the form of a special color or code for each category.

After the comments have been suitably annotated with the appropriate categories, the annotations should somehow be amalgamated into a single rating'preferably numerical'that indicates the overall quality of the text. For this, each category must first be associated with a specific numerical weight based on the relative severity of the category. Note that this is a one-time effort. Table 2 shows a sample scheme of weights:

Comment Category	Weight
Spelling error	1
Grammatical error	2
Violation of a style guideline	3
Missing content	4
Inconsistent content	5
Error in logic	5
Wrong content	5

Table 2. A Sample Set of Weights Assigned to Comment Categories

The sample weights that I have assigned in Table 2 are based on my personal judgment of the relative importance of various categories of editorial issues. I have provided them here merely to illustrate how the proposed method for calculating an overall quality rating works. According to this scheme, missing content is twice as severe as a grammatical error and four times as severe as a spelling error. Similarly, an error in content is five times as critical as a spelling error. The weights that I have assigned to the comment categories are admittedly arbitrary, but they are essential to convert the qualitative nature of each category description to a quantitative measure that indicates its relative significance.

The next step would be to multiply the number of comments in each category with the corresponding weight, sum the resulting values, and calculate the error intensity (comments per page). Table 3 illustrates this calculation.

As we can see from Table 3, although document 1 has fewer editorial comments (21 in total) than document 2 (23), the weighted number of comments for document 1 is significantly higher at 69, indicating that it received a relatively high number of comments that were severe in nature. In addition, in terms of error intensity, document 1 scores 0.69 errors per page while document 2 scores only 0.28. When choosing which document to take up first for post-edit revision, the technical writer can objectively select document 1 because it has the highest error-intensity number.

Comment Type	Weight	Document 1 (100 pages)		Document 2 (200 pages)
		Number of Comments	Weighted Number	Number of Comments	Weighted Number
	(A)	(B)	(A × B)	(C)	(A × C)
Spelling error	1	5	5	10	10
Grammatical error	2	3	6	4	8
Violation of a style guideline	3	2	6	3	9
Missing content	4	3	12	2	8
Inconsistent content	5	3	15	2	10
Error in logic	5	3	15	1	5
Wrong content	5	2	10	1	5
Total number of comments		21	69	23	55
Error Intensity		69/100 = 0.69		55/200 = 0.28

Table 3. Calculating the Relative Weighted Value of Editorial Comments'An Example

The method proposed in this article will work only in an electronic review workflow that supports the following activities:

Technical writers submit documents for review electronically, preferably in a Web-based environment.
Editors post context-specific comments, also electronically.
While posting comments, editors assign a severity level number to each editorial comment. Each severity level maps, internally, to a specific comment category.
The system calculates, dynamically, a weighted number of editorial comments based on the number of editorial comments at each severity level.

Some companies have already developed proprietary solutions that support steps 1, 2, and 3 of the above workflow. Oracle, for example, uses an internally developed Web-based tool that allows technical writers to submit drafts for parallel review by multiple reviewers. The reviewers can post comments online, respond to comments by other reviewers, and assign a priority level to each comment. Converting the priority assignments to an overall rating of the document (step 4 in the above process) is the logical next step. Once that step is automated, technical writers who have to deal with and prioritize numerous editorial comments on multiple documents can quickly assess which specific pieces of text (sections, chapters, or documents) they should devote their attention to.

A few words of caution

Any metric such as those described in this article (“weighted number of review comments” and “error intensity”) is, at best, a numerical approximation of an abstract value. These metrics can never replace the original abstract values because the transformation process involves assumptions and, to a certain extent, personal bias.

For example, while assigning weights to the comment categories—1 for typos, 5 for missing content, and so on (Table 2)—I injected my personal judgment that an error of missing content is five times as severe as a spelling error or a missing preposition. In some situations, a misspelling or an omitted word (the word “not,” for example) changes the meaning of the text completely. In addition, from a localization perspective, a spelling error might be more critical than missing content. Geoffrey Hart states that each company and context will have certain unique characteristics that change how you look at the review process.

The work of setting up a process for assessing documentation quality based on editorial review comments does not end with defining comment categories, assigning weights, and designing an electronic workflow for capturing review comments and calculating results. The data generated by the process over a period of time across several documentation projects should be stored, and the stored data should be analyzed. For example, the data might show that one editor consistently posts a relatively high proportion of comments on style violations. Further analysis might reveal that this editor focuses the editorial review on only style-related issues and does not actively look for errors, such as inconsistent content. In this situation, something has to change: if the seemingly anomalous behavior of the editor is what the company expects as the standard, then the weight for style-related comments should be increased; if, on the other hand, the particular editor's behavior is indeed an anomaly, the editor should be asked to alter focus suitably.

Kumar Dhanagopal (kumards_99@yahoo.com) is a technical writer working for Oracle. He is currently enrolled in the technical communication master's program at Utah State University and is a senior member of the STC India Chapter.

The problem

Why the usual methods do not work

The proposed solution

A few words of caution

Suggested Reading