When Is a Sentence a Sentence and a Frog a Frog?

By Arle Lommel | Guest Columnist

Standard Deviation is a column all about standards—a subject that affects most of our lives, but that we seldom think about. As the title implies, I want to keep the conversation lively and engaging. I’m always looking for guest columnists, and we welcome feedback with comments or requests for standards-related topics to cover. Email Columnist Ray Gallon at

As a technical author, you may not normally think about readers in other countries. You may not even know where your readers will live or what they will speak. Nevertheless, writers need to consider how people outside their home country will use their texts. Common Sense Advisory estimates that far less than 1% of all content is translated, but, as technical writers, you are producing particularly valuable content that often ends up used around the world. Seemingly minor decisions can have a big impact when your content is translated into 10, 20, or even 150 languages.

Translation, like most fields, has many standards and some of them—let’s be honest—make drying paint seem exciting by comparison. So why should you learn about them and even get involved in creating them? The short answer is that knowing about them can make you a better writer, improve the value of what you write, and make life easier for the poor translators who have to spin gold from what is too often straw. But the people who create these standards often lack meaningful input from authors, and so may make decisions that make your life difficult.

Well-meaning authors can end up creating nightmares for translators. To understand why, we need to consider three standards that translators commonly work with:

  1. XML Localization Interchange File Format (XLIFF) is a standard format for representing content that needs to be translated. Chances are that any text you write will end up in XLIFF at some point.
  2. Translation Memory eXchange (TMX) represents text that has already been translated so it can be automatically retrieved and reused.
  3. TermBase eXchange (TBX) represents information about how terms should be used and translated.

These standards all interact with what you write in various ways. First off, when your content management system sends text off to translators in XLIFF format, it slices and dices it into sentences. If you happen to have used hard returns to force line breaks, it may end up as follows:

  • Sentence 1: “Technical writing is terribly”
  • Sentence 2: “difficult to teach effectively.”

Translators encountering these may not see them in context and thus make their best guess and translate Sentence 1 as “Technical writing is terrible” and Sentence 2 as “Effective teaching is difficult.” To address this problem, you need to avoid manual “fixes” based on appearance and instead use margins or paragraph formatting to achieve the appearance you want.

Another problem is that writers working in formats like DITA or S1000D may not even know how text will be rendered or where it will appear. Because these formats encourage reuse of text in multiple environments, chunks may lose context in ways that cause real problems for translators. For example, in many languages the way you translate “it” and other pronouns depends on what they refer to. If a chunk starts with “It is an important task,” and it is not clear what “it” refers to, the translator may not be able to provide a correct translation or may provide one that works in one context but not another.

The problem is complicated when they employ TMX to reuse a previous translation that made sense in a different context. You can solve this problem by writing chunks and sentences to be complete in themselves (e.g., “Technical writing is an important task”). Taking this approach not only makes translation easier but also helps speakers of English as a second language or those who have difficulty reading.

Finally, when translating a term like frog, translators need to know what it refers to. For example, they would generally translate frog as Frosch in German, but, in a text about railroads, frog is a term for the place where two railroad tracks cross and should be Herzstücke (literally “heart piece”). Using Frosch in this context would leave German readers totally puzzled. A good terminology management system coupled with a standard such as TBX helps you know what terms to use consistently to prevent confusion.

As you educate yourself about localization standards and processes, you will discover ways to improve how you write for international audiences, ones that often also make you a better writer in general. But you can also provide input into the standardization process, by getting involved either personally or through an organization like STC. Doing so will allow you to make a difference for writers and users around the world.

ARLE LOMMEL is a senior analyst with Common Sense Advisory. He is a recognized expert in quality processes and interoperability standards. Arles research focuses on technology, quality assessment, and interoperability. He has been actively involved in standardization efforts for translation since 1998.

Download the July/August 2017 PDF

2017 PDF Downloads