By Adam Jones | Senior Member
New technology and strategies, from machine translation to tag-based formats, enable you to dramatically reduce translation costs.
At the 2017 STC Summit, we discussed “Avoiding the $36 Comma: Clever Editing Strategies Can Reduce Translation Costs.” You learned that revising your English documentation was causing your company’s translation costs to skyrocket as every little change needed to be replicated in dozens of target languages. Since then, STC members have done a fantastic job cutting down on needless editing, collectively saving an estimated $282,316,411!
Five years later, there are better ways to reduce localization costs even more dramatically.
Machine translation technology has exponentially improved in recent years. Neural machine translation systems abound and produce decent quality in many languages and subject matter areas.
Although the initial results are very good, they still require post-editing by human linguists. During this process, professional translators edit the text to remove errors, correct terminology, and refine style. This work requires proficiency in the source and target languages as well as a knowledge of the subject matter. According to the ISO 18587 Standard focusing on post-editing of machine translation output, full post-editing should aim “to produce an output which is indistinguishable from human translation output.”
While full post-editing is ideal and more frequently used, light post-editing is an option for translations that need not be perfect and where style and readability are less critical. Content that will not be published, such as information for internal use only, can undergo this lower level of post-editing to ensure rough accuracy and comprehensibility with less effort and expense.
Well-written source content with the following characteristics results in higher initial machine translation quality and requires less post-editing:
- Free from spelling mistakes or typos
- Consistent in terminology and syntax
- In the active voice
- Composed of simple, concise sentences
As with translation and review, costs for post-editing vary by language, primarily due to the cost of living in the countries where translators work (Table 1). For example, a post-editor for Simplified Chinese may live in China and be paid a third of what a post-editor who focuses on Swedish and lives in Sweden earns.
The suitability of each language for machine translation affects the post-editing effort required and the resulting cost. Romance languages, such as Spanish, Italian, French, and Portuguese, are most suited to machine translation. The translated text will require some post-editing, but the software can do almost all the work. On the other hand, Chinese, Japanese, and Korean require much more human revision effort. These languages feature complex structures and have many vocabulary nuances that machine translation tools cannot capture. Languages with the same translation costs may have different post-editing rates. For example, although translation and review costs are comparable for French and German, post-editing in German is more expensive because more work is required.
Interactive Translation Memory
Differing from machine translation, translation memory tools record the work of human translators. They store each text segment in a database, linking the source to the translation. Each target-language translation may be reused when the same source text appears in an update or another component.
Translation memory offers the advantages of reducing costs, accelerating schedules, and improving consistency. Therefore, creating and using translation memory should be a requirement of the translation process regardless of the approach or toolset selected.
These tools are more advanced in 2022 — most work with translators interactively, allowing a team of linguists to benefit from their colleagues’ completed work. Unlike previous systems that required leveraging text before the translation process begins, the current software takes advantage of the cloud to update the translation memory database immediately as translators complete each segment. These continuous updates allow you to benefit from memory matches as soon as they are available, within the same document or another block of content.
Translation memory provides discounts such as those detailed in Table 2.
Scrutinize the leveraging logs to ensure that your translators use translation memory effectively
Subjective penalties for formatting differences and different strategies for calculating internal matches can increase the costs. With interactive translation memory, you should be able to generate savings by leveraging each set of content and text that matches previously translated segments. At the outset of each project, a close examination of the translation memory logs will help you understand the savings and make ongoing improvements.
Tag-Based Content Formats
Traditional word processing and page layout applications require formatting each target language at a significant expense. Text expansion in many target languages increases the effort necessary, causing issues with pagination, line lengths, and formatting styles. Even if most of the text is leveraged from translation memory, all pages must be reformatted for every language.
Choosing a tagged file format (such as HTML or an XML-based structure like DITA) eliminates the need for manual formatting. The tags are maintained throughout the translation process, which isolates the text content. After translation, you will be able to generate the target-language output automatically to website pages, PDFs, or multiple formats.
This eliminates the formatting charge of $5–$12 per page to reduce the initial publication and update costs.
With most documentation now distributed online, rich analytics can show which content end users read. You can use this data to select text for translation and determine the level of quality required.
- Frequently used content: Professional human translation and review
- Moderately used content: Machine translation and full post-editing
- Infrequently used content: Raw machine translation or machine translation and light post-editing
- Rarely used content: No translation
As information gains traction and more views, you can refine translations further. Disaggregating analytics by market enables you to select content for editing by target language. For example, a process referenced often in Japan could be translated or post-edited into Japanese even if it is seldom used in France and requires no French translation.
Simultaneous translation of all or subsets of documentation into dozens of target languages used to be the norm. Now you can carefully curate which information is relevant for and appeals to each market.
Unnecessary Language Service Provider Fees
Some greedy language service providers attempt to charge unnecessary fees for routine work. You can typically negotiate these expenses.
Translation costs for the application and review of translation memory are one area of variability that have the potential to drive up expenses without adding value. In-context and exact matches from translation memory do not typically require review but can incur per-word fees of 10%–33% of the base translation cost. There is usually no need to review the reused translations for routine updates where most of the text has not changed. This review is only necessary in cases where over 50% of the documentation set has been rewritten, restructured, or expanded.
When reviewing translations, many organizations find areas where they prefer terminology or style updates, even in cases where no errors exist. Language service providers should incorporate one round of revisions at no charge, even if they are preferential or stylistic. It can be helpful to clarify this expectation at the outset of each project to avoid paying for the incorporation of review comments.
You should own your translation memory and be able to take it to a new language service provider or use it among several translation partners simultaneously. The translation memory is a valuable asset that you should insist on receiving after every large project or on a routine basis (such as annually) at no additional charge. You do not want to pay to retranslate content because you do not have a current and complete translation memory.
Many language service providers have robust systems for machine translation, translation memory, and translation management. Before investing in cloud or internal language technology for your organization, consider using your translation partner’s portal, often available at a much lower or even no cost. Relying on your vendor’s technology will also avoid implementation and integration hassles. By setting expectations up front, you should still be able to benefit from the accessibility, flexibility, and control you would find in an independent tool.
A Concluding Example
Combining machine translation and a tag-based content format will enable you to reduce initial translation costs dramatically. For example, Table 3 compares the cost estimates of translating a 250-page manual into six languages using old and new approaches.
Building and applying translation memory, using analytics to select meaningful content for translation, and avoiding unnecessary language service provider charges will allow you to benefit from additional ongoing savings.
Adam Jones (firstname.lastname@example.org) is the president and COO of SimulTrans, where he has spent 29 years helping companies take their products and content to the world with translations in over 100 languages. Adam graduated from Stanford University, where he studied public policy with an emphasis on education. He has been a proud and grateful member of the STC for decades.