Want Personalization? Start with Content Engineering

By Lisa Trager

There’s a lot of talk about personalization these days. What most of these conversations skip are the foundational content engineering elements that will enable the content to be personalized.

Over the past 20 years working with enterprise content within large agencies, as a solo consultant, and deep in the trenches of Fortune 50 companies, I am now convinced that the only viable path to personalization involves building a strong practice of content engineering. Here’s how my thinking has evolved.

Discovering the Magic of XML for Personalization

I’ll never forget the first time I was exposed to XML. In the late 1990s, I worked as an Information Architect and Content Analyst for an agency that specialized in intranets for Fortune 500 companies. One project was related to creating a good customer experience for employees who used the intranet for HR-related information. With the magic of XML, we could provide personalized information about health benefits for over 50,000 employees based on their role, department, and location. It’s memorable, because at the time, it was so new, and the potential was mind-blowing.

While working at top New York agencies, I worked as a Content Strategist—a role that was mostly limited to thinking about marketing content for large corporate clients. Along with the team of content strategists I managed, we spent much of our time in spreadsheets auditing content and creating content matrices. I remember trying to introduce the idea of creating a content model, tagging for personalization, and structuring the content for reuse. No way. Applying content engineering best practices in an agency environment was unwelcome and, frankly, not understood. We were tasked to only “do content strategy.” Doing anything besides marketing and copywriting was out of scope and certainly not included in the client’s budget. Even when my suggestions would have ultimately saved the client time, resources, and budget, there wasn’t even an opportunity to pitch it. I finally decided to accept a full-time opportunity on the client side—a chance, I thought, to make a bigger difference.

Working Inside a Giant Content-Driven Enterprise

Fast forward 20 years after my first exposure to the personalization offered by XML—this time working on staff at a complex 100,000 person company. I was recruited to be the “change-maker” who could come up with solving how to manage over 40,000 pages of support content. Front and center: the challenge of fixing a myriad of content problems. And, wow, the problems were huge.

Challenges included:

  • Content issues: Poor formatting, repetitiveness, limited personalization, an ever-growing amount of content.
  • Technical issues: Two separate, incompatible content management systems; lack of structure; lack of tagging; content not optimized for search; inability to publish relevant content across platforms (omnichannel publishing) to the Web, app, and chatbot.
  • Organizational issues: Separate departments that “own” the content, territorialism, lack of will or interest in seeking viable solutions that go beyond “business as usual,” fear of change that might bring risk to their annual performance review and associated bonus.

A viable solution that would have taken the organization from crawling to running within two years was within reach, and it included everything from creating a taxonomy to consideration of adding a CaaS platform. For this company and other large organizations, the investment needed to put a working content engineering plan with content reuse in place would have been negligible compared to the long-term costs of creating and maintaining content in a manually driven system.

The Value of Content Reuse

In addition to the hard and soft savings in time and resources, implementation of foundational elements, new processes, and approaches bring an improved customer experience, increase Net Promoter Scores (NPS), and prevent churn, which are worth gold in today’s customer centric business world. According to a 2018 Forrester report,

The average Fortune 500 company is spending millions of dollars in waste on content operations: Teams are weighed down with inefficient workflows, manual workarounds are still failing to deliver a next generation customer experience, and things continue to get worse as content volume expands (Forrester Consulting 2018).

In Mark Lewis’ book DITA Metrics 101 (Lewis 2012), he lays out cost models based on comparing various scenarios from using simple topics to more complex projects that use filtering, structured authoring, content reuse, and multiple translations. By defining the cost basis for each unit and then comparing costs of traditional publishing vs. with DITA or content reuse, the models prove the significant savings possible.

Because so many processes are tied to other processes, the savings can have a compound effect. For example, content reuse during the content development process results in savings that cascade in the publishing and translation processes (Lewis 2012).

Building Blocks

Based on my experience working in an assortment of environments, the following are the ingredients that will help any organization build a strong foundation for managing complex and ever-growing content challenges.

Shared Taxonomy


Having an official taxonomy helps everyone from writers to IT understand the hierarchy of topics. The nodes would include not only information for sales related to products and accessories, but can also be used to inform the organization of related support and service content.

A shared taxonomy is the only way we can share the relevant tasks, product hierarchy, features, time, place in the user journey, and all the other facets that align content with specific user intents.


You build a taxonomy to support personalization use cases to help inform the relationship of topics to users and where that content might be most useful in the customer journey. Using an example for child nutrition based on a “mad-libs” approach, the taxonomy might include breaking down moms, age of infant, product type; and product features: “As a new mom, I’m looking for the right formula to feed my newborn infant.”

The best way to complete a complicated taxonomy for a brand with lots of products and use cases is to hire a taxonomist. An expert with the experience to put into place a taxonomy that will have the biggest bang for the buck will not only save time but help avoid politics.

Building a shared taxonomy requires lots of input from various stakeholders and subject-matter experts. Don’t try to boil the ocean. Think about what will have the most impact and return on investment (ROI). For example, in my last endeavor, I recommended that the taxonomy not include products, but relate more to the topics and terminology for service and support content.

I’m also a big fan of using software. For large sites, there is no way to govern or maintain a spreadsheet with hundreds of thousands of terms and ontological relationships.

Semantic Tagging


Tagging content based on meaning enables bots to better find and serve the right content to the right person at the right time. Semantic tagging enables the dynamic presentation of content that is defined by rules and logic.


Semantic tagging hinges on the taxonomy you create, which defines the:

  • Audience
  • Topic
  • Subtopic
  • Intended use
  • More

In addition, many people use schema.org for structured data available to search engines. Developed by Google, Microsoft, Yahoo, and Yandex, schema.org uses an open process based on community input to keep microtagging vocabularies relevant and to meet the needs of the wider developer community.

Content Reuse


As mentioned previously, the financial savings of content reuse includes the production costs of creating and maintaining the content across platforms and with each translation. The other benefit is that increased trust comes with consistency in messaging and information, regardless of where your customers are.

In particular, having a content reuse strategy in place is particularly relevant when it comes to support or technical content. When you think of support content for a telecom, how many ways can you give instructions to find Wi-Fi, add a contact to an address book, or even activate a device? With content reuse, rather than having hundreds of files for one topic, one can have one file with exceptions that address unique requirements, thereby making it easier to update and publish.


Darwin Information Typing Architecture (DITA) is an XML data model designed for efficiency in writing and most importantly ease of reading instructional information. The major categories used to map the content include:

  • Tasks: Used to describe how to perform a procedure
  • Concepts: Descriptive information to help the user understand the background and context of a subject
  • References: Topics that provide detailed facts
Structured Content


For omnichannel to work, having in place the technology, as well as strategy, to implement content reuse not only saves time and resources but helps ensure consistency. Clear structure provides the scaffolding and is directive for writers, giving them all of the advantages of form-based writing. Structure helps authors focus and cuts out anything not immediately relevant to the topic.

Structured content is future-proof content. As the market continues to rapidly create new unknown devices, platforms, and channels, the only way to enable content to remain accessible is by separating the message from the format. When embedded in HTML, content is locked into format.


At the heart of structured content are components that can be reused across multiple platforms and/or channels. Through the use of a component content management system (CCMS), content is separate from format, enabling the content to look proper regardless of the output. Created as smaller chunks, these modules are also free from being tied to one long document, so the modules can be used individually or combined as needed to fit difference use cases.

Content-as-a-Service (CaaS)


Sometimes it just doesn’t make sense to force a working group to change the system they are using for content creation. For example, in one case, AEM (Adobe Experience Manager) was the primary CMS for public facing content. Other departments, however, were using another CMS specific to the needs of internal agents and reps, which was not compatible with AEM. Each system had benefits, but neither system had content models or a consistent way that content itself was structured. In today’s omnichannel world, having one content model (e.g., the [A] Master Content Model) applied across disparate content, makes it possible for systems to work together, as well as empowering multiple teams access to work on the content either together or independently. Making that unified content available via API enables us to personalize the content and to provide an enhanced customer experience.


Although we hear a lot about “one source of truth,” in reality that is usually not viable in large enterprises. That said, what is not only doable, but critical, is creating content models that can be shared across systems. Any headless platform, including AEM, that uses APIs can participate in a shared CaaS environment—the benefit being the seamless integration of content, which benefits workflow, content creation, and best of all, the ability to publish across platforms, channels, and print.

Content Operations and Governance


Content operations is an emerging function in corporate organizations and still rarely understood. The purpose is to have staff with subject matter expertise in all of the things related to the production of content, such as taxonomists, information architects, content strategists, and content engineers. As Rahel Bailie explains, the working definition of Content Ops is:

A set of principles that results in methodologies intended to optimize production of content, and allow organizations to scale their operations, while ensuring high quality in a continuous delivery pipeline, to allow for the leveraging of content as business assets to meet intended goals (2018).

Another benefit of having a content ops group is that it can be the perfect vehicle for providing content governance, especially in large organizations, where it’s often more like the wild west, and rules or guidelines are either unknown or not followed outside of that siloed unit.

Having a centralized group that can apply manual labor, automation, and software to ensure observance of guidelines has multiple benefits:

  • Faster content creation and time to market
  • Ability to create content at scale
  • Improved quality
  • Greater consistency in tone, voice, terminology, and information
  • Reduced legal risks
  • Greater adherence to branding and regulations
  • Improved customer experience due to personalization strategy
  • Automation and a move toward AI


Develop a team with expertise in taxonomy, tagging, content reuse, DITA, and information architecture. Consider software like Acrolinx, which works like a virtual editor, applying the governance and rules defined by content ops to any writer in the company who has the software installed on their computer.

Content governance includes addressing everything from adding efficiencies to content workflows, to creating content strategy and editorial guidelines, taxonomies, and archiving standards.

Lessons Learned

The change plan is still very much a work in progress. Big changes in enterprise content lifecycles take big effort over long periods of time, and change is hard. What I experienced isn’t that unique due to the realities of corporate culture. Until we can address both the rational and emotional reasons behind the resistance to change, the benefits that content engineering promises will remain an ideal discussed in conferences or written in books read by content engineering and content strategy insiders—and ignored by corporate decision-makers.

Companies working on transformation of content processes and systems need to address silos, territorialism, fear, lack of knowledge, constant reorganizations, and lack of accountability.

Without a commitment to making the necessary changes in process, roles, and technology, progress toward addressing ever-mounting content challenges will at best remain an uncertainty and at worst become a growing burden. Without a real change in the mindset of executives around prioritizing the importance of foundational changes and need for content engineering best-practices, creating ideal, personalized, customer experiences will continue to be a white whale.


“A Guide to Mastering the Master Content Model.” [A]. October 2018. https://simplea.com/Publications/Whitepapers/master-content-model.

Bailie, Rahel. “Principles, Not Prescriptions: The Coming of Age of ContentOps.” October 2018. https://gathercontent.com/events/principles-not-prescriptions-the-coming-of-age-of-contentops.

EasyDita by Jorsek. 31 July 2014. https://easydita.com/when-to-use-the-topic-concept-and-task-types-in-dita/.

Lewis, Mark. DITA Metrics 101: The Business Case for XML and Intelligent Content. The Rockley Group: Schomberg, ON, 2012.

Morse, Stephen. “When to use the Topic, Concept and Task types in DITA.”

“Today’s Content Supply Chains Prevent Continuous Customer Journeys.” SDL. 11 November 2018. https://www.sdl.com/resources/downloads/whitepapers/customer-journeys/whitepaper.html.

LISA TRAGER helps organizations develop and execute digital strategies for an omnichannel world. As a consultant and the Principal of Content in Motion, she led digital enterprise initiatives on the agency and client side in the telecommunications, healthcare, pharmaceutical, financial, consumer goods, and government sectors. Currently, she works at Verizon in Digital Operations where she applies her extensive knowledge and expertise in creating customer-centric experiences related to service and support.