71.3 August 2024

SEO as Audience Analysis: Accounting for Algorithms in Content Strategy

By Daniel L. Hocutt

doi.org/10.55177/tc549684

Abstract

Purpose: This project contributes a rhetorical approach to search engine optimization (SEO) as algorithmic audience analysis. It positions SEO as an activity that requires strategists to compose website content that is optimized to both human search engine users and the algorithmic audience (Gallagher, 2017) of a search engine’s indexed content.

Method: Actor-Network Theory (Latour, 2005), with its focus on the agency of non-human entities combined with human agency in social activity, provides the theoretical framework for this approach. The project combines usability testing with web development methods to trace rhetorical agency during online search activities (Hocutt, 2019). Doing so demonstrates the role search algorithms play as receptive audiences of SEO strategies.

Results: Approaches to teaching SEO within the framework of technical and professional communication (TPC) rhetorical foundations require understanding the algorithmic audiences of SEO practices. By matching timestamp data from video-recorded usability tests and HTTP archive (HAR) files produced during usability testing sessions, content strategists can overlay the chronological recordings with their SEO strategies to better understand how successfully SEO met human and algorithmic audience expectations. When SEO practice identifies human audience expectations effectively and develops content signals attractive to its technological audiences, both audiences succeed in an assembled meaning-making exercise. By applying existing methods of audience analysis to search algorithms, content strategists can improve SEO and help surface relevant content for their human users.

Conclusion: The results of this project provide a framework for practicing SEO as rhetorical activity built upon audience analysis of both human and non-human users.

Keywords: Search Engine Optimization (SEO), Audience Analysis, Content Strategy, Algorithmic Audience, Web Content

Practitioner’s Takeaway

  • Adds SEO as a necessary consideration of content strategy.
  • Refines SEO practices to address both human and algorithmic audience requirements.
  • Introduces HTTP archive (HAR) files and file readers tools for tracing rhetorical agency across algorithmic audiences.
  • Introduces a modified usability study that incorporates web development methods as a novel approach to measuring the effectiveness of content strategy.
  • Provides a framework for content strategists to effectively signal search algorithms while meeting human audience expectations.

Introduction

The purpose of this project is to position search engine optimization (SEO), a process in which web developers and content strategists prepare websites for indexing by search engines like Google, as a kind of audience analysis. To accomplish this, I’ll describe audience analysis within the realm of technical communication and describe SEO within the realm of content strategy.

Audience Analysis in Technical and Professional Communication (TPC)

The concept of audience analysis, rooted in the rhetorical situation of audience, rhetor, and purpose, is foundational in TPC theory and practice. Understanding audiences and their goals is the key to content strategy and user experience, ensuring that users are able to solve their problems and achieve their goals. Regardless of the way the rhetorical situation (Biesecker, 1989; Bitzer, 1968; Vatz, 1973) or rhetorical ecology (Edbauer, 2005) is described, tailoring the communication message to an audience remains a fundamental aspect of TPC theory and practice. In TPC research, audience analysis is the subject of numerous studies and theories, most recently focused on the rise of online information in the field. Albers (2003) provides a multidimensional framework for audience analysis when dealing with dynamic information. Miles (2009) addresses audience analysis in an immersive virtual reality (IVR) environment, recognizing the way invention and audience analysis are deeply intertwined when using IVR. Ross (2013) proposes an audience analysis approach to the complex genre ecology of environmental communication, one of several scholars who has recently approached audience analysis for a specific audience. Cardinal (2022) focuses on audience analysis of migrant multilingual audiences using the lens of superdiversity, while Gallagher et al. (2020) introduce big data audience analysis (BDAA) to better understand audience motivations and needs within a massive corpus of online comments. van Velsen et al. (2010) propose user-centered design methods to address increasing personalization in electronic communication; audience analysis for personalization to enable targeted digital microcontent is now commonplace, especially in chatbots and smartphones (Hocutt et al., 2022). Much earlier, Breuch et al. (2001) identified audience analysis as “perhaps the strongest link between usability and technical communication” (p. 227) when encouraging technical communication programs to incorporate usability studies into curricula. Since then, connections among usability, user experience, and technical communication theories and practices centered on audience analysis are regularly reinforced, including in mobile design (Melonçon, 2017), in intercultural and international communication and translation (Jarvis Kwadzo Bokor, 2011; Starke-Meyerring et al., 2007; Yu, 2012), and in curriculum development (St.Amant & Melonçon, 2017) among many other areas of overlap. These approaches to audience analysis are intended to be representative and not exhaustive to demonstrate the wide range of approaches to audience analysis available in technical communication research and practice.

Defining what “audience analysis” means is no less challenging than providing a short review of literature on the concept. The Society for Technical Communication (STC) Body of Knowledge (TCBOK, n.d.) glossary refers “audience analysis” to “user analysis” and defines “user analysis” as follows: “Identification of user requirements for a product. Also called audience analysis.” While accurate, the definition feels incomplete given the vast amount of research and theory focused on audience analysis. TPC textbooks offer more robust approaches to audience analysis as a practice. Technical Communication Across the Professions (Herald, 2022) recommends that audience analysis determine type of audience; identify background, needs and interests, and other demographic characteristics of the audience; and complicate audience understanding by recognizing more than one audience and wide variability in an audience (Chapter 1.2). In the Writing Commons, Hickman (n.d.) describes audience analysis practice as an iterative process throughout composing, focusing on who the audience is and what the audience needs may be. Open Technical Communication (Tijerina et al., n.d.) covers many of the same points as Herald (2022) but adds this statement to better illustrate the extent to which analysis should continue: “You’ve analyzed your audience until you know them better than yourself.” For the purposes of this project, audience analysis refers to the process used to adapt content to user needs, focusing on the simpler, but useful, definition provided by the TCBOK.

SEO in Web Development and Marketing

Focusing on search engine optimization (SEO) as a content strategy shifts us squarely into the realm of web development and content writing, often considered more directly aligned to marketing communication than technical communication. As user experience and content strategy become embedded in technical communication courses, theory, and practice (see among others Flanagan et al., 2022; Getto et al., 2020; Getto & Flanagan, 2023; Lauer & Brumberger, 2016; Rose & Schreiber, 2021), SEO becomes an important topic to cover in TPC courses and practice.

According to Semrush, a company focused on online visibility, SEO “is a set of processes aimed at improving a website’s visibility in search engines, like Google, with the goal of getting more organic traffic” (Pavlik, 2022). SEO matters, according to industry standard Search Engine Land, because “[t]he better visibility your pages have in search results, the more likely you are to be found and clicked on” (Goodwin, n.d.). It’s important to note that SEO isn’t restricted to “traditional” search engines like Google and Bing. Search engines can be found in many product and service websites, including Amazon and other online retailers, YouTube and other streaming media providers, and all social media platforms. According to recent research reported in Insider Intelligence (2022), shoppers were more likely to start searching for a product using Amazon (63%) than a search engine (49%) based on a survey conducted in September 2022. As a result, content strategists might find themselves developing content and online interfaces for documentation, retail, governmental, or social media platforms. And when they develop content strategies, SEO is likely to be an important aspect of their work.

More specifically, SEO is a process that helps ensure that web content appears at or near the top of search results in a search engine results page (SERP). Analysis of click through rates (CTR) from top search results on SERP to their linked landing pages conducted by Backlinko in May 2023 indicated that ranking number one on a SERP yielded nearly 40% CTR compared to only 18.7% CTR for the second ranking link and 10.2% for the third ranking link (Dean, 2023). The conclusions drawn from this analysis are clearly stated by Pavlik (2022): “The correlation is very simple—the higher you rank, the more people will visit your page” (emphasis original). A corollary to this finding is also clear: For content to be accessed, it has to be discoverable through organic search in a search engine. Goodwin (n.d.) reiterates the value of SEO, indicating that the majority of visits to a website, 53%, originate through organic search.

While plenty of scholarship about SEO exists (see for example Confetto & Covucci, 2021; Ibhadode & Opesade, 2022; Schultheiß & Lewandowski, 2021), most studies are necessarily constrained by SEO practices and specific use cases spatially and temporally defined. I’ll use Search Engine Land’s “Guide to SEO” (Goodwin, n.d.) as an overall primer on SEO practices that apply to this study. In this case, the guide is undated because it’s continually updated based on updates to SEO strategies. Goodwin defines three types of SEO: technical, on-site, and off-site. This study focuses on technical and on-site SEO because these aspects can be fully controlled by content strategists. Technical SEO focuses on architecture, URL structure, navigation, linking, user experience, structured data, and the hosting and content management platforms in use. On-site SEO focuses on content that is easily read and accessed by people and by search engines. Goodwin offers the following useful distinctions between optimizing content for people and for search engines:

When optimizing content for people, you should make sure it:

  • Covers relevant topics with which you have experience or expertise.
  • Includes keywords people would use to find the content.
  • Is unique or original.
  • Is well-written and free of grammatical and spelling errors.
  • Is up-to-date, containing accurate information.
  • Includes multimedia (e.g., images, videos).
  • Is better than your SERP competitors.
  • Is readable—structured to make it easy for people to understand the information you’re sharing (think: subheadings, paragraph length, use bolding/italics, ordered/unordered lists, reading level, etc.).

For search engines, some key content elements to optimize for are:

  • Title tags
  • Meta description
  • Header tags (H1–H6)
  • Image alt text
  • Open graph and Twitter Cards metadata

In these lists from Goodwin (n.d.), the connections between audience analysis and SEO begin to emerge. Content strategists should develop content based on audience analysis that is relevant to users, that includes keywords that users would include in a search, and that’s readable to users. Content strategists should also develop content that attracts search engines by being “crawlable” (i.e., available to search engine web crawlers that index web content) and easily indexed using structured data and accessible content.

How Online Search Works

A brief discussion of organic online search will help clarify the relationship between content strategy, SEO, and audience analysis. This section describes the online search process by breaking the process into two main visible activities: (1) Entering search terms into a search interface and (2) Receiving the search results. The process by which results emerge from entered search terms is summarized in order to demonstrate the major role that content strategy plays when algorithms match search queries to indexed content toward providing relevant search results.

Preparing for search

Search engines are prepared for organic searches by indexing the content of web pages. The process of collecting and indexing web content is an automated process completed by web crawlers, or spiderbots, that crawl the web to identify new or updated pages; collect information from websites based on content, metadata (title, keywords, descriptions), incoming and outgoing links, and information architecture; and index that information in easily accessible, highly engineered and customized data structures that are quickly accessed during search. Not every website gets indexed, and not all pages are crawled on a website. Indexing involves processing website content into data categories and values based on the structured content of the site. Put another way, algorithmic bots visit crawlable web content to build an index of that content. The bots are programmed to identify certain signals in web content to add to the index. What those signals may be is a trade secret of the indexing search engine corporation (Alphabet, Microsoft, Amazon, Meta, and more). However, each company has its proprietary bots, seeking out signals to index for search.

The search itself

Online search can be broken into a two-pronged, user-initiated process with the following steps:1

  1. Develop a language query that algorithms can process and match to indexed keywords.
  2. Receive relevant search results that seek to meet the needs of the user.

Developing language query. Users initiate online search sessions by typing or vocalizing search terms. Those terms may be single words, phrases, sentences, or questions. Once search terms are entered into a search interface, the algorithmic activity that matches search queries to keywords and identifies matches is largely obscured from view. However, this activity can be broken down into the following processes:

  1. Algorithmic collecting and indexing of web content (described above).
  2. Transmitting search query from users’ devices to search engine servers.
  3. Server-based Natural Language Processing (NLP) of search queries.
  4. Matching search query with indexed web content.
  5. Providing results of the search in SERP.

Viewing search results. The results of algorithmic responses to search are available to researchers in SERP, but the activity of the algorithm itself—the automated, iterative processes by which a search algorithm indexes web content, collects and analyzes search queries, matches queries to indexed keywords, and returns relevance-sorted results to the researchers—is obscured and unavailable for scrutiny and analysis. Most often, algorithmic activities are unavailable because they are proprietary secrets at the heart of a brand.

Where SEO Comes In

This section breaks down specific activities of algorithms involved in online search to reveal how search queries get matched to web results. This description oversimplifies the process, but it seeks to demonstrate the role SEO plays in successful search results.

A simplified method to search engine optimization (SEO) is composing structured content that matches the signals that bots are seeking. For example, one widely known signal is nested heading tags (e.g., H1, H2, H3, etc.). Bots are seeking out these content structures to provide context to indexed content. When a page has a single H1 tag that clearly and succinctly identifies the main idea of the content, that tag’s content (which appears between the <h1> and </h1> tags in the HTML code) gets indexed as the page’s heading. Multiple subheadings, like H2 and H3 headings, are treated in a similar way, except that repeated nested tags under the H1 tag are allowed. If a page has more than one H1 tag, on the other hand, the signal isn’t as clear and the bot, programmed to find a single H1 tag on the page, fails to identify the content’s main idea.

People and algorithms as audiences

In a nutshell, the difference between a single H1 tag and multiple H1 tags on a web page represents the difference between successful and unsuccessful SEO practice. SEO as a practice is far more complex than focusing on a single signal, but at its heart, the activity of SEO relies on structuring content to match the signals that crawling bots expect to discover. Successful acquisition of signals results in successful indexing, and successful, structured indexing of content results in web content appearing higher in search engine results pages. Phrased in terms of audience, if technical communicators want web content’s intended users to find that content using a search engine, then technical communicators must compose and structure content for algorithmic bots and for human users. Content written for users without careful attention to the structured signals that search engines expect will result in useful content that never gets listed among the top results of a search query. For this reason, a theoretical framework that recognizes the agency of online search’s human users and algorithmic processes is needed to help outline SEO as a practice that engages human and technological audiences.

Actor-Network Theory as Assembled Rhetorical Agency

One such theoretical framework is Bruno Latour’s (2005) Actor-Network Theory (ANT). The ANT framework helps describe the rhetorical activity of search engine optimization (SEO) as the combined agency of content strategists, technologies, human actors, and algorithmic processes. This project focuses on human audiences and non-human algorithms as audiences for SEO practices. By describing these actors as part of an actor-network, content strategy can respond to audience analysis using SEO among its analytical tools.

While Latour’s work has regularly been applied to rhetorical studies (see Walsh et al., 2017, for descriptions of Latour’s influence), ANT represents a methodology for redefining sociology, not a methodology for tracing rhetorical agency. Latour (2005) describes his project in Reassembling the Social as “redefining sociology not as the ‘science of the social’ but as the tracing of associations” and describing the term social as “not a thing among other things…, but a type of connection between things that are themselves social” (p. 5, emphasis original). Latour is not presenting a methodological approach to studying the rhetorical activity of humans and technologies in networks. However, ANT provides an approach for identifying actors, defined by Latour as “any thing that does modify a state of affairs by making a difference” (p. 71, emphasis original) and tracing their activity, or agency, in relation to other actors in a network. In the case of search, those actors might include web crawling bots, human users, data collections, algorithms, search engines, and content strategists functioning in an actor-network.

Latour’s (2005) work seeks to isolate and flatten the activity of network actors like those listed above toward understanding the relations among nodes in networks. The work of isolating actors and flattening networking activity enables tracing social relations among actors, which Latour agrees can be human or nonhuman entities, in order to reveal the social as action and study its emergence. In rhetorical terms, Latour focuses on the agency, or agentive activities, of individual actors toward the emergence of the social in order to demonstrate that social activity represents actors working in differential relation to each other. In writing that “an actor-network is traced whenever, in the course of a study, the decision is made to replace actors of whatever size by local and connected sites instead of ranking them into micro and macro” (p. 179, emphasis original), Latour recognizes that both actor and network are essential to the study.

An actor-network represents a combined entity of actor and network that interacts with other actors and networks whose interaction can be traced and studied toward uncovering the sociology of the social. However, although actor-network represents tracing the activity of “local and connected sites” rather than individual actors, it doesn’t represent the assemblage of agencies that this project seeks to identify and trace. Assemblage agency represents an ecological dependence among constituent entities for activity to emerge. In online research activity, agency is theorized to emerge in collaborative ecological interactivity consisting of human and nonhuman actors, not to emerge through actor-networks centered around human and nonhuman actors. More directly, actor-networks consist of networked connectivities around actors; assemblage agency consists of actors in collaboration whose activity cannot be isolated to individual actor-networks or actors.

SEO in Action

To this point, this project has made the case that content strategists practice audience analysis, and that audience analysis can be related to SEO because content focuses on users, audience analysis focuses on meeting users’ needs, and SEO enables users to find relevant content through organic search. It has described the data-driven process by which search engine algorithms match user-generated keywords with indexed web content and has described search engine results pages (SERP) as the location that sorts the resulting web content matches in relevance order. It has made the case that rhetorical agency emerges in the interaction of human and nonhuman actors, and that those actors serve as audiences for which content strategists compose.

We can now take a deeper dive into SEO as an audience-focused process, because the closer the matches between search query (a proxy for user needs) and search results (a proxy for meeting those needs), the more successful SEO strategies are. As Goodwin (n.d.) notes above, good SEO requires technical and content strategies that are optimized to people and search engines. Throughout this section, I’ll use “human user” to represent Goodwin’s “people,” and I’ll use “search engine algorithm” or “algorithms” to represent the search engine itself. I focus on “algorithm” because the search engine itself comprises a massive data ecology; the algorithms are the focus of my attention because they serve as the nodal agents that connect user queries with search results through an established procedure.

Search Engine Algorithms as Audience

Both technical SEO and content SEO focus on algorithms as an audience. This section will outline ways that SEO targets algorithmic audiences. Gallager (2017) introduced the term “algorithmic audience” to the field of technical communication “to capture the tension between human and nonhuman factors when writing and producing content for the Web” (p. 26). This project is built around this tension, and SEO is offered as a method to address this tension through audience analysis.

While Google isn’t the only search engine, it’s the one most commonly used in the U.S. (Statista, 2024a) and worldwide (Statista, 2024b). As a result, understanding how Google describes the action of its algorithms is instructive to understanding the role algorithmic audience analysis may play in SEO. After crawling and indexing content, algorithms seek to provide relevant results to search queries. Google’s “How Search Works” online guide describes what algorithmic processes seek to accomplish:

Google’s ranking systems are designed to … sort through hundreds of billions of webpages and other content in our Search index to present the most relevant, useful results in a fraction of a second. […] To give you the most useful information, Search algorithms look at many factors and signals, including the words of your query, relevance and usability of pages, expertise of sources, and your location and settings. (“How Search Works”, n.d.)

The “factors and signals” are among the items that content strategists can seek to better understand in order to develop content optimized for the algorithms. While these “factors and signals” aren’t public knowledge and differ as proprietary trade secrets among search engines, Google provides a broad outline to help content strategists develop content using both technical and on-site SEO:

  • Meaning of query
  • Relevance of content
  • Quality of content
  • Usability of webpages
  • Context and settings

While these factors are presented in terms of human users, behind each lies technical and on-site SEO strategies targeting an algorithmic audience.

Meaning of query

For content strategists, human user audience analysis helps identify words and phrases that might be used to describe content. On-site SEO encourages strategists to include keywords and phrases within the content of webpages to ensure that crawling algorithms capture them. While natural language processing (NLP) governs the algorithmic process by which search queries are attributed meaning by search algorithms, keyword and key phrase selection and inclusion in web content ensures that anticipated search terms match content. As importantly, technical SEO encourages strategists to include keywords and key phrases in URL structure, in file names, and in hyperlinks; both technical and on-site SEO focus on including potential query terms in specific areas of structured content, like headings and subheadings, and in page metadata like the <title> attribute. In this signal, predicting what human users might enter as search queries becomes the content that strategists can implement using SEO to meet the expectations of the search engine’s algorithmic audience.

Relevance of content

Relevance is a term that relates to both the human user and the algorithm. Relevance of content is determined, at least in part, by the quality of the match between query and indexed content. Again, careful understanding of human users is critical to successfully addressing the expectations of the algorithmic audience. The “How Search Works” guide notes that the “most basic signal that information is relevant is when content contains the same keywords as your search query” (Google, n.d.). Algorithms seek to match search queries with indexed content, and the closer those matches, the more relevant the content is considered. Careful understanding of the keywords that a human user might use to describe content requires human application of human audience analysis along with the SEO strategy of including those keywords in web content.

Quality of content

Content strategists seek to write quality content that is error free and accurate. These are aspects of quality that search engine algorithms seek when indexing content. However, quality is also a factor of trustworthiness of that content. Technical SEO strategies can help boost trustworthiness, like using a domain that matches the information presented (e.g., .gov for government sites; .edu for education sites; .org for organization sites, .com for commerce sites), using a quality domain hosting provider with strong up-time statistics, and using a recognized content management system (CMS) like Joomla, Drupal, or WordPress with a strong open-source community. On-site SEO techniques to boost quality signals might include consistent site design, accurate breadcrumbs that reflect the navigation structure, and internal links, especially among related subdomains, that connect related content by keywords. In this case, off-site SEO strategies, like a large number of trusted incoming links, become important indicators of quality content. However, external links to trusted sites can also boost the content quality signal. Again, SEO strategies can help boost quality content signals so that pages are indexed and served on SERP as high-quality content. SEO is the strategy by which content strategists can meet the “needs” of algorithmic indexing and matching processes.

Usability of webpages

When addressing this factor in search results, content strategists with UX and technical communication experience excel. For usability, paying careful attention to human user experience is vital to successful SEO. But specific technical and on-site SEO interventions help ensure that usability is recognized by algorithmic audiences. For example, building mobile-friendly, mobile-optimized, and mobile-first website designs is vital to boosting usability. Similarly, page load times and load order are signals that contribute to usability. Here, too, structured content plays a significant role, as navigation, headings, bulleted lists, and chunking make content easier to skim and understand, especially on mobile devices. Implementing plain language principles throughout the site design is an important on-site intervention that can boost the usability of a page. While usability is often focused on human users in technical communication, usability, and user experience design, many of the same design and content principles used for humans also boost usability signals for the algorithmic audience. SEO is a strategy that can help address the human and algorithmic needs for content that is easy to navigate, simple to use, and effective in solving problems.

Context and settings

Content strategists have less control over factors related to context and setting. In this factor, the algorithm focuses on previous search history and geolocation to help generate relevant results. These factors aren’t controlled by content managers, but some technical and on-site SEO strategies can help algorithms identify content that matches context. For example, use of plain language renders content more easily translated to other languages by automated translation programs. Including language metadata clearly identifying content’s primary language ensures that search engines can understand and select the correct language context when attempting to translate content into another language. This is especially important if a user has set a preferred language in search settings. Providing clear alternative text to images, structuring tables consistently using accessible design, and consistently structuring content can help algorithms determine if content can be easily accessed by users with accessibility issues, so accessible tools like screen readers, screen magnifiers, and text extractors can be effectively used on the content. Content strategists have less control through SEO over this factor, but select SEO strategies remain useful in ensuring algorithms recognize the content’s context and relation to user settings.

Humans as Audiences

Human audiences are most often intended when addressing audience analysis, so no further explanation of audience analysis as it relates to human users is needed. However, there are some aspects of search engine algorithms related to humans that are important to reiterate.

Search engines and their algorithms are the portal through which humans access information in the 21st century. Human users rely on search engine algorithms to match their needs with content and products. As a result, in order for content to be found through organic search, it must appear on some kind of SERP. SEO is the set of practices that content strategists use to connect human users to content, but doing so requires algorithmic intervention. SEO addresses both human users and algorithmic audiences.

Not only must content be listed on SERP, it must appear among top-ranked results. Only about 27% of users who conduct a search using the same query and receiving the same SERP will click on the top link on the page. That’s significantly higher than CTR for any link on a subsequent page of results: “only .63% of Google searchers clicked on something from the second page” (Dean, 2023). Searchers typically don’t go beyond the top ten search results when seeking a response to a query. If search engines are the portals through which human users access information, and if only the top results on an SERP receive high percentages of clicks, then the importance of content being linked as the most relevant response to a search engine inquiry can’t be overstated. SEO is the method content strategists use to increase SERP rankings, a method that’s the result of human and algorithmic audience analysis. Almukhtar et al. (2021) echo this conclusion: “Effective SEO means a web page is more likely to appear higher on the results page of a search engine (SERP)… SEO is the process of helping to raise the rank of your website on Google and other search engines, thereby having your website in front of more [human] users.” SEO implementation ensures that matches between content strategist keywords and indexed content result in click-throughs from SERP to web content by human users.

Tracing Rhetorical Agency

SEO is offered as a method that requires both human user and algorithmic audience analysis. An SERP presents the results of algorithmic processes that seek to match user inquiries to indexed content, and SEO is used to impact the placement of content links on an SERP. Content strategists can implement SEO to ensure that human users are able to access content, and in doing so, respond to the algorithmic audience that powers search activity and results. The activity of implementing SEO can be described in terms of an assemblage, in which human, technological, and algorithmic agents combine activity to generate meaningful content through algorithm-centered search processes that meet the needs of human users.

The next step to understanding how SEO can be considered audience analysis, addressing both human and algorithmic audiences, is to be able to trace rhetorical activity in real time as it emerges from human and algorithmic activity. Tracing rhetorical activity in assembled agency through an actor-network offers a number of challenges, but its results are meaningful for (at least) the following reasons:

  • Rhetorical agency lies with the rhetor. When the rhetor is assembled in network activity, it’s important to understand where human activity ends and technological or algorithmic activity begins.
  • Power lies with the rhetorical agent. In the rhetorical situation, rhetorical agency wields power over audiences. Recognizing the sources of rhetorical agency helps trace power as it converges in sociopolitical action. (See Bennett, 2010; Cooper, 2011; Miller, 2007; Walton et al., 2019 for more on rhetorical agency and power structures.)
  • Algorithmic audiences are black-boxed. We just don’t know many particulars about how search algorithms identify signals and order results by relevance on SERP. Knowing how to trace network activity to and from search engines via algorithmic processes helps us better understand the value of SEO as algorithmic audience analysis.

The Approach

Tracing rhetorical activity during online search helps us better understand how algorithms match user-generated keywords to algorithm-indexed content. By combining the results of traditional usability testing of an online search session with a technical record of network activity of that same session, students can isolate the give and take of rhetorical agency during the search session (Hocutt, 2019). More specifically, content strategists can identify moments when algorithms match keywords to indexed content and recognize how those matches produce SERPs from those matches. This provides insight into SEO practices that can help strengthen those matches.

Tracing human activities

To trace human activities during a search session scenario, a researcher can employ usability testing software to record cursor, keyboard, and mouse activity, along with a video record of research activities, and ask the user to practice speak-aloud protocol to collect their own narrative of research activity. The scenario of the usability test can start with a prompt (e.g., “think of a topic you want to know more about and conduct an online search using a search engine to find a page that addresses your search query”) to conduct a search and end when the user clicks through the SERP to a meaningful result. This recording provides a timestamped trace of participant activity that can be transcribed and related to network activity happening throughout the search session.

To supplement this collection of data, an ethnographic observer can collect descriptive field notes during the search session, focusing on actions taken by participants in relation to their devices (mobile, desktop, laptop) and their browser technology and search habits. The following are recommended areas to observe during the search session.

  • Network speed during session (using a tool like SpeedTest by Ookla [https://speedtest.net])
  • Environmental conditions during session (temperature, lighting, cleanliness, orderliness, seating area, comfort)
  • Participant appearance and unrecorded actions during session (arrival timeliness, comfort with search, questions asked, willingness to participate, apparent search literacy)
  • Technology used during session (device make and model, other technologies running in the background, ad blockers and other mediating apps in use and/or deactivated for the session)

For additional context, a post-search-session survey can ask the user to record any additional detail about their intentions, state of mind, and approach to the search session. Questions to include on the survey might include the following:

  • Have you used the search tool used in the search scenario before this activity? If so, characterize your level of experience with this search tool (novice, intermediate, expert).
  • Describe the environment in which you are conducting this activity. Be as descriptive as possible; complete sentences are not required.
  • Were you logged in to a search or social media account(s) while using your browser to complete this activity?
  • Summarize the research assignment or project you used to complete this usability test. Provide as much detail as possible.

When combined with the transcribed, timestamped record of the search session (including talk-aloud protocol) and the descriptive notes taken during the session, a portrait of the human user’s meaning-making rhetorical activities is available for examination and correlation with the network activities collected in the next step.

Tracing network activities

To trace the activity of networks, including algorithmic processes occurring during the search session, a web browser’s development tools can be used. During the recorded usability test, developer tools can be opened in the browser (all modern web browsers include developer tools) and network activity can be recorded. As the user completes the usability testing scenario, network activity is recorded, including network assets downloaded, cookies written to the browser, data collected from servers, data sent via tracking pixels to servers, search queries submitted, and search results returned. Upon completion of the search scenario and concluding the usability test recording, the network activity captured using developer tools can be downloaded as HTTP archive (HAR) files, which are multidimensional JSON objects that can be visualized using a HAR visualizer. Using a HAR visualizer, the network data can be traced chronologically using timestamps.

A sample visualization of a HAR file of a search session (see Fig. 1) contains a record of all network traffic and its content loaded when a search interface page loads. For each network asset loaded, the following details are recorded: its URL, its load status, its load timing relative to other assets on the page, and its timestamp. Additionally, for each asset, the information requested, the response and its content received, any cookies sent or received, and the amount of time the asset took to be sent and received are also collected.

In Figure 1, a Facebook Javascript asset row (“https://connect.facebook.net/en_US/all.js”) is highlighted at timestamp 17:46:27.068 (hh:mm:ss:mss UTC), showing the timestamp and the waterfall asset load time. In the detail to the right of the figure are tabs containing the information sent and received by the selected asset in the server request (contents visible in Fig. 1), the server response, the response content, any cookies sent and/or received with the request, and a labeled asset load timeline broken into segments. A detail of the timeline appears in Figure 2, measured in milliseconds (ms).


Figure 1: Portion of a HAR file visualization report for a library search page.


Figure 2: Timing tab detail in the HAR file visualization shown in Fig. 1.

The timestamped network activity contained in the HAR files can be transcribed and associated with the timestamped transcription of the recorded search session. The resulting spreadsheet provides a complete, timestamped report of user and network activities, accurate to the millisecond, as human and algorithmic actors interact to create meaning in the form of the SERP and a search result selected by the participant. A sample CSV file showing the format of such a combined timeline available for download at danielhocutt.com/posthumanagency. Its headings are as follows (see Table 1):

  • Usability Test Elapsed Time
  • HAR Timestamp
  • Activity
  • Comment
Assembled rhetorical agency

By matching timestamps from the usability test recording and the HAR files, a clear, chronological picture of rhetorical agency emerging during search activity emerges where human action in the form of entering search queries, reviewing SERP, and clicking through to relevant search results can be placed in chronological order, often to the millisecond, with network activities in the form of downloaded webpage assets; uploaded data from queries and from algorithmic activities like tracking pixels; downloaded SERP components; and uploaded and downloaded network calls that result for clicking through to a relevant result. Preparing data for analysis in this way is meticulous and time consuming, but the reward is the ability to point directly to specific SEO actions that result in relevant search results. Below I briefly illustrate how SEO techniques are recorded in tracing rhetorical agency using these methods.

  • The usability test transcript can specify the goal of entering a specific search query, helping the researcher understand keyword selection. These can then be matched to the keywords used by the content strategist in building web content.
  • The terms entered get captured and transmitted to the search engine’s server, which can be viewed in the network activity. Additional data transmitted to the server includes browser settings, browsing history, tracking pixel data, and cookie data, all used to provide context to the algorithm as it connects search queries to indexed data.
  • The SERP components including relevant results get downloaded via HTML and written to the browser, revealing the way query keywords get matched to indexed content and returned in response to the search query.
  • The SERP can be viewed in the usability test recording to reveal all the results the search engine generated in response to the search query. Viewing the full list of returned results in the SERP can provide insight into relevance sorting, a response to successful SEO practices.
  • The usability recording and post-session survey can provide the user’s rationale for either revising the search in response to irrelevant results or for selecting a relevant result. In both cases, insights into the way users determine relevance among search results can help content strategists hone their content for SEO.

A truncated example of the CSV datafile referenced above is shown in Table 1. It identifies user activity and browser and network activity while searching.

Although severely truncated (the entire dataset consists of 653 rows and spans an elapsed usability testing time of 13 minutes, 30 seconds), the data in Table 1 approximates the interplay of rhetorical agency among the search user and the browser. Rows containing only UT Elapsed Time values represent user agency, rows containing only HAR Timestamp values represent technological agency, and rows containing both UT Elapsed Time and HAR Timestamp values represent assembled rhetorical agency among search user and browser. I’ve intentionally used the term “among” rather than “between” with the pair user and browser. User represents an entire ecology of literacies, experiences, knowledge, and environmental factors surrounding them, while the browser represents an entire ecology of data sources, algorithmic processes, machine learning, artificial intelligence, network actions, and technical protocols supporting it. This is the actor-network that assembles around search and its results.

Table 1: Abridged data table showing user and browser activity during observed search session.

Connecting Agency Tracing to SEO

When provided with this tracing of rhetorical agency, content strategists can overlay the chronological recordings with their SEO strategies to better understand how successfully SEO met human and algorithmic audience expectations. For example, a user who repeatedly revises search queries to generate a new SERP with fresh results may be using queries with keywords that SEO didn’t predict and therefore didn’t include in crawled content. Or the relevant content may exist on a website, but its infrastructural elements may lack technical SEO execution and result in poor SEO, content un-crawled and un-indexed despite being relevant.

In the method described above, the user ecology captured while conducting a recorded usability test provides insight into the user’s approach to the search task. Data collected during the test may identify relevant details about the user-as-audience and their background, algorithmic literacy, and knowledge of the topic. For example, a user researching a historical event without adequate knowledge of the event’s causes and background may enter keywords in their queries that are broader than the keywords embedded in the content through SEO processes. While SEO as audience analysis can’t anticipate every human audience’s background and literacy, the method outlined above can help practitioners expand or adapt their SEO practices to meet those human audience needs.

Similarly, the technological ecology captured using browser developer tools provides insight into the technological audience’s approach to search. Data collected during the test may identify relevant details about the network calls and responses, the data passing between the browser and server, and the timeframe in which responses are generated. For example, the developer tools may identify network calls that are unanswered and generate time-out errors, or the tools may identify specific networks whose calls and responses lag behind other networks. Technological audiences benefit from strong SEO practices like ensuring load times are within mobile browsing parameters, and the methods outlined above can help practitioners expand or adapt their SEO practices to meet those technological audience needs.

And finally, the assembled agency surrounding the combined activity of human user and technological user that is captured in the merged timestamped file in this method offers unique insight into the role SEO plays in the overall search experience. Without the interplay among users entering queries and algorithms responding to those queries with relevant content, content strategists miss seeing the results of their SEO practices. When SEO practice identifies human audience expectations effectively and develops content signals attractive to its technological audiences, both audiences succeed in an assembled meaning-making exercise. Success may be revealed in brief search sessions that result in relevant content provided quickly by means of technological agency matching human agency, made possible through SEO as a technique for assembled audience analysis.

These methods, a hybrid combination of rhetorical usability studies and web development methods, emphasize the importance of understanding how web content is indexed and matched to user-entered queries. Understanding this matching process requires audience analysis of the indexing bots and algorithms. As a result, approaches to teaching SEO within the framework of TPC’s rhetorical foundations require understanding the algorithmic audiences of SEO practices. By applying existing methods of audience analysis to search algorithms, content strategists can improve SEO and help surface relevant content for their human users.

Framework for Using SEO as Audience Analysis

The goal of this project has been to reveal to content strategists, along with UX and TPC researchers and teachers, the value of SEO as an audience analysis technique for both human and algorithmic audiences. While SEO isn’t directly related to audiences, its success is the difference between algorithmic and human users accessing or missing relevant search results that meet their expectations. Human audiences expect information that addresses their queries, while algorithmic audiences expect signals for indexing and matching content.

A relatively simple framework for SEO as audience analysis might start by focusing on each audience separately. Search engines use algorithms to crawl and index (and iteratively re-crawl and re-index) web content in search of specific signals. These signals, as described above, are boosted by SEO techniques, effectively “attracting” algorithms as assembled technological audiences to crawl and index well-structured content composed using meaningful keywords. Search engines respond to human user queries by attempting to match query keywords to indexed keywords, then structure content on SERP for usability and provide relevant results for click through. The relevance of search results is boosted by SEO techniques, effectively “attracting” human users as audiences to search for information and select relevant results. The common element in both algorithm-focused and human-focused processes is SEO, which works best when both audiences are carefully analyzed and addressed. While there are hundreds, perhaps thousands, of technical and on-site SEO techniques, one of the core strategies behind effective SEO is keyword selection and inclusion.

In response, I offer the following as a simplified approach to understanding SEO as audience analysis. While steps 1–3 are commonly used in TPC and UX praxis, step 4 offers an additional step to determine whether intended audiences are assembling around queries and SERP.

  1. Identify keywords that a human audience might use to describe content.
    1. Conduct a careful human audience analysis, using the strategies outlined throughout the UX, TPC, and content strategy fields.
    2. Consider the complexities of human audiences, including their level of expertise, their knowledge of the subject, and the likelihood that multiple audiences will encounter this content for varying purposes.
    3. Develop an exhaustive list of keywords that predicts the variety of approaches that audiences will use to describe the content.
  2. Use technical and on-site SEO practices to incorporate keywords into content.
    1. Confirm that content is fully crawlable and meets relevance, quality, and usability factors.
    2. Include keywords in site infrastructure, including folder names, file names, and navigation text throughout the site.
    3. Include keywords in structured content, confirming their appropriate use in headings, subheadings, bulleted lists, and chunked content.
  3. Determine the effectiveness of SEO as audience analysis.
    1. Give search engine algorithms adequate time to crawl or re-crawl content.
    2. Conduct searches using multiple user profiles and keywords identified in step 1.
    3. Analyze SERP to determine if the content you’ve built appears in top search results.
  4. Take a sample query and trace rhetorical agency through the search process using the method described above.
    1. Visualize the effectiveness (or ineffectiveness) of SEO in matching human and algorithmic audience expectations to content.
    2. Focus on whether multiple searches need to be conducted to yield expected results.
    3. Troubleshoot problem areas if they exist.

Conclusion

This project aimed to present search engine optimization (SEO) as a method for audience analysis that can be applied to both human audiences and algorithmic audiences through the activity of online search using a search engine. To do so, it presented SEO as a rhetorical activity that addressed human and algorithmic audiences. It continued by describing the rhetorical agency of SEO as an actor-network consisting of assembled human, technological, and algorithmic actors combining efforts to deliver relevant content in response to online search. The project then focused on SEO techniques, demonstrating how these techniques meet the needs of both human and algorithmic audiences. It offered a method for tracing rhetorical agency through online search, and of connecting specific sections of that tracing activity to SEO techniques. It concluded with a basic framework for content strategists to use SEO as audience analysis through keyword development that persists during web content development and deployment.

Online search is built upon content. Successful online search, measured by the delivery of the most relevant responses as quickly as possible, depends on SEO. Successful SEO requires understanding human audience needs and search algorithm expectations. The combined, assembled rhetorical agency that emerges from an online search session requires understanding the rhetorical situation as an actor-network, combining the activities of human, technological, and algorithmic agents to deliver relevant results. SEO, along with methods of tracing rhetorical agency, can provide content strategists with important analysis techniques that apply across human and algorithmic audiences.

References

Albers, M. J. (2003). Multidimensional audience analysis for dynamic information. Journal of Technical Writing and Communication, 33(3), 263–279. https://doi.org/10.2190/6KJN-95QV-JMD3-E5EE

Almukhtar, F., Mahmood, N., & Kareem, S. (2021). Search engine optimization: A review. Applied Computer Science, 17(1), 70–80. https://doi.org/10.23743/acs-2021-07

Bennett, J. (2010). Vibrant matter: A political ecology of things. Duke University Press.

Biesecker, B. A. (1989). Rethinking the rhetorical situation from within the thematic of “différance.” Philosophy & Rhetoric, 22(2), 110–130.

Bitzer, L. F. (1968). The rhetorical situation. Philosophy & Rhetoric, 1(1), 1–14.

Breuch, L.-A. M. K., Zachry, M., & Spinuzzi, C. (2001). Usability instruction in technical communication programs: New directions in curriculum development. Journal of Business and Technical Communication, 15(2), 223–240.

Cardinal, A. (2022). Superdiversity: An audience analysis praxis for enacting social justice in technical communication. Technical Communication Quarterly, 31(4), 343–355. https://doi.org/10.1080/10572252.2022.2056637

Confetto, M. G. & Covucci, C. (2021). “Sustainability-contents SEO”: A semantic algorithm to improve the quality rating of sustainability web contents. TQM Journal, 33(7), 295–317. https://doi.org/10.1108/TQM-05-2021-0125

Cooper, M. M. (2011). Rhetorical agency as emergent and enacted. College Composition and Communication62(3), 420–449.

Dean, B. (2023, May 28). We analyzed 4 million Google search results. Here’s what we learned about organic CTR. Backlinko. https://backlinko.com/google-ctr-stats

Edbauer, J. (2005). Unframing models of public distribution: From rhetorical situation to rhetorical ecologies. Rhetoric Society Quarterly, 35(4), 5–24.

Flanagan, S., Getto, G., & Ruszkiewicz, S. (2022). What content strategists do and earn: Findings from an exploratory survey of content strategy professionals. Proceedings of the 40th ACM International Conference on Design of Communication, 15–23. https://doi.org/10.1145/3513130.3558973

Gallagher, J. R., Chen, Y., Wagner, K., Wang, X., Zeng, J., & Kong, A. L. (2020). Peering into the internet abyss: Using big data audience analysis to understand online comments. Technical Communication Quarterly, 29(2), 155–173. https://doi.org/10.1080/10572252.2019.1634766

Gallagher, J. R. (2017). Writing for algorithmic audiences. Computers and Composition, 45, 25–35. https://doi.org/10.1016/j.compcom.2017.06.002

Getto, G., Labriola, J., & Ruszkiewicz, S. (Eds.). (2020). Content strategy in technical communication. Routledge.

Getto, G., & Flanagan, S. (2023, June 27). Content strategy in TPC: How do we teach and train content strategists? https://www.guiseppegetto.com/2023/06/26/content-strategy-in-tpc-how-do-we-teach-and-train-content-strategists/

Goodwin, D. (n.d.). What Is SEO—Search Engine Optimization? Search Engine Land. https://searchengineland.com/guide/what-is-seo

Herald, C. B. (2022). Technical communication across the professions. https://oer.pressbooks.pub/techwritingacrosstheprofessions/

Hickman, D. W. (2014, October 3). Audience analysis for technical documents. Writing Commons. https://writingcommons.org/article/audience-analysis-primary-secondary-and-hidden-audiences/

Hocutt, D. L. (2019). Rhetorical agency in algorithm-centered digital activity: Methods for tracing agency in online research (Order No. 22619899). ProQuest Dissertations & Theses Global. (2307147076). https://digitalcommons.odu.edu/english_etds/92

Hocutt, D. L., Ranade, N., & Verhulsdonck, G. (2022). Localizing content: The roles of technical & professional communicators and machine learning in personalized chatbot responses. Technical Communication, 69(4), 114–131. https://doi.org/10.55177/tc148396

How search works—How Google search works. (n.d.). Google. https://www.google.com/search/howsearchworks/how-search-works/

Ibhadode, O., & Opesade, A. (2022). An assessment of website quality at Nigerian polytechnics and colleges of education. The African Journal of Information and Communication (AJIC), 30, Article 30. https://doi.org/10.23962/ajic.i30.13987

Jarvis Kwadzo Bokor, M. (2011). Connecting with the “Other” in technical communication: World Englishes and ethos transformation of U.S. native English-speaking students. Technical Communication Quarterly, 20(2), 208–237. https://doi.org/10.1080/10572252.2011.551503

Latour, B. (2005). Reassembling the social: An introduction to actor-network-theory. Oxford University Press.

Lauer, C., & Brumberger, E. (2016). Technical communication as user experience in a broadening industry landscape. Technical Communication, 63(3), 248–264.

Melonçon, L. (2017). Embodied personas for a mobile world. Technical Communication, 64(1), 50–65.

Miles, K. S. (2009). Reconceptualizing analysis and invention in a post-techne classroom: A comparative study of technical communication students. Technical Communication Quarterly, 19(1), 47–68. https://doi.org/10.1080/10572250903373056

Pavlik, V. (2022, December 4). What Is SEO? Meaning, examples & how to optimize your site. Semrush Blog. https://www.semrush.com/blog/what-is-seo/

Rose, E. J., & Schreiber, J. (2021). User experience and technical communication: Beyond intertwining. Journal of Technical Writing and Communication, 51(4), 343–349. https://doi.org/10.1177/00472816211044497

Ross, D. G. (2013). Deep audience analysis: A proposed method for analyzing audiences for environment-related communication. Technical Communication, 60(02), 94–107.

Schultheiß, S., & Lewandowski, D. (2021). “Outside the industry, nobody knows what we do”: SEO as seen by search engine optimizers and content providers. Journal of Documentation, 77(2), 542–557. https://doi.org/10.1108/JD-07-2020-0127

Society for Technical Communication. (n.d.). Glossary–U—Technical Communication Body of Knowledge (TCBOK). https://www.tcbok.org/tools/glossary/glossary-u/

St.Amant, K., & Melonçon, L. (2016). Reflections on research: Examining practitioner perspectives on the state of research in technical communication. Technical Communication, 63(4), 346–364.

Starke-Meyerring, D., Duin, A. H., & Palvetzian, T. (2007). Global partnerships: Positioning technical communication programs in the context of globalization. Technical Communication Quarterly, 16(2), 139–174.

Statista. (2024a). Global search engine desktop market share 2023 [dataset]. https://www.statista.com/statistics/216573/worldwide-market-share-of-search-engines/

Statista. (2024b). U.S. search engine market share queries handled 2023 [dataset]. https://www.statista.com/statistics/267161/market-share-of-search-engines-in-the-united-states/

Tijerina, T., Powell, T., Arnett, J., Logan, M., & Race, C. (n.d.). Open technical communication (4th ed.). Kennesaw State University. https://alg.manifoldapp.org/projects/open-technical-communication

van Velsen, L., van der Geest, T., & Steehouder, M. (2010). The contribution of technical communicators to the user-centered design process of personalized systems. Technical Communication, 57(2), 182–196.

Vatz, R. E. (1973). The myth of the rhetorical situation. Philosophy & Rhetoric, 6(3), 154–161.

Walsh, L., Rivers, N. A., Rice, J., Gries, L. E., Bay, J. L., Rickert, T., & Miller, C. R. (2017). Forum: Bruno Latour on rhetoric. Rhetoric Society Quarterly, 47(5), 403–462. https://doi.org/10.1080/02773945.2017.1369822

Walton, R., Moore, K. R., & Jones, N. N. (2019). Technical communication after the social justice turn: Building coalitions for action. Routledge.

Where US adults start their search when shopping online, Aug 2022 (% of respondents) [dataset]. (2022, September 13). Insider Intelligence. https://www.insiderintelligence.com/chart/260194/where-us-adults-start-their-search-shopping-online-aug-2022-of-respondents

Yu, H. (2012). Intercultural competence in technical communication: A working definition and review of assessment methods. Technical Communication Quarterly, 21(2), 168–186. https://doi.org/10.1080/10572252.2012.643443

About the author

Dr. Daniel L. Hocutt serves as web manager on the marketing and engagement team at the University of Richmond School of Professional and Continuing Studies in Richmond, Virginia. He teaches as an adjunct professor of liberal arts at the same institution. His research interests include data analytics and AI in technical and professional communication, literacies for digital life, and posthuman rhetorical agency. He’s a research member of the Building Digital Literacy research cluster in the Digital Life Institute, publishing research with collaborators in Technical Communication, Communication Design Quarterly, and Computers & Composition. He’s also published in Present Tense and the Journal of User Experience along with several edited collections. He’s worked as a secondary English teacher, a desktop publisher, a residential gifted education program administrator, and a free-lance web developer and small office technologist in addition to his current work and research.


1 For a thorough review of these processes, see Hocutt (2019), pp. 14–49.