Features September 2020

Developing a Knowledge Base on the Cheap: A Case Study

By John Campbell | STC Member

In 2011, I was hired by Viasat, a satellite-based broadband internet provider, as the sole technical writer to develop a new system for handling network operations center (NOC) troubleshooting documentation. The NOC managers were concerned that their current system was untenable with an upcoming satellite and network launch (meaning a huge increase in documentation). Viasat’s existing documentation setup consisted of a couple hundred Word documents on a shared server. Version control was little more than a manually maintained table at the end of each Word document.

The NOC wanted to upgrade from this relatively chaotic system to a wiki-type site that would allow their techs to add and edit procedures directly in a web browser. They wanted it to be searchable, and they wanted the ability to define user groups with group-specific permissions. They also wanted approval workflows so that a group of editors could modify documents and a different group of approvers could review edits before they went live to ensure the accuracy of the troubleshooting processes. The managers wanted automated version control and the ability to roll back to previous versions. They also wanted to be able to customize the system and add in-house features, like automations created by internal developers. Additionally, the company wanted to have reporting on the new system so that they could learn things like which procedures were the most commonly used or how many procedures specific users edited over the course of a month.

As for the budget for this effort, Viasat hoped to leverage existing systems or develop a new system that wouldn’t cost much. A turnkey solution would be costly, since rotating crews in the NOC would likely require securing a license for each tech. I was encouraged to do some research on potential solutions—the cheaper, the better. I was given this task on day one (after I finished the requisite corporate trainings), so I went home and had a little panic attack . . . then started researching.

Open-Source Solution

After looking into several open-source options, I provided the managers with an overview of my findings and recommended a content management system (CMS) called concrete5. The cost was negligible, as the application required only space on a server with Apache (web server software), PHP (a scripting language), and MySQL (an open-source SQL database). Plus, there was good documentation and a well-populated user community available to answer questions and troubleshoot issues. I demonstrated an installation of concrete5 on my laptop to show the managers how the CMS met most of their requirements and received approval. It was time to build this thing for real.

The initial concrete5 installation was performed on an existing server that handled our monitoring system, requiring that I work with the monitoring team to get the CMS configured properly. I then began to develop the knowledge base using the concrete5 user interface, creating page templates, user groups, approval workflows, and so on.

However, Viasat wanted some custom features that weren’t part of the concrete5 core application, and this required some PHP research and support from the concrete5 community. For example, the managers wanted automated emails to be sent when a page was edited or published (a functionality that didn’t exist in that early version of concrete5). I found a discussion in the concrete5 user forums that was related to this idea, and several users not only provided samples of code I could use but the logic behind them so I could understand how to customize the code from the forum to enable our site to trigger emails when a page was added, updated, or deleted.

Once the site framework and features were in place, I created training documentation to instruct users how to add and edit pages in the new system, and I ran some classroom training sessions to walk people through the process. I then converted all the existing troubleshooting procedures into pages in the knowledge base, and we were up and running. Although the site’s search functionality wasn’t great, and I still had to generate reports manually for the managers, it was a pretty significant improvement over the previous setup.

Growth

A couple of years later, in 2014, Viasat’s Technical Operations organization (which oversaw the NOC) began working toward certifications that required a more robust documentation process. While the knowledge base began as a troubleshooting repository for network operations (i.e., maintenance of the satellite and network), as a result of the certification work, we now needed to review our overall documentation situation and consider expanding the scope. Most of our documentation was siloed, meaning that crucial information was difficult to find or wasn’t making it to the right users.

We again had a list of requirements, and I was tasked with reviewing our options. This time, we also looked at the corporate wiki and a knowledge base associated with our ticketing system. Again, we went with concrete5. This CMS gave us more flexibility and oversight than the wiki, and it was (obviously) significantly cheaper than paying extra for the knowledge base element of our ticketing system. However, we needed to scale up in a big way.

Scaling up from the existing site to a new knowledge base required more resources. To aid in the day-to-day documentation tasks as well as the development of the new site, we hired a second technical writer. Finally, we had a documentation team! We then worked with the IT department to secure two virtual servers (development and production) so we could manage our own application instead of relying on a separate team, and we were given the responsibility of installing the latest version of concrete5. The version we had used in the previous site base was several versions (and a couple of years) out of date; we had not upgraded due to the difficulty of engaging the monitoring team for making changes on their server.

Our increased autonomy and responsibility required gaining new skills to deliver the new site. For example, we needed to become proficient working from the command line in Linux to manage the installation of concrete5 and to configure Apache and PHP to ensure the site would run smoothly. We needed to become more familiar with the architecture of the concrete5 application so that we could customize it to meet our users’ needs. We had to immerse ourselves in HTML5 and CSS so that we could give a fresh, clean look to the new site. Finally, we needed to study the MySQL database structure to understand the data relationships in concrete5 and how the system operated. I knew next to nothing about these areas when I started at Viasat, so I had to learn quickly, spending a lot of time visiting Codecademy, Stack Overflow, and W3Schools. With these new skills in place, we were able to deliver a new site with more features to a wider user community.

Over the past few years, the site has continued to grow as more teams outside of Technical Operations have learned about our ability to create custom pages and environments, our regular management of the content, and our intermittent delivery of new features. Today, the site has become an overall knowledge repository for multiple teams throughout the company. Where the initial site served about 25 users, we now have over 200 regular users and contributors and over a thousand registered accounts. We’ve gone from sharing a server with the monitoring team to our own dedicated virtual machines, and we’re looking into moving to the cloud in the future. We now have thousands of pages of process and reference documentation and about 20 custom page templates for different teams.

Data Analytics and Reporting

While our knowledge base was evolving, however, our reporting process had stagnated. Although concrete5 has some native reports in the dashboard—primarily error logs and the like—these weren’t useful for our stakeholders. As a result, we continued reporting on user edits, approvals, and page changes to the NOC management, compiling data into an Excel spreadsheet through a manual process that took several hours a week. As the knowledge base grew and more teams relied on it for their documentation needs, however, teams began to ask for more information about the site, such as most visited pages, how many users we had, and the number of edits by page type. We needed some data analysis capability that wouldn’t involve us staring at a spreadsheet for hours each week.

Enter Splunk, a data analysis platform that Viasat was already using to monitor devices in our network. We piggybacked on the platform and connected it to our MySQL database and production server to pull Apache access logs, which we had to learn how to read. The research we had done on the relationships in our database guided us in writing the queries to create the reports we wanted—we knew which columns and tables we needed to pull information from. These connections allowed us to use Splunk to mine data about how our site was being used in ways that weren’t available to us through the concrete5 user interface.

With our new Splunk knowledge, we developed many automated reports, dashboards, and alerts. A few examples include the following:

  • A report on pages edited that we’d been compiling manually for years
  • A dashboard displaying the number of pages added per month over the past year
  • Automated emailed reminders that a page hadn’t been updated in a year and needed to be reviewed for accuracy
  • Most visited pages, most common search terms, and so on

Automating the report that we’d once manually complied saved us at least a couple of hours each week. If we were manually creating all of the reports we currently generate with Splunk, the effort would represent hundreds of hours of work per year, so this is a really powerful and useful tool for us.

Where We Are Now

The knowledge base has become one of the company’s integral systems, well beyond the original scope of the initial troubleshooting document repository. We continue to add new features from the concrete5 community to enhance the site. For example, we contracted with one of the members of the community (whom we met at a PHP conference in Portland, Oregon, in September 2018) to develop a customized search feature that has greatly improved the quality of the search results on the site. We also purchased an LDAP authentication package that allows users to log in with their corporate credentials instead of remembering yet another username and password. All in all, the total cost of this system is less than $1,500, and over the course of developing and growing this system, we learned how to write PHP code, create MySQL queries, and manage a Linux server—skills that make us more flexible in our work and better able to contribute in other areas of the company.

Looking back at our experience developing our knowledge base, there have been many benefits—and a few challenges—over the years. Obviously, the biggest benefit is the cost. Being able to create a customized knowledge base for next to nothing has been great for our users and managers. We gained many new skills that have been useful in other situations at Viasat; these skills definitely weren’t part of our original job descriptions, but they increase our value. And the autonomy and control we have over the management of our site is wonderful. We can jump in with a fix if we encounter an issue, and we can roll out a new feature without having to wait on some software company’s release schedule.

There are some challenges, however. Quite simply, this isn’t a turnkey solution, and it’s not as ubiquitous as Microsoft Word or SharePoint, so some teams have been slow (or have outright refused) to adopt the site as their documentation solution because it’s “not what they’re used to.” It’s also a bit more work to integrate it with other systems, and we’ve also had to manage expectations about what the knowledge base is intended to do. It doesn’t allow you to edit spreadsheets in the browser like SharePoint, for example, and it doesn’t have the same robust search ability that Google search does, so we try to be straightforward with our users about what the site can do and what it can’t. Finally, because we developed the site, some users have been hesitant to take ownership of the documentation; instead of adding or editing pages in the site, they view that as our role, which defeats some of the purpose of a site where everyone can contribute. Still, the benefits have far outweighed the challenges.

Do It Yourself

Note that my case study involves concrete5 and Splunk, but there are many open-source content management systems and data analysis platforms out there. Feel free to explore!

To build a knowledge base and reporting solution like the one I’ve described above, follow these steps:

  1. Set up the environment. This is the machine on which your site will run (laptop, desktop, virtual machine or server). The machine should have a 2+ GHz CPU, 12 GB of memory, and a 64-bit OS.
  2. Ensure that the machine has up-to-date versions of PHP, MySQL, and Apache. There are free packages that will install all three applications on a machine. If you’re on a MacBook, MacOS has Apache and PHP built in, so you’ll need only to install MySQL separately.
  3. Create a MySQL database for use with concrete5.
  4. Download and install concrete5.
  5. Download and install Splunk.
  6. Install the DB Connect add-on for Splunk, which also requires installation of the Java Development Kit (JDK) and the appropriate Java Database Connection (or JDBC) driver.
  7. Set up a connection to your MySQL database from the DB Connect interface in Splunk.

While I benefited from Viasat’s enterprise license, Splunk does have a free option. Splunk Free provides 500 MB of daily indexing and a single user to generate reports, dashboards, and so on. This would be useful for smaller organizations in which you might be the primary user generating reports on your knowledge base for management. Splunk also has relatively inexpensive enterprise licensing options that are dependent on data index volumes. In our concrete5/Splunk model, Splunk connects to our database and doesn’t index data at all, so the volumes are very low—likely within the 1 GB per day minimum. This type of license would cost $150 per month, billed annually, for a cost of $1,800 per year. This option allows you to provide metrics dashboards to multiple users who visit your Splunk site, as well as generate automatic alerts for triggered scenarios, such as sending automatic emails to users when their pages reach an out-of-date threshold.

 

John Campbell (john@campbell.fyi) is a senior technical writer for Viasat and currently serves as the president of the Rocky Mountain Chapter of STC. He has 16 years of experience in technical writing, including writing software user manuals, editing medical college courseware, and managing a corporate knowledge base. As a new technical writer, John’s main interest was in document design and styling. Now, however, he spends time writing code, developing database queries, managing workflows, and authoring the occasional procedure. In his spare time, John likes hanging out with his family, playing disc golf, going to heavy metal shows, and checking out the craft beer scene in Denver.

See John's extensive list of links to resources for building your own knowledge base in the web version of this article at www.stc.org/intercom.