Features

Captioning Videos on YouTube

By Karen Mardahl | Senior Member

With video growing in popularity as a form of documentation and training, technical communicators must include captions or transcripts on their video checklist.

Why?

Ask people who are hard-of-hearing or Deaf, who work in noisy environments, or who cannot use the sound on their computer for various reasons. Captions make the video perceivable and understandable to those people. Captions make your video inclusive.

This article explains how to caption videos using YouTube's auto-caption feature. (Note: You need a free YouTube account to use this feature.)

Uploading Your Video

Figure 1. Uploading a video to YouTube.

Figure 1. Uploading a video to YouTube.

Upload a video to YouTube. (The four-minute, 18 MB video processed for this article took less than two minutes to upload.) While you wait, provide a clear title, a description, the appropriate tags, and a category to help people find your video. You might want to set the privacy settings to Private while you edit your video.

During the processing, YouTube extracts the images that are used for displaying the video on its Web pages.

When YouTube is finished processing your video, you can add a caption track as shown in Figure 2.

Figure 2. When the processing is completed, you can add a transcript.

Figure 2. When the processing is completed, you can add a transcript.

YouTube's machine translation is part of the processing, but results are generally poor and will always require a heavy edit. In my video, the machine translation failed, likely due to the speaker's British accent.

Luckily, I had a transcript prepared for this video, so I clicked “Add New Captions or Transcript.”

When uploading your transcript file (as .txt format), choose the type “Transcript.”

Processing Your Transcript

Figure 3. Uploading the transcript file.

Figure 3. Uploading the transcript file.

Currently, you can only upload an English-language transcript. (I discuss more on adding languages later in this article.)

YouTube processes a transcript file very quickly. My four-minute video took two or three minutes.

Figure 4. YouTube creates a caption file out of your transcript file.

Figure 4. YouTube creates a caption file out of your transcript file.

Viewing the Caption File

Figures 5 and 6 show what magic YouTube performed on your transcript file.

Figure 5. You can turn on captions now that the transcript file is transformed into a caption track.

Figure 5. You can turn on captions now that the transcript file is transformed into a caption track.

Click the arrow at the bottom right of the video frame to get to the closed captioning (CC) option to turn on captions. The transcript I submitted was matched to the video with amazing accuracy. I bow to the algorithms that made this possible.

Of course, editing is needed, but it was easy to catch errors in my original transcript when I listened to the audio and read the file at the same time.

Captioning is not just the spoken word. External sounds, such as doorbells or background music, need to be conveyed to the viewer. A speaker might use a silly voice or break into song, which cannot be seen in the images. This type of information should be included in square brackets so those watching but not listening can receive more of the details. In my example, Bruce says “Ding” as a kind of verbal indication of his movement while he holds up a book. I wanted to connect his expression to his action. Figure 6 shows my attempt to give the text an extra layer and capture Bruce's personality in his talk.

Figure 6. Captions are turned on for the reviewing and editing process.

Figure 6. Captions are turned on for the reviewing and editing process.

Editing the Caption File

To edit the caption file, I download a copy of the transcript that I just uploaded. (The download button is shown in Figure 6.) I need to do this because my original transcript has been processed and now contains time codes. It was also renamed to captions.sbv.

Use HTML editors to edit this file. (Mac users can also edit with TextEdit in Plain Text mode, but Windows users should not use Notepad because it doesn't retain the text layout.) Make sure you keep the .sbv extension. The time codes that were added to your transcript file are the great value of using YouTube for captioning. YouTube saves you the tedious chore of manually adding time codes.

Figure 7. The captions.sbv file showing YouTube time codes.

Figure 7. The captions.sbv file showing YouTube time codes.

The format for time codes is 0:00:00.000—hours, minutes, seconds, and milliseconds. Each time code covers two lines of text; these two lines are what will be displayed in that time span.

The first block in Figure 7 shows a time code of 0:00:00.240, 0:00:07.240. The audio starts at 240 milliseconds into the video file, and the two lines are displayed for a period of 7 seconds. The next block begins at 0:00:07:649, or 409 milliseconds after the previous block. You can modify these time codes as needed, but you should not need to make major adjustments.

Review the video with this caption file opened for editing. Not only can you correct misunderstandings and grammar or spelling errors, but you can also evaluate the flow of the text and the images—for example when one phrase appears for too short a time span or another phrase remains on the screen too long.

Uploading the Edited Captions File

Figure 8. The page for editing captions and subtitles is found in the overview of your videos and playlists.

Figure 8. The page for editing captions and subtitles is found in the overview of your videos and playlists.

Upload the edited captions.sbv file on the “Captions and Subtitles” page. (Delete the existing caption file uploaded. Both files are called captions.sbv, and YouTube doesn't ask about overwriting a file!)

Review the video again to be sure that everything is to your liking. Tweak again, if necessary. If you set your video to Private while working on it, make it public now.

The Bonus!

The bonus to using YouTube is that the captions file becomes an interactive transcript! The icon for the interactive transcript is next to the video description (see Figure 9).

Figure 9. An Interactive Transcript option is located under captioned videos.

Figure 9. An Interactive Transcript option is located under captioned videos.

The interactive transcript is extremely useful. If you are searching for a bit of information in a training video, you don't have time to listen to the entire video. With the interactive transcript, you can skim the video as shown in Figure 10.

Figure 10. The interactive transcript is great for skimming a video.

Figure 10. The interactive transcript is great for skimming a video.

Another bonus is the ability to translate the transcript into other languages. Remember how your transcript file had to be in English? Now that you have a transcript with time codes, you can make a copy of the file, translate it, and upload it to YouTube. Viewers then have the option to choose the languages you provide with the video.

When you upload your translated files, you can upload them as captions, not as transcripts, because the time codes are already in place.

When naming the extra language files, I suggest using language codes; for example, a German language file might be captions-de.sbv.

Some Statistics

With the four-minute video I used as an example for this article, I uploaded and transcribed it in 30 minutes, then uploaded the transcript file and waited for YouTube to process it in 10 minutes, for a total of 40 minutes. I also resolved some slang issues in the video via email—all part of the good editor's job!

In another example with a transcript copied from a prepared manuscript (a storyboard) for a two-minute video, the processing took less than 10 minutes. That included uploading the video, uploading the transcript, editing the transcript, and declaring the job done.

Remember, technical communicators preparing videos are working with storyboards and manuscripts. This makes generating a transcript much easier: the process of building the storyboard creates the manuscript. When you are ready to caption a video, you can usually copy and paste from your storyboard text.

Videos of interviews rarely have a manuscript, which means you will have to take the time to prepare a transcript before you can caption the video.

Conclusion

When captioning is this easy, there is no excuse for professional organizations and businesses to leave captions out of their videos. The time spent is minimal compared to the huge benefit of making your Deaf and hard-of-hearing audience happy.

Resources and Acknowledgements

For statistics on the number of Deaf or hard-of-hearing people in your country, visit the website of your local association for the Deaf or contact them directly. Some resources are:

The video used in this article was made by Bruce Lawson (www.brucelawson.co.uk) with his kind permission and support. The captioned video is available at www.youtube.com/watch?v=SvEncpCDEBI (accessed 5 November 2010).

Karen Mardahl is a technical writer in Denmark and manager of the STC AccessAbility SIG. Find her through http://flavors.me/kmdk.