Creating & Using Transcripts

Written versions of audio & video

What is a transcript?

Essentially, transcripts are a plain text version of the speech and non-speech audio elements in an audio recording, video, podcast, interview, and more. Sometimes these transcripts are interactive and highlight passages as the audio/video progresses so you can read along. 

Podcasters might use transcripts to help their SEO ratings since search engines aren’t able to index an audio file for keywords. Journalists and scholars might transcribe live interviews so they can quickly locate quotes and sound bites at later times. Educational videos often come with transcripts to act as a study guide. Transcripts allow for quick skimming to find the information you need rather than listening through the entire piece at the speed of regular speech. When it comes to accessibility, transcripts are especially important because it enables the information to be processed by screen readers. 

There are two general types of transcripts: “verbatim” or “descriptive” transcripts and “standard” or “clean read” or transcripts. 

Standard transcripts are lightly edited for better readability. Non verbal utterances like “um” or “ah” are often cut out unless they impact the overall meaning and intention of the audio. 

Verbatim transcripts are word-for-word transcriptions of the audio. These will include any stutters or false starts in speech as well as sound effects like laughter or applause. Sometimes Verbatim or “descriptive” transcripts also include explanations of visual moments in a video that would otherwise be missed. In a way, descriptive and verbatim transcripts are a written hybrid of captions and audio description.


Transcripts vs captions

Transcription is the process of converting speech and audio into a written document. As this is the basis of captioning, it’s understandable why there’s often confusion between the terms and uses. 

Transcripts are usually broken into paragraphs that follow logical sense when reading. Captions, on the other hand, are broken up into time-coded chunks based on speech patterns or audio speed. 

Transcripts are suitable for making audio-only content accessible. Some websites feature interactive transcripts, which highlights each word or sentence as it is being spoken, thus assisting listeners/viewers to read along. Providing this option makes content more accessible to those who are visual learners rather than auditory learners. Additionally, just as captions and subtitles allow for increased comprehension when watching a video in noisy environments, transcripts can play this role with podcasts.

Captions, however, are legally required to make videos accessible as they are synced with the visuals to enhance and contextualize the content. 

Both transcripts and captions can be used to boost SEO as search engines aren’t able to index video or audio content alone. If a video features open captions – that is, captions that are permanently burned into the video and not uploaded as a caption file – it’s often useful to also include a transcript, as well, since open captions cannot be indexed.


Best practices for formatting and using transcripts 

If you don’t already have a caption file, simply start typing what you hear, as you hear it, to generate a transcript. You can also use speech-to-text software to speed up the process and then edit for accuracy - try the links below:
Automatic Subtitle Generator

Audio to Text

Video to Text

Just like with captions, the goal for your transcript is to provide the auditory information to those who are not hearing the sounds. This includes speaker identifications, sound effects, noting if a speaker is off-screen, etc. If you cannot understand what has been said, label that section as [inaudible] or [unintelligible].

If you’re starting with subtitles or captions, you can download the file as a plain text (.txt) document and reformat the transcript from there. The bulk of this work will be combining several caption lines into logical paragraphs. It might also be helpful to add section headings, depending on the type of content. If you’re converting captions from a video into a transcript, be sure to go back and include any pertinent on-screen text or visuals that you wouldn't have initially captioned.

Generally speaking, transcripts do not need to have timestamps, although some people find it useful to include a time notation (once a page or so) when transcribing an interview, for example, in order to more easily return to that exact point of the audio. If you do include a timestamp, there is no need to note the end time of each unit of speech (as you would with captions). 


Where to put transcripts 

There’s no right or wrong answer to this – include the transcript itself or a link to the transcript wherever it will be most easily accessed! Often for videos, a link will be posted in the video description or one of the first comments. For a podcast, the transcript is often embedded right under the audio player. Feel free to play around with placement and links to see how each affects your reach and engagement!