Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.


Captioning converts the audio content within a video into text, then synchronizes the transcribed text to the video. When the recording is played, that text will be displayed in segments that are timed to align with specific words as they are spoken. Captioning is required to make video content accessible to viewers who are deaf or hard of hearing. 

Subtitles show the translation of words spoken in a different language. The words shown on the screen in a foreign film in another language, for example, are considered subtitles.

Transcription is the process of producing a text document from the words spoken in a video. Transcribed text does not have a time value associated with it and can't be used immediately for captions or subtitles - further editing will be required. In terms of accessibility, transcription works well for audio-only media, but falls short when it comes to audio with moving content on a screen, such as voice-over-PowerPoint slides or video.


University Captioning Policy (July 2020)


From 23 September 2020, websites of public sector bodies will need to satisfy new accessibility requirements.

These regulations were brought into law in 2018 as part of the disability-focused Public Sector Bodies (Websites and Mobile Applications) (No. 2) Accessibility Regulations 2018.

Websites that don’t meet the new requirements after 23 September 2020 deadline could be found to be breaking the law. This includes intranet services (or internal company websites) as well as websites for external audiences.

Websites will also need to include an Accessibility Statement - sample statements can be found here



titleQ: What is the difference between open and closed captioning?
There are a few differences between open captioning and closed captioning in videos. Most notably, open captions are always on and in view (burnt in), whereas closed captions can be turned off by the viewer. Open captions are part of the video itself, and closed captions are delivered by the video player or television (via a decoder). And unlike closed captions, open captions may lose quality when a video is encoded and compressed.
titleQ: What’s the difference between automatic speech recognition (ASR) captioning and human-generated captions?

There are two important differences between ASR captioning (also referred to as machine-generated captions) and human-generated captions: the quality and the time required to generate captions.

Machine-generated captioning produces captions very quickly. Typically, captions can be created in about one-quarter of the total video length. For example, an hour-long video could be captioned using ASR in approximately 15 minutes.

ASR captions are typically 70-75% accurate depending on the audio quality in the recording. As a result, machine-generated captions are primarily intended to enable inside-video search, and by default, they aren’t added to the video as closed captions. Instead, the text is stored in the video platform’s database for use with the video search engine.

Of course, ASR also provides a starting point from which people can manually create 100% accurate captions. In video platforms like Panopto, text generated by ASR can be added to the video as closed captions, which people can then edit.

Human-generated captions take substantially longer to produce but provide results that are at least 99% accurate. In some cases, human-generated captions can be turned around in 24 hours, but typically, you can expect a 2-5 day turnaround.

titleQ: Why Is captioning important?

In addition to making video content more accessible to viewers with impaired hearing, captioning can actually improve the effectiveness of video:

  • Captions improve comprehension by native and foreign language speakers. An 2006 Ofcom study showed 80% of people who use video captions don’t even have a hearing disability.
  • Captions help compensate for poor audio quality or background noise within a video.
  • Captions make video useful when a person is watching with the sound off or viewing in a noisy environment that obscures the sound.
  • Captions provide viewers with one way to search inside of videos. The link gives an example using the University Replay lecture capture.
titleQ: How long does it take to generate captions?

Machine-generated captions can be generated in a quarter of the time it takes to play an individual video. Depending on the service used to generate captions and the length of material, it can be well within 24 hours.

Human-generated captions are typically generated within two to five days, depending on the requested turnaround time and service options.

titleQ: Can captions be generated 'live'?

Yes, while a live event is taking place captions can be added to the video content but typically this would involve a third party supplier who provides captions generated by machine or human.

The latest version of Zoom will generate captions while the event is taking place and individual users can choose whether captions are displayed or not, and what size to make the captions.

The captions are then displayed either superimposed over the video content (typically at the bottom of the screen) or as a separate web browser page so that more than one line of text is visible at a time. The latter is useful for viewers who may need to see captions in a larger font size or need longer to read them if someone is speaking quickly.

It is also possible to provide subtitling (simultaneous translation) in this manner, however costs are an important factor in live captioning and translation.

titleQ: What are Web Content Accessibility Guidelines (WCAG) and how do they relate to captioning?

The Web Content Accessibility Guidelines, also known as the WCAG standard, is the most detailed and widely adopted guide for creating accessible web content.

WCAG 2.1 AA generally asks that online content meet four principles that improve accessibility for people with disabilities and also adhere to a certain level of compliance. Both are summarized below:

WCAG Design Principles:

  • Perceivable: All relevant information in your content must be presented in ways the user can perceive.
  • Operable: Users must be able to operate interface components and navigation successfully.
  • Understandable: Users must be able to understand both the information in your content and how to operate the user interface.
  • Robust: Content must be robust enough that it can be interpreted by users, including those using assistive technologies (such as screen readers).

WCAG Compliance Levels for Online Video:

  • Level A: Captions are provided for all prerecorded audio content in synchronized media, except when the media is a media alternative for text and is clearly labeled as such.
  • Level AA: In addition to Level A compliance, captions are provided for all live audio content in synchronized media.
  • Level AAA: In addition to Levels A and AA compliance, sign language interpretation is provided for all prerecorded audio content in synchronized media.

To learn more about WCAG 2.11 guidelines visit

titleAutomatic Captions using Youtube

Youtube provides a facility for automatic captioning of videos on it's channels.

Be aware that this only applies to videos which have been recorded and uploaded to the channel - it does not cover live streams. Providing captions (or subtitles) for live streams will involve a third party supplier who will make a charge for their services.

Captions using Panopto

As of August 2020, the University will automatically caption all material recorded by Panopto - the PDLT team have produced a one page guide to assist with embedding and editing captions.

In addition to that, there are links below which provide further information:

Downloading captions

Uploading captions

Setting up 3rd party integration - the example uses 3PlayMedia as a provider for caption services.

Adding translated captions

Third party suppliers

AI Media - currently used by the Univeristy's Equality & Diversity team for events and INCLUDE meetings

3PlayMedia - used by Audio Visual & Digital Marketing for one-off captioning requests

Third party suppliers will always make a charge for their services and there may also be costs incurred for cancellation at short notice. 

Further Information

There is some further information here from Digital Marketing & Communications relating to Youtube captions

titleDepartmental Information when employing casual staff to provide captions

Sourced from 'Update to Academic Departments Newsletter #8 2 November 2020'

Additional support to departments

Executive Summary version: Additional Funding is available to pay students to support captioning work outside of GTA budgets.

We recognise that the need to provide captions for our recorded online lectures in order to meet legal requirements is burdensome.  The University is exploring ways to support departments better and expects to have a medium term solution in place from the start of next calendar year.  This will be in place while we look to find a better longer term fix.  In the short term the following measures have been agreed with the aim of easing some of the current problems related to casual staff and carrying out the training and supervision needed to support them in this important work.

Short term approach with immediate effect until the end of the 2020 calendar year

Departments are asked to continue seeking creative solutions to do this work from existing resources.  However where additional spend on casual staffing is required there will be a streamlined approval process to authorise this spend as business critical.  Please note that no additional salaried staff spend is permitted for this work. Departments are asked not to make retrospective bookings using this new approach.

For the remainder of this year Departments should:

  • Book casual staff in the normal way;

  • Charge spend to a faculty work-order, details of which have been shared with Departmental administrators, which will allow spend to be tracked for the remainder of the term; 

  • Note that the spend will not form part of GTA budgets.

  • Departments are asked to use the rate for Casual Office Assistant £8.72 per hour so that we are consistent. However, where the work requires specialist subject knowledge, departments may use the ‘Captioning (Specialist Knowledge)’ rate of £11.31 per hour.  There should not be  any additional preparation time.

  • These changes will be effective immediately, but do not need to be applied retrospectively to any existing bookings.

  • Non specialist captioning work should be offered to all students, including undergraduates. We expect the specialist captioning to be done by postgraduate students with a cognate academic background.