Making your media accessible
Captioning your media content helps to ensure that it is accessible to as many people as possible. This page covers information on how to add automated captions and good practice guidance for creating accurate captions.
Captioning requires an investment of time, particularly if the media contains strong accents, scientific or technical content, or inconsistent sound quality.
What is captioning?
Captioning is where a text-based version of the spoken part of media content is visible alongside the media. Sometimes captioning is also referred to as subtitling.
Why do I need to add captions to my media?
Captions benefit many people including those who are deaf or hard of hearing, viewers who have English as an additional language and those who are in noisy environments, enabling them to understand the media without relying on the audio. Many colleagues have found that the process takes a long time and our support guides can be helpful in helping you to manage your captioning more effectively.
The University continues to aim to move towards a culture where subtitling our media is standard practice at the point of creation, not only because of changing legislation but because it promotes engagement with our media for the benefit of our whole audience, whilst promoting digital literacy and digital skills. To achieve this, ISG run a series of training workshops for staff facilitated by the Digital Skills Team to increase awareness of good practice in subtitling and to improve practical subtitling skills using the software available through Media Hopper Create and other platforms.
What is the difference between automated and human corrected captioning?
Some systems provide automated captioning tools which can be used to convert spoken words into text. This automated captioning can be done quickly and is available for free in most of our university media tools. Media Hopper Create. Media Hopper Replay, Teams and Zoom all include automated captions.
Automated captioning is done by computer software and will always require editing to improve accuracy. The software has been modelled and trained on a corpus of spoken word which is largely generic and not specific to academia. This means that many of the words, phrases, names (etc) you say may not be recognised. They also may not be recognised if your accent is different from how the software has been trained. You should spend some time thinking about how the software handles the speech you are converting and correcting. Background noise, speaker volume and clarity, accents, subject-specific terminology and pace of speech all impact the quality of the audio and therefore impact the accuracy of the automated captions.
Our systems also provide tools to allow you to edit the automated captions to improve their accuracy. Getting good at correcting captions is helped by an understanding of how the tools work. We have done testing in ISG to compare various tools so that we can offer you the best support possible. Some small changes to how you create your media may improve the accuracy of the automated captions and make them easier to correct. Information about the test we have done is available (on request via IS Helpline). We found that although there were variations in performance, when tested with the same content the variations between tools such as Verbit, Otter.ai etc were only small percentage points but all of the automated captions required some correction.
What can I do to improve the accuracy of automated captions for my media?
There are things you can do when recording these sessions which may improve the accuracy of the automated captions.
1. Don’t underestimate the importance of good quality audio
It is essential that the audio in your media is good quality to allow the speech recognition technology to be able to recognise the words being said.
Here are a few things you can do to improve the quality of your recording and therefore make the automated captions better:
Try and record in a quiet room so that background noise is at a minimum.
The fan on your computer can cause a lot of noise on recordings, so close any applications that you’re not actually using to minimize any noise.
Use an external microphone and place it correctly so that your voice is recorded clearly. Consider the distance of the microphone from your mouth, the direction it’s pointing in, avoid rustle from fabric if it’s a lapel mic, and minimise tapping from your keyboard if you are using a desktop mic . Note that a headset microphone will usually be better than using the microphone on your computer. The exact placement will differ between people and type of microphone, so do two or three short test recordings to try it out. Even if you’re using your device’s in-built microphone, try and position your device so that your voice is being recorded well.
A range of good quality microphones to suit your needs, recording location and environment can be borrowed from the Learning Spaces Technology team and you can collect equipment from the Main Library.
Listen back to your tests on some headphones or earphones. What sounds best? You’ll soon arrive at the optimal position.
2. Consider your voice
Speak directly into the microphone, don’t speak too fast, just try and speak clearly and naturally, all good practice when speaking to groups of people (face-to-face or remotely). Again, testing in advance is helpful to see what gives the best results.
3. Find out more by attending one of our training sessions
The Digital Skills Team offers regular training which covers how to request and edit captions on Media Hopper Create, along with guidance on good practice and getting the clearest audio for more accurate automated captions. To date, we've taught over 120 people to caption on these courses. You can find more information and book a place via People & Money Learning, or use the course link here:
4. Check your internet speed
As the quality of any recording made on a video conferencing platform also relies on your internet connection, making sure that it’s as robust as possible by using a wired connection can help. If you are recording without a live audience, consider recording into something that doesn’t rely on an internet connection like Media Hopper Create’s desktop recorder. You can upload the media after you’ve recorded it.
5. More advanced options
a. Video conferencing applications like Zoom and Teams have in-built features which aim to minimise background noise or echo when you’re running and recording meetings. While this helps in many meeting situations, it can also mean the audio recording of actual speech is poorer than it might have been if you and the other participants are in a quiet environment and are using good microphones (as may be the case, for example, in some research interviews). Adjusting these settings can mean that the recording is more faithful to actual speech and therefore the likelihood of getting better automatic captions when you upload your video to Media Hopper Create is greater.
If you are recording via Zoom you can enable the ‘Original Sound’ option.
In Teams there’s an option to switch off the in-built noise suppression.
b. If you have a recording which still has quite noticeable background noise, you can use a free program like Audacity to reduce it. You’ll have to split the audio from the video using a video editing program like VLC Media Player, or another video editing program, clean it up in Audacity, and then rejoin it to your video in VLC Media Player.
If you want to learn more about Audacity, book onto our Audacity training.
How do I create and correct captions for media?
Different services have different mechanisms for creating and editing captions, and information about these is linked to below.
Zoom - there is a section on the Zoom Help & Support page which covers captioning
Teams - there is a section on captions and transcription on the Teams Meetings page.
Collaborate – Collaborate doesn’t currently have any automated speech recognition available. We recommend that you download the recording and upload to Media Hopper Create and use the captioning features there to caption your recordings.
If you are correcting the captions for media where you are not the speaker, having access to a script or speaker notes can help a lot when correcting any errors in the automatic captions, especially when it comes to less familiar technical terms. Asking the speaker or creator to share these with you can be very helpful and save time.
Can I employ students to help correcting captions for my content?
If you prefer to have a dedicated captioning resource, you may wish to consider employing students as captioners as some Schools and Colleges have already successfully done. You can employ students via People and Money, advertising as a guaranteed hours post on CareerHub. The Digital Skills team will provide training to develop captioning expertise.
The University also has access to the Unitemps student recruitment service for hiring students for temporary work. Some students are already trained in and have experience of editing captions. You can find more about Unitemps at the University of Edinburgh here: Unitemps - Contact Us.
This kind of work has real value for students in learning digital skills and subject knowledge. We interviewed students to find out what they enjoyed about their jobs as captioners.
You can find out more about the work we have done to support subtitling at University of Edinburgh Subtitling Media Pilot Project | The University of Edinburgh
The pilot service explored a new way of working, blending automation with human intervention. Automated subtitling services are notoriously inaccurate and require checking. In the pilot service, subtitles were automatically generated and a student team acted as human mediators, checking and correcting the subtitles, drawing on their own knowledge and expertise of the HE sector in the process. We identified a number of key findings relating to student employment, the art of subtitling and digital skills development.
The Subtitling for Media pilot identified a number of key findings as follows:
The project was able to recruit motivated and competent students, and the work pattern complemented study commitments
Students valued the work, which they found to be meaningful and purposeful, and were able to produce high quality subtitles for varied media
Students gained valuable experience in a positive and dynamic working environment
Subtitling requires an investment of time, particularly if the media contains strong accents, scientific or technical content, or inconsistent sound quality
The quality of automated subtitles continues to improve with advances in speech-to-text technology enabled by the availability of large data sets
Staff welcomed the opportunity to attend training to improve their own subtitling capability
You can read the full Project report as a PDF.
In addition to adding captions and subtitles, what else can I do to make my content accessible?
Here are some ideas
- Include complicated or unfamiliar terms in the text of your slides so that students can read them there even if the captions don’t capture them correctly.
- Provide transcripts of your lecture or your own notes so that there is a written version of your content.
- Record your media in shorter chunks and about a single main topic. This helps the viewer focus more easily on the topic at hand.
- If you are recording your lectures, please see our 'Making lectures accessible' guidance for more suggestions.
Will the automated captioning robots improve?
Research and development
The accuracy of machine-generated subtitles is influenced by many factors such as sound quality, speech characteristics (accent, hesitation, and volume), content (abbreviations, scientific or technical language) or poor synchronisation of the subtitle track and audio. Speech recognition technology is an emerging field, already influencing our engagement with technology at home and at work through devices that respond to spoken commands and virtual assistants. As speech-to-text technologies develop, the need for human correction of subtitles should reduce over time as the accuracy of automated outputs improves.
To ensure the University has sight of technology trends in this area to understand how they might influence service development over time, the ISG ran an event for staff (“I’m sorry, could you repeat that?”) with guest speakers from the University’s Institute for Language, Computation, and Cognition (ILCC) and Quorate Technology. The Project Team had a number of meetings with Professor Steve Renals (Chair of Speech Technology, ILCC) and with Quorate Technology to begin to understand technology developments and how we might take advantage of opportunities for funding or partnerships in this area.
Students with adjustments for subtitling
The Student Disability Service runs an annual survey for all students with adjustments, a number of questions relating to subtitling were added, to gather feedback directly from students with adjustments for subtitling about their experience.