Caption Formats

Status: This is an incomplete, unapproved draft. The current draft is at wai-media-guide.netlify.com/

Nearly all modern browsers and media players support the display of closed captions. However, they do not all support the same caption-file formats. The most commonly used formats used for online media are:

Standalone players typically support WebVTT and/or TTML. Streaming media services typically use TTML to convey captions to users.

WebVTT and TTML contain a full array of markup for styling, timing and placement options. SRT is a bare-bones format that displays unstyled text only, although some user agents may support basic styling commands (such as bold or italic text) if they are present in the caption file.

Web browsers support various caption formats, as shown in the table below.

Browser	OS	Supported caption format(s)
Firefox	Windows, OS X, Android, iOS	WebVTT
IE 10, 11; Edge	Windows	TTML, WebVTT
Safari	OS X; iOS	WebVTT
Chrome	Windows, OS X, Chrome OS, Android, iOS	WebVTT

SRT is not supported natively by any browser, but is supported by most other types of media players including those provided by popular video-hosting services, some social-media platforms and by custom players.

WebVTT, TTML and SRT are "sidecar" files, which is to say they are transmitted separately from their corresponding video files (riding alongside the video data in the delivery stream, rather than being embedded directly into the video file), and are synchronized and displayed by the user agent at the time of playback.

Link to this section: Shortcut to copy the link: ctrl+C or ⌘C

E-mail a link to this section

Distributing captions

Captions are distributed to viewers using HTML5's track element, which was created specifically for carrying text tracks, such as captions, subtitles and text-based audio descriptions. track is used as a child element of the video element:

Code snippet:

<video controls>
    <source src="myvideo.mp4" type="video/mp4" />
        <track kind="captions" src="myvideo_captions.vtt" srclang="en" label="Captions" default />
</video>

In the example above, the kind attribute is set to "captions" to identify what type of text track it is. The label attribute is set to "Caption," which is the visible text (or label) that the user agent will display to identify the track to the user. Learn more about attributes for the track element.

Link to this section: Shortcut to copy the link: ctrl+C or ⌘C

E-mail a link to this section

These tutorials provide best-practice guidance on implementing accessibility in different situations. This page combines the following WCAG 2.0 success criteria and techniques from different conformance levels:

Success Criteria:

1.2.2 Captions (Prerecorded): Captions are provided for all prerecorded audio content in synchronized media, except when the media is a media alternative for text and is clearly labeled as such. (Level A)

Techniques:

Distributing captions

Related WCAG 2.0 resources