The basic metronome to use to react to playing media over time is provided by the
timeupdate event of HTML’s
The Web Audio API defines a full-fledged audio processing API that exposes the precise time at which audio will be played, exposing the latency introduced by the output pipeline in particular. This allows for very tight synchronization between different audio processing events happening in the local audio context.
To associate an external text track (e.g. closed captions) to a video, the Web VTT format can be plugged into a
Timestamp accuracy in captured content
The Media Capture and Streams API, which allows to control microphones and cameras from Web applications, exposes the targeted latency of the configuration, which can be used to relate timestamps in captured content to events happening in the real world.
Synchronization to a media stream
The Media Timed Events document collects use cases and requirements for improved support for timed events related to audio or video media on the Web, such as subtitles, captions, or other web content, where synchronization to a playing audio or video media stream is needed, and makes recommendations for new or changed Web APIs to realize these requirements.
Specifications in progress
Beyond text tracks, Web pages may contain many other time-based animations with which synchronization can be useful; the Web Animations API offers the tools needed to set up these synchronization points. However, it does not allow synchronizing animations to playback of audio or video media.
Synchronization to a media stream
Incubation of the DataCue proposal, which builds on top of the former
TextTrackCue enhancements could pave the way for a generic solution in that field.
When media resources enclose their own text tracks, having this in-band information exposed to the Web application enables creating richer interactions; the Sourcing In-band Media Resource Tracks from Media Containers into HTML document offers guidance as to how that in-band information should be exposed in browsers.
There are a number of cases that need synchronization of several tracks in the same page, for instance to synchronize the sign-language transcript of an audio track with its associated video. The
MediaController interface, initially defined in HTML5, was dropped from the HTML specification due to very limited implementation support. The Timing Object specification proposes a mechanism to bring shared on-line clocks to browsers and ease synchronization of heterogeneous content within a single device and across devices. The proposed timing model is generic, in that it could be applied to any timed media, such as audio or video playback or presentation or animation of web content in general.
|Feature||Specification / Group||Implementation intents|
|Synchronization to a media stream||DataCue|
Web Platform Incubator Community Group
|Closed Captioning||Sourcing In-band Media Resource Tracks from Media Containers into HTML|
Media Resource In-band Tracks Community Group
|Multi-device synchronization||Timing Object|
Multi-Device Timing Community Group
MediaControllerinterface had been introduced in HTML5 to ease the playback synchronization of different media elements within a single page. The interface was dropped from the HTML5.1 specification due to very limited implementation support and concerns about the performance and technical viability of such solutions on constrained devices.
Features not covered by ongoing work
- Media-synchronised Web animations
- Web Animations allows Web applications to synchronise CSS Transitions, CSS Animations, and SVG with a global clock. There is no built-in mechanism to synchronize animations to playback of audio or video media, which could be useful to enable tighter synchronization in libraries such as Mozilla's Popcorn.js. Possible solutions include an extension to Web Animations to be able to associate animations with a media element, or integration with the more generic Timing Object solution.