Well-deployed technologies

All the basic interactions one expects from a media player (play, pause, playback rate, etc.) are available in JavaScript via the HTMLMediaElement interface defined in HTML.

The Web Audio API defines a full-fledged audio processing API that gives precise control over the playback of audio content.

Users often want to control the position at which to start a video (or an audio) feed when they share media content with friends on social networks. The Media Fragments URI specification defines a syntax for constructing media fragment URIs and explains how Web browsers can use this information to jump to the right position in the media timeline.

FeatureSpecification / GroupMaturityCurrent implementations
Select browsers…
Control of one local media elementHTMLMediaElement interface in HTML Standard
Living Standard
Audio controlWeb Audio API
Audio Working Group
Candidate Recommendation
Control position in media timeline when sharingMedia Fragments URI 1.0 (basic)
Media Fragments Working Group

Specifications in progress

A number of media-capable devices expose specific hardware keys and buttons to facilitate control of playing media; the Media Keys in DOM Level 3 events specification exposes these interactions to Web-based players.

For immersive media experiences enabled by Virtual Reality headsets, media content needs to be adjusted based on the viewer’s head position, orientation and velocity. The work on the WebXR Device API (formerly WebVR) specification addresses this need. The specification also exposes user input mechanisms used in such scenarios.

In any given Web application, media playback can be in competition with media playback from other applications, and the underlying operating system is in charge of determining which of these applications should have the media focus. The Media Session specification exposes these changes of focus to Web applications.

Remote controllers and gamepads are commonly used to navigate ten-foot user interfaces on televisions. They usually feature 4-way keys, which make them particularly suitable for 2D navigation, where focus moves between focusable elements based on their position on the screen. This behavior can be achieved through JavaScript to some extent, but cannot cater for all cases and negatively impacts performance due to the need to access the DOM repeatedly. The Spatial navigation specification defines extensions to CSS and an API for this navigation paradigm to be supported across browsers.

FeatureSpecification / GroupMaturityCurrent implementations
Select browsers…
Key-based controlMedia keys in UI Events KeyboardEvent key Values
Web Platform Working Group
Candidate Recommendation
Virtual reality controlWebXR Device API
Immersive Web Working Group
Working Draft
OS-wide media controlMedia Session
Media Working Group
Editor's Draft
User InterfaceCSS Spatial Navigation Level 1
CSS Working Group
Working Draft

Exploratory work

Devices which can access broadcast content need to offer ways to select which broadcast signal to play; the TV Control API is an attempt to provide such an interface for broadcast TV and radio. The API also exposes the Electronic Program Guide when it is streamed with the media content, and explores ways to expose applications embedded in media streams as is common in interactive TV platforms (ATSC, HbbTV, Hybridcast). This work has been discontinued for lack of interest to the API as specified among potential implementers. Discussions on scope, use cases and requirements for these features have now moved to the Media and Entertainment Interest Group.

The Multi-Device Timing Community Group is exploring media control across devices: its Timing Object specification enables coordinatinatgion of the playback of multiple video, audio and other data streams in close synchrony, within a single device and across devices, independently of the network topology. This proposal would replace and extend the MediaController interface, which was dropped from HTML for lack of implementations.

To preserve bandwidth, memory and battery on mobile, and prevent possibly unwanted media playback, browsers have put autoplay policies into place and may deny automated playback of media content. The Autoplay Policy Detection specification is an early proposal to let applications know whether autoplay will succeed for a given media element.

FeatureSpecification / GroupImplementation intents
Select browsers…
Tuner controlTV Control API
TV Control Working Group
Multi-device media controlTiming Object
Multi-Device Timing Community Group
AutoplayAutoplay Policy Detection
Media Working Group

Features not covered by ongoing work

360° video controls
To render 360° videos, applications can either rely on native support for 360° video codecs from web browsers and stick to integrated controls, or they can provide their own controls but then need to handle the underlying projection logic and adaptive streaming programmatically. Exposing controls for 360° videos on the HTMLMediaElement interface would allow applications to customize the user experience and still take advantage of native 360° video playback support when available.