Well-deployed technologies

To ensure the user experience of streamed media remains resilient in the face of changing network conditions, media providers make use of adaptive streaming, where content gets split into chunks at different quality levels, and where the client requests chunks at the appropriate quality level based on networking conditions during playback. Several technologies are available to satisfy this approach (such as MPEG DASH) but Web browsers may not offer native support for them. The Media Source Extensions (MSE) specification enables Web application developers to create libraries that will consume these different adaptive streaming formats and protocols.

Users often want to share a pointer to a specific position within the timeline of a video (or an audio) feed with friends on social networks, and expect media players to jump to the requested position right away. The Media Fragments URI specification defines a syntax for constructing URIs that reference fragments of the media content, and explains how Web browsers can use this information to render the media fragment, allowing streaming of only the relevant part of the media content to the client device.

FeatureSpecification / GroupMaturityCurrent implementations
Select browsers…
Adaptive streamingMedia Source Extensions™
HTML Media Extensions Working Group
Recommendation
Fragment streamingMedia Fragments URI 1.0 (basic)
Media Fragments Working Group
Recommendation

Specifications in progress

To improve the user experience and take advantage of advanced device capabilities when they are available, media providers need to know the decoding (and encoding) capabilities of the user's device. Can the device decode a particular codec at a given resolution, bitrate and framerate? Will the playback be smooth and power efficient? Can the display render high dynamic range (HDR) and wide color gamut content? The Media Capabilities specification defines an API to expose that information, with a view to replacing the more basic and vague canPlayType() and isTypeSupported() functions defined in HTML and MSE.

Media providers also need some mechanism to assess the user's perceived playback quality to alter the quality of content transmitted using adaptive streaming. The Media Playback Quality specification, initially part of MSE, exposes metrics on the number of frames that were displayed or dropped.

When broadcasting popular live events on an IP network, content distribution should introduce as little latency as possible. WebRTC technologies may be used to reach sub-second latency. Additionally, these technologies may also be used to create a peer-to-peer content delivery network and spread the load of video distribution among watchers.

FeatureSpecification / GroupMaturityCurrent implementations
Select browsers…
Capabilities and qualityMedia Capabilities
Media Working Group
Editor's Draft
Media Playback Quality
Media Working Group
Editor's Draft
Realtime distributionWebRTC 1.0: Real-time Communication Between Browsers
WebRTC Working Group
Candidate Recommendation

Exploratory work

A media stream may be composed of different programs that use different encodings, or may need to embed ads that have different characteristics. Applications cannot easily and smoothly merge these heterogeneous sub-streams back into one stream using Media Source Extensions (MSE), as it does not support this use case. Content providers need to work around this limitation, e.g. transcoding and re-packaging the ad content to be consistent with the program content prior to distribution where possible. Work on a codec switching feature for MSE has started to address this issue.

Some devices (e.g. TV sets) provide access to non-IP broadcast media; the Tuner API in the TV Control specification brings these non-IP media streams to Web browsers. The API also explores ways to expose and launch broadcast-related applications that get transmitted within the transport stream sometimes. The scope of the API may change in the future though. The work on the API itself has been discontinued while scope, use cases and requirements get refined in the Media and Entertainment Interest Group.

Different media containers, used to transport media over the network, embed different kinds of in-band tracks, which must be mapped onto video, audio and text tracks in HTML5 so that Web applications can access them interoperably across Web browsers. The Sourcing In-band Media Resource Tracks from Media Containers into HTML specification provides such mapping guidelines.

Device such as HDMI dongles and lightweight set-top boxes may not have the power needed to run a Web browser locally. Updating the firmware of these devices may also be difficult. One possible solution is to bring the processing and rendering of the browser to the cloud and to stream the result to the device. The Cloud Browser Architecture specification describes the concepts and architecture for such a Cloud Browser to provide building blocks for potential further standardization needs.

Features not covered by ongoing work

Streaming HTTP media on HTTPs pages
While it is possible to read audio/video content served on HTTP in <audio> and <video> elements served on HTTPs pages (despite the usual mixed content restrictions), this does not extend to the usage of streaming enabled by Media Stream Extensions. There have been early discussions on how this could be solved. The consensus was not to include possible solutions in the current version of the specification.
Multicast distribution
In typical unicast distribution schemes, the cost of the infrastructure increases with the number of viewers, and becomes impractical to stream large live events such as finals of world championships that attract millions of viewers at once. The ability to distribute content using multicast would address network scaling issues, allowing to push media stream resources to many clients at once. The Media & Entertainment Interest Group discussed possible extensions to fetch that could enable server push and multicast distribution.