Copyright © 2020 World Wide Web Consortium. W3C® liability, trademark and permissive document license rules apply.
This document collects use cases and requirements for improved support for building web applications that allow end-users to manipulate professional media assets, including audio-visual masters for television and motion pictures, and perform media production steps such as quality checking, versioning, or timed text authoring.
This document is merely a W3C-internal document. It has no official standing of any kind and does not represent consensus of the W3C Membership.
Professional media assets, including audio-visual masters for television and motion pictures, are increasingly being stored in the cloud.
There is a corresponding growing interest in building web applications that allow end-users to manipulate these assets, e.g., quality checking, versioning, timed text authoring, etc. While the web platform has evolved to support consumer media applications, professional applications require additional capabilities, including precise timing, wider color gamut and high-dynamic range, high-fidelity timed text, etc.
This document analyses gaps in web platform technologies for media production through use cases and requirements.
Concrete use cases welcome through GitHub issues or pull requests.
This list of gaps is to be driven by use cases and will be re-evaluated as the list of use cases in is completed.
Web applications measure time values with respect to a
monotonic clock [HR-TIME]. The [HTML] does not
expose any precise mechanism to assess the time value, with respect to
that clock, at which a particular media frame is going to be rendered.
A web application may only infer this information by looking at the
media element's
currentTime
property to infer the frame being
rendered and the time at which the user will see the next frame.
This has several limitations:
currentTime
is represented as a double value,
which does not allow it to identify individual frames due to
rounding errors. This is a
known issue.
currentTime
is updated at a user-agent defined
rate (typically the rate at which the
time marches on algorithm
runs), and is kept stable while scripts are running. When a web
application reads currentTime
, it cannot tell
when this property was last updated, and thus cannot reliably assess
whether this property still represents the frame currently being
rendered.
In addition, currentTime
only accepts a time
value, and neither has the ability, nor is there an alternative API,
to set its value to a frame number or SMPTE time code [SMPTE12-1].
The media element does not
provide a mechanism to seek by individual frames. This can be worked
around by using the media's frame rate and the
media element's
currentTime
property to seek by a frame duration
from the reported currentTime
value, but the
currentTime
property does not guarantee frame
level precision. In addition, the frame rate of the media may vary
over time, may be rounded internally by the browser, and is not
exposed.
When appending segments using [MSE], the
timestampOffset
property does not provide enough precision
to identify frame boundaries. This suffers the same limitation as
currentTime
, where the value is represented as a
double, which does not allow it to identify individual frames due to
rounding errors.
For playback of remote content (e.g., via the [REMOTE-PLAYBACK] or [PICTURE-IN-PICTURE]), whether the content will be able to be synchronized remains an open question. This has been reported by the Second Screen Community Group.