Initial Author of this Specification was Ian Hickson, Google Inc., with the following copyright statement:
© Copyright 2004-2011 Apple Computer, Inc., Mozilla Foundation, and Opera Software ASA. You are granted a license to use, reproduce and create derivative works of this document.
All subsequent changes since 26 July 2011 done by the W3C WebRTC Working Group and the Device APIs Working Group are under the following Copyright:
© 2011-2015 W3C® (MIT, ERCIM, Keio, Beihang), All Rights Reserved. Document use rules apply.
For the entire publication on the W3C site the liability and trademark rules apply.
This document defines a set of JavaScript APIs that allow local media, including audio and video, to be requested from a platform.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This document is not complete. It is subject to major changes and, while early experimentations are encouraged, it is therefore not intended for implementation. The API is based on preliminary work done in the WHATWG.
This document was published by the Web Real-Time Communication Working Group and Device APIs Working Group as an Editor's Draft. If you wish to make comments regarding this document, please send them to public-media-capture@w3.org (subscribe, archives). All comments are welcome.
Publication as an Editor's Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures (Web Real-Time Communication Working Group, Device APIs Working Group) made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
This document is governed by the 14 October 2005 W3C Process Document.
This section is non-normative.
This document defines APIs for requesting access to local multimedia devices, such as microphones or video cameras.
This document also defines the MediaStream API, which provides the means to control where multimedia stream data is consumed, and provides some control over the devices that produce the media. It also exposes information about devices able to capture and render media.
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key words MAY, MUST, MUST NOT, NOT REQUIRED, and SHOULD are to be interpreted as described in [RFC2119].
This specification defines conformance criteria that apply to a single product: the User Agent that implements the interfaces that it contains.
Conformance requirements phrased as algorithms or specific steps may be implemented in any manner, so long as the end result is equivalent. (In particular, the algorithms defined in this specification are intended to be easy to follow, and not intended to be performant.)
Implementations that use ECMAScript [ECMA-262] to implement the APIs defined in this specification must implement them in a manner consistent with the ECMAScript Bindings defined in the Web IDL specification [WEBIDL], as this specification uses that specification and terminology.
The
EventHandler
interface represents a callback used for event handlers as
defined in [HTML5].
The concepts queue a task and fires a simple event are defined in [HTML5].
The terms event handlers and event handler event types are defined in [HTML5].
A source is the "thing" providing the source of a media stream track. The source is the broadcaster of the media itself. A source can be a physical webcam, microphone, local video or audio file from the user's hard drive, network resource, or static image. Note that this document describes the use of microphone and camera type sources only, the use of other source types is described in other documents.
An application that has no prior authorization regarding sources is only given the number of available sources, their type and any relationship to other devices. Additional information about sources can become available when applications are authorized to use a source (see 9.2.3 Access control model).
Sources do not have constraints — tracks have constraints. When a source is connected to a track, it must produce media that conforms to the constraints present on that track. Multiple tracks can be attached to the same source. User Agent processing, such as downsampling, MAY be used to ensure that all tracks have appropriate media.
Sources are detached from a track when the track is ended for any reason.
Sources have constrainable properties which have
capabilities
and
settings
. The constrainable properties are "owned" by the source
and are common to any (multiple) tracks that happen to be using the
same source (e.g., if two different track objects bound to the same
source ask for the same capability or setting information, they will
get back the same answer).
A setting refers to the immediate, current value of the source's constrainable properties. Settings are always read-only.
A source's settings can change dynamically over time due to environmental conditions, sink configurations, or constraint changes. A source's settings must always conform to the current set of mandatory constraints on all attached tracks. A source that cannot conform to mandatory constraints causes affected tracks to become overconstrained and therefore muted. A user agent attempts to ensure that sources adhere to optional constraints as closely as possible, see 11. Constrainable Pattern.
Although settings are a property of the source, they are only
exposed to the application through the tracks attached to the
source. This is exposed via the ConstrainablePattern
interface.
For each constrainable property, there is a capability
that describes whether it is supported by the source and if so, the
range of supported values. As with settings, capabilities are exposed
to the application via the ConstrainablePattern
interface.
The values of the supported capabilities must be normalized to the ranges and enumerated types defined in this specification.
A getCapabilities() call on a track returns the same underlying per-source capabilities for all tracks connected to the source.
Source capabilities are effectively constant. Applications should be able to depend on a specific source having the same capabilities for any session.
This API is intentionally simplified. Capabilities are not capable of describing interactions between different values. For instance, it is not possible to accurately describe the capabilities of a camera that can produce a high resolution video stream at a low frame rate and lower resolutions at a higher frame rate. Capabilities describe the complete range of each value. Interactions between constraints are exposed by attempting to apply constraints.
Constraints provide a general control surface that allows applications to both select an appropriate source for a track and, once selected, to influence how a source operates.
Constraints limit the range of operating modes that a source can use when providing media for a track. Without provided track constraints, implementations are free to select a source's settings from the full ranges of its supported capabilities. Implementations may also adjust source settings at any time within the bounds imposed by all applied constraints.
getUserMedia() uses constraints to help select
an appropriate source for a track and configure it. Additionally, the
ConstrainablePattern
interface on tracks includes an API for
dynamically changing the track's constraints at any later time.
A track will not be connected to a source using
getUserMedia() if its initial constraints cannot be satisfied.
However, the ability to meet the constraints on a track can change
over time, and constraints can be changed. If circumstances change
such that constraints cannot be met, the ConstrainablePattern
interface defines an appropriate error to inform the application.
5. The model: sources, sinks, constraints, and settings
explains how constraints interact in more detail.
In general, user agents will have more flexibility to optimize the media streaming experience the fewer constraints are applied, so application authors are strongly encouraged to use mandatory constraints sparingly.
For each constrainable property, a constraint exists whose name corresponds with the relevant source setting name and capability name.
RTCPeerConnection
RTCPeerConnection
is defined in [WEBRTC10].The two main components in the MediaStream API are the
and MediaStreamTrack
interfaces. The MediaStream
object represents media of a single type that originates from one media
source in the User Agent, e.g. video produced by a web camera. A MediaStreamTrack
is used to group several MediaStream
objects into one unit that can be recorded or rendered in a media
element.MediaStreamTrack
Each
can contain zero or more MediaStream
objects. All tracks in a MediaStreamTrack
are intended to be synchronized when rendered. This is not a
hard requirement, since it might not be possible to synchronize tracks
from sources that have different clocks. Different
MediaStream
objects do not need to be synchronized.MediaStream
While the intent is to synchronize tracks, it could be better in some circumstances to permit tracks to lose synchronization. In particular, when tracks are remotely sourced and real-time [WEBRTC10], it can be better to allow loss of synchronization than to accumulate delays or risk glitches and other artifacts. Implementations are expected to understand the implications of choices regarding synchronization of playback and the effect that these have on user perception.
A single
can represent multi-channel content, such as stereo or 5.1
audio or stereoscopic video, where the channels have a well defined
relationship to each other. Information about channels might be exposed
through other APIs, such as [WEBAUDIO], but this specification
provides no direct access to channels.
MediaStreamTrack
A
object has an input and an output that represent the
combined input and output of all the object's tracks. The output of
the MediaStream
controls how the object is rendered, e.g., what is saved if
the object is recorded to a file or what is displayed if the object is
used in a MediaStream
video
element. A single
object can be attached to multiple different outputs at the
same time.MediaStream
A new
object can be created from existing media streams or tracks
using the MediaStream
MediaStream()
constructor. The constructor argument can either be an
existing
object, in which case all the tracks of the given stream are
added to the new MediaStream
object, or an array of MediaStream
objects. The latter form makes it possible to compose a
stream from different source streams.MediaStreamTrack
Both
and MediaStream
objects can be cloned. A cloned MediaStreamTrack
contains clones of all member tracks from the original
stream. A cloned MediaStream
has a
set of constraints that is
independent of the instance it is cloned from, which allows media
from the same source to have different constraints applied for
different consumers.
The MediaStreamTrack
MediaStream
object is also used in contexts outside
getUserMedia
, such as [WEBRTC10].
The
MediaStream()
constructor composes a new stream out of existing tracks. It
takes an optional argument of type
or an array of MediaStream
objects. When the
constructor is invoked, the User Agent must run the following steps:MediaStreamTrack
Let stream be a newly constructed
object.MediaStream
Initialize stream's
id
attribute to a newly generated value.
If the constructor's argument is present, construct a set of tracks, tracks based on the type of argument:
A
object:MediaStream
Let tracks be a set containing all
the
objects in
the MediaStreamTrack
track
set.MediaStream
A sequence of
objects:MediaStreamTrack
Let tracks be a set containing all
the
objects in the provided
sequence.MediaStreamTrack
Run the steps
for addTrack
on stream for each
in tracks.MediaStreamTrack
If stream's track set is
empty or only contains ended tracks, set stream's
active
attribute to false
, otherwise set it to
true
.
Return stream.
The tracks of a
are stored in a track set. The
track set MUST contain the MediaStream
objects that correspond to the tracks of the stream. The
relative order of the tracks in the set is User Agent defined and the
API will never put any requirements on the order. The proper way to
find a specific MediaStreamTrack
object in the set is to look it up by its MediaStreamTrack
id
.
An object that reads data from the output of a
is referred to as a MediaStream
consumer. The list of MediaStream
consumers currently include media elements (such as
MediaStream
<video>
and <audio>
) [HTML5],
Web Real-Time Communications (WebRTC; RTCPeerConnection
)
[WEBRTC10], media recording (MediaRecorder
)
[mediastream-recording], image capture (ImageCapture
)
[image-capture], and web audio
(MediaStreamAudioSourceNode
) [WEBAUDIO].
consumers must be able to handle tracks being added and
removed. This behavior is specified per consumer.MediaStream
A
object is said to
be active when it has at least
one MediaStream
that has
not ended. A MediaStreamTrack
that does not have any tracks or only has tracks that
are ended
is inactive.MediaStream
When a
goes from being active to inactive, the User Agent MUST
queue a task that sets the object's MediaStream
active
attribute to false
and fire a simple event
named
inactive
at the object. When a
goes from being inactive to active, the User Agent MUST
queue a task that sets the object's MediaStream
active
attribute to true
and fire a simple event named
active
at the object.
If the stream's activity status changed due to a user request, the task source [HTML5] for this task is the user interaction task source [HTML5]. Otherwise the task source for this task is the networking task source [HTML5].
[ Constructor,
Constructor (MediaStream
stream),
Constructor (sequence<MediaStreamTrack
> tracks)]
interface MediaStream : EventTarget {
readonly attribute DOMString id;
sequence<MediaStreamTrack
> getAudioTracks ();
sequence<MediaStreamTrack
> getVideoTracks ();
sequence<MediaStreamTrack
> getTracks ();
MediaStreamTrack
? getTrackById (DOMString trackId);
void addTrack (MediaStreamTrack
track);
void removeTrack (MediaStreamTrack
track);
MediaStream
clone ();
readonly attribute boolean active;
attribute EventHandler onactive;
attribute EventHandler oninactive;
attribute EventHandler onaddtrack;
attribute EventHandler onremovetrack;
};
MediaStream
See the MediaStream constructor algorithm
MediaStream
See the MediaStream constructor algorithm
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
stream |
| ✘ | ✘ |
MediaStream
See the MediaStream constructor algorithm
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
tracks | sequence< | ✘ | ✘ |
active
of type boolean, readonly This attribute is true if the
is active and false otherwise.MediaStream
id
of type DOMString, readonly When a
object is created, the User Agent MUST generate an
identifier string, and MUST initialize the object's MediaStream
id
attribute to that string. A good practice is to use a
UUID [rfc4122], which is 36 characters long in its canonical form.
The
id
attribute MUST return the value to which it was
initialized when the object was created.
onactive
of type EventHandler, This event handler, of type
active
, MUST be supported by all objects implementing the
interface.MediaStream
onaddtrack
of type EventHandler, This event handler, of type
addtrack
, MUST be supported by all objects implementing the
interface.MediaStream
oninactive
of type EventHandler, This event handler, of type
inactive
, MUST be supported by all objects implementing the
interface.MediaStream
onremovetrack
of type EventHandler, This event handler, of type
removetrack
, MUST be supported by all objects implementing the
interface.MediaStream
addTrack
Adds the given
to this MediaStreamTrack
.MediaStream
When the
addTrack()
method is invoked, the User Agent MUST run the following
steps:
Let track be the
argument and stream this MediaStreamTrack
object.MediaStream
If track is already in stream's track set, then abort these steps.
Add track to stream's track set.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
track |
| ✘ | ✘ |
void
clone
Clones the given
and all its tracks.MediaStream
When the
MediaStream.clone()
method is invoked, the User Agent MUST run the following
steps:
Let streamClone be a newly constructed
object.MediaStream
Initialize streamClone's
id
attribute to a newly generated value.
Let clonedTracks be a list that contains the
result of running
MediaStreamTrack.clone()
on all the tracks in the stream on which this method
was called.
Let clonedTracks be streamClone's track set.
MediaStream
getAudioTracks
Returns a sequence of
objects representing the audio tracks in this
stream.MediaStreamTrack
The
getAudioTracks()
method MUST return a sequence that represents a snapshot
of all the
objects in this stream's track
set whose MediaStreamTrack
kind
is equal to "audio
". The conversion from
the track set to the sequence is user
agent defined and the order does not have to be stable between
calls.
sequence<MediaStreamTrack
>
getTrackById
The getTrackById()
method MUST return either a
object from this stream's track set whose
MediaStreamTrack
id
is equal to
trackId, or null, if no such track exists.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
trackId | DOMString | ✘ | ✘ |
MediaStreamTrack
, nullablegetTracks
Returns a sequence of
objects representing all the tracks in this stream.MediaStreamTrack
The
getTracks()
method MUST return a sequence that represents a snapshot
of all the
objects in this stream's track
set, regardless of MediaStreamTrack
kind
. The conversion from the track set to the sequence is User Agent
defined and the order does not have to be stable between calls.
sequence<MediaStreamTrack
>
getVideoTracks
Returns a sequence of
objects representing the video tracks in this
stream.MediaStreamTrack
The
getVideoTracks()
method MUST return a sequence that represents a snapshot
of all the
objects in this stream's track
set whose MediaStreamTrack
kind
is equal to "video
". The conversion from
the track set to the sequence is user
agent defined and the order does not have to be stable between
calls.
sequence<MediaStreamTrack
>
removeTrack
Removes the given
object
from this MediaStreamTrack
.MediaStream
When the
removeTrack()
method is invoked, the User Agent
MUST remove the
object,
indicated by the method's argument, from the stream's track set, if present.MediaStreamTrack
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
track |
| ✘ | ✘ |
void
A
object represents a media source in the User Agent. Several
MediaStreamTrack
objects can represent the same media source, e.g., when the
user chooses the same camera in the UI shown by two consecutive calls
to MediaStreamTrack
getUserMedia()
.
The data from a
object does not necessarily have a canonical binary form;
for example, it could just be "the video currently coming from the
user's video camera". This allows User Agents to manipulate media in
whatever fashion is most suitable on the user's platform.MediaStreamTrack
A script can indicate that a track no longer needs its source with
the
MediaStreamTrack.stop()
method. When all tracks using a source have been stopped,
the given permission for that source is revoked and the source is stopped. If the data is being generated from
a live source (e.g., a microphone or camera), then the User Agent
SHOULD remove any active "on-air" indicator for that source. An
implementation may use a per source reference count to keep track of
source usage, but the specifics are out of scope for this specification.
If there is no stored permission to use that source, the User Agent SHOULD also remove the "permission granted" indicator for the source.
A
has two states in its life-cycle: MediaStreamTrack
live
and
ended
. A newly created
can be in either state depending on how it was created. For
example, cloning an ended track results in a new ended track. The
current state is reflected by the object's MediaStreamTrack
readyState
attribute.
In the live
state, the track is active and media is
available for use by consumers (but may be replaced by
zero-information-content if the
is muted or disabled, see below).MediaStreamTrack
A muted or disabled
renders either silence (audio), black frames (video), or a
zero-information-content equivalent. For example, a video element
sourced by a muted or disabled MediaStreamTrack
(contained within a MediaStreamTrack
), is playing but the rendered content is the muted output.
When all tracks connected to a source are muted or disabled, the
"on-air" or "recording" indicator for that source can be turned off;
when the track is no longer muted or disabled, it MUST be turned
back on.MediaStream
The muted/unmuted state of a track reflects whether the source
provides any media at this moment. The enabled/disabled state is
under application control and determines whether the track outputs media
(to its consumers). Hence, media from the source only flows when a
object is both unmuted and enabled.MediaStreamTrack
A
is muted when the source is
temporarily unable to provide the track with data. A track can be
muted by a user. Often this action is outside the control of the
application. This could be as a result of the user hitting a
hardware switch or toggling a control in the operating system /
browser chrome. A track can also be muted by the User Agent.MediaStreamTrack
Applications are able to enable or
disable a
to prevent it from rendering media from the source. A
muted track will however, regardless of the enabled state, render
silence and blackness. A disabled track is logically equivalent to a
muted track, from a consumer point of view.MediaStreamTrack
For a newly created
object, the following applies. The track is always enabled
unless stated otherwise (for example when cloned) and the muted
state reflects the state of the source at the time the track is
created.MediaStreamTrack
A
object is said to end when the source of the
track is disconnected or exhausted.MediaStreamTrack
A
can be detached from its
source. It means that the track is no longer dependent on the source
for media data. If no other MediaStreamTrack
is using the same source, the source will be stopped. MediaStreamTrack
attributes such as MediaStreamTrack
kind
and
label
MUST NOT change values
when the source is detached.
When a
object ends for any reason (e.g., because the user
rescinds the permission for the page to use the local camera, or
because the application invoked the MediaStreamTrack
stop()
method on the
object, or because the User Agent has instructed the track to end
for any reason) it is said to be ended.MediaStreamTrack
When a
track ends for any reason other than the MediaStreamTrack
stop()
method being invoked, the User Agent MUST queue a task
that runs the following steps:
If the track's
readyState
attribute has the value ended
already,
then abort these steps.
Set track's
readyState
attribute to ended
.
Detach track's source.
Fire a simple event named
ended
at the object.
If the end of the stream was reached due to a user request, the event source for this event is the user interaction event source.
There are two concepts related to the media flow for a
live
: muted / not muted, and enabled / disabled.MediaStreamTrack
Muted refers to the input to the
. If live samples are not made available to the MediaStreamTrack
it is muted.MediaStreamTrack
Muted is out of control for the application, but can be observed
by the application by reading the
muted
attribute and listening to the associated events
mute
and
unmute
. There can be several reasons for a
to be muted: the user pushing a physical mute button on
the microphone, the user toggling a control in the operating system,
the user clicking a mute button in the browser chrome, the User Agent (on
behalf of the user) mutes, etc.MediaStreamTrack
Enabled/disabled on the other hand
is available to application to control (and observe) via the
enabled
attribute.
The result for the consumer is the same in the meaning that
whenever
is muted or disabled (or both) the consumer gets
zero-information-content, which means silence for audio and black
frames for video. In other words, media from the source only flows
when a MediaStreamTrack
object is both unmuted and enabled. For example, a video
element sourced by a muted or disabled MediaStreamTrack
(contained in a MediaStreamTrack
), is playing but rendering blackness.MediaStream
For a newly created
object, the following applies: the track is always enabled
unless stated otherwise (for example when cloned) and the muted
state reflects the state of the source at the time the track is
created.MediaStreamTrack
Constraints are set on tracks and may affect sources.
Whether
were provided at track initialization time or need to be
established later at runtime, the APIs defined in the
Constraints
ConstrainablePattern
Interface allow the retrieval and
manipulation of the constraints currently established on a
track.
Each track maintains an internal version of the
structure, namely a mandatory set of constraints (no
duplicates) and an optional ordered list of individual constraint
objects (may contain duplicates). The internal stored constraint
structure is exposed to the application by the Constraints
constraints
attribute, and may be modified by the
applyConstraints()
method.
When
applyConstraints()
is called, a User Agent MUST queue a task to evaluate
those changes when the task queue is next serviced. Similarly, if
the
sourceType
changes, then the User Agent MUST perform the same actions to
re-evaluate the constraints of each track affected by that source
change.
If the
event named MediaStreamError
overconstrained
is thrown, the track MUST be
muted until either new satisfiable constraints are applied or the
existing constraints become satisfiable.
interface MediaStreamTrack : EventTarget {
readonly attribute DOMString kind;
readonly attribute DOMString id;
readonly attribute DOMString label;
attribute boolean enabled;
readonly attribute boolean muted;
attribute EventHandler onmute;
attribute EventHandler onunmute;
readonly attribute boolean _readonly;
readonly attribute boolean remote;
readonly attribute MediaStreamTrackState
readyState;
attribute EventHandler onended;
MediaStreamTrack
clone ();
void stop ();
Capabilities getCapabilities ();
MediaTrackConstraints
getConstraints ();
Settings getSettings ();
Promise<void> applyConstraints (MediaTrackConstraints
constraints);
attribute EventHandler onoverconstrained;
};
enabled
of type boolean, The
MediaStreamTrack.enabled
attribute controls the
enabled
state for the object.
On getting, the attribute MUST return the value to which
it was last set. On setting, it MUST be set to the new value,
regardless of whether the
object has been detached
from its source or not.MediaStreamTrack
Thus, after a
is detached from its source, its MediaStreamTrack
enabled
attribute still changes value when set; it just
doesn't do anything with that new value.
id
of type DOMString, readonly Unless a
object is created as a part of a special purpose
algorithm that specifies how the track id must be initialized,
the User Agent MUST generate an identifier string and initialize
the object's MediaStreamTrack
id
attribute to that string. See
MediaStream.id
for guidelines on how to generate such an
identifier.
An example of an algorithm that specifies how the track id
must be initialized is the algorithm to represent an incoming
network component with a
object. [WEBRTC10]MediaStreamTrack
MediaStreamTrack.id
attribute MUST return the value to which it was
initialized when the object was created.
kind
of type DOMString, readonly The
MediaStreamTrack.kind
attribute MUST return the string "audio
"
if the object represents an audio track or "video
"
if object represents a video track.
label
of type DOMString, readonly User Agents MAY label audio and video sources (e.g.,
"Internal microphone" or "External USB Webcam"). The
MediaStreamTrack.label
attribute MUST return the label of the object's
corresponding source, if any. If the corresponding source has or
had no label, the attribute MUST instead return the empty
string.
muted
of type boolean, readonly The
MediaStreamTrack.muted
attribute MUST return true
if the track is
muted, and false
otherwise.
onended
of type EventHandler, This event handler, of type
ended
, MUST be supported by all objects implementing the
interface.MediaStreamTrack
onmute
of type EventHandler, This event handler, of type
mute
, MUST be supported by all objects implementing the
interface.MediaStreamTrack
onoverconstrained
of type EventHandler, See ConstrainablePattern Interface for the definition of this event handler.
onunmute
of type EventHandler, This event handler, of type
unmute
, MUST be supported by all objects implementing the
interface.MediaStreamTrack
readonly
of type boolean, readonly If the track (audio or video) source is a local microphone or
camera that is shared so that constraints applied to the track
cannot modify the source's settings, the
readonly
attribute MUST return the value true
.
Otherwise, it must return the value false
.
readyState
of type MediaStreamTrackState
, readonly The
readyState
attribute represents the state of the track. It MUST
return the value as most recently set by the User Agent.
remote
of type boolean, readonly If the track is sourced by a non-local source, the
remote
attribute MUST return the value true
.
Otherwise, it must return the value false
.
applyConstraints
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
constraints |
| ✘ | ✘ |
A new constraint structure to apply to this object. |
Promise<void>
clone
Clones the given
.MediaStreamTrack
When the
MediaStreamTrack.clone()
method is invoked, the User Agent MUST run the
following steps:
Let trackClone be a newly constructed
object.MediaStreamTrack
Initialize trackClone's
id
attribute to a newly generated value.
Let trackClone inherit this track's underlying
source,
kind
,
label
,
readyState
, and
enabled
attributes, as well as its currently active
constraints.
Return trackClone.
MediaStreamTrack
getCapabilities
See ConstrainablePattern Interface for the definition of this method.
Capabilities
getConstraints
See ConstrainablePattern Interface for the definition of this method.
MediaTrackConstraints
getSettings
See ConstrainablePattern Interface for the definition of this method.
Settings
stop
When a
object's
MediaStreamTrack
stop()
method is invoked, the User Agent MUST run following
steps:
Let track be the current
object.MediaStreamTrack
If track is sourced by a non-local source, then abort these steps.
Set track's
readyState
attribute to ended
.
Detach track's source.
The task source for the tasks queued for the
stop()
method is the DOM manipulation task source.
void
enum MediaStreamTrackState {
"live",
"ended"
};
Enumeration description | |
---|---|
live |
The track is active (the track's underlying media source is making a best-effort attempt to provide data in real time). The output of a track in the |
ended |
The track has ended (the track's underlying media source is no longer providing data, and will never provide more data for this track). Once a track enters this state, it never exits it. For example, a video track in a |
enum SourceTypeEnum {
"camera",
"microphone"
};
Enumeration description | |
---|---|
camera |
A valid source type only for video |
microphone |
A valid source type only for audio |
dictionary MediaTrackConstraints : MediaTrackConstraintSet
{
sequence<MediaTrackConstraintSet
> advanced;
};
MediaTrackConstraints
Membersadvanced
of type sequence<MediaTrackConstraintSet
>See Constraints and ConstraintSet for the definition of this element.
dictionary MediaTrackConstraintSet {
ConstrainLong
width;
ConstrainLong
height;
ConstrainDouble
aspectRatio;
ConstrainDouble
frameRate;
ConstrainDOMString
facingMode;
ConstrainDouble
volume;
ConstrainLong
sampleRate;
ConstrainLong
sampleSize;
ConstrainBoolean
echoCancellation;
ConstrainDOMString
deviceId;
ConstrainDOMString
groupId;
};
MediaTrackConstraintSet
MembersaspectRatio
of type ConstrainDouble
deviceId
of type ConstrainDOMString
echoCancellation
of type ConstrainBoolean
facingMode
of type ConstrainDOMString
frameRate
of type ConstrainDouble
groupId
of type ConstrainDOMString
height
of type ConstrainLong
sampleRate
of type ConstrainLong
sampleSize
of type ConstrainLong
volume
of type ConstrainDouble
width
of type ConstrainLong
The
addtrack
and
removetrack
events use the
interface.MediaStreamTrackEvent
Firing a track event named
e with a
track means that an event with the name
e, which does not bubble (except where otherwise stated)
and is not cancelable (except where otherwise stated), and which uses
the MediaStreamTrack
interface with the MediaStreamTrackEvent
track
attribute set to track, MUST be created and
dispatched at the given target.
dictionary MediaStreamTrackEventInit : EventInit {
MediaStreamTrack
track = null;
};
[ Constructor (DOMString type, MediaStreamTrackEventInit
eventInitDict)]
interface MediaStreamTrackEvent : Event {
readonly attribute MediaStreamTrack
track;
};
MediaStreamTrackEvent
Constructs a new
.MediaStreamTrackEvent
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
type | DOMString | ✘ | ✘ | |
eventInitDict |
| ✘ | ✘ |
track
of type MediaStreamTrack
, readonly The
track
attribute represents the
object associated with the event.MediaStreamTrack
MediaStreamTrackEventInit
Memberstrack
of type MediaStreamTrack
, defaulting to null
This section is non-normative.
Browsers provide a media pipeline from sources to sinks. In a browser, sinks are the <img>, <video>, and <audio> tags. Traditional sources include streamed content, files, and web resources. The media produced by these sources typically does not change over time - these sources can be considered to be static.
The sinks that display these sources to the user (the actual tags
themselves) have a variety of controls for manipulating the source
content. For example, an <img> tag scales down a huge source image
of 1600x1200 pixels to fit in a rectangle defined with
width="400"
and height="300"
.
The getUserMedia API adds dynamic sources such as microphones and cameras - the characteristics of these sources can change in response to application needs. These sources can be considered to be dynamic in nature. A <video> element that displays media from a dynamic source can either perform scaling or it can feed back information along the media pipeline and have the source produce content more suitable for display.
Note: This sort of feedback loop is obviously just enabling an "optimization", but it's a non-trivial gain. This optimization can save battery, allow for less network congestion, etc...
Note that MediaStream
sinks (such as
<video>
, <audio>
, and even
RTCPeerConnection
) will continue to have mechanisms to
further transform the source stream beyond that which the
Settings, Capabilities, and Constraints
described
in this specification offer. (The sink transformation options, including
those of RTCPeerConnection
, are outside the scope of this
specification.)
The act of changing or applying a track constraint may affect the
settings
of all tracks sharing that source and consequently all
down-level sinks that are using that source. Many sinks may be able to
take these changes in stride, such as the <video>
element or RTCPeerConnection
. Others like the Recorder API
may fail as a result of a source setting change.
The RTCPeerConnection
is an interesting object because
it acts simultaneously as both a sink and a source for
over-the-network streams. As a sink, it has source transformational
capabilities (e.g., lowering bit-rates, scaling-up / down resolutions,
and adjusting frame-rates), and as a source it could have its own settings
changed by a track source (though in this specification sources with the
remote
attribute set to true do not consider the current constraints
applied to a track).
To illustrate how changes to a given source impact various sinks,
consider the following example. This example only uses width and height,
but the same principles apply to all of the Settings exposed in
this specification. In the first figure a home client has obtained a
video source from its local video camera. The source's width and height
settings are 800 pixels and 600 pixels, respectively. Three
objects on the home client contain tracks that use this same
MediaStream
deviceId
. The three media streams are connected to three different
sinks: a <video>
element (A), another
<video>
element (B), and a peer connection (C). The
peer connection is streaming the source video to a remote client. On the
remote client there are two media streams with tracks that use the peer
connection as a source. These two media streams are connected to two
<video>
element sinks (Y and Z).
Note that at this moment, all of the sinks on the home client must apply a transformation to the original source's provided dimension settings. B is scaling the video down, A is scaling the video up (resulting in loss of quality), and C is also scaling the video up slightly for sending over the network. On the remote client, sink Y is scaling the video way down, while sink Z is not applying any scaling.
Using the ConstrainablePattern
interface, one of the tracks
requests a higher resolution (1920 by 1200 pixels) from the home
client's video source.
Note that the source change immediately affects all of the tracks and sinks on the home client, but does not impact any of the sinks (or sources) on the remote client. With the increase in the home client source video's dimensions, sink A no longer has to perform any scaling, while sink B must scale down even further than before. Sink C (the peer connection) must now scale down the video in order to keep the transmission constant to the remote client.
While not shown, an equally valid settings change request could be client's side). In addition to impacting sink Y and Z in the same manner as A, B and C were impacted earlier, it could lead to re-negotiation with the peer connection on the home client in order to alter the transformation that it is applying to the home client's video source. Such a change is NOT REQUIRED to change anything related to sink A or B or the home client's video source.
Note that this specification does not define a mechanism by which a change to the remote client's video source could automatically trigger a change to the home client's video source. Implementations may choose to make such source-to-sink optimizations as long as they only do so within the constraints established by the application, as the next example demonstrates.
It is fairly obvious that changes to a given source will impact sink
consumers. However, in some situations changes to a given sink may also
cause mplementations to adjust a source's settings. This is
illustrated in the following figures. In the first figure below, the
home client's video source is sending a video stream sized at 1920 by
1200 pixels. The video source is also unconstrained, such that the exact
source dimensions are flexible as far as the application is concerned.
Two
objects contain tracks with the same MediaStream
deviceId
, and those
s are connected to two different MediaStream
<video>
element sinks A and B. Sink A has been sized to
width="1920"
and height="1200"
and is
displaying the source's video content without any transformations. Sink
B has been sized smaller and, as a result, is scaling the video down to
fit its rectangle of 320 pixels across by 200 pixels down.
When the application changes sink A to a smaller dimension (from 1920 to 1024 pixels wide and from 1200 to 768 pixels tall), the browser's media pipeline may recognize that none of its sinks require the higher source resolution, and needless work is being done both on the part of the source and sink A. In such a case and without any other constraints forcing the source to continue producing the higher resolution video, the media pipeline MAY change the source resolution:
In the above figure, the home client's video source resolution was changed to the greater of that from sink A and B in order to optimize playback. While not shown above, the same behavior could apply to peer connections and other sinks.
It is possible that constraints can be applied to a track
which a source is unable to satisfy, either because the source itself
cannot satisfy the constraint or because the source is already
satisfying a conflicting constraint. When this happens, the promise
returned from
applyConstraints()
will be rejected, without applying any of the new
constraints. Since no change in constraints occurs in this case, there
is also no required change to the source itself as a result of this
condition. Here is an example of this behavior.
In this example, two media streams each have a video track that share the same source. The first track initially has no constraints applied. It is connected to sink N. Sink N has a resolution of 800 by 600 pixels and is scaling down the source's resolution of 1024 by 768 to fit. The other track has a mandatory constraint forcing off the source's fill light; it is connected to sink P. Sink P has a width and height equal to that of the source.
Now, the first track adds a mandatory constraint that the fill light should be forced on. At this point, both mandatory constraints cannot be satisfied by the source (the fill light cannot be simultaneously on and off at the same time). Since this state was caused by the first track's attempt to apply a conflicting constraint, the constraint application fails and there is no change in the source's settings nor to the constraints on either track.
Let's look at a slightly different situation starting from the same point. In this case, instead of the first track attempting to apply a conflicting constraint, the user physically locks the camera into a mode where the fill light is on. At this point the source can no longer satisfy the second track's mandatory constraint that the fill light be off. The second track is transitioned into the muted state and receives an overconstrained event. At the same time, the source notes that its remaining active sink only requires a resolution of 800 by 600 and so it adjusts its resolution down to match (this is an optional optimization that the User Agent is allowed to make given the situation).
At this point, it is the responsibility of the application to address the problem that led to the overconstrained situation, perhaps by removing the fill light mandatory constraint on the second track or by closing the second track altogether and informing the user.
A MediaStream
may be assigned to media elements as
defined in HTML5
[HTML5] A MediaStream
is not preloadable or seekable and
represents a simple, potentially infinite, linear media timeline. The
timeline starts at 0 and increments linearly in real time as long as the
MediaStream
is playing. The timeline does not increment
when the MediaStream
is paused.
User Agents that support this specification MUST support the following partial interface, which allows a MediaStream to be assigned directly to a media element.
partial interface HTMLMediaElement {
attribute MediaStream
? srcObject;
};
srcObject
of type MediaStream
, , nullableHolds the MediaStream that provides media for this element.
This attribute overrides both the src
attribute and
any <source> elements. Specifically, if
srcObject
is specified, the User Agent MUST use it as the
source of media, even if the src
attribute is also
set or <source> children are present. If the value of
srcObject
is replaced or set to null the User Agent MUST
re-run the
media element load algorithm
We may want to allow direct assignment of other types as well
The User Agent runs the
media element load algorithm to obtain media for the media element
to display. As defined in the [HTML5] specification, this algorithm
has two basic phases:
resource selection algorithm chooses the resource to play and
resolves its URI. Then the
resource fetch phase loads the resource. Both these phases are
potentially simplified when using a MediaStream. First of all,
srcObject
takes priority over other means of specifying
the resource, and it provides the object itself rather than a URI.
Therefore, there is no need to run the resource selection algorithm.
Secondly, when the User Agent reaches the resource fetch algorithm with a
MediaStream, the MediaStream is a local object so there's nothing to
fetch. Therefore, the following modifications/restrictions to the
media element load algorithm apply:
Whenever the User Agent runs the
media element load algorithm, if srcObject
is
specified, the User Agent must immediately go to the
resource fetch phase of the algorithm.
Whenever the User Agent runs the
media element load algorithm, and reaches the
resource fetch phase of this algorithm, if it determines that
the media resource in question is a MediaStream, it MUST
immediately abort the
resource selection algorithm, and set the
media.readyState
to HAVE_NOTHING if the MediaStream
is inactive,
or HAVE_ENOUGH_DATA if it is active.
For each
in the MediaStreamTrack
, including those that are added after the User Agent enters the
media element load algorithm, the User Agent MUST create a
corresponding MediaStream
AudioTrack
or
VideoTrack
as defined in [HTML5]. Since the order in the
's track set is undefined, no
requirements are put how the MediaStream
AudioTrackList
and
VideoTrackList
are ordered.
The properties of the AudioTrack
and
VideoTrack
objects MUST be initialized as follows.
Let
AudioTrack.id
and VideoTrack.id
have the value of the corresponding
MediaStreamTrack.id
attribute
AudioTrack.kind
and
VideoTrack.kind
be "main"
AudioTrack.label
and
VideoTrack.label
have the value of the
corresponding
MediaStreamTrack.label
attribute
AudioTrack.language
and
VideoTrack.language
be the empty string
Let the media resource, represented by the
object, indicate to the
media element load algorithm that all audio tracks and all
live video tracks (represented by a MediaStream
with the MediaStreamTrack
readyState
attribute set to live
) should be enabled.
This allows the media element load algorithm to set
AudioTrack.enabled
, VideoTrack.selected
and VideoTrackList.selectedIndex
accordingly.
(Note that since the MediaStream is potentially endless, the User Agent does not exit the media element load algorithm until the MediaStream moves from the active to the inactive state.)
If a
is removed from a MediaStreamTrack
, played by a media element, the corresponding
MediaStream
AudioTrack
or VideoTrack
MUST be removed
as well.
The User Agent MUST NOT buffer data from a MediaStream. When playing, the User Agent MUST always play the current data from the stream.
When the MediaStream state moves from the active to the inactive state, the User Agent MUST raise an
ended event on the media element and set its ended
attribute to true
. Note that once ended
equals true
the media element will not play media
even if new Tracks are added to the MediaStream (causing it to
return to the active state) unless autoplay
is
true
or the JavaScript restarts the element, e.g., by
calling play().
The nature of the MediaStream
places certain
restrictions on the behavior and attribute values of the associated
media element and on the operations that can be performed on it, as
shown below:
Attribute Name | Attribute Type | Valid Values When Using a MediaStream | Additional considerations |
---|---|---|---|
currentSrc
|
DOMString
|
the empty string | When srcObject is specified the User Agent MUST set
this to the empty string. |
preload
|
DOMString
|
none
|
A MediaStream cannot be preloaded. |
buffered
|
TimeRanges
|
buffered.length MUST return
0 . |
A MediaStream cannot be preloaded. Therefore, the amount buffered is always an empty TimeRange. |
networkState
|
unsigned short
|
NETWORK_IDLE | The media element does not fetch the MediaStream so there is no network traffic. |
readyState
|
unsigned short
|
HAVE_NOTHING, HAVE_ENOUGH_DATA | A
may be created before there is any data available, for
example when a stream is received from a remote peer. The value
of the readyState of the media element MUST be
HAVE_NOTHING before the first media arrives and HAVE_ENOUGH_DATA
once the first media has arrived. |
currentTime
|
double
|
Any non-negative integer. The initial value is 0 and the values increments linearly in real time whenever the stream is playing. | The value is the current stream position, in seconds. On any
attempt to set this attribute, the User Agent must throw an
InvalidStateError exception. |
duration
|
unrestricted double
|
Infinity | A MediaStream does not have a pre-defined duration. |
seeking
|
boolean
|
false | A MediaStream is not seekable. Therefore, this attribute
MUST always have the value false . |
defaultPlaybackRate
|
double
|
1.0 | A MediaStream is not seekable. Therefore, this attribute
MUST always have the value 1.0 and any attempt to
alter it MUST fail. |
playbackRate
|
double
|
1.0 | A MediaStream is not seekable. Therefore, this attribute
MUST always have the value 1.0 and any attempt to
alter it MUST fail. |
played
|
TimeRanges
|
played.length MUST return
1 .played.start(0) MUST return
0 .played.end(0) MUST return the
last known
currentTime
. |
A MediaStream's timeline always consists of a single range, starting at 0 and extending up to the currentTime. |
seekable
|
TimeRanges
|
seekable.length MUST return 0 .
|
A MediaStream is not seekable. |
loop
|
boolean
|
true, false | Setting the loop attribute has no effect since
a
has no defined end and therefore cannot be
looped. |
All promises in this specification, when they are rejected, are
rejected with an object that implements the
interface.MediaStreamError
All errors defined in this specification implement the following interface:
[NoInterfaceObject]
interface MediaStreamError {
readonly attribute DOMString name;
readonly attribute DOMString? message;
readonly attribute DOMString? constraintName;
};
constraintName
of type DOMString, readonly , nullableThis attribute is only used for some types of errors. For
with a name of MediaStreamError
ConstraintNotSatisfiedError
or of OverconstrainedError
,
this attribute MUST be set to the name of the constraint that caused
the error.
message
of type DOMString, readonly , nullableA User Agent-dependent string offering extra human-readable information about the error.
name
of type DOMString, readonly The name of the error
The following interface is defined for cases when a MediaStreamError is raised as an event:
dictionary MediaStreamErrorEventInit : EventInit {
MediaStreamError
? error = null;
};
[ Constructor (DOMString type, MediaStreamErrorEventInit
eventInitDict)]
interface MediaStreamErrorEvent : Event {
readonly attribute MediaStreamError
? error;
};
MediaStreamErrorEvent
Constructs a new
.MediaStreamErrorEvent
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
type | DOMString | ✘ | ✘ | |
eventInitDict |
| ✘ | ✘ |
error
of type MediaStreamError
, readonly , nullableThe
describing the
error that triggered the event (if any).MediaStreamError
MediaStreamErrorEventInit
Memberserror
of type MediaStreamError
, nullable, defaulting to null
The
describing the
error associated with the event (if any)MediaStreamError
The table below lists the error names defined in this specification.
Name | Description | Note |
---|---|---|
NotSupportedError |
The operation is not supported. | Same as defined in [DOM4] |
PermissionDeniedError |
The user did not grant permission for the operation. | |
ConstraintNotSatisfiedError |
One of the mandatory Constraints could not be satisfied. |
The constraintName attribute gets set to the name of the constraint that caused the error |
OverconstrainedError |
Due to changes in the environment, one or more mandatory constraints can no longer be satisfied. | The constraintName attribute gets set to the name of the constraint that caused the error |
NotFoundError |
The object can not be found here. | Same as defined in [DOM4] |
AbortError |
The operation was aborted. | Same as defined in [DOM4] |
SourceUnavailableError |
The source of the MediaStream could not be accessed due to a hardware error (e.g. lock from another process). |
This section is non-normative.
The following event fires on
objects:MediaStream
Event name | Interface | Fired when... |
---|---|---|
active
|
Event
|
The
became active (see inactive). |
inactive
|
Event
|
The
became inactive. |
addtrack
|
|
A new
has been added to this stream. Note that this event is
not fired when the script directly modifies the tracks of a
. |
removetrack
|
|
A
has been removed from this stream. Note that this event
is not fired when the script directly modifies the tracks of a
. |
The following event fires on
objects:MediaStreamTrack
Event name | Interface | Fired when... |
---|---|---|
mute
|
Event
|
The
object's source is temporarily unable to provide
data. |
unmute
|
Event
|
The
object's source is live again after having been
temporarily unable to provide data. |
overconstrained
|
|
This error event fires for each affected track
(when multiple tracks share the same source) after the user
agent has evaluated the current constraints against a given
Due to being over-constrained, the User Agent must mute each affected track. The affected track(s) will remain muted until the application adjusts the constraints to accommodate the source's capabilities. |
ended
|
|
The When the end of MediaStreamTrack is the result of an
error, the |
The following event fires on
objects:MediaDevices
Event name | Interface | Fired when... |
---|---|---|
devicechange
|
Event
|
The set of media devices, available to the User Agent, has
changed. The current list devices can be retrieved with the
enumerateDevices()
method. |
This section describes an API that the script can use to query the User Agent about connected media input and output devices (for example a web camera or a headset).
The MediaDevices
object which is the entry point to
the API used to examine and get access to media devices available to
the User Agent.
When a new media input or output device is made available, the user
agent MUST queue a task fires a simple event named
devicechange
at the
object.MediaDevices
interface MediaDevices : EventTarget {
attribute EventHandler ondevicechange;
Promise<sequence<MediaDeviceInfo
>> enumerateDevices ();
};
ondevicechange
of type EventHandler, This event handler, of type
devicechange
, MUST be supported by all objects implementing the
interface.MediaDevices
enumerateDevices
Collects information about the User Agents available media input and output devices.
Returns a promise. The promise will be fulfilled with a sequence
of
dictionaries representing the User Agent's available media
input and output devices if enumeration is successful.MediaDeviceInfo
Camera and microphone sources should be enumerable. Specifications that add additional types of source will provide recommendations about whether the source type should be enumerable.
When the
enumerateDevices()
method is called, the User Agent must run the following steps:
Let p be a new promise.
Run the following steps asynchronously:
Let resultList be an empty list.
If this method has been called previously within this
application session, let oldList be the list of
objects that was produced at that call
(resultList); otherwise, let oldList be
an empty list.MediaDeviceInfo
Probe the User Agent for available media devices, and run the following sub steps for each discovered device, device:
If device is represented by a
object in oldList, append that object
to resultList, abort these steps and continue
with the next device (if any).MediaDeviceInfo
Let deviceInfo be a new
object to represent device.MediaDeviceInfo
If device belongs to the same physical
device as a device, already represented in
oldList or resultList, initialize
deviceInfo's
groupId
member to the
groupId
value of the existing
object. Otherwise, let deviceInfo's
MediaDeviceInfo
groupId
member be a newly generated unique
identifier.
Append deviceInfo to resultList.
If none of the local devices are attached to an active
, let
filteredList be a copy of resultList, and
all its elements, where the MediaStreamTrack
label
member is the
empty string.
If filteredList is a non-empty list, then resolve p with filteredList. Otherwise, resolve p with resultList.
Return p.
Promise<sequence<MediaDeviceInfo
>>
The algorithm described above means that the access to media device information depends on whether or not permission has been granted to the page's origin to use media devices.
If no such access has been granted, the
dictionary will contain
the deviceId, kind, and groupId.MediaDeviceInfo
If access has been granted for a media device,
the
dictionary will
contain the deviceId, kind, label, and groupId.MediaDeviceInfo
interface MediaDeviceInfo {
readonly attribute DOMString deviceId;
readonly attribute MediaDeviceKind
kind;
readonly attribute DOMString? label;
readonly attribute DOMString? groupId;
};
deviceId
of type DOMString, readonly A unique identifier for the represented device.
All enumerable devices have an identifier that MUST be unique to the application and persistent across sessions. Unique and stable identifiers let the application save, identify the availability of, and directly request specific sources.
This identifier MUST be un-guessable by other applications to prevent the identifier being used to correlate the same user across different applications.
Since deviceId
persists across sessions and to
reduce its potential as a fingerprinting
mechanism, deviceId
is to be treated as other
persistent storage mechanisms such as cookies [COOKIES]. User
agents should reset per-application device
identifiers when other persistent storages are cleared.
groupId
of type DOMString, readonly , nullableReturns the group identifier of the represented device. Two devices have the same group identifier if they belong to the same physical device; for example a monitor with a built in camera and microphone.
kind
of type MediaDeviceKind
, readonly Describes the kind of the represented device.
label
of type DOMString, readonly , nullableA label describing this device (for example "External USB Webcam"). If the device has no associated label, then this attribute MUST return the empty string.
enum MediaDeviceKind {
"audioinput",
"audiooutput",
"videoinput"
};
Enumeration description | |
---|---|
audioinput |
Represents an audio input device; for example a microphone. |
audiooutput |
Represents an audio output device; for example a pair of headphones. |
videoinput |
Represents a video input device; for example a webcam. |
This section extends
and NavigatorUserMedia
with APIs to request permission to access media input devices
available to the User Agent.MediaDevices
When on an insecure origin [mixed-content], user agents are
encouraged to warn about usage
of MediaDevices.getUserMedia
, navigator.getUserMedia
,
and any prefixed variants in their developer tools, error
logs, etc. It is explicitly permitted for user agents to
remove these APIs entirely when on an insecure origin, as long
as they remove all of them at once (e.g., they should not
leave just the prefixed version available on insecure origins.)
First, the official definition for the getUserMedia() method, and the one which developers are encouraged to use, is now the one defined here under MediaDevices. This decision reflected consensus as long as the original API remained available at NavigatorUserMedia.getUserMedia() under the Navigator object for backwards compatibility reasons, since the working group acknowledges that early users of these APIs have been encouraged to define getUserMedia as "var getUserMedia = navigator.getUserMedia || navigator.webkitGetUserMedia || navigator.mozGetUserMedia;" in order for their code to be functional both before and after official implementations of getUserMedia() in popular browsers. To ensure functional equivalence, the getUserMedia() method under NavigatorUserMedia is defined in terms of the method here.
Second, the method defined here is Promises-based, while the one defined under NavigatorUserMedia is currently still callback-based. Developers expecting to find getUserMedia() defined under NavigatorUserMedia are strongly encouraged to read the detailed Note given there.
The getSupportedConstraints
method is provided to
allow the application to determine which constraints the User Agent
recognizes.
partial interface MediaDevices {
static Dictionary getSupportedConstraints (DOMString kind);
Promise<MediaStream
> getUserMedia (MediaStreamConstraints
constraints);
};
getSupportedConstraints
, staticReturns a dictionary whose members are the constrainable properties
known to the User Agent for the kind given as argument. A
supported constrainable property MUST be represented by a member whose name is
the constraint name and whose value is true
. Any
constrainable properties not supported by the User Agent MUST not be
present in the returned dictionary.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
kind | DOMString | ✘ | ✘ |
Dictionary
getUserMedia
Prompts the user for permission to use their Web cam or other video or audio input.
(Remove when other issues are removed. This is only here to keep the issues from being renumbered)
The constraints argument is a dictionary of type
.MediaStreamConstraints
Returns a promise. The promise will be fulfilled with a suitable
object if the user accepts valid tracks
as described below.MediaStream
The promise will be rejected if there is a failure in finding valid tracks or if the user denies permission, as described below.
When the
getUserMedia()
method is called, the User Agent MUST run the following
steps:
Let p be a new promise.
Let constraints be the method's first argument.
Run the following steps asynchronously:
Let requestedMediaTypes be the set of media types in constraints with either a dictionary value or a value of "true".
If requestedMediaTypes is the empty set, let
error be a new
object whose MediaStreamError
name
attribute has the value
NotSupportedError
and jump to the step labeled
Error Task below.
Let finalSet be an (initially) empty set.
For each media type T in requestedMediaTypes,
For each possible source for that media type, construct an unconstrained MediaStreamTrack with that source as its source.
Call this set of tracks the candidateSet.
Run
the SelectSettings()
algorithm on each track
in CandidateSet. If the algorithm
does not return "undefined", add the track to
finalSet.
This eliminates devices unable to
satisfy the constraints, by verifying that at least
one setting dictionary exists that satisfies the constraints.
If finalSet is the empty set,
let error be a
new
object
whose MediaStreamError
name
attribute has the
value NotFoundError
and jump to the
step labeled Error Task below.
Optionally, e.g., based on a previously-established user preference, for security reasons, or due to platform limitations, jump to the step labeled Permission Failure below.
Prompt the user in a User Agent specific manner for
permission to provide the entry script's origin with a
object representing a media stream.MediaStream
The provided media MUST include precisely one track of each
media type in requestedMediaTypes from the
finalSet. The decision of which tracks to choose
from the finalSet is completely up to the user
agent and may be determined by asking the user. Once selected,
the source of a
MUST not change.MediaStreamTrack
The user agent MAY use the value of the computed "fitness distance" from the SelectSettings() algorithm, or any other internally-availble information about the devices, as an input to the selection algorithm.
User Agents are encouraged to default to using the user's primary or system default camera and/or microphone (when possible) to generate the media stream. User Agents MAY allow users to use any media source, including pre-recorded media files.
If the user grants permission to use local recording devices, User Agents are encouraged to include a prominent indicator that the devices are "hot" (i.e. an "on-air" or "recording" indicator), as well as a "device accessible" indicator indicating that the page has been granted access to the source.
If the user denies permission, jump to the step labeled Permission Failure below. If the user never responds, this algorithm stalls on this step.
If the user grants permission but a hardware error such as an OS/program/webpage lock prevents access, jump to the step labeled Unavailable Failure below.
If the user grants permission but device access fails for any reason other than those listed above, jump to the step labeled General Failure below.
Let stream be the
object for which the user granted permission.MediaStream
Run the ApplyConstraints()
algorithm on
all tracks in stream with the appropriate
constraints.
Resolve p with stream.
Abort these steps.
Permission Failure: Let error be a new
object whose MediaStreamError
name
attribute has the value
PermissionDeniedError
and jump to the step
labeled Error Task below.
Constraint Failure: Let error be a new
object whose MediaStreamError
name
attribute has the value
ConstraintNotSatisfiedError
and whose
constraintName
attribute is set to the name of the constraint that
caused the error.
Unavailable Failure: Let error be a new
object whose MediaStreamError
name
attribute has the value
SourceUnavailableError
and jump to the step
labeled Error Task below.
General Failure: Let error be a new
object whose MediaStreamError
name
attribute has the value
AbortError
and jump to the step
labeled Error Task below.
Error Task: Reject p with error.
Return p.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
constraints |
| ✘ | ✘ |
Promise<MediaStream
>
In the algorithm above, constraints are checked twice - once at device selection, and once after access approval. Time may have passed between those checks, so it is concievable that the selected device is no longer suitable. In this case, a SourceUnavailable error will result.
The MediaStreamConstraints
dictionary is used to
instruct the User Agent what sort of MediaStreamTrack
s to include in
the MediaStream
returned by getUserMedia().
dictionary MediaStreamConstraints {
(boolean or MediaTrackConstraints
) video = false;
(boolean or MediaTrackConstraints
) audio = false;
};
MediaStreamConstraints
Membersaudio
of type (boolean or MediaTrackConstraints), defaulting to false
If true
, it requests that the returned
MediaStream
contain an audio track. If a Constraints
structure is provided, it further specifies the nature and
settings of the audio Track. If false
, the
MediaStream
MUST not contain an audio Track.
video
of type (boolean or MediaTrackConstraints), defaulting to false
If true
, it requests that the returned
MediaStream
contain a video track. If a Constraints
structure is provided, it further specifies the nature and
settings of the video Track. If false
, the
MediaStream
MUST not contain a video Track.
The User Agent is encouraged to reserve resources when it has determined that a given call to getUserMedia() will be successful. It is preferable to reserve the resource prior to resolving the returned promise. Subsequent calls to getUserMedia() (in this page or any other) should treat the resource that was previously allocated, as well as resources held by other applications, as busy. Resources marked as busy should not be provided as sources to the current web page, unless specified by the user. Optionally, the user agent may choose to provide a stream sourced from a busy source but only to a page whose origin matches the owner of the original stream that is keeping the source busy.
This document recommends that in the permission grant dialog or device selection interface (if one is present), the user be allowed to select any available hardware as a source for the stream requested by the page (provided the resource is able to fulfill any specified mandatory constraints). Although not specifically recommended as best practice, note that some User Agents may support the ability to substitute a video or audio source with local files and other media. A file picker may be used to provide this functionality to the user.
This document also recommends that the user be shown all resources that are currently busy as a result of prior calls to getUserMedia() (in this page or any other page that is still alive) and be allowed to terminate that stream and utilize the resource for the current page instead. If possible in the current operating environment, it is also suggested that resources currently held by other applications be presented and treated in the same manner. If the user chooses this option, the track corresponding to the resource that was provided to the page whose stream was affected must be removed.
When permission is requested for a device, the User Agent may choose to store that permission, if granted, for later use by the same origin, so that the user does not need to grant permission again at a later time. Such storing MUST only be done when the page is secure (served over HTTPS and having no mixed content). It is an User Agent choice whether it offers functionality to store permission to each device separately, all devices of a given class, or all devices; the choice needs to be apparent to the user.
When permission is not stored, permission should last only until such time as all MediaStreamTracks sourced from that device have been stopped.
A MediaStream
may contain more than
one video and audio track. This makes it possible to include video
from two or more webcams in a single stream object, for example.
However, the current API does not allow a page to express a need for
multiple video streams from independent sources.
It is recommended for multiple calls to getUserMedia() from the same page be allowed as a way for pages to request multiple discrete video and/or audio streams.
A single call to getUserMedia() will always return a stream with either zero or one audio tracks, and either zero or one video tracks. If a script calls getUserMedia() multiple times before reaching a stable state, this document advises the UI designer that the permission dialogs should be merged, so that the user can give permission for the use of multiple cameras and/or media sources in one dialog interaction. The constraints on each getUserMedia call can be used to decide which stream gets which media sources.
The Constrainable pattern allows applications to inspect and adjust
the properties of objects implementing it. It is broken out as a
separate set of definitions so that it can be referred to by other
specifications. The core concept is the Capability, which consists
of a constrainable property of an object and the set of its possible
values, which may be specified either as a range or as an enumeration.
For example, a camera might be capable of framerates (a property)
between 20 and 50 frames per second (a range) and may be able to be
positioned (a property) facing towards the user, away from the user, or
to the left or right of the user (an enumerated set). The application
can examine a constrainable property's supported Capabilities via the
getCapabilities()
accessor.
The application can select the (range of) values it wants for an
object's Capabilities by means of basic and/or advanced ConstraintSets
and the applyConstraints()
method. A ConstraintSet consists
of the names of one or more properties of the object plus the desired
value (or a range of desired values) for each property. Each of those
property/value pairs can be considered to be an individual constraint.
For example, the application may set a ConstraintSet containing two
constraints, the first stating that the framerate of a camera be between
30 and 40 frames per second (a range) and the second that the camera
should be facing the user (a specific value). How the individual
constraints interact depends on whether and how they are given in the
basic Constraint structure, which is a ConstraintSet with an additional
'advanced' property, or whether they are in a ConstraintSet in the
advanced list. The behavior is as follows: all 'min', 'max', and 'exact'
constraints in the basic Constraint structure are together treated as
the 'required' set, and if it is not possible to satisfy simultaneously
all of those individual constraints for the indicated property names,
the User Agent MUST reject the returned promise. Otherwise, it must
apply the required constraints. Next, it will consider any
ConstraintSets given in the 'advanced' list, in the order in which they
are specified, and will try to satisfy/apply each complete ConstraintSet
(i.e., all constraints in the ConstraintSet together), but will skip a
ConstraintSet if and only if it cannot satisfy/apply it in its entirety.
Next, the User Agent MUST attempt to apply, individually, any 'ideal' constraints
or a constraint given as a bare value for the property. Of these
properties, it MUST satisfy the largest number that it can, in any
order. Finally, the User Agent MUST resolve the returned promise.
getSupportedConstraints()
,
that all the named properties that are used are supported by the
browser. The reason for this is that WebIDL drops any unsupported names
from the dictionary holding the constraints, so the browser does not see
them and the unsupported names end up being silently ignored. This will
cause confusing programming errors as the JavaScript code will be
setting constraints but the browser will be ignoring them. Browsers that
support (recognize) the name of a required constraint but cannot satisfy
it will generate an error, while browsers that do not support the
constrainable property will not generate an error.
The following examples may help to understand how constraints work. The first shows a basic Constraint structure. Three constraints are given, each of which the User Agent will attempt to satisfy individually. Depending upon the resolutions available for this camera, it is possible that not all three constraints can be satisfied at the same time. If so, the User Agent will satisfy two if it can, or only one if not even two constraints can be satisfied together. Note that if not all three can be satisfied simultaneously, it is possible that there is more than one combination of two constraints that could be satisfied. If so, the user agent will choose.
var supports = navigator.mediaDevices.getSupportedConstraints("video"); if(!supports["aspectRatio"]) { // Treat like an error. } var constraints = { width: 1280, height: 720, aspectRatio: 1.5 };
This next example adds a small bit of complexity. The ideal values are still given for width and height, but this time with minimum requirements on each as well that must be satisfied. If it cannot satisfy either the width or height minimum it will reject the promise. Otherwise, it will try to satisfy the width, height, and aspectRatio target values as well and then resolve the promise.
var supports = navigator.mediaDevices.getSupportedConstraints("video"); if(!supports["aspectRatio"]) { // Treat like an error. } var constraints = { width: {min: 640, ideal: 1280}, height: {min: 480, ideal: 720}, aspectRatio: 1.5 };
This example illustrates the full control possible with the Constraints structure by adding the 'advanced' property. In this case, the User Agent behaves the same way with respect to the required constraints, but before attempting to satisfy the ideal values it will process the 'advanced' list. In this example the 'advanced' list contains two ConstraintSets. The first specifies width and height constraints, and the second specifies an aspectRatio constraint. Note that in the advanced list, these bare values are treated as 'exact' values. This example represents the following: "I need my video to be at least 640 pixels wide and at least 480 pixels high. My preference is for precisely 1920x1280, but if you can't give me that, give me an aspectRatio of 4x3 if at all possible. If even that is not possible, give me a resolution as close to 1280x720 as possible."
var supports = navigator.mediaDevices.getSupportedConstraints("video"); if(!supports["width"] || !supports["height"]) { // Treat like an error. } var constraints = { width: {min: 640, ideal: 1280}, height: {min: 480, ideal: 720}, advanced: [{width: 1920, height: 1280}, {aspectRatio: 1.3333333333}] };
The ordering of advanced ConstraintSets is significant. In the preceding example it is impossible to satisfy both the 1920x1280 ConstraintSet and the 4x3 aspect ratio ConstraintSet at the same time. Since the 1920x1280 occurs first in the list, the User Agent will attempt to satisfy it first. Application authors can therefore implement a backoff strategy by specifying multiple optional ConstraintSets for the same property. For example, an application might specify three optional ConstraintSets, the first asking for a framerate greater than 500, the second asking for a framerate greater than 400, and the third asking for one greater than 300. If the User Agent is capable of setting a framerate greater than 500, it will (and the subsequent two ConstraintSets will be trivially satisfied). However, if the User Agent cannot set the framerate above 500, it will skip that ConstraintSet and attempt to set the framerate above 400. If that fails, it will then try to set it above 300. If the User Agent cannot satisfy any of the three ConstraintSets, it will set the framerate to any value it can get. If the developer wanted to insist on 300 as a lower bound, he could provide that as a 'min' value in the basic ConstraintSet. In that case, the User Agent would fail altogether if it couldn't get a value over 300, but would choose a value over 500 if possible, then try for a value over 400.
Note that, unlike basic constraints, the constraints within a
ConstraintSet in the advanced list must be satisfied together or skipped
together. Thus, {width: 1920, height: 1280} is a request for that
specific resolution, not a request for that width or that height. One
can think of the basic constraints as requesting an or (non-exclusive)
of the individual constraints, while each advanced ConstraintSet is
requesting an and of the individual constraints in the ConstraintSet. An
application may inspect the full set of Constraints currently in effect
via the getConstraints()
accessor.
The specific value that the User Agent chooses for a Capability is referred
to as a Setting. For example, if the application applies a ConstraintSet
specifying that the framerate must be at least 30 frames per second, and
no greater than 40, the Setting can be any intermediate value, e.g., 32,
35, or 37 frames per second. The application can query the current
settings of the object's Capabilities via the
getSettings()
accessor.
Due to the limitations of the interface definition language used in this specification, it is not possible for other interfaces to inherit or implement ConstrainablePattern. Therefore the WebIDL definitions given are only templates to be copied. Each interface that wishes to make use of the functionality defined here will have to provide its own copy of the WebIDL for the functions and interfaces given here. However it can refer to the semantics defined here, which will not change. See MediaStreamTrack Interface Definition for an example of this.
[NoInterfaceObject]
interface ConstrainablePattern {
Capabilities getCapabilities ();
Constraints
getConstraints ();
Settings getSettings ();
Promise<void> applyConstraints (Constraints
constraints);
attribute EventHandler onoverconstrained;
};
onoverconstrained
of type EventHandler, overconstrained
,
MUST be supported by all objects implementing the
ConstrainablePattern
pattern. The User Agent MUST raise a MediaStreamErrorEvent
named "overconstrained" if changing circumstances at runtime result
in it no longer being able to satisfy the
requiredConstraints from the currently valid Constraints.
This MediaStreamErrorEvent MUST contain a MediaStreamError whose
name
is OverconstrainedError
, and whose
constraintName
attribute is set to one of the
requiredConstraints that can no longer be satisfied. The
message
attribute of the MediaStreamError SHOULD
contain a string that is useful for debugging. The conditions under
which this error might occur are platform and application-specific.
For example, the user might physically manipulate a camera in a way
that makes it impossible to provide a resolution that satisfies the
constraints. The User Agent MAY take other actions as a result of the
overconstrained situation.
applyConstraints
The applyConstraints() algorithm for applying constraints is stated below. Here are some preliminary definitions that are used in the statement of the algorithm:
We use the term settings dictionary for the set of values that might be applied as settings to the object.
We define the fitness distance
between a
settings dictionary
and a constraint
CS as the sum, for each constraint
provided for a constraint name in CS,
ignoring "advanced",
the following values:
If the constraint is not supported by the browser, it does not contribute to the fitness distance. Use 0.
If the constraint is required ('min', 'max', or 'exact'), and the settings dictionary's value for the constraint does not satisfy the constraint, use positive infinity.
(actual == ideal) ? 0 : |actual - ideal|/max(|actual|,|ideal|)
(actual == ideal) ? 0 : 1
More definitions:
We define the function SelectSettings() as follows:
Note that
unknown properties are discarded by WebIDL, which means that
unknown/unsupported required constraints will silently
disappear. To avoid this being a surprise, application authors
are expected to first use the
getSupportedConstraints()
method as shown in the
Examples below.
For every possible settings dictionary
of copy,
compute its fitness distance
.
Discard the settings dictionaries
for which the result is
infinity, and add the remaining settings dictionaries
to the set
of possible settings dictionaries candidates
If the fitness distance
of the best fitting
settings dictionary is infinity, return "undefined" as the result
of the function.
compute the fitness distance between it and each settings dictionary.
If the fitness distance is finite for one or more settings dictionaries, add this ConstraintSet to requiredConstraints, and keep these settings dictionaries in the list of possible settings, discarding others.
If the fitness distance is infinite for all settings dictionaries, ignore this ConstraintSet.
Select one settings dictionary from the list of possible
settings,
and return this as the result of SelectSettings()
. The UA SHOULD use the one
with a smaller fitness distance
.
When applyConstraints
is called, the User Agent MUST run the
following steps:
Let p be a new promise.
Let newContraints be the argument to this function.
Run the following steps asynchronously:
Let successfulSettings be the result of running SelectSettings(newConstraints).
If successfulSettings is undefined,
reject p with a new
MediaStreamError
with name
ConstraintNotSatisfied
and
constraintName
set to any of the required
constraints that could not be satisfied, and abort these steps.
existingConstraints remain in effect in this
case.
Return p.
Any implementation that has the same result as the algorithm above is an allowed implementation. For instance, the implementation may choose to keep track of the maximum and minimum values for a setting that are OK under the constraints considered, rather than keeping track of all possible values for the setting.
When picking a settings dictionary, the UA can use any information available to it. Examples of such information may be whether the selection is done as part of device selection in getUserMedia, whether the energy usage of the camera varies between the settings dictionaries, or whether using a settings dictionary will cause the device driver to apply resampling.
The User Agent MAY choose new settings for the Capabilities of the object at any time. When it does so it MUST attempt to satisfy the current Constraints, in the manner described in the algorithm above.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
constraints |
| ✘ | ✘ |
A new constraint structure to apply to this object. |
Promise<void>
getCapabilities
The getCapabilities() method returns the dictionary of the names of the capabilities that the object supports.
It is possible that the underlying hardware may not exactly
map to the range defined in the registry entry. Where this is
possible, the entry SHOULD define how to translate and scale the
hardware's setting onto the values defined in the entry. For
example, suppose that a registry entry defines a hypothetical
fluxCapacitance capability that is defined to be the range from
-10 (min) to 10 (max), but there are common hardware devices
that support only values of "off" "medium" and "full". The
registry entry might specify that for such hardware, the user
agent should map the range value of -10 to "off", 10 to "full",
and 0 to "medium". It might also indicate that given a
ConstraintSet imposing a strict value of 3, the User Agent
should attempt to set the value of "medium" on the hardware, and
and that
getSettings()
should return a fluxCapacitance of 0, since that is
the value defined as corresponding to "medium".
Capabilities
getConstraints
The getConstraints method returns the Constraints
that were the argument to the most recent successful call of
applyConstraints()
, maintaining the order in which they
were specified. Note that some of the optional ConstraintSets
returned may not be currently satisfied. To check which
ConstraintSets are currently in effect, the application should use
getSettings
.
Constraints
getSettings
The getSettings() method returns the current settings
of all the constrainable properties of the object, whether they are platform
defaults or have been set by applyConstraints()
. Note
that the actual setting of a property MUST be a single value.
Settings
An example of Constraints that could be passed into
applyConstraints()
or returned as a value of
constraints
is below. It uses the properties defined in the Track property registry.
var supports = navigator.mediaDevices.getSupportedConstraints("video"); if(!supports["facingMode"]) { // Treat like an error. } var constraints = { width: { min: 640 }, height: { min: 480 }, advanced: [{ width: 650 }, { width: { min: 650 } }, { frameRate: 60 }, { width: { max: 800 } }, { facingMode: "user" }] };
Here is another example, specifically for a video track where I must have a particular camera and have separate preferences for the width and height:
var supports = navigator.mediaDevices.getSupportedConstraints("video"); if(!supports["deviceId"]) { // Treat like an error. } var constraints = { deviceId: {exact: "20983-20o198-109283-098-09812"}, advanced: [{ width: { min: 800, max: 1200 } }, { height: { min: 600 } }] };
And here's one for an audio track:
var supports = navigator.mediaDevices.getSupportedConstraints("audio"); if(!supports["deviceId"] || !supports["volume"]) { // Treat like an error. } var constraints = { advanced: [{ deviceId: "64815-wi3c89-1839dk-x82-392aa" }, { volume: 0.5 }] };
Here's an example of use of ideal:
var supports = navigator.mediaDevices.getSupportedConstraints("video"); if(!supports["aspectRatio"] || !supports["facingMode"]) { // Treat like an error. } var gotten = navigator.mediaDevices.getUserMedia({ video: { width: {min: 320, ideal: 1280, max: 1920}, height: {min: 240, ideal: 720, max: 1080}, framerate: 30, // Shorthand for ideal. // facingMode: "environment" would be optional. facingMode: {exact: "environment"} }});
Here's an example of "I want 720p, but I can accept up to 1080p and down to VGA.":
var supports = navigator.mediaDevices.getSupportedConstraints("video"); if(!supports["width"] || !supports["height"]) { // Treat like an error. } var gotten = navigator.mediaDevices.getUserMedia({video: { width: {min: 640, ideal: 1280, max: 1920}, height: {min: 480, ideal: 720, max: 1080}, }});
Here's an example of "I want a front-facing camera and it must be VGA.":
var supports = navigator.mediaDevices.getSupportedConstraints("video"); if(!supports["width"] || !supports["height"] || !supports["facingMode"]) { // Treat like an error. } var gotten = navigator.mediaDevices.getUserMedia({video: { facingMode: {exact: "user"}, width: {exact: 640}, height: {exact: 480} }});
There is a single IANA registry that defines the constrainable properties of all objects that implement the Constrainable pattern. The registry entries MUST contain the name of each property along with its set of legal values. The registry entries for MediaStreamTrack are defined below. The syntax for the specification of the set of legal values depends on the type of the values. In addition to the standard atomic types (boolean, long, double, DOMString), legal values include lists of any of the atomic types, plus min-max ranges, as defined below.
List values MUST be
interpreted as disjunctions. For example, if a property 'facingMode'
for a camera is defined as having legal values ["left", "right",
"user", "environment"], this means that 'facingMode' can have the
values "left", "right", "environment", and
"user". Similarly Constraints
restricting 'facingMode' to
["user", "left", "right"] would mean that the User Agent should select a
camera (or point the camera, if that is possible) so that "facingMode"
is either "user", "left", or "right". This Constraint would thus
request that the camera not be facing away from the user, but would
allow the User Agent to allow the user to choose other directions.
dictionary ConstrainDoubleRange {
double max;
double min;
double exact;
double ideal;
};
ConstrainDoubleRange
Membersexact
of type doubleThe exact required value for this property.
ideal
of type doubleThe ideal (target) value for this property.
max
of type doubleThe maximum legal value of this property.
min
of type doubleThe minimum value of this Property.
dictionary ConstrainLongRange {
long max;
long min;
long exact;
long ideal;
};
ConstrainLongRange
Membersexact
of type longThe exact required value for this property.
ideal
of type longThe ideal (target) value for this property.
max
of type longThe maximum legal value of this property.
min
of type longThe minimum value of this property.
dictionary ConstrainBooleanParameters {
boolean exact;
boolean ideal;
};
ConstrainBooleanParameters
Membersexact
of type booleanThe exact required value for this property.
ideal
of type booleanThe ideal (target) value for this property.
dictionary ConstrainDOMStringParameters {
(DOMString or sequence<DOMString>) exact;
(DOMString or sequence<DOMString>) ideal;
};
ConstrainDOMStringParameters
Membersexact
of type (DOMString or sequence<DOMString>)The exact required value for this property.
ideal
of type (DOMString or sequence<DOMString>)The ideal (target) value for this property.
typedef (long or ConstrainLongRange
) ConstrainLong;
typedef (double or ConstrainDoubleRange
) ConstrainDouble;
typedef (boolean or ConstrainBooleanParameters
) ConstrainBoolean;
typedef (DOMString or sequence<DOMString> or ConstrainDOMStringParameters
) ConstrainDOMString;
Capabilities is a dictionary containing one or more key-value pairs, where each key MUST be a constrainable property defined in the registry, and each value MUST be a subset of the set of values defined for that property in the registry. The exact syntax of the value expression depends on the type of the property, and its type is as defined in the Values column of the registry. The Capabilities dictionary specifies the subset of the constrainable properties and values from the registry that the User Agent supports. Note that a User Agent MAY support only a subset of the properties that are defined in the registry, and MAY support a subset of the set values for those properties that it does support. Note that Capabilities are returned from the User Agent to the application, and cannot be specified by the application. However, the application can control the Settings that the User Agent chooses for Capabilities by means of Constraints.
An example of a Capabilities dictionary is shown below. This example is not very realistic in that a browser would actually be required to support more constrainable properties than just these.
{ frameRate: { min: 1.0, max: 60.0 }, facingMode: ["user", "environment"] }
The next example below points out that capabilities for range values provide ranges for individual constrainable properties, not combinations. This is particularly relevant for video width and height, since the ranges for width and height are reported separately. In the example, if the User Agent can only provide 640x480 and 800x600 resolutions the relevant capabilities returned would be:
{ width: { min: 640, max: 800 }, height: { min: 480, max: 600 }, aspectRatio: { min: 1.3333333333, max: 1.3333333333 } }
Note in the example above that the aspectRatio would make clear that arbitrary combination of widths and heights are not possible, although it would still suggest that more than two resolutions were available.
Settings is a dictionary containing one or more
key-value pairs. It MUST contain
each key returned in getCapabilities()
. There MUST be a
single value for each key and the value MUST be a member of
the set defined for that property by capabilities()
. The
Settings
dictionary contains the actual values that the
User Agent has chosen for the object's constrainable properties. The exact syntax of the
value depends on the type of the property.
A conforming User Agent MUST support all the constrainable properties defined in this specification.
An example of a Setting dictionary is shown below. This example is not very realistic in that a browser would actually be required to support more constrainable properties than just these.
{ frameRate: 30.0, facingMode: "user" }
Due to the limitations of WebIDL, interfaces implementing the Constrainable Pattern cannot simply subclass Constraints and ConstraintSet as they are defined here. Instead they must provide their own definitions that follow this pattern. See MediaTrackConstraints for an example of this.
dictionary ConstraintSet {
};
Each member of a ConstraintSet corresponds to a constrainable property and specifies a subset of the property's legal Capability values. Applying a ConstraintSet instructs the User Agent to restrict the settings of the corresponding constrainable properties to the specified values or ranges of values. A given property MAY occur both in the basic Constraints set and in the advanced ConstraintSets list, and MAY occur at most once in each ConstraintSet in the advanced list.
dictionary Constraints : ConstraintSet
{
sequence<ConstraintSet
> advanced;
};
Constraints
Membersadvanced
of type sequence<ConstraintSet
>The list of ConstraintSets that the User Agent MUST attempt to satisfy,
in order, skipping only those that cannot be satisfied. The order of these
ConstraintSets is significant. In particular, when they are passed
as an argument to applyConstraints
, the User Agent MUST
try to satisfy them in the
order that is specified. Thus if optional ConstraintSets C1 and C2
can be satisfied individually, but not together, then whichever of
C1 and C2 is first in this list will be satisfied, and the other
will not. The User Agent MUST
attempt to satisfy all optional ConstraintSets in the list, even
if some cannot be satisfied. Thus, in the preceding example, if
optional constraint C3 is specified after C1 and C2, the User Agent will
attempt to satisfy C3 even though C2 cannot be satisfied. Note
that a given property name may occur only once in each
ConstraintSet but may occur in more than one ConstraintSet.
This sample code exposes a button. When clicked, the button is disabled and the user is prompted to offer a stream. The user can cause the button to be re-enabled by providing a stream (e.g., giving the page access to the local camera) and then disabling the stream (e.g., revoking that access).
<input type="button" value="Start" onclick="start()" id="startBtn"> <script> var startBtn = document.getElementById('startBtn'); function start() { navigator.mediaDevices.getUserMedia({ audio: true, video: true }).then(gotStream).catch(logError); startBtn.disabled = true; } function gotStream(stream) { stream.oninactive = function () { startBtn.disabled = false; }; } function logError(error) { log(error.name + ": " + error.message); } </script>
This example allows people to take photos of themselves from the local video camera. Note that the Image Capture specification [image-capture] provides a simpler way to accomplish this.
<article> <style scoped> video { transform: scaleX(-1); } p { text-align: center; } </style> <h1>Snapshot Kiosk</h1> <section id="splash"> <p id="errorMessage">Loading...</p> </section> <section id="app" hidden> <p><video id="monitor" autoplay></video> <canvas id="photo"></canvas> <p><input type=button value="📷" onclick="snapshot()"> </section> <script> var video = document.getElementById('monitor'); var canvas = document.getElementById('photo'); navigator.mediaDevices.getUserMedia({ video: true }).then(function (stream) { video.srcObject = stream; stream.oninactive = noStream; video.onloadedmetadata = function () { canvas.width = video.videoWidth; canvas.height = video.videoHeight; document.getElementById('splash').hidden = true; document.getElementById('app').hidden = false; }; }).catch(function (reason) { document.getElementById('errorMessage').textContent = 'No camera available.'; }); function snapshot() { canvas.getContext('2d').drawImage(video, 0, 0); } </script> </article>
This section is non-normative; it specifies no new behavior, but instead summarizes information already present in other parts of the specification.
This document extends the Web platform with the ability to manage input devices for media - in this iteration, microphones, and cameras. It also allows the manipulation of audio output devices (speakers and headphones).
Without authorization (to the "drive-by web"), it offers the ability
to tell how many devices there are of each class. The identifiers for
the devices are designed to not be useful for a fingerprint that can
track the user between origins, but the number of devices adds to the
fingerprint surface. It recommends to treat the per-origin persistent
identifier deviceId
as other persistent storages (e.g.
cookies) are treated.
When authorization is given, this document describes how to get access to, and use, media data from the devices mentioned. This data may be sensitive; advice is given that indicators should be supplied to indicate that devices are in use, but both the nature of authorization and the indicators of in-use devices are platform decisions.
Authorization may be given on a case-by-case basis, or be persistent. In the case of a case-by-case authorization, it is important that the user be able to say "no" in a way that prevents the UI from blocking user interaction until permission is given - either by offering a way to say a "persistent NO" or by not using a modal permissions dialog.
It is possible to use constraints so that the failure of a getUserMedia call will return information about devices on the system without prompting the user, which increases the surface available for fingerprinting. The User Agent should consider limiting the rate at which failed getUserMedia calls are allowed in order to limit this additional surface.
In the case of persistent authorization, it is important that it is easy to find the list of granted permissions and revoke permissions that the user wishes to revoke.
Once permission has been granted, the User Agent should make two things readily apparent to the user:
Developers of sites with persistent permissions should be careful that these permissions not be abused.
In particular, they should not make it possible to automatically send audio or video streams from authorized media devices to an end point that a third party can select.
Indeed, if a site offered URLs such as https://webrtc.example.org/?call=user
that would automatically set up calls and transmit audio/video to user
, it would be open for instance to the following abuse:
Users who have granted permanent permissions to https://webrtc.example.org/
could be tricked to send their audio/video streams to an attacker EvilSpy
by following a link or being redirected to https://webrtc.example.org/?user=EvilSpy
.
IANA is requested to register the following properties as specified in [RTCWEB-CONSTRAINTS]:
The following constraint names are defined to apply to both video
and audio
objects:MediaStreamTrack
Property Name | Values | Notes |
---|---|---|
sourceType |
|
The type of the source of the MediaStreamTrack . Note
that the setting of this property is uniquely determined by the
source that is attached to the Track. In particular,
getCapabilities() will return only a single value for
deviceId/Type. This property can therefore be used for initial
media selection with getUserMedia(). However is not useful for
subsequent media control with applyConstraints, since any
attempt to set a different value will result in an unsatisfiable
ConstraintSet. |
deviceId | DOMString | The application-unique identifier for this source. The same identifier MUST be valid between sessions of this application, but MUST also be different for other applications. Some sort of GUID is recommended for the identifier. Note that the setting of this property is uniquely determined by the source that is attached to the Track. In particular, getCapabilities() will return only a single value for deviceId/Type. This property can therefore be used for initial media selection with getUserMedia(). However is not useful for subsequent media control with applyConstraints, since any attempt to set a different value will result in an unsatisfiable ConstraintSet. |
groupId | DOMString | The group identifier for this source. Two devices have the same group identifier if they belong to the same physical device; for example the audio input and output devices representing the speaker and microphone of the same headset. |
The following properties are defined to apply only to video
objects:MediaStreamTrack
Property Name | Values | Notes |
---|---|---|
width |
|
The width or width range, in pixels, of the video source. As a capability, the range should span the video source's pre-set width values with min being the smallest width and max being the largest width. |
height |
|
The height or height range, in pixels, of the video source. As a capability, the range should span the video source's pre-set height values with min being the smallest height and max being the largest height. |
frameRate |
|
The exact desired frame rate (frames per second) or frameRate range of the video source. If the source does not natively provide a frameRate, or the frameRate cannot be determined from the source stream, then this value MUST refer to the User Agent's vsync display rate. |
aspectRatio |
|
The exact aspect ratio (width in pixels divided by height in pixels), represented as a double rounded to the tenth decimal place. |
facingMode |
|
This string should be one of the members of
. The members describe the
directions that the camera can face, as seen from the user's
perspective. Note that
getConstraints may not return exactly the same
string for strings not in this enum. This preserves the possibility
of using a future version of WebIDL enum for this property. |
enum VideoFacingModeEnum {
"user",
"environment",
"left",
"right"
};
Enumeration description | |
---|---|
user |
The source is facing toward the user (a self-view camera). |
environment |
The source is facing away from the user (viewing the environment). |
left |
The source is facing to the left of the user. |
right |
The source is facing to the right of the user. |
Below is an illustration of the video facing modes in relation to
the user.
The following properties are defined to apply only to audio
objects:MediaStreamTrack
Property Name | Values | Notes |
---|---|---|
volume |
|
The volume or volume range of the audio source is a multiplier of the linear audio sample values. A volume of 0.0 is silence, while a volume of 1.0 is the maximum supported volume. A volume of 0.5 will result in an approximately 6 dBSPL change in the sound pressure level from the maximum volume. Note that any ConstraintSet that specifies values outside of this range of 0 to 1 can never be satisfied. |
sampleRate |
|
The sample rate in samples per second for the audio data. |
sampleSize |
|
The linear sample size in bits. This constraint can only be satisfied for audio devices that produce linear samples. |
echoCancellation |
boolean
|
When one or more audio streams is being played in the processes of various microphones, it is often desirable to attempt to remove the sound being played from the input signals recorded by the microphones. This is referred to as echo cancellation. There are cases where it is not needed and it is desirable to turn it off so that no audio artifacts are introduced. This allows applications to control this behavior. |
This section will be removed before publication.
getSupportedConstraints()
method.The editors wish to thank the Working Group chairs and Team Contact, Harald Alvestrand, Stefan Håkansson, and Dominique Hazaël-Massieux, for their support. Substantial text in this specification was provided by many people including Jim Barnett, Harald Alvestrand, Travis Leithead, and Stefan Håkansson.