Copyright © 2014 W3C® (MIT, ERCIM, Keio, Beihang), All Rights Reserved. W3C liability, trademark and document use rules apply.
This specification extends the Media Capture and Streams specification [GETUSERMEDIA] to allow a depth stream to be requested from the web platform using APIs familiar to web authors.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This document is not complete and is subject to change. Early experimentations are encouraged to allow the Media Capture Task Force to evolve the specification based on technical discussions within the Task Force, implementation experience gained from early implementations, and feedback from other groups and individuals.
This document was published by the Device APIs Working Group and Web Real-Time Communications Working Group as a First Public Working Draft. This document is intended to become a W3C Recommendation. If you wish to make comments regarding this document, please send them to public-media-capture@w3.org (subscribe, archives). All comments are welcome.
Publication as a First Public Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures (Device APIs Working Group, Web Real-Time Communications Working Group) made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
This document is governed by the 1 August 2014 W3C Process Document.
This specification extends the MediaStream
interface [GETUSERMEDIA] to enable it to also contain depth-based
MediaStreamTracks. A depth-based
MediaStreamTrack, referred to as a depth
track, represents an abstraction of a stream of frames that can
each be converted to objects which contain an array of pixel data,
where each pixel represents the distance between the camera and the
objects in the scene for that point in the array. A
MediaStream object that contains one or more
depth tracks is referred to as a depth stream.
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key words MUST, MUST NOT, REQUIRED, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL in this specification are to be interpreted as described in [RFC2119].
This specification defines conformance criteria that apply to a single product: the user agent that implements the interfaces that it contains.
Implementations that use ECMAScript to implement the APIs defined in this specification must implement them in a manner consistent with the ECMAScript Bindings defined in the Web IDL specification [WEBIDL], as this specification uses that specification and terminology.
The
Constraints,
MediaStreamConstraints, and
MediaTrackConstraints dictionaries,
MediaStreamTrack and
MediaStream interfaces this specification extends
are defined in [GETUSERMEDIA].
The
NavigatorUserMediaSuccessCallback callback is
defined in [GETUSERMEDIA].
The ImageData
interface and its
data attribute are defined in [2DCONTEXT2].
The ArrayBuffer,
ArrayBufferView
and Uint16Array
types are defined in [TYPEDARRAY].
A depth stream is a MediaStream object
that contains one or more depth tracks.
A depth track represents media sourced from a depth camera or other similar source.
A depth data represents the underlying depth data structure
of an area of a canvas element.
MediaStreamConstraints dictionary
partial dictionary MediaStreamConstraints {
(boolean or MediaTrackConstraints) depth = false;
};
The depth
attribute MUST return the value it was initialized to. When the
object is created, this attribute MUST be initialized to false. If
true, the attribute represents a request that the
MediaStream object returned as an argument of the
NavigatorUserMediaSuccessCallback contains a
depth track. If a Constraints structure is
provided, it further specifies the nature and settings of the
depth track.
MediaStream interface
partial interface MediaStream {
sequence<MediaStreamTrack> getDepthTracks ();
};
The getDepthTracks()
method, when invoked, MUST return a sequence of
MediaStreamTrack objects representing the
depth tracks in this stream.
The getDepthTracks()
method MUST return a sequence that represents a snapshot of all the
MediaStreamTrack objects in this stream's track
set whose kind is equal to "depth".
The conversion from the track set to the sequence is user
agent defined and the order does not have to be stable between
calls.
MediaStreamTrack interface
The kind attribute MUST, on getting, return
the string "depth" if the object represents a depth
track.
DepthData interface
Depth cameras usually produce 16-bit depth values per pixel. However, the canvas drawing surface used to draw and manipulate 2D graphics on the web platform does not currently support 16bpp.
To address the issue, this specification defines a new
interface and extends the
DepthData interface to provide
pixel manipulation constructors and methods that create, and
interact with, CanvasRenderingContext2D objects.
DepthData
[Constructor(unsigned long sw, unsigned long sh),
Constructor(Uint16Array data, unsigned long sw, optional unsigned long sh),
Exposed=Window,Worker]
interface DepthData {
readonly attribute unsigned long width;
readonly attribute unsigned long height;
readonly attribute Uint16Array data;
readonly attribute CameraParameters parameters;
};
New objects MUST be initialised so that
their DepthDatawidth attribute is set to the number of entries
per row in the depth data, their height attribute
is set to the number of rows in the depth data, and their
data attribute, except where an existing array is
provided, is initialised to a new Uint16Array object.
The Uint16Array object MUST use a new Canvas depth
ArrayBuffer for its storage, and MUST have a zero
start offset and a length equal to the length of its storage, in
bytes.
A Canvas Depth ArrayBuffer is an
ArrayBuffer whose data is represented in
left-to-right order, row by row top to bottom, starting with the top
left, with each pixel's depth component being given in that order for
each pixel. Each depth component of each pixel represented in this
array MUST be in the range 0..65536, representing the 16 bit value
for that depth component. The depth components MUST be assigned
consecutive indices starting with 0 for the top left pixel's depth
component.
CameraParameters interface
What is the minimum set of metadata
should expose? At minimum, the
general pinhole camera model have to be calculable. For related
discussion, see the
Focal length/fov capabilities and general camera intrinsics
thread.
CameraParameters
Each DepthData interface is associated with a
CameraParameters object. It represents the parameters of a
pinhole camera model that describes the mathematical relationship
between the coordinates of a 3D point and its projection onto the
image plane.
interface CameraParameters {
readonly attribute double focalLength;
readonly attribute double horizontalViewAngle;
readonly attribute double verticalViewAngle;
};
The focalLength attribute, on getting, MUST return the
focal length of the camera in millimeters.
The horizontalViewAngle attribute, on getting, MUST
return the horizontal angle of view in degrees.
The verticalViewAngle attribute, on getting, MUST return
the vertical angle of view in degrees.
CanvasRenderingContext2D interface
partial interface CanvasRenderingContext2D {
DepthData createDepthData (double sw, double sh);
DepthData createDepthData (DepthData depthdata);
DepthData getDepthData (double sx, double sy, double sw, double sh);
void putDepthData (DepthData depthdata, double dx, double dy);
void putDepthData (DepthData depthdata, double dx, double dy, double dirtyX, double dirtyY, double dirtyWidth, double dirtyHeight);
};
Define the algorithms for the createDepthData(),
getDepthData(), and putDepthData()
methods.
We may want to file bugs against [2DCONTEXT2] to add extension points this specification can hook into to facilitate reuse of common algorithms and avoid monkey patching.
Thanks to everyone who contributed to the Use Cases and Requirements, sent feedback and comments. Special thanks to Ningxin Hu for experimental implementations, as well as to the Project Tango for their experiments.