User Interface Security and the Visibility API

1. Introduction

This section is not normative.

Composite or "mash-up" web applications built using iframes are ubiquitous because they allow users to interact seamlessly and simultaneously with content from multiple origins while maintaining isolation boundaries that are essential to security and privacy for both users and applications.

However, those boundaries are not absolute. In particular, the visual and temporal integrity of embedded content is not protected from manipulation by the embedding resource. An embedding resource might constrain the viewport, draw over, transform, reposition, or resize the user’s view of a third-party resource.

Collectively known as User Interface Redressing, the goal of such manipulations might be to entice the user to interact with embedded content without knowing its context, (e.g. to send a payment or share content) commonly known as "clickjacking", or to convince paid content that it is being shown to the user when it is actually obscured, commonly known in the advertising business as "display fraud".

Existing anti-clickjacking measures such as frame-busting scripts and headers granting origin-based embedding permissions have shortcomings which prevent their application to important use-cases. Frame-busting scripts, for example, rely on browser behavior that has not been engineered to provide a security guarantee and as a consequence, such scripts may be unreliable if loaded inside a sandbox or otherwise disabled. The X-Frame-Options header and the frame-ancestors Content Security Policy directive offer an all-or-none approach to display of embedded content that is not appropriate for content which may be embedded in arbitrary locations, or known locations which might still be adversarial.

This document defines mechanisms to allow resources to request to be displayed free of interference by their embedding context and learn if the user agent was able to satisfy such a request, with sufficient granularity to make decisions that can protect both users and content purveyors from various types of fraud.

First, this document defines an imperative API, VisibilityObserver, by which a resource can request that a conforming user agent guarantee unmodified display of its viewport, and report events on the success or failure of meeting such guarantees. This API should be suitable for e.g. paid content such as advertising to receive trustworthy signals about its viewability from a conforming user agent.

Secondly, this specification defines a declarative mechanism (via a Content Security Poicy directive) to request visibility protection and receive notification, via event properties or out-of-band reporting, if certain events are delivered to a resource while it does not meet its requested visibility contract.

The declarative CSP interface does not offer the same fine-granularity control as the JavaScript API. Its goal is to allow protection to be retrofitted to legacy applications, with no or minimal code changes, as a replacement for X-Frame-Options, or potentially for use with content that is sandboxed and cannot execute JavaScript.

Do we need to deal with form submission / navigations that aren’t JS-event-based?

how to interact with frame-ancestors and XFO?

A notable non-goal is pixel-accurate information about what was actually displayed beyond its bounding rectangle, as this information can be quite difficult to obtain in an efficient manner, and is extremely difficult to accomplish without exposing timing side channels which leak information across the Same Origin Policy security boundary.

NOTE: Similar to, and modeled on the Intersection Observer draft, this specification shares a goal of allowing reliable and low-cost calculation of element visibility for, e.g purposes of reporting ad visibility for monetizing impressions. The current specification adds the goals of preventing clickjacking and other UI redressing attacks both by enforcing that an iframe which has requested visibility be free of any transforms, movement or re-clipping within a defined time threshold, and by allowing event delivery to be intercepted or annotated when policies are not met.

Distinct from the Intersection Observer proposal, this specification operates internally on entire documents, on a per-iframe basis (although it provides some syntatic sugar for the declarative, event-driven API) rather than observing individual elements, and it affirmatively modifies the final composited result in the global viewport by promoting the graphics layer of an iframe that has requested visibility.

2. Special Conformance Notes

This section is not normative.

UI Redressing attacks rely on fooling the subjective perceptions of human actors to induce them to interact with a web application out of its intended context. Because of this, the specific mechanisms which may be used in attack and defense may vary greatly with the details of a user agent implementation. For example, attacks which rely on redressing the cursor may not apply in a touch environment, or entire classes of attack may be impossible on a text-only browser or screen reader.

Similarly, the implementation of the policies specified herein is highly dependent on internal architecture and implementation strategies of the user agent; such strategies may vary greatly between user agents or even across versions or platforms for a single user agent.

This specification provides a normative means by which a resource owner can communicate to a user agent its desire for additional protective measures, actions to take if violations are detected, and tuning hints which may be useful for certain means of implementation. A user agent is conformant if it understands these directives and makes a best effort to provide the desired security properties, which might require no additional implementation steps, e.g. in the case of a screen reader that does not support embedded resources in a manner that is subject to any of the attack classes of concern.

While the indeterminacy of the user agent implementation protects applications from needing to constantly update their policies as user agents make internal changes, application authors should understand that even a conformant user agent cannot make perfect security guarantees against UI Redressing.

These directives should be used as part of a comprehensive risk mitigation strategy with an appropriate understanding of their limitations.

3. VisibilityObserver API

The VisibilityObserver API provides an imperative API for developers to receive notification of visibility state changes for their document relative to the global viewport.

3.1. The VisibilityObserverCallback

callback VisibilityObserverCallback = void(sequence<VisibilityObserverEntry> entries, VisibilityObserver observer)

This callback will be invoked when there are changes to the document’s visibility state.

3.2. The VisibilityObserverEntry interface

[Constructor(VisibilityObserverCallback callback, optional VisibilityObserverEntryInit visibilityObserverEntryInit), Exposed=Window]
interface VisibilityObserverEntry {
  readonly attribute DOMRectReadOnly globalVisibleBounds;
  readonly attribute DOMRectReadOnly visibleBounds;
  readonly attribute DOMHighResTimeStamp time;
 };

dictionary VisibilityObserverEntryInit {
  required DOMRectInit globalVisibleBounds;
  required DOMRectInit visibleBounds;
  required DOMHighResTimeStamp time;
};

globalVisibleBounds The DOMRect coresponding to the visible dimensions of the top-level document in the global viewport’s coordinate space.

visibleBounds The DOMRect corresponding to the document’s boundingClientRect, intersected by each of the document’s ancestor’s clipping rects, intersected with globalVisibleBounds. This value represents the portion of the document actually visible within globalVisibleBounds.

time A DOMHighResTimeStamp that corresponds to the time the visibility state was recorded.

3.3. The VisibilityObserver Interface

The VisibilityObserver interface can be used to observe changes in the document’s visibility state relative to the global viewport.

[Constructor(VisibilityObserverCallback callback), Exposed=Window]
interface VisibilityObserver {
  void observe ();
  void unobserve ();
  sequence<VisibilityObserverEntry> takeRecords ();
};

new VisibilityObserver(callback, options)

Let this be a new VisibilityObserver object
Set this’s internal [[callback]] slot to callback.

observe()

Add this to the document’s [[RegisteredVisibilityObservers]] list

unobserve()

Remove this from the document’s [[RegisteredVisibilityObservers]] set.

takeRecords()

Let queue be a copy of this’s internal [[QueuedEntries]] slot.
Clear this’s internal [[QueuedEntries]] slot.
Return queue.

3.4. The VisibilityObserverInit dictionary

dictionary VisibilityObserverInit {
  (double or sequence<double>) areaThreshold = 0;
 (boolean) displacementAware = false;
 (DOMString) visibleMargin = "0px";
 (Element)? observedElement;
};

areaThreshold, of type (double or sequence<double>), defaulting to 0

List of threshold(s) at which to trigger callback. callback will be invoked when visibleBounds area changes from greater than or equal to any threshold to less than that threshold, and vice versa.

Threshold values must be in the range of [0, 1.0] and represent a percentage of the area as specified by target.getBoundingClientRect().

Note: 0.0 is effectively "any non-zero number of pixels".

displacementAware, of type (boolean), defaulting to false

If true, this observer should trigger the callback when the position of the [[observedElement]] changes relative to the global viewport.

visibleMargin, of type (DOMString), defaulting to "0px"

Same as margin, extends the required visibility rectangle behind the protected-element.getBoundingClientRect(). Can be 1, 2, 3 or 4 components, possibly negative lengths.

If there is only one component value, it applies to all sides. If there are two values, the top and bottom margins are set to the first value and the right and left margins are set to the second. If there are three values, the top is set to the first value, the left and right are set to the second, and the bottom is set to the third. If there are four values, they apply to the top, right, bottom, and left, respectively.e.g.


  "5px"                // all margins set to 5px
  "5px 10px"           // top & bottom = 5px, right & left = 10px
  "-10px 5px 8px"      // top = -10px, right & left = 5px, bottom = 8px
  "-10px -5px 5px 8px" // top = -10px, right = -5px, bottom = 5px, left = 8px

observedElement, of type (Element), nullable

The Element being observed. If unset, the internal slot will be initialized to the Document element.

4. Content Security Policy Interface

This section describes the Content Security Policy directive introduced in this specification to provide declarative configuration of protection against input when an element does not meet it’s visibility requirements.

The optional directive-value allows configuration of conditions for which violations will be triggered.

4.1. The input-protection Directive

directive-name    = 'input-protection'
directive-value   = ['area-threshold=' num-val]
                    ['protected-element=' id-selector]
                    ['time-threshold=' num-val]
                    ['visible-margin=' num-val 'px' *3(',' num-val 'px')]

4.1.1. Directive Value

area-threshold A violation will be triggered if an event is delivered to the protected-element or one of its ancestors if the visibility of the protected area is below this threshold.

Threshold values must be in the range [0, 1.0] and represent a percentage of the area as specified by protected-element.getBoundingClientRect(), adjusted by visible-margin. Unlike the imperative API, only a single value may be specified.

protected-element A DOMString used as the argument to getElementById() to resolve the Element to which the policy applies.

If unspecified the policy is applied to the resource’s Document node.

time-threshold A numeric value in the range [0, 10000] that specifies how long, in milliseconds, the screen area containing the protected-element must have unmodified viewiability properties when an event is delivered to it or one of its ancestors.

If not specified, it defaults to 800. If a value outside of the range stated above is given, it defaults ot the nearest value between the lower and higher bounds.

visible-margin Same as visibleMargin.

If unspecified, it defaults to "0px".

5. Processing Model

This section outlines the steps the user agent must take when implementing the VisibilityObserver API.

5.1. Internal Slot Definitions

5.1.1. Browsing Contexts

Each unit of related similar-origin browsing contexts has an VisibilityObserverTaskQueued flag which is initialized to false.

5.1.2. Element

Element objects have an internal [[InputProtectionObservers]] list, which is initially empty.

5.1.3. Document

Document objects have an internal [[RegisteredVisibilityObservers]] list, which is initially empty, and an [[InputProtectionRequested]] flag which is intitially false.

5.1.4. VisibilityObserver

VisibilityObserver objects have the following internal slots:

[[QueuedEntries]] which is initialized to an empty list
[[previousVisibleRatio]] which is initialized to 0
[[previousGlobalViewportPosition]]

As well as internal slots initialized by VisibilityObserver(callback,options):

[[callback]]
[[areaThreshold]]
[[displacementAware]]
[[visibleMargin]]
[[observedElement]] which is initialized to the Document Element if not set in the VisibilityObserverInit dictionary

The following internal slots will be initialzed to null unless the object was constructed to represent an input-protection directive.

[[timeThreshold]]
[[associatedContentSecurityPolicy]]

5.2. Algorithms

5.2.1. Queue a VisibilityObserver Task

To queue a visibility observer task for a unit of related similar-origin browsing contexts unit, run these steps:

If unit’s VisibilityObserverTaskQueued flag is set to true, return.
Set unit’s VisibilityObserverTaskQueued flag to true.
Post a task to notify visibility observers, or enqueue a task to notify visibility observers in the list of idle request callbacks with an appropriate timeout.
Should we define an appropriate timeout?

5.2.2. Notify VisibilityObservers

To notify visibility observers for a unit of related similar-origin browsing contexts unit, run these steps:

Set unit’s VisibilityObserverTaskQueued flag to false.
For each Document document in unit
1. Let notify list be a copy of document’s [[RegisteredVisibilityObservers]] list.
2. For each VisibilityObserver object observer in notify list, run these steps:
  1. If observer’s internal [[QueuedEntries]] slot is empty, continue.
  2. Let queue be a copy of observer’s internal [[QueuedEntries]] slot.
  3. Clear observer’s internal [[QueuedEntries]] slot.
  4. Invoke callback with queue as the first argument and observer as the second argument and callback this value. If this throws an exception, report the exception.

5.2.3. Queue a VisibilityObserverEntry

To queue a VisibilityObserverEntry for observer, given a unit of related similar-origin browsing contexts unit, VisibilityObserver observer, and VisibilityObserverEntry entry run these steps:

Append entry to observer’s internal [[QueuedEntries]] slot.
Queue a visibility observer task for unit.

5.2.4. Promote Observed GraphicsLayers

This section is non-normative.

NOTE: The full internal details of rendering a document to the pixels actually displayed to the user is not standardized. UA implementations may vary widely.

The implementation strategy detailed in this section is not normative. Any strategy which produces correct outcomes for the normative algorithms is conformant and implementers are encouraged to optimize whenever possible.

The possibility of variance among user agent implementations notwithstanding, the normative algorithms of this specification are designed such that a highly performant implementation should be possible on the most common internal software and hardware architectures that are state-of-the-art for user agents and consumer computing platforms as of the time of writing.

In particular, the approach here deliberately avoids auditing the correctness of the representations displayed to users. In typical architectures, the pixel-level rendering of the global viewport is delgated to a a Graphics Processing Unit (GPU) using higher-level abstractions like surfaces, polygons, and vectors. As a consequence, the main execution context of the user agent does not "know" what pixels actually result without reading them back. System architectures are optimized for sending data to a GPU, not returning data from it, therefore, approaches which rely on pixel comparisons are likely to have an unacceptable performance cost. Instead, the approach detailed here relies on correctness by design, by manipulating the order in which instructions are sent to the GPU such that malicious interference is not possible.

Generally, at some point in the rendering of a set of documents in nested browsing contexts into the fully composed graphical representation in the global viewport, a user agent will arrive at a set of intermediate representations we will designate as GraphicsLayers, each of which represents a graphical surface to be painted / clipped / scrolled.

A GraphicsLayer representing the contents of a document in an iframe will be arranged in the layer stack such that at a later phase in the rendering it is automatically clipped and positioned relative to the series of viewports above it, and also subject to being drawn over or transformed by the layers above it.

To prevent potentially malicious composition, the user agent can promote observed graphicsLayers by manipulating them such that a document with [[RegisteredVisibilityObservers]]

Is clipped and positioned as-if-unmodified within the set of viewports of its ancestor browsing contexts. A promoted document should not be able to occupy more screen real estate than it is given by its embedding contexts.
Responds to hit testing and events as-if-unmodified. Implementation-specific modifications to internal representations of the document should not change the behavior of the DOM.
Is not subject to being drawn over or transformed by any other GraphicsLayers, except other promoted layers, which should be treated as fully opaque occlusions when reporting the visibility state of the document.

To promote observed graphicsLayers, given a time now, and an initially empty list promotedLayers, run these steps during the rendering loop at the stage where the intermediate representation of a set of Documents is a set of GraphicsLayers graphicsLayers.

For each graphicsLayer in graphicsLayers
For each Document document with an intermediate representation in graphicsLayer
1. If document has an empty list of [[RegisteredVisibilityObservers]], continue.
2. If document has a non-empty list of [[RegisteredVisibilityObservers]]
  1. If document is not the only Document represented in graphicsLayer, apply whatever implementation-specific steps are necessary to place it in its own layer. (e.g. apply translatez(0) to the documentElement) Let graphicsLayer be that new layer.
3. Let rectToRaise be the value of document.getBoundingClientRect().
4. Intersect rectToRaise with document’s viewport clip rect.
5. For every parent browsing context parent between document and the top-level document, intersect rectToRaise with parent’s viewport clip rect, and finally with the global viewport clip rect.
6. Clip graphicsLayer to rectToRaise. (graphicsLayer may have zero width and height if it is scrolled off screen by an ancestor browsing context)
7. Intersect rectToRaise with any items in the promotedLayers list.
8. Add rectToRaise to the promotedLayers list.
9. Without reordering prior intermediate representations in a manner which would change event dispatching, hit testing, or the DOM as exposed to JavaScript, reorder the GraphicsLayers such that rectToRaise is on top of the root GraphicsLayer. (e.g. by making it a direct child of the root layer) but beneath any layers in promotedLayers that clipped it.
10. Let protectedRect be the value of observer’s [[observedElement]].getBoundingClientRect(), adjusted by [[visibleMargin]].
11. Let visibleRatio be the intersection of protectedRect with rectToRaise, divided by protectedRect if protectedRect is non-zero, and 0 otherwise.
12. For each of document’s [[RegisteredVisibilityObservers]] observer
  1. Let threshold be the index of the first entry in observer’s internal [[areaThreshold]] slot whose value is greater than or equal to visibleRatio. If visibleRatio is equal to 0, let threshold be -1.
  2. Let oldVisibleRatio be set to observer’s internal [[previousVisibleRatio]] slot.
  3. Let oldThreshold be the index of the first entry in observer’s internal [[areaThreshold]] slot whose value is greater than or equal to oldVisibleRatio. If oldVisibleRatio is equal to 0, let oldThreshold be -1.
  4. Let oldPosition be the value of the observer’s internal [[previousGlobalViewportPosition]].
  5. If threshold does not equal oldThreshold, or if observer’s internal [[displacementAware]] slot is true and oldPosition is not equal to protectedRect,
    - queue a VisibilityObserverEntry
    - Assign visibleRatio to observer’s internal [[previousVisibleRatio]] slot.
    - Assign protectedRect to the value of the observer’s internal [[previousGlobalViewportPosition]] slot.

find exact terms to make sure that we have viewport definitions minus scrollbars

need to also clip to any other layers that were promoted ahead of us!

if a parent and child layer both request to be promoted, the parent’s clipping window will have a complex geometry with holes in it that is not accounted for by this algorithm. Likely need to specify that graphics layers be processed by order of depth.

5.2.5. Enforce An input-protection Directive

To enforce an input-protection directive for a Document document, run the following steps:

Parse the policy according to [CSP2].
If a value is set for protected-element, let protectedElement be the Element returned by invoking document.getElementById() with the value as the input, or document if null or unset.
If document’s [[InputProtectionRequested]] flag is false, set it to true.
Construct a new VisibilityObserver observer, with [[areaThreshold]] set to the value of area-threshold, [[visibleMargin]] set to the value of visible-margin, [[observedElement]] set to protectedElement, [[displacementAware]] set to true, and [[callback]] set to a new function with an empty function body.
Set the internal [[timeThreshold]] slot of observer to the value of time-threshold
Set the internal [[associatedContentSecurityPolicy]] slot of observer to a reference to the Content Security Policy which the input-protection directive is associated with.
When dispatching events, when an Element element will handle an Event event, if event is of type Mouse Event, Pointer Event, Drag-and-Drop, or Clipboard Event, (TODO:linkify) and if element has [[InputProtectionObservers]] observers:
1. If applicable, check the computed style for the cursor. If a cursor is typically displayed but has been hidden or changed to a non-standard bitmap, handle a violation for event and each observer in observers.
2. Otherwise, for each observer in observers:
  1. If observer’s [[previousVisibleRatio]] is less than [[areaThreshold]], handle a violation for observer.
  2. If observer’s [[previousVisibleRatio]] is greater than [[areaThreshold]], get the most recent VisibilityObserverEntry entry from observer’s [[QueuedEntries]]. If the difference between entry.time and now is less than [[timeThreshold]], handle a violation for observer.

5.2.6. Handle a Violation

To handle a violation of an input-protection directive for observer and event, run the following steps:

Follow the steps in [CSP2] to report a violation for observer’s [[associatedContentSecurityPolicy]] policy.
Determine if policy is being enforced or monitored. [CSP2]
If policy is being enforced, set event’s cancelled flag and stop propagation flag.
If policy is being monitored, set event.isUnsafe to true.

5.3. External Spec Integrations

5.3.1. HTML Processing Model: Event Loop

As part of substep 10 of the update the rendering event loop in the HTML Processing Model, Promote Observed GraphicsLayers, passing in now as the timestamp.

5.3.2. DOM: Dispatching Events

As part of dispatching events in the DOM Standard, add a substep to step 5, ("For each object in event path..."), invoking step 7 of enforce an input-protection directive before proceeding to "invoke object with event".

5.3.3. isUnsafe Attribute

partial interface Event {
  readonly attribute boolean isUnsafe;
};

isUnsafe, of type boolean, readonly: Will be set to true if the event fired when the event did not meet the document’s input-protection requirements.

6. Privacy Considerations

This section is non-normative.

The timing of visibilityEvents may leak some information across Origin boundaries. An embedded document might have previously been unable to learn that it was obscured, or the timing and nature of repositioning of ancestor frame’s viewports. In some circumstances, this information leak might have privacy implications, but the granularity and nature of the information is such that it should not be of much value to attackers. Compared to anti-clickjacking strategies which rely on pixel comparisions, the side channels exposed by comparing rectulangar masks are very low bandwidth. The privacy gains from preventing clickjacking, considered in a holistic system context, may be quite large.

7. Security Considerations

This section is non-normative.

UI Redressing and Clickjacking attacks rely on violating the contextual and temporal integrity of embedded content. Because these attacks target the subjective perception of the user and not well-defined security boundaries, the heuristic protections afforded by the input-protection directive can never be 100% effective for every interface. It provides no protection against certain classes of attacks, such as displaying content around an embedded resource that appears to extend a trusted dialog but provides misleading information.

When used as a mechanism to report visibility for purposes of monetizing content, operators should be aware that a malicious or modified user agent can always report perfect visibility for content it colludes with. Determining, through remote measurement, whether an ostensible viewer of monetizable content is using an agent which faithfully implements and reports in conformance with this specification is out of scope for this document.

8. Accessibility Considerations

Users of accessibility tools MUST NOT be prevented from accessing content because of input-protection or VisibilityEvents. If a user agent’s interaction modality is not subject to UI redressing attacks or definitions of "visibility" do not apply, the user agent SHOULD report a VisibilityEvent indicating 100% visibility, and SHOULD never fire a violation for any input-protection policy.

Conformance

Document conventions

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Conformant Algorithms

Requirements phrased in the imperative as part of algorithms (such as "strip any leading space characters" or "return false and abort these steps") are to be interpreted with the meaning of the key word ("must", "should", "may", etc) used in introducing the algorithm.

Conformance requirements phrased as algorithms or specific steps can be implemented in any manner, so long as the end result is equivalent. In particular, the algorithms defined in this specification are intended to be easy to understand and are not intended to be performant. Implementers are encouraged to optimize.

User Interface Security and the Visibility API

W3C Working Draft, 3 June 2016

Abstract

Status of this document

1. Introduction

2. Special Conformance Notes

3. VisibilityObserver API

3.1. The VisibilityObserverCallback

3.2. The VisibilityObserverEntry interface

3.3. The VisibilityObserver Interface

3.4. The VisibilityObserverInit dictionary

4. Content Security Policy Interface

4.1. The input-protection Directive

4.1.1. Directive Value

5. Processing Model

5.1. Internal Slot Definitions

5.1.1. Browsing Contexts

5.1.2. Element

5.1.3. Document

5.1.4. VisibilityObserver

5.2. Algorithms

5.2.1. Queue a VisibilityObserver Task

5.2.2. Notify VisibilityObservers

5.2.3. Queue a VisibilityObserverEntry

5.2.4. Promote Observed GraphicsLayers

5.2.5. Enforce An input-protection Directive

5.2.6. Handle a Violation

5.3. External Spec Integrations

5.3.1. HTML Processing Model: Event Loop

5.3.2. DOM: Dispatching Events

5.3.3. isUnsafe Attribute

6. Privacy Considerations

7. Security Considerations

8. Accessibility Considerations

Conformance

Document conventions

Conformant Algorithms

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

IDL Index

Issues Index