editing

EditContext API Explainer

Introduction

The EditContext is a new API that simplifies the process of integrating a web app with advanced text input methods, improves accessibility and performance, and unlocks new capabilities for web-based editors.

Motivation

The web platform provides out-of-the-box editing experiences for single lines of plain-text (input), small amounts of multi-line plain-text (textarea) and a starting point for building an HTML document editing experience (contenteditable elements).

Each of the editable elements provided by the web platform comes with built-in editing behaviors that are often inadequate to power the desired editing experience. As a result, web-based editors don’t incorporate the web platform’s editable elements into their view. Unfortunately, the only API provided by the web platform today to enable advanced text input experiences is to place an editable element in the DOM and focus it.

This contradiction of needing an editable element, but not wanting it to be visible, leads web-based editors to create hidden editable elements to facilitate text input. This approach negatively impacts accessibility and increases complexity, leading to buggy behavior.

An alternative is to incorporate a contenteditable element into the view of the editor, regardless of whether the editor is editing an HTML document. This approach limits the editor’s flexibilty in modifying the view, since the view is also powering the text input experience.

Real-world Examples of Text Input Issues in Top Sites and Frameworks

Accessibility Issues in the Monaco Editor

This video demos Windows Narrator reading from a hidden textarea element in the Monaco editor and compares it with the intended experience by showing Narrator reading text from CKEditor, which uses a contenteditable element as part of its view.

Monaco edits plain text - it’s a code editor. The plain text document is presented using a rich view created from HTML, but a hidden textarea is used to integrate with the text input services of the OS. This approach makes the hidden textarea the accessibile surface for the editable content being edited.

Two aspects of accessibility suffer as a result:

  1. The focused element is off screen so narrator doesn’t place a blue outline around the words as they are read aloud.
  2. Unless Monaco duplicates the whole document into the textarea element, only a fraction of the content can be read before Narrator moves prematurely out of the document content and starts reading elsewhere on the page.

Trouble Collaborating in Word Online while Composing Text

This video shows a collaboration feature in Word Online where two users can see each other’s edits and caret positions. Collaboration is suspended though while composition is in progress. When composition is active, updates to the view (especially nearby the composition) may cancel the composition and prevent text input.

To work around this problem, Word Online waits until the composition finishes before updating the view. Some Chinese IMEs don’t auto commit their composition; it just keeps going until the user types Enter. As a result, collaboration may be blocked for some time.

Can’t Use the Windows Emoji Picker in Google Docs

In this video Google Docs is using an off screen contenteditable element to enable text input. This approach gives Google Docs access to text input features like an IME for composition, as well as enabling the emoji picker and other advanced text input options.

Google Docs is listening for events to ensure the contenteditable element is focused and positioned appropriately near the insertion point before composition starts. It isn’t aware of all events, or in some cases doesn’t receive any events, when other text input UI like the emoji picker is displayed. As a result, the emoji window is positioned near the top of the app (not near the insertion point) and input isn’t received since focus isn’t currently in an editable element.

Trouble Composing Across Page Boundaries

In this video Native Word on Windows is shown updating its view while in an active composition. The scenario demonstrated requires Word to relocate the active composition into a different page based on layout constraints.

Because the web platform integrates with the OS text input services through its HTML DOM view, updating the view while composition is in progress may cancel the composition and prevent text input. Using the EditContext, however, the view can be updated and new locations for where composition is occurring can be reported without canceling the composition.

No Support for Type-to-search in Custom Controls with Chinese Characters

This video demonstrates an IE feature that automatically selected an option in a select element based on the text typed by the user - even when that text is being composed.

Custom components have no ability to achieve similar behavior, but with the EditContext API type-to-search can be a reality for arbitrary custom elements. Non-editing scenarios will also benefit from the EditContext.

Proposal: EditContext API

The EditContext addresses the problems above by decoupling text input from the HTML DOM view. Rather than having the web platform infer the data required to enable sophisticated text input mechanisms from the HTML DOM, the author will provide that data explicitly through the API surface of the EditContext.

Specifically, the EditContext allows the author to provide:

Additionally, the EditContext communicates events driven from text input UI to JavaScript:

EditContext Event Sequence:

This section describes the sequences of events that get fired on the EditContext and focused element when the EditContext has focus and IME is active. In this event sequence, the user types in two characters, then commits to the first IME candidate by hitting ‘Space’.

Event EventTarget Related key in sequence
keydown focused element Key 1
compositionstart active EditContext
textupdate active EditContext
textformatupdate active EditContext
keyup focused element
keydown focused element Key 2
textupdate active EditContext
textformatupdate active EditContext
keyup focused element
keydown focused element Space
textupdate active EditContext (committed IME characters available in event.updateText)
textformatupdate active EditContext
keyup focused element
compositionend active EditContext  

Note that the composition events are also not fired on the focused element as the composition is operating on the shared buffer that is represented by the EditContext.

EditContext WebIDL


dictionary TextUpdateEventInit {
    unsigned long updateRangeStart;
    unsigned long updateRangeEnd;
    DOMString updateText;
    unsigned long newSelectionStart;
    unsigned long newSelectionEnd;
};

[Exposed=Window]
interface TextUpdateEvent : Event {
    constructor(optional TextUpdateEventInit options = {});
    readonly attribute unsigned long updateRangeStart;
    readonly attribute unsigned long updateRangeEnd;
    readonly attribute DOMString updateText;
    readonly attribute unsigned long newSelectionStart;
    readonly attribute unsigned long newSelectionEnd;
};

dictionary TextFormatUpdateEventInit {
    unsigned long formatRangeStart;
    unsigned long formatRangeEnd;
    DOMString underlineColor;
    DOMString backgroundColor;
    DOMString suggestionHighlightColor;
    DOMString textColor;
    DOMString underlineThickness;
    DOMString underlineStyle;
};

[Exposed=Window]
interface TextFormatUpdateEvent : Event {
    constructor(optional TextFormatUpdateEventInit options = {});
    readonly attribute unsigned long formatRangeStart;
    readonly attribute unsigned long formatRangeEnd;
    readonly attribute DOMString underlineColor;
    readonly attribute DOMString backgroundColor;
    readonly attribute DOMString suggestionHighlightColor;
    readonly attribute DOMString textColor;
    readonly attribute DOMString underlineThickness;
    readonly attribute DOMString underlineStyle;
};

enum EditContextInputMode {
    "none",
    "text",
    "decimal",
    "search",
    "email",
    "numeric",
    "tel",
    "url",
    "password"
};

enum EditContextEnterKeyHint {
    "enter",
    "done",
    "go",
    "next",
    "previous",
    "search",
    "send"
};

enum EditContextInputPanelPolicy {
    "auto",
    "manual"
};

dictionary EditContextInit {
    DOMString text;
    unsigned long selectionStart;
    unsigned long selectionEnd;
    EditContextInputMode inputMode;
    EditContextInputPolicy inputPolicy;
    EditContextEnterKeyHint enterKeyHint;
};

/// @event name="textupdate", type="TextUpdateEvent"
/// @event name="textformatupdate", type="TextFormatUpdateEvent"
/// @event name="compositionstart", type="CompositionEvent"
/// @event name="compositionend", type="CompositionEvent"
[Exposed=Window]
interface EditContext : EventTarget {
    constructor(optional EditContextInit options = {});

    void updateSelection(unsigned long start, unsigned long end);
    void updateBounds(DOMRect controlBounds, DOMRect selectionBounds);
    void updateText(unsigned long start, unsigned long end, DOMString newText);

    attribute DOMString text;
    attribute unsigned long selectionStart;
    attribute unsigned long selectionEnd;
    attribute EditContextInputMode inputMode;
    attribute EditContextInputPanelPolicy inputPanelPolicy;
    attribute EditContextEnterKeyHint enterKeyHint;

    // Event handler attributes
    attribute EventHandler ontextupdate;
    attribute EventHandler ontextformatupdate;
    attribute EventHandler oncompositionstart;
    attribute EventHandler oncompositionend;
};

Difference between Contenteditable element and the EditContext element.

contenteditable_vs_editcontext

One can think of a div with Contenteditable (on the left in the above figure) as a div with a built-in EditContext which maintains a plain text buffer that serves as a plain text view (or IME-facing view) to communicate with various text input services (ex. IME, handwriting recognition, speech detection, etc.) When users initiate text inputs, the text input services will update the plain text buffer through the plain text view. The built-in EditContext then sends internal events to the div which takes the plain text buffer as part of its own model and updates the DOM, which serves as a user-facing view, based on some default editing behaviors defined by the brower.

When a div is associated with an EditContext (on the right in the above figure), the “external” EditContext takes over the text input. Instead of directly triggering the default manipulation of the DOM, the text input now updates the plain text buffer in the external EditContext. The external EditContext then sends events to JavaScript and web-based editors can listen to the events, updates their own models, and manipulates the DOM per their desired editing experiences.

Note that EditContext only decouples and handles the manipulation of the plain text view coming from the text input services. Manipulation involving the user-facing view (ex. drag and drop selected text, spell check replacement, up/down arrow keys to move the caret between lines), or manipulation involving formats (ex. ctrl+B, outdent/indent) are out of scope of EditContext, however, the beforeinput events for these manipulation will still fire on the div to serve as user intent and it’ll be editors’s responsibility to handle the editing operations.

Here are several key points when a div is associated with an EditContext:

The following table summarizes the difference between div with contentEditable and div with EditContext for each common editing commands: | | <div contentEditable> | <div> with EditContext | | — | ———————– | ————————- | | div gets focus (by clicking or .focus()) | <ul><li>Show focus ring</li><li>Show blinking caret</li></ul> | <ul><li>Show focus ring</li><li>Show blinking caret</li></ul> | |English typing |<ul><li>beforeinput (insertText) -> div</li><li>div.innerHTML gets updated</li><li>input (insertText) -> div </li> | <ul><li>beforeinput (insertText) -> div</li><li>editContext.text gets updated</li><li>textupdate -> EditContext</li> | |Backspace |<ul><li>beforeinput (deleteContentBackward) -> div</li><li>div.innerHTML gets updated</li><li>input (deleteContentBackward) -> div </li> | <ul><li>beforeinput (deleteContentBackward) -> div</li><li>editContext.text gets updated</li><li>textupdate -> EditContext</li> | |Delete |<ul><li>beforeinput (deleteContentForward) -> div</li><li>div.innerHTML gets updated</li><li>input (deleteContentForward) -> div </li> | <ul><li>beforeinput (deleteContentForward) -> div</li><li>editContext.text gets updated</li><li>textupdate -> EditContext</li> | |Very first Composition input|<ul><li>Compositoinstart -> div</li><li>beforeinput (insertCompositionText) -> div</li><li>Compositionupdate -> div</li><li>div.innerHTML gets updated</li><li>input (insertCompositionText) -> div</li> |<ul><li> compositionstart -> EditContext</li><li>editContext.text gets updated</li><li>textupdate -> EditContext</li><li>textformatupdate -> EditContext</li> | | During composition (text input and arrow keys) | <ul><li>beforeinput (insertCompositionText) -> div</li><li>Compositionupdate -> div</li><li>div.innerHTML gets updated</li><li>input (insertCompositionText) -> div</li></ul> | <ul><li>editContext.text gets updated</li><li>textupdate -> EditContext</li><li>textformatupdate -> EditContext</li></ul> | | Commit comosition (hit Enter)| <ul><li>beforeinput (insertCompositionText) -> div</li><li>Compositionupdate -> div</li><li>div.innerHTML gets updated</li><li>input (insertCompositionText) -> div</li><li>Compositoinend -> div</li></ul> | <ul><li>editContext.text gets updated</li><li>textupdate -> EditContext</li><li>textformatupdate -> EditContext</li><li>compositionend -> EditContext</li></ul> | | Ctrl+B / Ctrl+I / etc. | <ul><li>beforeinput (formatBold) -> div</li><li>div.innerHTML gets updated</li><li>input (formatBold) -> div</li></ul>|<ul><li>beforeinput (formatBold) -> div</li></ul> | |Arrow keys (with shift) / Home / End / PageUp / PageDown / etc.|<ul><li>caret/selection is updated</li><li>selectionchange -> document</li></ul>|<ul><li>caret/selection is updated (in DOM space)</li><li>selectionchange -> document</li><li>EditContext’s selection is NOT auto updated</li><li>It will require web authors to map selection position from DOM space to EditContext’s plain text space</li></ul>| |Mouse click (with shift)|<ul><li>caret/selection is updated</li><li>selectionstart</li><li>selectionchange -> document</li></ul>|<ul><li>caret/selection is updated (in DOM space)</li><li>selectionchange -> document</li><li>EditContext’s selection is NOT auto updated</li></ul>| |Spell check replacement|<ul><li>beforeinput (insertReplacementText) -> div</li><li>div.innerHTML gets updated</li><li>input (insertReplacementText) -> div</li></ul>|<ul><li>beforeinput (insertReplacementText) -> div</li></ul>| |Drag & drop selected words|<ul><li>beforeinput (deleteByDrag) -> div</li><li>input (deleteByDrag) -> div</li><li>beforeinput (insertFromDrop) -> div</li><li>div.innerHTML gets updated</li><li>input (insertFromDrop) -> div</li></ul>|<ul><li>beforeinput (deleteByDrag) -> div</li><li>beforeinput (insertFromDrop) -> div</li></ul>| |Cut (ctrl+x)|<ul><li>beforeinput (deleteByCut) -> div</li><li>div.innerHTML gets updated</li><li>input (deleteByCut) -> div</li></ul>|<ul><li>beforeinput (deleteByCut) -> div</li></ul>| |Copy|n/a|n/a| |Paste (ctrl+v)|<ul><li>beforeinput (insertFromPaste) -> div</li><li>div.innerHTML gets updated</li><li>input (insertFromPaste) -> div</li></ul>|<ul><li>beforeinput (insertFromPaste) -> div</li></ul>|

EditContext Usage

Example 1: initialization

    var editContext = new EditContext();
    div.editContext = editContext;

Example 2: event handler

    editContext.addEventListener("textupdate", e => {
        let s = document.getSelection();
        let textNode = s.anchorNode;
        let offset = s.anchorOffset;
        let string = textNode.textContent;
        // update the text Node
        textNode.textContent = string.substring(0, offset) + e.updateText + string.substring(offset);
    });

    editContext.addEventListener("textformatupdate", e => { 
        decoration.style.borderBottom = "3px " + e.underlineStyle;
    });

Example 3: mapping from DOM space to EditContext (plain text) space

    document.addEventListener("selectionchange", e => {
        let s = document.getSelection();

        // calculate the offset in plain text
        let range = document.createRange();
        range.setEnd(s.anchorNode, s.anchorOffset);
        range.setStartBefore(parentSpan);
        let plainText = range.toString();

        editContext.updateSelection(plainText.length, plainText.length);
    });

Example 4

Create an EditContext and have it start receiving events when its associated container gets focus. After creating an EditContext, the web application should initialize the text and selection (unless the state of the web application is correctly represented by the empty defaults) via a dictionary passed to the constructor. Additionally, the layout bounds of selection and conceptual location of the EditContext in the view should be provided by calling updateBounds.

let editContainer = document.querySelector("#editContainer");

let editContextInit = {
    text: "Hello world",
    selectionStart: 11,
    selectionEnd : 11,
    inputMode: "text",
    inputPolicy: "auto",
    enterKeyHint: "enter"
};
let editContext = new EditContext(editContextInit);

// EditModel and EditView are author supplied code omitted from this example for brevity.
let model = new EditModel(editContext, editContextInit.text, editContextInit.selectionStart, editContextInit.selectionEnd);
let view = new EditView(editContext, model, editContainer);

window.requestAnimationFrame(() => {
    editContext.updateBounds(editContainer.getBoundingClientRect(), computeSelectionBoundingRect());
});

editContainer.focus();

The following code registers for textupdate and keyboard related events (note that keydown/keyup are still delivered to the edit container, i.e. the activeElement). Note that model represents the document model for the editable content, and view represents an object that produces an HTML view of that document.

editContainer.addEventListener("keydown", e => {
    // Handle control keys that don't result in characters being inserted
    switch (e.key) {
        case "Home":
            model.updateSelection(...);
            view.queueUpdate();
            break;
        case "Backspace":
            model.deleteCharacters(Direction.BACK);
            view.queueUpdate();
            break;
        ...
    }
});

editContext.addEventListener("textupdate", e => {
    model.updateText(e.newText, e.updateRangeStart, e.updateRangeEnd);

    // Do not call updateText on editContext, as we're accepting
    // the incoming input.

    view.queueUpdate();
});

editContext.addEventListener("textformatupdate", e => {
    view.addFormattedRange(e.formatRangeStart, e.formatRangeEnd)
});

Example 5

Example of a user-defined EditModel class that contains the underlying model for the editable content

// User defined class
class EditModel {
    constructor(editContext, text, selectionStart, selectionEnd) {
        // This specific model uses the underlying buffer of the editContext directly
        // and so doesn't have a backing text store of its own.
        this.editContext = editContext;
        this.text = text;
        this.selection = new Selection();
        this.setSelection(selectionStart, selectionEnd);
    }

    updateText(text, start, end) {
        this.textRows[this.caretPosition.y].splice(start, end - start, ...text.split(""));
        this.caretPosition.set(this.caretPosition.x - (end - start) + text.length, this.caretPosition.y);
        this.desiredCaretX = this.caretPosition.x;
    }

    setSelection(start, end) {
        this.selection.start = start;
        this.selection.end = end;
    }

    updateSelection(...) {
        // Compute new selection, based on shift/ctrl state
        let newSelection = computeSelection(this.editContext.selectionStart, this.editContext.selectionEnd,...);
        this.setSelection(newSelection.start, newSelection.end);
        this.editContext.updateSelection(newSelection.start, newSelection.end);
    }

    deleteCharacters(direction) {
        if (this.selection.start !== this.selection.end) {
            // Notify EditContext that things are changing.
            this.editContext.updateText(this.selection.start, this.selection.end, "");
            this.editContext.updateSelection(this.selection.start, this.selection.start);

            // Update internal model state
            this.text = text.slice(0, this.selection.start) +
                text.slice(this.selection.end, this.text.length)
            this.setSelection(this.selection.start, this.selection.start);
        } else {
            // Delete a single character, based on direction (forward or back).
            // Notify editContext of changes
            ...
        }
    }
}

Example 6

Example of a user defined class that can compute an HTML view, based on the text model

class EditableView {
    constructor(editContext, editModel, editRegionElement) {
        this.editContext = editContext;
        this.editModel = editModel;
        this.editRegionElement = editRegionElement;

        // When the webpage scrolls, the layout position of the editable view
        // may change - we must tell the EditContext about this.
        window.addEventListener("scroll", this.notifyLayoutChanged.bind(this));

        // Same response is needed when the window is resized.
        window.addEventListener("resize", this.notifyLayoutChanged.bind(this));
    }

    queueUpdate() {
        if (!this.updateQueued) {
            requestAnimationFrame(this.renderView.bind(this));
            this.updateQueued = true;
        }
    }

    addFormattedRange(formatRange) {
        // Replace any previous formatted range by overwriting - there
        // should only ever be one (specific to the current composition).
        this.formattedRange = formatRange;
        this.queueUpdate();
    }

    renderView() {
        this.editRegionElement.innerHTML = this.convertTextToHTML(
            this.editModel.text, this.editModel.selection);

        notifyLayoutChanged();

        this.updateQueued = false;
    }

    notifyLayoutChanged() {
        this.editContext.updateBounds(this.computeBoundingBox(), this.computeSelectionBoundingBox());
    }

    convertTextToHTML(text, selection) {
        // compute the view (code omitted for brevity):
        // - if there is no selection, return a string with the text contents
        // - surround the selection by a <span> that has the
        //   appropriate background/foreground colors.
        // - surround the characters represented by this.formatRange
        //   with a <span> whose style has properties as specified by
        //   the properties on 'this.formattedRange': color
        //   backgroundColor, textDecorationColor, textUnderlineStyle
    }
}

Example Application

This example shows how an author can leverage native selection when using EditContext.

This example shows how an author might build a simple editor that leverages the EditContext in a more holistic way.

Interaction with Other Browser Editing Features

By decoupling the view from text input, the EditContext opts out of some editing behaviors that are currently only available through the DOM. An inventory of those features and their interaction with the EditContext follows:

Spellchecking

Web apps have no way today to integrate with spellcheck from the browser except through editable elements. Using the EditContext will make the native spellchecking capabilities of the browser unreachable. There is demand for an independent spellchecking API.

For web apps or editing frameworks relying on editable elements to provide this behavior, it may be a barrier to adoption of the EditContext. Note, however, there are heavily used web editing experiences (Office Online apps, Google docs) that have replaced spell checking with a custom solution who will not be blocked from adopting a better text input integration story, even in the absence of a separate spellcheck API. Similarly, there are also editing experiences, e.g. Monaco, that don’t use spell checking from the browser because an element like a contenteditable won’t understand what’s a string and what’s a class name leading to a lot of extra innappropriate squiggles in the code editing experience.

Undo

Web-based editors rarely want the DOM undo stack. Undo reverses the effect of DOM operations in an editable element that were initiated in response to user input. Since many editors use the editable element to capture text input from the user, but use JavaScript operations to update the view in response to that input, undoing only the DOM changes from user input rarely makes sense.

It is expected that web-based editors using the EditContext will provide their own undo operations. Some performance benefit should be realized as DOM operations will no longer incur the overhead of maintaining a valid undo stack as DOM mutations mix with user-initiated (undoable) actions.

Focus

The notion of focus in the DOM, which determines the target for KeyboardEvents, is unaffected by the EditContext. DOM elements can remain focused while the EditContext serves as the recipient of composition and textupdate events.

Built-in Editing Commands that Manipulate the DOM in Response to User Input

Web-based editors which use the EditContext are expected to provide their own editing command implementations. For example, typing Enter on the keyboard will not automatically insert a newline into the HTML view. An editor must handle the KeyboardEvent and perform updates to their own document model and render those changes into the HTML DOM for users to see the impact of the Enter key press.

As an alternative, basic editing command implementations could be implemented and expressed as textupdate events to the EditContext’s cached text view. Such a feature may make it easier for web-based editors to adopt since the EditContext will behave more like the hidden text area without the side effects.

However, if the EditContext did provide more editing behavior, it may not be used by editors since a key press like Enter or Backspace is often associated with editing heuristics such as ending or outdenting a list, turning a heading into a normal paragraph style, inserting a new table row, removing a hyperlink without removing any characters from the URL, etc.

The current thinking is that a more minimal approach is a better place to start.

Default Key Event Behavior Adaptations for Editing

Some KeyboardEvents are associated with different default behaviors when an editable element is focused than when a read-only element is focused. As an example, the spacebar inserts a space in editable elements, but scrolls when a read-only element is focused.

When an EditContext is active, the web platform will treat the set of KeyboardEvents with special editing behaviors as though the default behavior has been prevented, i.e. there will be no need for the author to call preventDefault to prevent scrolling when a Space key is pressed.

Touch-specific Editing Behaviors

Some browsers may support double-tap to zoom. When double tap occurs on editable text, however, it is commonly used to select the word under the double tap. Editors using read-only elements in conjunction with an EditContext can employ the touch-action CSS property to eliminate unwanted touch behavior.

Native Selection and Caret

Web-based editors using the EditContext that also want to use native selection and the caret don’t currently have a great solution. There are two problems in particular that must be overcome:

  1. A native caret currently can only be rendered in an editable region, so using an EditContext in combination with a read-only element in the DOM doesn’t support a native caret.
  2. Native selection is constrained to stay within the bounds of an editable element. This is likely expected behavior, but no such restriction is placed on read-only elements which could lead to over selection without an editable element that establishes a selection limit.

Option 1

New DOM content attributes could be proposed to constrain selection to a subtree of the DOM and allow display of the native caret.

Option 2

Editors implement their own selection and caret using DOM elements or the proposed Highlight API.

Option 2 is the default and may be the best starting point. It is currently employed by multiple editors as those editors offer specialized behavior related to selection: e.g. multiple insertion point support or rectangular selection or table selection.

Option 3

An editor could combine a contenteditable element with an EditContext. This has the advantage of overcoming both selection related challenges: constraining selection and displaying the native caret. It, however, has the disadvantage that editing behaviors not disabled by having an EditContext, for example clipboard paste and drag and drop, may result in DOM mutations which could break editors.

Highlighting

Editable elements can apply paint-time effects to mark an active composition and spellchecking results. These features won’t happen automatically for web-based editors using the EditContext. Instead, additional elements can be added to the DOM to render these effects, or, when available, the proposed Highlight API can be used.

Alternatives:

Multiple approaches have been discussed during F2F editing meetings and through online discussions.

Appendix

Example Text Input Methods

Virtual Keyboard Shape-writing

VK shape-writing

Handwriting Recognition

Handwriting Recognition

Emoji Picker

Emoji Picker

IME Composition

IME Compositions


Related issues | Open a new issue