editing

EditContext API Explainer

Introduction

The EditContext is a new API that simplifies the process of integrating a web app with advanced text input methods, improves accessibility and performance, and unlocks new capabilities for web-based editors.

Motivation

The web platform provides out-of-the-box editing experiences for single lines of plain-text (input), small amounts of multi-line plain-text (textarea) and a starting point for building an HTML document editing experience (contenteditable elements).

Each of the editable elements provided by the web platform comes with built-in editing behaviors that are often inadequate to power the desired editing experience. As a result, web-based editors don’t incorporate the web platform’s editable elements into their view. Unfortunately, the only API provided by the web platform today to enable advanced text input experiences is to place an editable element in the DOM and focus it.

This contradiction of needing an editable element, but not wanting it to be visible, leads web-based editors to create hidden editable elements to facilitate text input. This approach negatively impacts accessibility and increases complexity, leading to buggy behavior.

An alternative is to incorporate a contenteditable element into the view of the editor, regardless of whether the editor is editing an HTML document. This approach limits the editor’s flexibilty in modifying the view, since the view is also powering the text input experience.

Real-world Examples of Text Input Issues in Top Sites and Frameworks

Accessibility Issues in the Monaco Editor

This video demos Windows Narrator reading from a hidden textarea element in the Monaco editor and compares it with the intended experience by showing Narrator reading text from CKEditor, which uses a contenteditable element as part of its view.

Monaco edits plain text - it’s a code editor. The plain text document is presented using a rich view created from HTML, but a hidden textarea is used to integrate with the text input services of the OS. This approach makes the hidden textarea the accessibile surface for the editable content being edited.

Two aspects of accessibility suffer as a result:

  1. The focused element is off screen so narrator doesn’t place a blue outline around the words as they are read aloud.
  2. Unless Monaco duplicates the whole document into the textarea element, only a fraction of the content can be read before Narrator moves prematurely out of the document content and starts reading elsewhere on the page.

Trouble Collaborating in Word Online while Composing Text

This video shows a collaboration feature in Word Online where two users can see each other’s edits and caret positions. Collaboration is suspended though while composition is in progress. When composition is active, updates to the view (especially nearby the composition) may cancel the composition and prevent text input.

To work around this problem, Word Online waits until the composition finishes before updating the view. Some Chinese IMEs don’t auto commit their composition; it just keeps going until the user types Enter. As a result, collaboration may be blocked for some time.

Can’t Use the Windows Emoji Picker in Google Docs

In this video Google Docs is using an off screen contenteditable element to enable text input. This approach gives Google Docs access to text input features like an IME for composition, as well as enabling the emoji picker and other advanced text input options.

Google Docs is listening for events to ensure the contenteditable element is focused and positioned appropriately near the insertion point before composition starts. It isn’t aware of all events, or in some cases doesn’t receive any events, when other text input UI like the emoji picker is displayed. As a result, the emoji window is positioned near the top of the app (not near the insertion point) and input isn’t received since focus isn’t currently in an editable element.

Trouble Composing Across Page Boundaries

In this video Native Word on Windows is shown updating its view while in an active composition. The scenario demonstrated requires Word to relocate the active composition into a different page based on layout constraints.

Because the web platform integrates with the OS text input services through its HTML DOM view, updating the view while composition is in progress may cancel the composition and prevent text input. Using the EditContext, however, the view can be updated and new locations for where composition is occurring can be reported without canceling the composition.

No Support for Type-to-search in Custom Controls with Chinese Characters

This video demonstrates an IE feature that automatically selected an option in a select element based on the text typed by the user - even when that text is being composed.

Custom components have no ability to achieve similar behavior, but with the EditContext API type-to-search can be a reality for arbitrary custom elements. Non-editing scenarios will also benefit from the EditContext.

Proposal: EditContext API

The EditContext addresses the problems above by decoupling text input from the HTML DOM view. Rather than having the web platform infer the data required to enable sophisticated text input mechanisms from the HTML DOM, the author will provide that data explicitly through the API surface of the EditContext.

Specifically, the EditContext allows the author to provide:

Additionally, the EditContext communicates events driven from text input UI to JavaScript:

EditContext Event Sequence:

This section describes the sequences of events that get fired on the EditContext and the focused element when IME is active. In this example, the user types ‘s’ and ‘u’ in Japanese, then commits the first candidate ‘巣’ by hitting ‘Space’.

Event EventTarget key code event.updateText
keydown focused element ‘S’  
compositionstart active EditContext    
textupdate active EditContext   ‘S’
textformatupdate active EditContext    
keyup focused element ‘S’  
keydown focused element ‘U’  
textupdate active EditContext   ‘す’
textformatupdate active EditContext    
keyup focused element ‘U’  
keydown focused element ‘Space’  
textupdate active EditContext   ‘巣’
textformatupdate active EditContext    
compositionend active EditContext    
keyup focused element ‘Space’  

EditContext WebIDL


dictionary TextUpdateEventInit {
    unsigned long updateRangeStart;
    unsigned long updateRangeEnd;
    DOMString updateText;
    unsigned long newSelectionStart;
    unsigned long newSelectionEnd;
};

[Exposed=Window]
interface TextUpdateEvent : Event {
    constructor(optional TextUpdateEventInit options = {});
    readonly attribute unsigned long updateRangeStart;
    readonly attribute unsigned long updateRangeEnd;
    readonly attribute DOMString updateText;
    readonly attribute unsigned long newSelectionStart;
    readonly attribute unsigned long newSelectionEnd;
};

dictionary TextFormatUpdateEventInit {
    unsigned long formatRangeStart;
    unsigned long formatRangeEnd;
    DOMString underlineColor;
    DOMString backgroundColor;
    DOMString suggestionHighlightColor;
    DOMString textColor;
    DOMString underlineThickness;
    DOMString underlineStyle;
};

[Exposed=Window]
interface TextFormatUpdateEvent : Event {
    constructor(optional TextFormatUpdateEventInit options = {});
    readonly attribute unsigned long formatRangeStart;
    readonly attribute unsigned long formatRangeEnd;
    readonly attribute DOMString underlineColor;
    readonly attribute DOMString backgroundColor;
    readonly attribute DOMString suggestionHighlightColor;
    readonly attribute DOMString textColor;
    readonly attribute DOMString underlineThickness;
    readonly attribute DOMString underlineStyle;
};

dictionary EditContextInit {
    DOMString text;
    unsigned long selectionStart;
    unsigned long selectionEnd;
};

/// @event name="textupdate", type="TextUpdateEvent"
/// @event name="textformatupdate", type="TextFormatUpdateEvent"
/// @event name="compositionstart", type="CompositionEvent"
/// @event name="compositionend", type="CompositionEvent"
[Exposed=Window]
interface EditContext : EventTarget {
    constructor(optional EditContextInit options = {});

    void updateSelection(unsigned long start, unsigned long end);
    void updateBounds(DOMRect controlBounds, DOMRect selectionBounds);
    void updateText(unsigned long start, unsigned long end, DOMString newText);

    attribute DOMString text;
    attribute unsigned long selectionStart;
    attribute unsigned long selectionEnd;

    // Event handler attributes
    attribute EventHandler ontextupdate;
    attribute EventHandler ontextformatupdate;
    attribute EventHandler oncompositionstart;
    attribute EventHandler oncompositionend;
};

Difference between Contenteditable element and the EditContext element.

contenteditable_vs_editcontext

One can think of a div with Contenteditable (on the left in the above figure) as a div with a built-in EditContext which maintains a plain text buffer that serves as a plain text view (or IME-facing view) to communicate with various text input services (ex. IME, handwriting recognition, speech detection, etc.) When users initiate text inputs, the text input services will update the plain text buffer through the plain text view. The built-in EditContext then sends internal events to the div which takes the plain text buffer as part of its own model and updates the DOM, which serves as a user-facing view, based on some default editing behaviors defined by the brower.

When a div is associated with an EditContext (on the right in the above figure), the “external” EditContext takes over the text input. Instead of directly triggering the default manipulation of the DOM, the text input now updates the plain text buffer in the external EditContext. The external EditContext then sends events to JavaScript and web-based editors can listen to the events, updates their own models, and manipulates the DOM per their desired editing experiences.

Note that EditContext only decouples and handles the manipulation of the plain text view coming from the text input services. Manipulation involving the user-facing view (ex. drag and drop selected text, spell check replacement, up/down arrow keys to move the caret between lines), or manipulation involving formats (ex. ctrl+B, outdent/indent) are out of scope of EditContext, however, the beforeinput events for these manipulation will still fire on the div to serve as user intent and it’ll be editors’s responsibility to handle the editing operations.

Here are several key points when a div is associated with an EditContext:

The following table summarizes the difference between div with contentEditable and div with EditContext for each common editing commands: | | <div contentEditable> | <div> with EditContext | | — | ———————– | ————————- | | div gets focus (by clicking or .focus()) | <ul><li>Show focus ring</li><li>Show blinking caret</li></ul> | <ul><li>Show focus ring</li><li>Show blinking caret</li></ul> | |English typing |<ul><li>beforeinput (insertText) -> div</li><li>div.innerHTML gets updated</li><li>input (insertText) -> div </li> | <ul><li>beforeinput (insertText) -> div</li><li>editContext.text gets updated</li><li>textupdate -> EditContext</li> | |Backspace |<ul><li>beforeinput (deleteContentBackward) -> div</li><li>div.innerHTML gets updated</li><li>input (deleteContentBackward) -> div </li> | <ul><li>beforeinput (deleteContentBackward) -> div</li><li>editContext.text gets updated</li><li>textupdate -> EditContext</li> | |Delete |<ul><li>beforeinput (deleteContentForward) -> div</li><li>div.innerHTML gets updated</li><li>input (deleteContentForward) -> div </li> | <ul><li>beforeinput (deleteContentForward) -> div</li><li>editContext.text gets updated</li><li>textupdate -> EditContext</li> | |Very first Composition input|<ul><li>Compositoinstart -> div</li><li>beforeinput (insertCompositionText) -> div</li><li>Compositionupdate -> div</li><li>div.innerHTML gets updated</li><li>input (insertCompositionText) -> div</li> |<ul><li> compositionstart -> EditContext</li><li>editContext.text gets updated</li><li>textupdate -> EditContext</li><li>textformatupdate -> EditContext</li> | | During composition (text input and arrow keys) | <ul><li>beforeinput (insertCompositionText) -> div</li><li>Compositionupdate -> div</li><li>div.innerHTML gets updated</li><li>input (insertCompositionText) -> div</li></ul> | <ul><li>editContext.text gets updated</li><li>textupdate -> EditContext</li><li>textformatupdate -> EditContext</li></ul> | | Commit comosition (hit Enter)| <ul><li>beforeinput (insertCompositionText) -> div</li><li>Compositionupdate -> div</li><li>div.innerHTML gets updated</li><li>input (insertCompositionText) -> div</li><li>Compositoinend -> div</li></ul> | <ul><li>editContext.text gets updated</li><li>textupdate -> EditContext</li><li>textformatupdate -> EditContext</li><li>compositionend -> EditContext</li></ul> | | Ctrl+B / Ctrl+I / etc. | <ul><li>beforeinput (formatBold) -> div</li><li>div.innerHTML gets updated</li><li>input (formatBold) -> div</li></ul>|<ul><li>beforeinput (formatBold) -> div</li></ul> | |Arrow keys (with shift) / Home / End / PageUp / PageDown / etc.|<ul><li>caret/selection is updated</li><li>selectionchange -> document</li></ul>|<ul><li>caret/selection is updated (in DOM space)</li><li>selectionchange -> document</li><li>EditContext’s selection is NOT auto updated</li><li>It will require web authors to map selection position from DOM space to EditContext’s plain text space</li></ul>| |Mouse click (with shift)|<ul><li>caret/selection is updated</li><li>selectionstart</li><li>selectionchange -> document</li></ul>|<ul><li>caret/selection is updated (in DOM space)</li><li>selectionchange -> document</li><li>EditContext’s selection is NOT auto updated</li></ul>| |Spell check replacement|<ul><li>beforeinput (insertReplacementText) -> div</li><li>div.innerHTML gets updated</li><li>input (insertReplacementText) -> div</li></ul>|<ul><li>beforeinput (insertReplacementText) -> div</li></ul>| |Drag & drop selected words|<ul><li>beforeinput (deleteByDrag) -> div</li><li>input (deleteByDrag) -> div</li><li>beforeinput (insertFromDrop) -> div</li><li>div.innerHTML gets updated</li><li>input (insertFromDrop) -> div</li></ul>|<ul><li>beforeinput (deleteByDrag) -> div</li><li>beforeinput (insertFromDrop) -> div</li></ul>| |Cut (ctrl+x)|<ul><li>beforeinput (deleteByCut) -> div</li><li>div.innerHTML gets updated</li><li>input (deleteByCut) -> div</li></ul>|<ul><li>beforeinput (deleteByCut) -> div</li></ul>| |Copy|n/a|n/a| |Paste (ctrl+v)|<ul><li>beforeinput (insertFromPaste) -> div</li><li>div.innerHTML gets updated</li><li>input (insertFromPaste) -> div</li></ul>|<ul><li>beforeinput (insertFromPaste) -> div</li></ul>|

EditContext Usage

Example 1: Initialization

    // This will make the div behave like a ContentEditable div except the user input will go to 
    // EditContext instead of the div, i.e., the div will receive beforeInput events, will be focusable, etc
    // but the DOM won't be changed while user typing.
    var editContext = new EditContext();
    div.editContext = editContext;
    // When the associated element is focused, the EditContext is automatically activated.
    div.focus();

Example 2: Event handler

    // When user typing, EditContext will receive textupdate events which has text info that can be used to
    // update the editor's model, or direclty update the DOM (as shown in this example)
    editContext.addEventListener("textupdate", e => {
        let s = document.getSelection();
        let textNode = s.anchorNode;
        let offset = s.anchorOffset;
        let string = textNode.textContent;
        // update the text Node
        textNode.textContent = string.substring(0, offset) + e.updateText + string.substring(offset);
    });

    // EditContext will also receive textformatupdate event for IME decoration.
    // Ex. thin/thick underline for the "phrase mode" in Japanese IME.
    editContext.addEventListener("textformatupdate", e => { 
        decoration.style.borderBottom = "3px " + e.underlineStyle;
    });

Example 3: Mapping the selection from DOM space to EditContext (plain text) space

    document.addEventListener("selectionchange", e => {
        let s = document.getSelection();

        // Calculate the offset in plain text
        let range = document.createRange();
        range.setEnd(s.anchorNode, s.anchorOffset);
        range.setStartBefore(parentSpan);
        let plainText = range.toString();

        // EditContext doesn't handle caret navigation, so all the caret navigation/selection happened
        // in DOM space will need to be mapped to plain text space by web authors and passed to EditContext.
        editContext.updateSelection(plainText.length, plainText.length);
    });

Example 4: Update the control bounds and selection bounds for IME

        // IME will need the control bounds (i.e. the conceptual location of the EditContext in the view)
        // and the selection bounds (if no selection, it will be the bounding box for the caret) to show the
        // candidate window in the right position. The bounds are in the client coordinate space.
        let controlBound = editView.getBoundingClientRect();
        let s = document.getSelection();
        let selectionBound = s.getRangeAt(0).getBoundingClientRect();
        editContext.updateLayout(controlBound, selectionBound);

Example Application

This example shows how an author can use EditContext to implement (IME) typing on a <canvas> element. (demo video)

This example shows how an author can leverage native selection when using EditContext.

Interaction with Other Browser Editing Features

By decoupling the view from text input, the EditContext opts out of some editing behaviors that are currently only available through the DOM. An inventory of those features and their interaction with the EditContext follows:

Spellchecking

Web apps have no way today to integrate with spellcheck from the browser except through editable elements. Using the EditContext will make the native spellchecking capabilities of the browser unreachable. There is demand for an independent spellchecking API.

For web apps or editing frameworks relying on editable elements to provide this behavior, it may be a barrier to adoption of the EditContext. Note, however, there are heavily used web editing experiences (Office Online apps, Google docs) that have replaced spell checking with a custom solution who will not be blocked from adopting a better text input integration story, even in the absence of a separate spellcheck API. Similarly, there are also editing experiences, e.g. Monaco, that don’t use spell checking from the browser because an element like a contenteditable won’t understand what’s a string and what’s a class name leading to a lot of extra innappropriate squiggles in the code editing experience.

Undo

Web-based editors rarely want the DOM undo stack. Undo reverses the effect of DOM operations in an editable element that were initiated in response to user input. Since many editors use the editable element to capture text input from the user, but use JavaScript operations to update the view in response to that input, undoing only the DOM changes from user input rarely makes sense.

It is expected that web-based editors using the EditContext will provide their own undo operations. Some performance benefit should be realized as DOM operations will no longer incur the overhead of maintaining a valid undo stack as DOM mutations mix with user-initiated (undoable) actions.

Focus

The notion of focus in the DOM, which determines the target for KeyboardEvents, is unaffected by the EditContext. DOM elements can remain focused while the EditContext serves as the recipient of composition and textupdate events.

Built-in Editing Commands that Manipulate the DOM in Response to User Input

Web-based editors which use the EditContext are expected to provide their own editing command implementations. For example, typing Enter on the keyboard will not automatically insert a newline into the HTML view. An editor must handle the KeyboardEvent and perform updates to their own document model and render those changes into the HTML DOM for users to see the impact of the Enter key press.

As an alternative, basic editing command implementations could be implemented and expressed as textupdate events to the EditContext’s cached text view. Such a feature may make it easier for web-based editors to adopt since the EditContext will behave more like the hidden text area without the side effects.

However, if the EditContext did provide more editing behavior, it may not be used by editors since a key press like Enter or Backspace is often associated with editing heuristics such as ending or outdenting a list, turning a heading into a normal paragraph style, inserting a new table row, removing a hyperlink without removing any characters from the URL, etc.

The current thinking is that a more minimal approach is a better place to start.

Default Key Event Behavior Adaptations for Editing

Some KeyboardEvents are associated with different default behaviors when an editable element is focused than when a read-only element is focused. As an example, the spacebar inserts a space in editable elements, but scrolls when a read-only element is focused.

When an EditContext is active, the web platform will treat the set of KeyboardEvents with special editing behaviors as though the default behavior has been prevented, i.e. there will be no need for the author to call preventDefault to prevent scrolling when a Space key is pressed.

Touch-specific Editing Behaviors

Some browsers may support double-tap to zoom. When double tap occurs on editable text, however, it is commonly used to select the word under the double tap. Editors using read-only elements in conjunction with an EditContext can employ the touch-action CSS property to eliminate unwanted touch behavior.

Native Selection and Caret

Web-based editors using the EditContext that also want to use native selection and the caret don’t currently have a great solution. There are two problems in particular that must be overcome:

  1. A native caret currently can only be rendered in an editable region, so using an EditContext in combination with a read-only element in the DOM doesn’t support a native caret.
  2. Native selection is constrained to stay within the bounds of an editable element. This is likely expected behavior, but no such restriction is placed on read-only elements which could lead to over selection without an editable element that establishes a selection limit.

Option 1

New DOM content attributes could be proposed to constrain selection to a subtree of the DOM and allow display of the native caret.

Option 2

Editors implement their own selection and caret using DOM elements or the proposed Highlight API.

Option 2 is the default and may be the best starting point. It is currently employed by multiple editors as those editors offer specialized behavior related to selection: e.g. multiple insertion point support or rectangular selection or table selection.

Option 3

An editor could combine a contenteditable element with an EditContext. This has the advantage of overcoming both selection related challenges: constraining selection and displaying the native caret. It, however, has the disadvantage that editing behaviors not disabled by having an EditContext, for example clipboard paste and drag and drop, may result in DOM mutations which could break editors.

Highlighting

Editable elements can apply paint-time effects to mark an active composition and spellchecking results. These features won’t happen automatically for web-based editors using the EditContext. Instead, additional elements can be added to the DOM to render these effects, or, when available, the proposed Highlight API can be used.

Alternatives:

Multiple approaches have been discussed during F2F editing meetings and through online discussions.

Appendix

Example Text Input Methods

Virtual Keyboard Shape-writing

VK shape-writing

Handwriting Recognition

Handwriting Recognition

Emoji Picker

Emoji Picker

IME Composition

IME Compositions

Dictation

DictationRelated issues | Open a new issue