This document describes requirements for a digital publishing term vocabulary that defines structural semantics of elements of a publication. It is hoped that these principles can aid assistive technologies and user agents as well as authors and publishers.

This is a work in progress. No section should be considered final, and the absence of any content does not imply that such content is out of scope, or may not appear in the future. If you feel something should be covered here, tell us!

Introduction

Several methods for inclusion of publishing-specific structural semantic terms in ebooks and online documents have been attempted over the years. These include using @class or @data-*, as well as creating namespaced attributes with associated vocabularies, to convey semantic meaning. Most of these methods are either cumbersome for the author, present validation issues, or demonstrate poor tagging practice. This document proposes to develop a method for the inclusion of digital publishing structural semantics, that will also enhance accessibility by providing a clarified AT contract.

Structural semantics provide authors and publishers a method of conveying intent and specific meaning to HTML tagging, in cases where the HTML grammar does not include the needed semantic natively. A digital publishing structural semantics vocabulary defines a set of properties (with or without explicit associated behaviors) relating to specific elements of a publication.

Having digital publishing structural semantics natively supported in WAI-ARIA would yield multiple benefits. From a user perspective, this can improve the user experience, regardless of ability or disability, by potentially enabling enhanced behavioral repertoires of UAs, and by allowing for direct interaction with assistive technologies that support ARIA. From a publishers perspective, benefits include facilitation of content repurposing, native validation, as well as avoiding to have to use different expression forms when publishing ebooks vs online.

Objective

The Digital Publishing Interest Group and the Protocols and Formats WG will work jointly to establish a formalized method for using publishing-specific term vocabularies in WAI-ARIA. The joint task force will:

Is it anticipated that this will require discussion and agreement regarding a non-AT-exclusive usage of the role attribute, support for centralized vs decentralized vocabulary approaches, and that the UA behavioral contract for unrecognized values is clarified.

Success Criteria and Options

There are several options available for inclusion of a rich vocabulary of publishing terms; however there are also several factors that determine the success of a method. The W3C Digital Publishing IG has identified the following criteria as important for success within the digital publishing ecosystem: @@@ TODO link to wiki

  1. Validation on open web platform: The solution should use a native OWP construct, such that the resulting content remains valid HTML5.
  2. Serialization agnostic: The solution must be directly usable in both serializations of HTML5, without any variations in syntax.
  3. Decentralized vocabulary support to allow expression of domain-specific concerns. The solution should allow domain-specific values to be used without an impact on validity, while also conveying information for how User Agents must deal with unrecognized values.
  4. Clear AT contract: The solution must come with a native OWP contract that describes how Assistive Technologies will make use of the provided semantics when rendering the content. The contract must also describe AT behaviors for when the provided semantics are unrecognized, as well as AT behaviors for the case where the inflected semantics conflict with the given host language semantic.
  5. Static provisioning: The solution must not require support for script execution within the User Agent, such that that the provided information remains available in circumstances where scripting is disabled, or not supported at all. Note that many ebook reading systems do not support scripting.
  6. Simplicity and terseness: The solution should be easy to learn, and difficult to get wrong.
  7. Appropriatenes: The solution's specification makes it appropriate for this specific usage and not overly-complex.

Enabling Assistive Technologies

Improved Navigation

Navigation and way-finding are key elements to enabling all readers access to a publication. Sighted and non-sighted readers must be able to move efficiently between major sections of a publication and be able to quickly locate print-equivalent page boundaries, tables, figures, audio and video clips and other features that they choose to read or skip while browsing content. The inclusion and identification of specialized navigational elements allows a user agent to open custom navigation options for the user when requested.

Contextual Cues

Contextual cues offer screen reader technology information about deeply nested sectioning structures. This orients a reader who has followed a link as to where in the pubication she is. A student can invoke the screen reader function to report the position within the document hierarchy. As the publisher has provided structural semantics for relevant sections within the content, the screen reader is able to report the position as “volume 1, part 2, chapter 3, subsection 4” although the actual current nesting level is ten levels deep. The student then invokes the screen reader function to read current chapter and subchapter titles. The screen reader traverses the sectioning structure upwards until it has located the nearest subchapter and chapter, and renders the chapter and subchapter titles to the user. Similarly a blind student is exploring the content of an infographic in SVG format. Though the SVG elements can have textual values, titles, and descriptions that a screen reader can voice, the student may need additional context to understand what SVG elements are the legend, data elements, or labels.

Enabling UA Behavior

An additional benefit of vocabulary of publishing terms is that the terms can describe specific behaviors to user agents. Implementation of behaviors has been seen using the EPUB vocabulary on some ebook reading systems.

Intrinsic Glossaries

A user is reading a publication that includes specialized terminology. The built-in User Agent dictionary does not include many of these terms. The publication includes explicitly marked-up glossary terms and definitions for the specialized terms in the actual content. As a result, the User Agent is able to harvest these specialized terms and definitions, and expose them to the user via the common term lookup UI.

Intelligent Indexes

A user is reviewing the index of a lengthy book about World War II. She can tap on the entries listed under the term “Normandy” to preview the content to which the index refers. This enables her to assess whether she wishes to access that link. Likewise, while reading a section about Normandy, the user can access the index by viewing a display of the relevant index entries and sub-entries. The window might display related index terms or a snippet of the index.

Conditional exposure of optional content

A user is reading a book with an extensive amount of footnotes on a device with limited screen real-estate. The user accesses the User Agent’s preferences, and activates a mode where footnotes are hidden from view by default, and instead exposed in a pop-up window only when the user activates a footnote reference link.

Domain-specific content fragments

A user is reading a STEM textbook. The content is structured into the domain-specific fragments (e.g. physics: experiment, law; mathematics: theorem, proof) which frequently reference other fragments (“as Experiment 2 showed, this approximation is not yet accurate”, “by Theorem 4 the function must be continuous”). The user agent can integrate (activated) references into the reading flow, allowing the user to adapt the level of detail of the content.

Content Reuse

A publisher offers customers the option to create customized publications on the fly by selecting content from a different source publications. The publisher uses an automated tool to process requests for dozens of a customers a month. As the tool that performs the automated process traverses the selected chapters to determine what content to include in the resulting publication, it analyzes the included hyperlinks and includes the content referenced from links that are rearnote references, but excludes content referenced from links that are generic hyperlinks.