publ-a11y-extended-desc


title: “Explainer: Extended Descriptions for Publications and the Web” date: 11-27-2025 author: Publishing Maintenance Group Accessibility Task Force —

W3C Logo

Explainer: Extended Descriptions for Publications and the Web

Authors

Participate

Introduction

This explainer documents the need for and proposed semantics for extended descriptions: longer, structured content that explains images, diagrams, tables, formulas, and other information-rich material that cannot be adequately represented by brief alternative text alone but needs to be closely linked to the image it explains.

The document summarizes goals, non-goals, candidate approaches, examples, alternatives considered, and next steps to enable review by APA, TAG, implementers and the community.

User-Facing Problem

Users encounter images and non-text content that convey complex information (e.g., technical diagrams, charts, mathematical notation, museum objects) where a short alt is insufficient and an extended, structured description is provided. Without clear affordances, users may not discover these descriptions or understand the linking between them and the primary content.

While there is a well-established standard for providing alternative text (alt attributes) for images, there is currently no equivalent standardized mechanism for extended descriptions. This inconsistency creates a gap in accessibility support: authors cannot reliably associate extended descriptions with their images, and assistive technologies cannot uniformly discover and surface them.

Authors and reading systems need a reliable, discoverable, and programmatic way to identify and surface extended descriptions in consistent ways without breaking reading flow or excluding non-AT users. Reading solution developers, in particular, require explicit semantics to offer a dedicated, optimized experience for accessing extended descriptions—whether through pop-ups, side panels, or other specialized UI patterns that preserve reading context.

Collection managers and accessibility experts need to be able to identify and collect extended descriptions with links to their context image.

At a platform level, this problem arises because extended image descriptions lack explicit, programmatically identifiable semantics. Web accessibility guidance requires that text alternatives and their relationships to non-text content be programmatically determinable, rather than inferred from presentation or author conventions.

HTML and ARIA follow this same architectural principle by providing explicit semantics for meaningful content, enabling consistent discovery and interaction by assistive technologies. While short text alternatives are programmatically associated with images, there is currently no equivalent semantic mechanism for extended descriptions. As a result, assistive technologies cannot reliably identify, announce, or expose extended descriptions consistently, even when authors provide them.

Goals

Non-goals

User research

Publisher feedback from Brazil, Europe, and North America has informed this work, with proposed solutions reviewed and validated by some publishers. Publisher associations in Italy and France have been engaged. Community use cases and testing are documented in publishing and accessibility working groups (see References).

A proof-of-concept (POC) in both HTML and EPUB formats has been developed and refined over three years by the DAISY Transition to EPUB working group, demonstrating practical patterns and their effectiveness across reading systems. For more information on best practices, see the Extended Descriptions Best Practices.

Further user testing is recommended to validate discoverability and presentation patterns in paginated vs. continuous reading contexts.

State of the art

Today, best practice relies on the use of aria-detailsto identify either a link to the extended description or the extended description itself. The aria-details attribute creates a programmatic relationship between an image and its extended description. The extended description can be implemented in two complementary ways depending on the publication format and authoring context:

Extended description adjacent to the image in the reading order

Extended descriptions can be embedded directly before or after the image they explain. This approach works well for content that is primarily web-based or when authors prefer to keep all content in a single file. The aria-details attribute points directly to the description, with no need for intermediate links.

Example pattern:

<!-- Main content -->
<img id="img1" src="figure1.png" alt="Schematic of the device" aria-details="desc-img1">

<!-- Adjacent extended description -->
<details id="desc-img1">
    <summary>Extended description — Figure 1</summary>
    <p>...detailed structured description...</p>
</details>

Extended descriptions placed in a separate section (possibly in another file)

Extended descriptions can be managed in a separate section rather than adjacent to the image. This approach avoids heavy additions to the original document structure and gives users the choice to consult the extra content without disrupting the reading flow.

Example pattern:

<!-- Main content -->
<img id="img1" src="figure1.png" alt="Schematic of the device" aria-details="extdesc-1">
<a id="extdesc-1" href="extended-descriptions.xhtml#desc-img1">Extended description</a>

<!-- Extended descriptions section -->
<section id="desc-img1">
    <h2>Extended description — Figure 1</h2>
    <img src="figure1.png" role="presentation" alt="">
    <p>...detailed structured description...</p>
    <a role="doc-backlink" href="chapter01.xhtml#extdesc-1">Back to image</a>
</section>

General considerations

Both implementations use aria-details to create the semantic link between image and description. Authors should ensure the referenced content is visible to all users and test the pattern across common reading systems and screen readers because user agent and AT support can vary.

Current limitations

Tests revealed limitations related to identifying extended descriptions with certainty. It manifests in two critical areas:

Additionally, testing showed declarative workarounds being impractical when parsed using XPath, including identification limited to a single document, expensive preprocessing and DOM traversals, and a requirement for sophisticated parsing logic that makes implementations less scalable for many images or large collections — more error-prone and resource-intensive compared with simple, explicit markup approaches.

The footnote precedent

Footnotes provide a well-established, accessible pattern: a reference in the main flow that links to a separate, uniquely identified container with a backlink to return to the reading position.

DPUB ARIA’s doc-noteref / doc-footnote roles are a concrete example of how explicit, paired semantics enable assistive technologies to announce purpose, provide navigation, and allow tooling to extract related pairs programmatically. Thanks to this declarative semantics, reading solutions can enable specific UX features (which, for example, allow the user to keep their place in the text), but even if they do not, the system still works with links.

This precedent shows the value of: - unique IDs for references and targets, - bidirectional navigation (reference → note, note → backlink), - exposing semantics in the accessibility tree so user agents and AT can offer specialized affordances.

Extended descriptions share these link-and-return needs but differ in scope: they are typically longer, structured, and may include media or complex markup. That difference argues against reusing footnote roles directly.

The footnote model is a useful precedent for linking and navigation patterns, but extended descriptions benefit from distinct semantics (e.g., extendeddescriptionref / extendeddescription) that reflect their content and afford appropriate UX and tooling behavior. As they are links, they remain usable even with legacy reading systems.

A screenshot showing the apple book reader displaying a footnote. The main text includes a superscript number linking to the footnote section. The referenced note text appears in a pop-up panel right on top of the superscript. Screenshot of the Thorium Reader displaying a footnote. The main text includes a superscript number that links to the footnote section. The referenced note text appears in a pop-up panel at the bottom of the screen.

Hiding and skipping extended descriptions

A key benefit of explicit semantics is that reading systems can enable users to hide or skip extended descriptions based on their preferences. This is especially relevant when extended descriptions are adjacent to the image in the reading order, where they may otherwise disrupt the flow of the main text. Implementing hide/skip functionality relies on standardized markup that explicitly identifies image description content. Specific use cases include:

These capabilities ensure that extended descriptions enhance accessibility for users who need them while not imposing them on users who prefer alternative reading approaches or interaction patterns.

Proposed approach

We propose two complementary ARIA roles to strengthen link semantics and make extended-description relationships explicit to assistive technologies:

Implementation for extended description adjacent to the image in the reading order

Example pattern:

<!-- Main content -->
<img id="img1" src="figure1.png" alt="Schematic of the device" aria-details="extdesc-1">

<!-- Adjacent extended description -->
<details id="extdesc-1" role="extendeddescription">
    <summary>Extended description — Figure 1</summary>
    <p>...detailed structured description...</p>
</details>

Implementation for extended descriptions collected in a separate section (possibly in another file)

Example pattern:

<!-- Main content -->
<img id="img1" src="figure1.png" alt="Schematic of the device" aria-details="extdesc-1">
<a id="extdesc-1" role="extendeddescriptionref" href="extended-descriptions.xhtml#desc-img1">Extended description</a>

<!-- Extended descriptions section -->
<section id="desc-img1" role="extendeddescription">
    <h2>Extended description — Figure 1</h2>
    <img src="figure1.png" role="presentation" alt="">
    <p>...detailed structured description...</p>
    <a role="doc-backlink" href="chapter01.xhtml#extdesc-1">Back to image</a>
</section>

Approach sustainability

Using explicit markup reduces the computational overhead associated with reverse-checking aria-details to identify links to extended descriptions.

The combination of aria-details, role="extendeddescriptionref" and role="extendeddescription" provides bidirectional programmatic relationships even when extended descriptions reside in separate HTML files.

role="extendeddescription" would identify extended description containers across document boundaries.

Similar semantic identification challenges have been successfully addressed, demonstrating the value of specific semantic roles for different types of linked supplementary content. For example, DPUB ARIA roles provide the doc-footnote and doc-noteref roles to identify notes and their references, enabling assistive technologies and text-to-speech engines to announce them appropriately and user agents to implement specialized navigation features.

Dependencies on non-stable features

Solving specific goals

Alternatives considered

Accessibility, Internationalization, Privacy, and Security Considerations

Stakeholder feedback

Next steps

  1. Socialize this explainer with APA, TAG, and the wider web platform community.
  2. Discuss ARIA role proposals and aria-details usage with the ARIA Working Group.
  3. Create samples HTML, EPUB and web publications demonstrating the recommended patterns.
  4. Produce authoring guidance and lint rules to encourage correct usage and prevent misuse.
  5. Coordinate with reading system and assistive technology vendors on implementation and UX affordances.

References

Example resources

Acknowledgements

This explainer has been written by Gautier Chomel summarizing previous works from Charles LaPierre and Gregorio Pellegrino, discussed and reviewed by Matthew Atkinson, Matt Garrish, George Kerscher, Steve Noble, Wendy Reid, James Yanchack, and others. All works conducted under Avneesh Singh coordination.