Accessibility Conformance Testing Rules Format

Editor’s Draft,

This version:
https://w3c.github.io/wcag-act/act-rules-format.html
Previous Versions:
Editors:
Wilco Fiers (Deque Systems)
Maureen Kraft (IBM Corp.)

Abstract

The Accessibility Conformance Testing (ACT) Rules Format 1.0 is a specification designed to harmonize how accessibility rules are described for automated test tools, and how test procedures are written for quality assurance testing. By writing these tests in a defined format, organizations are better able to document and share their method of testing. A defined format for writing tests is the first step in enabling a harmonized approach for accessibility conformance testing.

1. Introduction

There are currently many products available which aid their users in testing web content for conformance to accessibility standards such as WCAG 2.0. As the web develops and grows in both size and complexity, these tools are essential for managing the accessibility of resources available on the web.

This format is intended to provide a consistent interpretation of how to test for accessibility requirements so as to avoid conflicting results of accessibility tests. It is intended for both manual accessibility tests as well as for automated testing done through accessibility test tools (ATTs).

Describing how to test certain accessibility requirements will result in accessibility tests that are transparent with test results that are reproducible. The Accessibility Conformance Testing (ACT) Rules Format defines the requirements of these test descriptions, known as Accessibility Conformance Testing Rules (ACT Rules).

2. Scope

The ACT Rules Format scope is focused on the development of ACT Rules for the conformance testing of content created using web technologies, such as HTML, CSS, WAI-ARIA, SVG and more, including digital publishing. The ACT Rules Format, however, is designed to be technology agnostic, meaning it has no specific technology in mind. This also means that the ACT Rules Format could conceivably be used for other technologies.

Accessibility requirements such as Web Content Accessibility Guidelines, which is specifically designed for web content, can be tested using ACT Rules.

Other accessibility requirements that may be applicable to web technologies should also be testable with ACT Rules. For example, the User Agent Accessibility Guidelines 2.0 is applicable to web-based user agents, for which ACT Rules could be developed, but other technologies can also be used to develop User Agents, which are not web-based. Because some accessibility requirements may be applicable to technologies other than web technologies, the ACT Rules Format may not be suitable for every part of these accessibility requirement.

3. ACT Rule Structure

3.1. Rule Outline

Each ACT Rule MUST provide the following items written in plain language:

3.2. Rule Description

Each ACT Rule MUST have a description that is in plain language and provides a brief explanation of what the rule does.

3.3. Accessibility Requirements

Explain the accessibility requirement being tested such as the WCAG 2 Success criterion and / or the technique the rule maps to; For example WCAG 2.0 Technique H67.

3.4. Limitations, Assumptions or Exceptions

List any limitations, assumptions or any exceptions for the test, the test environment, technologies being used or the subject being tested; For example, A rule for Success Criterion 1.4.1: Use of Color has to make an assumption that CSS is used to make a link visually evident - typically by using CSS background, border, color, font, or text-decoration properties.

3.5. Accessibility Support

Editor’s Note - The ACT Task Force acknowledges that this approach does not capture all nuances of accessibility support. We are looking for feedback on how to balance the need for organizations to solve issues with buggy or unsupported features in assistive technologies and user agents, with keeping the development cycle lean, and delivering future-proof products.

Determining if a web page is accessible depends partly on the assistive technologies and user agents that are used by the visitors of the page to access it. This is known as Accessibility Supported in WCAG 2.0. When testing a web page for accessibility, it is important to know which assistive technologies and user agents should be supported.

It is important that users of ACT Rules can determine which accessibility features are relied upon in a rule. With this information, the user can determine which rules do not provide results that are in line with the accessibility support baseline set for this particular test.

Developers of Accessibility Test Tools should use the Accessibility Support information to develop a default ruleset that is sensible for their user base. Additionally, they could allow users of the ATT to customize which rules are run, allowing fine grained control of the accessibility test.

A list of Accessibility Support Features is an optional part of an ACT Rule. Not every accessibility requirement involves assistive technologies, and so certain rules will have no use for this feature. For HTML, XML and similar markup documents, the accessibility features should be described using a three part CSS selector.

Example 1: Requires support for the alt attribute on an img element, existing within a link.

  a[href] img[alt]

Example 2: Requires support for role=menuitem on any type of element, existing within an element marked up as a menu

  *[role=menu] *[role=menuitem]

4. Test Subject Types (Input Data)

Web pages, including publications and applications, go through many different stages before it is rendered to the end user. For example, PHP may be used to put various pieces of content into a template. That template is then sent as an HTML text document to a web browser. The browser in turn parses it and turns it into a DOM tree before rendering it to the screen. Accessibility tests could be run at each of these stages. It is therefore important to specify the test subject type that an ACT Rule expects.

The following test subject types are common in accessibility testing:

Other types of test subjects may be possible. In those cases the ACT Rule MUST include a detailed description of the test subject type, or a reference to the that description.

4.1. HTTP Response Testing

Testing the files as they are served to the web browser (or other user agent) has its limitations. The files may be manipulated in different ways through presentation and scripting. Although this is an excellent place for parser testing.

4.2. DOM Tree Testing

After the web browser (or other user agent) has parsed the files, and executed scripts to get it into a specific state (be it the initial state or any other), tests can be run against the DOM Tree. The DOM Tree can be tested for things like correct parent-child relationships, use of required attributes or properties and more.

4.3. Rendered Page Testing

Testing the browser is the next level up from DOM Tree Testing. In addition to building the DOM Tree, the browser styles elements in the DOM tree and positions them. This enables a rule to determine if an element is visible, which is critical for many tests. Additionally, testing things like the color contrast becomes possible at this level.

4.4. Template Testing

A template is a document that has "open" fields that are filled with pieces of content or other templates. E.g. an HTML template contains a head with a variable title, a predefined navigation, and a variable content area.

<img> tags with a variable src, MUST NOT have a static alt value.

4.5. Script Testing

A composition is a class or component that extends other native elements or other compositions, to build a higher level component. E.g. a login form component, consists of a form, a few fields, and a label.

Component properties starting with aria- MUST exist in the list of WAI-ARIA attributes.

5. ACT Test Procedure

5.1. Selector

A selector is a filter procedure that is used on the input data to be evaluated against the test procedure. For example, finding specific nodes in the DOM tree, or finding tags that are incorrectly closed in an HTML document.

Selector syntax depends on the test subject type. When a formal syntax can be used, that (part of the) selector should use that syntax. When a formal selector syntax can not be used, that (part of the) selector MAY be described unambiguously in plain language.

This can be used together, such as in the following example which uses CSS selector syntax to locate elements, combined with a plain language description to further filter the nodes.

A rule for HTML img elements has the following selector:
  1. Select elements matching the following CSS Selector: img:not([alt=""])

  2. Exclude any elements that have visibility:hidden or display: none

5.2. Test Cases

Each rule contains one or more test cases written in plain language. Each test case MUST provide the following:

Test case results are combined with other test case results to provide a cumulative outcome, pass or fail, of the rule. When a rule has multiple test cases, the results of all test cases are combined to give a single result for each selected item. The results are cumulative, meaning that as long as one test case passes, the rule has passed.

A rule for HTML img elements has two test cases:
  1. Check if there is a text alternative

  2. Check if the image is marked as decorative.

If either one of these passed, the rule has passed. Only if both fail, does that element fail the rule.

6. ACT Data Format (Output Data)

With ACT Rules, it becomes important that data coming from different sources can be compared. Only by having a shared vocabulary, can accessibility data that is produced by different auditors be compared and where necessary aggregated. Every ACT Rule MUST therefore require that the output is expressed in a format that has all of the features described in the ACT Data Format.

Rules are tested in two steps. Firstly, the selector is applied to the web page or other test subject. This results in a list of Selected items (elements, tags or other "components") to test. Following this, each selected item will be taken through the test procedure of the rule. This will give the outcome for each selected item. For contextual information, the output data must also include test subject and the rule identifier.

This will mean that every time a rule is executed on a page, it will return a set with zero or more results, each of which MUST have at least the following properties:

Output data using EARL and JSON-LD
{
  "@context": "https://raw.githubusercontent.com/w3c/wcag-act/master/earl-act.json",
  "@type": "Assertion",
  "subject": "https://example.org/",
  "test": "auto-wcag:rules/SC1-1-1-css-image.html",
  "outcome": "Failed",
  "pointer": "html > body > h1:first-child"
}

6.1. Selected Item

When representing the selected item in the output data, it is often impractical or impossible to serialize the selected item as a whole. Instead of this, a pointer can be used to indicate where the selected item exists within the web page or other test subject. There are a variety of pointer methods available, such as those defined in Pointer Methods in RDF 1.0.

The pointer method used in the output data of an ACT Rule MUST include the pointer method used in Implementation Test Cases.

6.2. Outcome

The procedure of a rule MUST always return one of the following outcomes:

6.3. Ensure Comparable Results

In addition to the Selected Item and the Outcome, ACT Rule output data MUST include the following contextual information:

6.4. Test Subject

When a single URL can be used to reference the web page, or other test subject, this URL MUST be used. In scenarios where more complex actions are required to obtain the test subject (in the state that it is to be tested), it is left to ATT developers to determine which method is best used to express the test subject.

7. Rule Quality Assurance

7.1. Implementation Validation

Implementation tests are tests for accessibility test tools. These tests enable the developers and users of ATTs to validate the implementation of an ACT Rule. Each rule MUST have implementation tests for the selector, as well as for each test case in the rule.

An implementation test consists of two parts: a piece of input data and an expected result. When applying the test, the piece of input data, for instance an HTML code snippet, is evaluated by following the rule’s test procedure. The result is then compared to the expected result of the test. The expected result consists of a list of pointers and the expected outcome (Passed, Failed, Inapplicable) of the evaluation.

7.2. Accuracy Benchmarking

The web is ever changing, and technologies are used in such diverse and creative ways that it is impossible to predict in advance, all the ways that accessibility issues can occur and all the ways they can be solved for. When writing ACT Rules, it is almost inevitable that exceptions will be overlooked during the design of a rule, or that new technologies will emerge that introduce new exceptions.

This makes it important to be able to regularly test if the rule has the accuracy that is expected of it. This can be done by benchmark testing. In benchmark testing, the accuracy of a rule is measured, by comparing its results to those obtained through accessibility expert testing.

The accuracy of a rule is the average between the false positives and false negatives, which are in turn calculated as follows:

There are several ways this can be done. For instance, accessibility test tools can implement a feature which lets users indicate that that a result is in error, or pages that for which accessibility results are known, can be tested using ATT, and the results are compared. To compare results from ACT Rules to those of expert evaluations, data aggregation may be necessary.

7.3. Rule Aggregation

As describe in section §6 ACT Data Format (Output Data) a rule will return a list of results, each of which contain 1) an outcome (Passed, Failed or Cannot tell), 2) the selected item, 3) the Rule ID and 4) the test subject. Data expressed this way has a great deal of detail, as it gives multiple pass / fail results for each rule.

Most expert evaluations do not report results at this level of detail. Often reports are limited to giving a single outcome (Passed, Failed, Inapplicable) per page, for each success criteria (or other accessibility requirement). To compare the data, results from rules should be combined, so that they are at the same level.

When all rules pass, that does not mean that all accessibility requirements are met. Only if the rules can test 100% of what should be tested, can this claim be made. Otherwise the outcome for a criteria is inconclusive.

Example: An expert evaluates a success criterion to fail on a specific page. When testing that page using ACT Rules, there are two rules that map to this criterion. The first rule returns no results. The second rule finds 2 elements that pass, and a 3rd element that fails.

In this example, the first rule is inapplicable (0 results), and the second rule has failed (1 fail, 2 pass). Combining this inapplicable and fail, means the success criterion has failed.

See Appendix 1: Aggregation examples, using JSON-LD and EARL on how this could be expressed using JSON-LD and EARL.

7.4. Update Management

7.4.1. Version Numbers

Each ACT Rule MUST have it’s own version number. The version number has to follow the semantic versioning schema. Using the X.Y.Z schema in the following way:

X / Major updates:

The major version number must be increased if the change can lead to new failure results.

Y / Minor updates:

The minor version number must be increased if the test logic has been updated, which could lead to a a different result.

Z / Patch updates:

The patch version number must be increased if the change does not affect the result of a rule. This includes editorial changes, new issues on the issues list, an updated description, etc.

7.4.2. Change Log

All major and minor changes to an ACT Rule MUST be recorded in a change log that is part of the updated rule. The change log MUST at least include the changes since the last minor update, as well as a reference to the previous version.

7.4.3. Issues List

An ACT Rule MAY include an issues list. This list must be used to record cases in which the ACT Rule might return a failure where it should have returned a pass or vice versa. There may be several reasons why this might occur, including:

The issues list serves two purposes. For users of ACT Rules, they give insight into why an inaccurate result might have occurred, as well as provide confidence in the result of that rule. For the designer of the rule, the issues list is also useful to plan future updates to the rule. A new version of the rule might see issues resolved and the item moved to the change log.

Appendix 1: Aggregation examples, using JSON-LD and EARL

Example:

{
  "@context": "https://raw.githubusercontent.com/w3c/wcag-act/master/earl-act.json",
  "@type": "Assertion",
  "subject": "https://example.org/",
  "test": "auto-wcag:SC1-1-1-css-image.html",
  "outcome": "Failed",
  "source": [{
    "outcome": "Failed",
    "pointer": "html > body > h1:first-child"
  }, {
    "outcome": "Passed",
    "pointer": "html > body > h1:nth-child(2)"
  }]
}

Example: Aggregate rules to a WCAG success criterion

{
  "@context": "https://raw.githubusercontent.com/w3c/wcag-act/master/earl-act.json",
  "@type": "Assertion",
  "subject": "https://example.org/",
  "test": {
    "@id": "wcag20:#text-equiv-all",
    "title": "1.1.1 Non-text Content"
  },
  "outcome": "Failed",
  "source": [{
    "outcome": "Failed",
    "test": "auto-wcag:SC1-1-1-css-image.html",
    "pointer": "html > body > h1:first-child"
  }, {
    "outcome": "Passed",
    "test": "auto-wcag:SC1-1-1-longdesc.html",
    "pointer": "html > body > img:nth-child(2)"
  }]
}

Example: Aggregate a list of results to a result for the website

{
  "@context": "https://raw.githubusercontent.com/w3c/wcag-act/master/earl-act.json",
  "@type": "Assertion",
  "subject": {
    "@type": ["WebSite", "TestSubject"],
    "@value": "https://example.org/"
  }
  "test": "http://www.w3.org/WAI/WCAG2A-Conformance",
  "outcome": "Failed",
  "source": [{
    "outcome": "Failed",
    "test": "wcag20:text-equiv-all",
    "source": []
  }, {
    "outcome": "Passed",
    "test": "wcag20:media-equiv-av-only-alt",
    "source": []
  }, {
    "outcome": "Inapplicable",
    "test": "wcag20:media-equiv-captions",
    "source": []
  },]
}

Conformance

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

References

Normative References

[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://tools.ietf.org/html/rfc2119