Language matrix
International typography on the Web

Languages and writing systems have varying typographic conventions, and users and content authors expect local conventions to be supported on the Web and in eBooks. This page provides a ‘heat-map’ indicating the degree to which missing support for local features impacts use of the Web. Viewed as a whole, the table gives an idea of the amount of work still to be done. Dark green means there's no problem; red means that, currently, issues with this feature make the language very difficult or impossible to use.

Across the top are feature categories; down the side are languages, listed by script.

For the   languages currently listed, we see the following problem areas. This page is updated as information becomes available.

languages need work for advanced publishing

languages need work for basic features

languages don't work well on the Web

? of cells still need investigation.

The matrix

Where a language name has a link, you can click on any cell in that row for more detailed information.

script language summary

Color key

3 All needs covered (ie. OK), or not applicable
2 Basic needs covered, but work needed for advance publishing
1 Can create interoperable web pages, but work still needed for basic features
0 Something prevents interoperable or effective use of the language in webpages

An asterisk next to a language name indicates tentative coloring, based on set of initial expectations, but it is very possible that the colors will change after further investigation.

A language name followed by † indicates that experts are currently investigating issues and developing the gap analysis. In this case, the coloring is expected to be more reliable, though may still change.

Whatever the current status of a language, a cell that is currently colored green may change to another colour at a later date if a previously-overlooked issue is discovered.

Detailed information about the chart

The colour to the left of the language is the lowest score of any of the cells to its right, and represents the level for the language as a whole. The column immediately to the right of the language name graphically summarises the available data for that language (by reordering the squares in the main part of the table).

A question mark indicates that we don't yet have a reasonable degree of confidence about level of support for this aspect of the language. There may not be an issue, or there may be something that needs fixing which we are as yet unaware of.

The chart is arranged by script. Note that some languages use more than one script, and so appear in more than one place in the chart. For each script, a list of languages appears in order of the highest to lowest number of speakers.

The list currently targets some of the more common languages, or scripts that have important implications for general support. It is expected that the list will grow as data becomes available, and we welcome contributions for other languages from local experts. Some less common languages are already represented, sometimes because they present features that need to be taken into account while designing the overall technological architecture, other times simply because we have information about them and including them helps the local community ensure the survival of their language on the Web.

Each cell is the intersection of a language and an aspect of language enablement. The colours indicate what needs work, and whether the work is needed to bring the language to the next level. If no work is needed for a language, either because it is fully supported, or because that feature is not applicable, the dark green colour is used to indicate that no work is needed.

An asterisk next to a language name indicates that the scores for that language are tentative, ie. our best guess pending thorough analysis and validation by experts representing the language user community.

When a language name has a link, you can follow that link to find a draft report that gives details about work needed. That also needs validation and prioritization by experts if the language name has an asterisk.

Click the feature categories along the top of the chart to get an idea of what kind of typographic features are included in that category. You are taken to the document Language enablement Index. From there you can follow links to current requirements, requests for information, spec wording, and issues related to that group of features.

The column with the title Page Layout and the two columns to its right all relate to rendering paged media. In the general case, various paged media features are still not widely available on the Web; however, these columns are particularly focused on whether there are language-specific adaptations of the general approach which need attention.

What qualifies a cell for a score of 3 (OK)? A cell can be scored as OK if the feature in question is specified in an appropriate specification, and is supported by user agents. For the latter, a specification that is in CR or later and has two implementations in 'major' browsers will count. This means that the feature may not be supported in all browsers yet. (At some point in the future we may try to distinguish, visually, whether support is available in a specification but still pending in browsers.)

How to help

We are actively looking for people and organizations that can help us complete the missing information in the matrix, and compile requirements for spec and browser developers. The following links may be helpful.