International typography on the Web
Language matrix

Languages and writing systems have varying typographic conventions, and users and content authors expect local conventions to be supported on the Web and in eBooks. This page provides a ‘heat-map’ indicating the degree to which missing support for local features impacts use of the Web. Viewed as a whole, the table gives an idea of the amount of work still to be done. Dark green means there's no problem; red means that, currently, issues with this feature make the language very difficult or impossible to use.

Across the top are feature categories; down the side are languages, listed by writing-system.

For the   languages currently listed, we see the following problem areas. This page is updated as information becomes available.

languages need work for advanced publishing

languages need work for basic features

languages don't work well on the Web

? of cells still need investigation.

The matrix

Where a language name has a link, you can click on any cell in that row for more detailed information.

  language Level
Encoding
Fonts Bidi layout

Color key

3 All needs covered (ie. OK), or not applicable
2 Basic needs covered, but work needed for advance publishing
1 Can create interoperable web pages, but work still needed for basic features
0 Something prevents interoperable or effective use of the language in webpages

* next to a language name indicates a tentative score, pending validation by experts

Detailed information about the chart

A question mark indicates that we don't yet have a reasonable degree of confidence about level of support for this aspect of the language. There may not be an issue, or there may be something that needs fixing which we are as yet unaware of.

The chart is arranged by script. Note that some languages use more than one script, and so appear in more than one place in the chart. For each script, a list of languages appears in order of the highest to lowest number of speakers.

The list currently targets some of the more common languages, or scripts that have important implications for general support. It is expected that the list will grow as data becomes available, and we welcome contributions for other languages from local experts. Some less common languages are already represented, sometimes because they present features that need to be taken into account while designing the overall technological architecture, other times simply because we have information about them and including them helps the local community ensure the survival of their language on the Web.

Each cell is the intersection of a language and a category of typographic features. The colours indicate what needs work, and whether the work is needed to bring the language to the next level. If no work is needed for a language, either because it is fully supported, or because that feature is not applicable, the dark green colour is used to indicate that no work is needed.

An asterisk next to a language name indicates that the scores for that language are tentative, ie. our best guess pending thorough analysis and validation by experts representing the language user community.

When a language name has a link, you can follow that link to find a draft report that gives details about work needed. That also needs validation and prioritization by experts if the language name has an asterisk.

Click the feature categories along the top of the chart to get an idea of what kind of typographic features are included in that category. You are taken to the document International Text Layout and Typography Index. From there you can follow links to current requirements, requests for information, spec wording, and issues related to that group of features.

What qualifies a cell for a score of 3 (OK)? A cell can be scored as OK if the feature in question is specified in an appropriate specification, and is supported by user agents. For the latter, a specification that is in CR or later and has two implementations in 'major' browsers will count. This means that the feature may not be supported in all browsers yet. (At some point in the future we may try to distinguish, visually, whether support is available in a specification but still pending in browsers.)

The colour of the third column from the left is the lowest score of any of the cells to its right, and represents the level for the language as a whole. The numeric score in that column is a non-scientific way of hinting at the amount of work to be done to get the language level to OK. (We may change the algorithm used for that at some future point.)

How to help

We are actively looking for people and organizations that can help us complete the missing information in the matrix, and compile requirements for spec and browser developers. The following links may be helpful.