Language matrix
International typography on the Web

Languages and writing systems have varying typographic conventions, and users and content authors expect local conventions to be supported on the Web and in eBooks. This page provides a ‘heat-map’ indicating the degree to which missing support for local features impacts use of the Web. Viewed as a whole, the table gives an idea of the amount of work still to be done. Dark green means there's no problem; red means that, currently, issues with this feature make the language very difficult or impossible to use.

Across the top are feature categories; down the side are languages, listed by writing-system.

For the   languages currently listed, we see the following problem areas. This page is updated as information becomes available.

languages need work for advanced publishing

languages need work for basic features

languages don't work well on the Web

? of cells still need investigation.

The matrix

Where a language name has a link, you can click on any cell in that row for more detailed information.

script language summary
Vertical text
Bidirectional text
Characters
Fonts
Font styles
Glyph control
Cursive text
Transforms
Boundaries
Text decoration
Quotations
Inline annotation
Numbers, dates ...
Otder inline
Line breaking
Hyphenation
Align, justify ...
Letter/word space
Lists, counters
Styling initials
Baselines
Other paragraph
Page layout
Notes, footnotes
Page no.s, heads ...
Forms, interaction
Other pagination

Color key

3 All needs covered (ie. OK), or not applicable
2 Basic needs covered, but work needed for advance publishing
1 Can create interoperable web pages, but work still needed for basic features
0 Something prevents interoperable or effective use of the language in webpages

* next to a language name indicates a tentative score, pending validation by experts

Detailed information about the chart

The colour to the left of the language is the lowest score of any of the cells to its right, and represents the level for the language as a whole. The column immediately to the right of the language name graphically summarises the available data for that language (by reordering the squares in the main part of the table).

A question mark indicates that we don't yet have a reasonable degree of confidence about level of support for this aspect of the language. There may not be an issue, or there may be something that needs fixing which we are as yet unaware of.

The chart is arranged by script. Note that some languages use more than one script, and so appear in more than one place in the chart. For each script, a list of languages appears in order of the highest to lowest number of speakers.

The list currently targets some of the more common languages, or scripts that have important implications for general support. It is expected that the list will grow as data becomes available, and we welcome contributions for other languages from local experts. Some less common languages are already represented, sometimes because they present features that need to be taken into account while designing the overall technological architecture, other times simply because we have information about them and including them helps the local community ensure the survival of their language on the Web.

Each cell is the intersection of a language and a category of typographic features. The colours indicate what needs work, and whether the work is needed to bring the language to the next level. If no work is needed for a language, either because it is fully supported, or because that feature is not applicable, the dark green colour is used to indicate that no work is needed.

An asterisk next to a language name indicates that the scores for that language are tentative, ie. our best guess pending thorough analysis and validation by experts representing the language user community.

When a language name has a link, you can follow that link to find a draft report that gives details about work needed. That also needs validation and prioritization by experts if the language name has an asterisk.

Click the feature categories along the top of the chart to get an idea of what kind of typographic features are included in that category. You are taken to the document International Text Layout and Typography Index. From there you can follow links to current requirements, requests for information, spec wording, and issues related to that group of features.

What qualifies a cell for a score of 3 (OK)? A cell can be scored as OK if the feature in question is specified in an appropriate specification, and is supported by user agents. For the latter, a specification that is in CR or later and has two implementations in 'major' browsers will count. This means that the feature may not be supported in all browsers yet. (At some point in the future we may try to distinguish, visually, whether support is available in a specification but still pending in browsers.)

How to help

We are actively looking for people and organizations that can help us complete the missing information in the matrix, and compile requirements for spec and browser developers. The following links may be helpful.