This document describes and prioritises gaps for the support of Khmer on the Web and in eBooks. In particular, it is concerned with text layout. It checks that needed features are supported in W3C specifications, in particular HTML and CSS and those relating to digital publications. It also checks whether the features have been implemented in browsers and ereaders. This is a preliminary analysis.

This document describes and prioritises gaps for the support of languages using the Khmer script on the Web and in eBooks. In particular, it is concerned with text layout. It checks that needed features are supported in W3C specifications, in particular HTML and CSS and those relating to digital publications. It also checks whether the features have been implemented in browsers and ereaders. This document complements the document Khmer Layout Requirements, which describes the requirements for areas where gaps appear. It is linked to from the language matrix that tracks Web support for many languages.

The editor's draft of this document is being developed by the Southeast Asian Layout Task Force, part of the W3C Internationalization Interest Group. It is published by the Internationalization Working Group. The end target for this document is a Working Group Note.

Introduction

The W3C needs to make sure that the needs of scripts and languages around the world are built in to technologies such as HTML, CSS, SVG, etc. so that Web pages and eBooks can look and behave as people expect around the world.

This page documents difficulties people encounter when trying to use the Cambodian language in the Khmer script on the Web.

Having identified an issue, it investigates the current status with regards to web specifications and implementations by user agents (browsers, e-readers, etc.), and attempts to prioritise the severity of the issue for web users.

A summary of this report and others can be found as part of the language matrix.

For a description of the Khmer script see the (non-W3C) page Khmer, which summarises aspects of the orthography and typographic features, including relevant Unicode characters and their use.

Work flow

This version of the document is a preliminary analysis

Gap analysis work usually starts with a preliminary analysis, conducted quickly by one or a small group of experts. Then a more detailed analysis is carried out, involving a wider range of experts. The detailed analysis may involve the development of tests, in order to illustrate issues and track results for browsers. The next phase is ongoing maintenance. It is expected that the resulting document will not be frozen: as gaps are fixed, this should be noted in the document. It is also possible that new gaps are noticed or arise, and they can be added to this document when that happens.

As the gap analysis develops, the requirements for features that are problematic should be described in the companion document, Khmer Layout Requirements. Links to the appropriate part of that document should be added to this document as the material is created. Note that the requirements document should not contain any technology-specific information: all of that belongs here.

Prioritization

This document not only describes gaps, it also attempts to prioritise them in terms of the impact on the local user. The prioritisation is indicated by colour.

Key:

It is important to note that these colours do not indicate to what extent a particular features is broken. They indicate the impact of a broken or missing feature on the content author or end user.

Basic styling is the level that would be generally accepted as sufficient for most Web pages. Advanced level support would include additional features one might expect to include in ebooks or other advanced typographic formats. There may be features of a script or language that are not supported on the Web, but that are not generally regarded as necessary (usually archaic or obscure features). In this case, the feature can be described here, but the status should be marked as OK.

The decision as to what priority level is assigned to a described gap is down to the experts doing the gap analysis. It may not always be straightforward to decide. If a given section in this document refers to more than one feature that is broken, each with different impacts on Web users, the priority for the section should be the lowest denominator.

A cell can be scored as OK if the feature in question is specified in an appropriate specification, and is supported by user agents. A specification that is in CR or later and has two implementations in 'major' browsers will count. This means that the feature may not be supported in all browsers yet. (At some point in the future we may try to distinguish, visually, whether support is available in a specification but still pending in major browsers or applications.)

Text direction

See also General page layout & progression for features such as column layout, page turning direction, etc. that are affected by text direction.

Vertical text

Bidirectional text

Characters and phrases

Characters & encoding

Fonts

Font styles, weight, etc

Glyph shaping and positioning

Cursive text

Baselines, line-height, etc

Transforming characters

Grapheme/word segmentation & selection

Inline features & punctuation

Text decoration

Quotations

Inline notes & annotations

Data formats & numbers

Lines and Paragraphs

Line breaking

See also hyphenation below.

Hyphenation

Text alignment & justification

Firefox on MacOS keeps all vowel signs and diacritics with base characters. It also keeps together consonant stacks and their vowel signs, such as ខ្លួ. Also, ligated combinations such as បា កា are rendered as expected.

Chrome and Safari don't support text-justify: inter-character.

Letter spacing

Lists, counters, etc.

Two numeric CSS counter styles are defined for Khmer, using Khmer digits, in the document Ready-made Counter Styles: khmer and cambodian. Both styles are exactly the same, and are listed separately because both names are supported in some browsers, and it makes it easier to cut and paste if two clean instances are provided.

In addition, an alphabetic counter style is defined: khmer-consonant.

The CSS Counter Styles specification only specifies the khmer and cambodian styles.

The khmer and khmer counter styles are supported by Firefox, Chrome, Edge, and Safari, but not by legacyEdge.

The khmer-consonant style is not supported by any browser natively, and although the CSS Counter Styles spec allows users to create their own counter styles, the feature is only implemented by Firefox at the moment, so no support is available for this style.

See tests: Simple numericKhmer script

Unless a proposal is made that the khmer-consonant or some other style is important, marking the status for this as ok.

Styling initials

Page & book layout

General page layout & progression

Footnotes, endnotes, etc.

Page headers, footers, etc.

Forms & user interaction

Other

Culture-specific features

Sometimes a script or language does things that are not common outside of its sphere of influence. This is a loose bag of additional items that weren't previously mentioned. This section may also be relevant for observations related to locale formats (such as number, date, currency, format support).

What else?

There are many other CSS modules which may need review for script-specific requirements, not to mention the SVG, HTML, Speech, MathML and other specifications. What else is likely to cause problems for worldwide deployment of the Web, and what requirements need to be addressed to make the Web function well locally?

Show summary