This document describes requirements for the layout and presentation of text in languages that use the XXXX script when they are used by Web standards and technologies, such as HTML, CSS, Mobile Web, Digital Publications, and Unicode.
This early draft has not yet been through any review process. Please do not rely on the contents.
This document describes the basic requirements for XXXX script layout and text support on the Web and in eBooks. These requirements provide information for Web technologies such as CSS, HTML and digital publications about how to support users of XXXX scripts. Currently the document focuses on XXXX as used for YYYY. The information here is developed in conjunction with a document that summarises gaps in support on the Web for XXXX.
The editor's draft of this document is being developed by the XXXX Layout Task Force, part of the W3C Internationalization Interest Group. It is published by the Internationalization Working Group. The end target for this document is a Working Group Note.
This document provides information about the XXX script is used for the XXX language.
This document should contain no reference to a particular technology. For example, it should not say "CSS does/doesn't do such and such", and it should not describe how a technology, such as CSS, should implement the requirements. It is technology agnostic, so that it will be evergreen, and it simply describes how the script works. The gap analysis document is the appropriate place for all kinds of technology-specific information.
This document is pointed to by a separate document, XXXX Gap Analysis, which describes gaps in support for XXX on the Web, and prioritises and describes the impact of those gaps on the user.
Wherever an unsupported feature is indentified through the gap analysis process, the requirements for that feature need to be documented. This document is where those requirements are described.
The document Language Enablement Index points to this document and others, and provides a central location for developers and implementers to find information related to various scripts.
The W3C also maintains a tracking system that has links to github issues in W3C repositories. There are separate links for (a) requests from developers to the user community for information about how scripts/languages work, (b) issues raised against a spec, and (c) browser bugs. For example, you can find out what information developers are currently seeking, and the resulting list can also be filtered by script.
Text in italics like this is placeholder text, aimed to give suggestions for content. The questions are only prompts, to help the editor think of topics to address, and should be deleted as actual content is added. Actual content will depend on the needs of the script being described.
What follows is a suggested initial set of headings. Change and reorganize sections as needed. We suggest that you delete all sections initially, other than those you are immediately about to work on. The full set of headings can be found again by looking at the latest template.
For more ideas, see the Language Enablement Index or check for currently needed information (can be filtered by language group).
This section introduces the script & language in general terms, providing some context and terminology that is useful in the remainder of the document. If possible, it's best not to introduce actual requirements in this section, but to leave those and detailed descriptions of expected typography to the sections that follow.
Is the script an alphabet, abugida, abjad, syllabary, etc? What are the basic set of characters for major languages written in the script? How does the script work? Are glyphs in the script regularly joined up, as in Arabic, or not? Are there other special features, such as stacking, syllabic clusters and conjuncts, etc. Is there a set of characters used for special purposes, such as transcription? Are multiple scripts used?
What are the basic directional features of the script? Is the script written right-to-left, or bidirectionally? Is is written vertically? If so, can it also be written horizontally, what is the frequency with which it is written one way or the other, and which side is the first line on?
If this script runs right-to-left, are there any special requirements when handling that? What about numbers and expressions – do they run rtl or ltr?
Are there special considerations about page layout that should be described briefly at this point, eg. Japanese kihon-hanmen?
What are key writing styles or font types used for this script (eg. naskh vs nastaliq, etc, for Arabic, looped vs. loopless for Thai, upright/slanted/mool for Khmer, etc.?
If this script is cursive (eg. Arabic, N'Ko, Syriac, Mongolian, etc), are there needed features related to the handling of cursive text? Is the basic shape of a letter radically changed? Is it sometimes not cursive? Are Unicode joiner and non-joiner characters needed to override default joining behaviours? Should cursive links break if parts of a word are marked up or styled?
Does the script in question require additional features to support alterations to the position or shape of glyphs, for example adjusting the distance between the base text and diacritics, or changing the glyphs used in a systematic way? Does the rendered text need to be shaped in particular ways, and are there special features related to diacritic or other glyph positioning?
Are italicisation, bolding, oblique, etc relevant? Do italic fonts lean to the right or left? Is synthesised italicisation problematic? Are there other problems relating to bolding or italicisation - perhaps relating to generalised assumptions of applicability?
Does the script have special requirements for baseline alignment between mixed scripts and in general? Is line height special for this script? Are there other aspects that affect line spacing, or positioning of items vertically within a line?
What transformations does your script need? For example, does your script convert letters to uppercase, capitalised and lowercase alternatives? Do you need to to convert between half-width and full-width presentation forms?
What are the basic units of the text, and how are they demarcated, eg. characters, character sequences, syllables, or words? Are spaces or different symbols used between 'words'? Is it important to treat clusters of characters as a single unit? What should happen if you double- or triple-click in the text, whether or not 'words' are separated by spaces? Any special requirements for forward and backwards deletion, cursor movement &; selection, character counts, searching & matching, text insertion, line-breaking, justification, case conversions, sorting?
Are words separated by spaces, or other characters? Are there special requirements when double-clicking on the text? Are words hyphenated (ie. to create compounds)?
What characters are used to indicate the boundaries of phrases, sentences, and sections? What about parenthetical or bracketed phrases?
What is the expected behaviour for quotations marks, especially when nested? Should block quotes be indented or handled specially? What characters are used to indicate dialogue?
Bold and italic are not always appropriate for expressing emphasis, and some scripts have their own unique ways of doing it, that are not in the Western tradition at all. How are emphasis and highlighting achieved? If lines are drawn alongside, over or through the text, do they need to be a special distance from the text itself? Is it important to skip characters when underlining, etc? How do things change for vertically set text?
What characters or mechanisms are used to indicate abbreviation, ellipsis & repetition?
What mechanisms, if any, are used to create *inline* notes and annotations? (For referent-type notes such as footnotes, see below.)
What other characters or methods (eg. text decoration) are used to convey information about a range of text? If lines are drawn alongside, over or through the text, do they need to be a special distance from the text itself? Is it important to skip or not skip characters when underlining, etc?. Do you need support for special line shapes or widths? How do things change for vertically set text?
Punctuation not already mentioned, such as dashes, connectors, separators, etc. Does the script use special symbols that are worth noting, eg. head marks in Tibetan?
Does the script have its own set of number digits? How are they used, and how frequently? Does the numbering system use base-10, or some other type of base? Does it have special formatting patterns (eg. 12,34,000 in India). What about date/time formats and selection – are non-Gregorian calendars used? How are percent signs used, and do numbers have special decorations (like in Ethiopic or Syriac)?
Are there special rules about the way text wraps when it hits the end of a line?• Does line-breaking wrap whole 'words' at a time, or characters, or something else (such as syllables in Tibetan and Javanese)? What characters should not appear at the end or start of a line, and what should be done to prevent that?
Is it normal to have flush lines down both sides of the text body (ie. full justification)? If so, what rules are used by the script to resolve the positioning of characters on a line? Does the script allow punctuation to hang outside the text box at the start or end of a line? • Where adjustments are need to make a line flush, how is that done? • Does the script shrink/stretch space between words and/or letters? • Are word baselines stretched, as in Arabic?
Does the script conform to a grid pattern? What about paragraph indents?
Some scripts create emphasis or other effects by spacing out the letters or syllables in a word. Does the script create emphasis or other effects by spacing out the words, letters or syllables in a word? If so, what are the rules for doing so, eg. what things that should be kept together or split? What happens if the text wraps to the next line? How is cursiveness affected? (For justification related spacing, see above. This feature relates only to text that is separated by inserting/reducing space over an inline range.)
The CSS Counter Styles specification describes a limited set of simple and complex styles for counters to be used in list numbering, chapter heading numbering, etc.The rules plus more counter styles (totalling around 120 for over 30 scripts) are listed in the document Ready-made Counter Styles. Do these cover your needs? Do counters need to be upright in vertical text? Are the details correct? Are there other aspects related to counters and lists that need to be addressed?
Does the script use special styling of the initial letter of a line or paragraph, such as for drop caps or similar? How about the size relationship between the large letter and the lines alongide? Where does the large letter anchor relative to the lines alongside? is it normal to include initial quote marks or other leading punctuation in the large letter? is the large letter really a syllable? Are dropped, sunken, and raised types found? etc.
How are the main text area and ancilliary areas positioned and defined? Are there any special requirements here, such as dimensions in characters for the Japanese kihon hanmen? The book cover for scripts that are read right-to-left scripts is on the right of the spine, rather than the left. When content can flow vertically and to the left or right, how to specify the location of objects, text, etc. relative to the flow? Do tables and grid layouts work as expected? How do columns work in vertical text? Can you mix block of vertical and horizontal text? Does text scroll in a different direction?
Does the script have special requirements for character grids or tables?
Does the script have special requirements for notes, footnotes, endnotes or other necessary annotations of this kind? (There is a section above for purely inline annotations, such as ruby or warichu. This section is more about annotation systems that separate the reference marks and the content of the notes.)
Are vertical form controls needed? Are scroll bars in an unusual position? Other special requirements for user interaction?
Are there special conventions for page numbering, or the way that running headers and the like are handled?
Special thanks to the following people who contributed to this document (contributors' names listed in in alphabetic order).
This Person, That Person, etc
Please find the latest info of the contributors at the GitHub contributors list.