This document describes requirements for the layout and presentation of text in languages that use the Hebrew script when they are used by Web standards and technologies, such as HTML, CSS, Mobile Web, Digital Publications, and Unicode.
This document describes the basic requirements for Hebrew script layout and text support on the Web and in eBooks. These requirements provide information for Web technologies such as CSS, HTML and digital publications about how to support users of the Hebrew script. Currently the document focuses on Hebrew as used for the Hebrew language. The information here is developed in conjunction with a document that summarises gaps in support on the Web for Hebrew.
The editor's draft of this document is being developed by the Hebrew Layout Task Force, part of the W3C Internationalization Interest Group. It is published by the Internationalization Working Group. The end target for this document is a Working Group Note.
Sending comments on this document
If you wish to make comments regarding this document, please raise them as github issues. Only send comments by email if you are unable to raise issues on github (see links below). All comments are welcome.
To make it easier to track comments, please raise separate issues or emails for each comment, and point to the section you are commenting on using a URL.
Some text goes here.
This document is pointed to by a separate document, Hebrew Gap Analysis, which describes gaps in support for Hebrew on the Web, and prioritises and describes the impact of those gaps on the user.
Wherever an unsupported feature is indentified through the gap analysis process, the requirements for that feature need to be documented. This document is where those requirements are described.
This document should contain no reference to a particular technology. For example, it should not say "CSS does/doesn't do such and such", and it should not describe how a technology, such as CSS, should implement the requirements. It is technology agnostic, so that it will be evergreen, and it simply describes how the script works. The gap analysis document is the appropriate place for all kinds of technology-specific information.
The document International text layout and typography index (known informally as the text layout index) points to this document and others, and provides a central location for developers and implementers to find information related to various scripts.
The W3C also maintains a tracking system that has links to github issues in W3C repositories. There are separate links for (a) requests from developers to the user community for information about how scripts/languages work, (b) issues raised against a spec, and (c) browser bugs. For example, you can find out what information developers are currently seeking, and the resulting list can also be filtered by script.
This section introduces the Hebrew script in general terms, providing some context and terminology that is useful in the remainder of the document. If possible, it's best not to introduce actual requirements in this section, but to leave those and detailed descriptions of expected typography to the sections that follow.
Is the script an alphabet, abugida, abjad, syllabary, etc? What are the basic set of characters for major languages written in the script? How does the script work? Are glyphs in the script regularly joined up, as in Arabic, or not? Are there other special features, such as stacking, syllabic clusters and conjuncts, etc. Is there a set of characters used for special purposes, such as transcription? Are multiple scripts used?
What are the basic directional features of the script? Is the script written right-to-left, or bidirectionally? Is is written vertically? If so, can it also be written horizontally, what is the frequency with which it is written one way or the other, and which side is the first line on?
Are there special considerations about page layout that should be described briefly at this point, eg. Japanese kihon-hanmen?
The following is placeholder text, aimed to give suggestions for content. The questions are only prompts, to help the editor think of topics to address. Actual content will depend on the needs of the script being described. Change and reorganize sections as headings and needed.
What are key fonts and or font styles used for this script? Do italic fonts lean in the right direction? Is synthesised italicisation problematic? Are there standard fallback fonts in use in browsers, and if so do they match expectations? Does the script use fixed-width font glyphs, proportionally-spaced fonts, or a combination? See available information or check for currently needed data.
Does the script in question require additional features to support alterations to the position or shape of glyphs, for example adjusting the distance between the base text and diacritics, or changing the glyphs used in a systematic way? See available information or check for currently needed data.
If this script is cursive (eg. Arabic, N'Ko, Syriac, Mongolian, etc), are there problems or needed features related to the handling of cursive text? Do cursive links break if parts of a word are marked up or styled? Do Unicode joiner and non-joiner characters behave as expected? See available information or check for currently needed data.
What are the typical punctuation marks used, and how are they used? (See other sections for information about how quotations work, and how punctuation interacts with boundary detection and line-breaking, etc.) See available information or check for currently needed data.
What is the expected behaviour for quotations marks, especially when nested? Should block quotes be indented or handled specially? See available information or check for currently needed data.
Does the script use special symbols that are worth noting, eg. head marks in Tibetan?
Does the script have its own set of number digits? How are they used, and how frequently? Does the numbering system use base-10, or some other type of base? Does it have special formatting patterns (eg. 12,34,000 in India). What about date/time formats and selection – are non-Gregorian calendars used? How are percent signs used, and do numbers have special decorations (like in Ethiopic or Syriac)?
What are the basic units of the text, and how are they demarcated, eg. characters, character sequences, syllables, or words? Are spaces or different symbols used between 'words'? Is it important to treat clusters of characters as a single unit? What should happen if you double- or triple-click in the text, whether or not 'words' are separated by spaces? See available information.
What transformations does your script need? For example, does your script convert letters to uppercase, capitalised and lowercase alternatives? Do you need to to convert between half-width and full-width presentation forms? See available information or check for currently needed data.
Some scripts create emphasis or other effects by spacing out the letters or syllables in a word. We know there are questions here about how this should work in Indic and SE Asian scripts, and in Arabic-based scripts. Can you provide information? Are there requirements for this script that we should add? (For justification related spacing, see below.) See available information or check for currently needed data.
Are ruby annotations used in this script/language? If so, how? See available information or check for currently needed data.
Some aspects related to the drawing of lines alongside or through text involve local typographic considerations. For example, underlines need to be broken in special ways for some scripts, and the position relative to the text may vary depending on the script. Do you need support for special line shapes or widths? What about vertical text? See available information or check for currently needed data.
Bold and italic are not always appropriate for expressing emphasis, and some scripts have their own unique ways of doing it, that are not in the Western tradition at all. If this applies to this script/language, how is it done? See available information or check for currently needed data.
If this script runs right-to-left, are there any special requirements when handling that? What about numbers and expressions – do they run rtl or ltr? See available information or check for currently needed data.
Does the script/language have special ways of representing inline notes (such as wakiten or kumimoji in Japanese) or other special inline features that need to be supported? See available information or check for currently needed data.
What are the important features about how script wraps when it hits the end of a line? Does line-breaking wrap whole 'words' at a time, or characters, or something else (such as syllables in Tibetan and Javanese)? What characters should not appear at the end or start of a line, and what should be done to prevent that? See available information or check for currently needed data.
Is hyphenation used for your script, or something else? If hyphenation is used, where should the hyphen appear? See available information or check for currently needed data.
Is it normal to have flush lines down both sides of the text body (ie. full justification)? If so, what rules are used by the script to resolve the positioning of characters on a line? Does the script conform to a grid pattern? Does your script allow punctuation to hang outside the text box at the start or end of a line? Where adjustments are need to make a line flush, how is that done? Do you shrink/stretch space between words and/or letters? Are word baselines stretched, as in Arabic? See available information or check for currently needed data.
The CSS Counter Styles specification describes a limited set of simple and complex styles for counters to be used in list numbering, chapter heading numbering, etc.The rules plus more counter styles (totalling around 120 for over 30 scripts) are listed in the document Ready-made Counter Styles. Do these cover your needs? Are the details correct? Are there other aspects related to counters and lists that need to be addressed? See available information or check for currently needed data.
Does the script/language apply special styling of the initial letter of a line or paragraph, such as for drop caps? If so, what are the rules for positioning: what is the size relationship between the large letter and the lines alongide? where does the large letter anchor relative to the lines alongside? is it normal to include initial quote marks in the large letter? is the large letter really a syllable? etc. See available information or check for currently needed data.
What are the requirements for baseline alignment between mixed scripts and in general? See available information or check for currently needed data.
In this script/language, is the first line of text typically indented at the start of a paragraph? Are there other features of paragraph design that are peculiar to your script? See available information or check for currently needed data.
If this is a RTL or vertically-set writing system, how do you specify the location of objects, text, etc. relative to the flow? Is content mirror-imaged completely in layouts? What other aspects of layout are affected by direction? See available information or check for currently needed data.
What are the requirements for vertically oriented text? What about if you mix vertical text with scripts that are normally only horizontal? Is it normal to use different characters in vertical vs. horizontal text? Do you expect short numbers, acronyms, etc to run horizontally within a vertical line (tate chu yoko)? See available information.
Does your script have special requirements for notes, footnotes, endnotes or other necessary annotations of this kind in the way needed for your culture? See available information or check for currently needed data.
Are there special conventions for page numbering, or the way that running headers and the like are handled? See available information or check for currently needed data.
Some cultures define page areas and page progression direction very differently from those in the West (eg. kihon hanmen in Japanese). Is this an issue for you? Are widows and orphans relevant? In what order do pages progress, RTL or LTR? See available information or check for currently needed data.
Special thanks to the following people who contributed to this document (contributors' names listed in in alphabetic order).
This Person, That Person, etc
Please find the latest info of the contributors at the GitHub contributors list.