Structural markup and right-to-left text in HTML

This article looks at ways of handling text direction for structural markup in HTML, ie. at the document level and for elements like paragraphs, tables and forms.

For handling bidirectional text with inline markup you should read the separate article, Inline markup and bidirectional text in HTML. It also describes some other elements and attributes related to direction.

The dir attribute is used to set the base direction of text for display. It is essential to support languages that use right-to-left scripts such as Adlam, Arabic, Hebrew, N'Ko, Syriac, and Thaana. Many different languages are written with these scripts, including Arabic, Dhivehi, Hebrew, Mandinka, Pashto, Persian, Pular, Sindhi, Syriac, Urdu, Yiddish, etc.

Quick answer

If the overall document direction is right-to-left, add dir="rtl" to the html tag.

Below the html tag, only use the dir attribute on structural elements on the rare occasions when the base direction needs to change in order for the text to display correctly.

Never use CSS to apply the base direction. But do use logical ('end' and 'start') on properties or values related to margins, padding, alignment, etc. to make it easy to manage direction changes during localisation. Avoid HTML attributes with values of 'right' and 'left'.

Set the dir attribute to auto on forms and inserted text in order to automatically detect the direction of content supplied at run-time. Consider using the dirname attribute on forms to send information about direction to the server in addition to the usual form data.

Handling bidirectional inline text is dealt with in the separate article, Inline markup and bidirectional text in HTML.

Setting direction at the document level

Base direction

Examples in this document may be shown as images to avoid problems for those with a browser that doesn't produce what was intended.

Code samples containing Arabic and Hebrew text may be displayed in different ways, none of which are usually satisfactory. In this article right-to-left text in code samples may be represented by UPPERCASE TRANSLATIONS, and left-to-right text by lowercase.

At the outset, it is important to understand the concept of base direction (see Unicode Bidirectional Algorithm basics for a simple overview of how it works with the Unicode bidirectional algorithm).

It is fundamentally important to establish the appropriate base direction for the text, so that the Unicode bidirectional algorithm can reorder the text appropriately when it is displayed. Correctly setting the base direction also sets the default paragraph alignment of the text.

In HTML the base direction is either (a) set explicitly by the nearest parent element that uses the dir attribute (which could be the html element), or, (b) in the absence of such an attribute, left-to-right (LTR).

Setting up a right-to-left page

Add dir="rtl" to the html tag any time the overall document direction is right-to-left (RTL). This sets the default base direction for the whole document. All block elements in the document will inherit this setting unless the direction is explicitly overridden.

<!DOCTYPE html>
<html dir="rtl" lang="ar">
<head>
<meta charset="utf-8">
...

No dir attribute is needed for documents that have a base direction of left-to-right, since this is the default, but it doesn't hurt to use it with a value of ltr.

This simple addition to the html element will have the following effects throughout the rendered page.

  1. Paragraphs and other blocks will be right-aligned.
  2. Bidirectional text will correctly flow from right-to-left.
  3. Punctuation will appear in the correct place relative to the text.
  4. Table columns will progress from right-to-left, and their contents will be right-aligned.
  5. Input in form fields will automatically start at the right, by default.
  6. If you write the style sheet correctly, CSS will automatically mirror the layout.
  7. It will set the direction of overflows.
 
What content looks like before (left) and after (right) the dir attribute is added to the html tag. (Click on the images to enlarge them.)

Language tags

While you are declaring the directionality of the document in the html tag, don't forget to also declare the language of the document using the lang attribute (see Declaring language in HTML). However, do not make the mistake of assuming that language declarations indicate directionality, or vice versa! Even if the language declaration has a script tag it won't affect the directionality of the text in the user agent. You must always declare the directionality using the dir attribute.

Scroll bars

The LTR/RTL direction of the page shouldn't affect the location of scrollbars, since these are part of the browser chrome that is determined by the user, rather than by the language of the page.

The title element

The text that appears in the title element at the top of an HTML file is often displayed in tab headings, bookmarks, etc. When so displayed, the browser should automatically apply the base direction that the title element had in the original document. For example, if the html tag declares the document direction to be RTL, the title element text should be displayed with a RTL base direction.

At the time of writing, browsers tend to display RTL title text from right-to-left, and vice versa. However, they do this not by examining the direction applied to the text by the markup, but instead by finding the first strongly directional character in the title and assuming that that indicates the appropriate base direction.

Much of the time this will produce the desired result. However, if the title text in a RTL document begins with, say, an acronym in the Latin script, the order will be incorrect when the text is displayed (see some tests).

A workaround for this scenario is to add &rlm; at the beginning of the title text when it doesn't begin with a RTL character. This adds U+200F RIGHT-TO-LEFT MARK at the start, which is an invisible, strongly directional RTL character.

If you have a LTR text that begins with a strong RTL character, use &lrm; at the start, instead.

Setting direction on block elements

Don't use CSS for direction!

Do not use CSS to apply base direction in HTML pages.

Basically, this is because you want the directional information to be available even when the CSS is not. The directional information can affect the semantics of your content, and so should be part of the markup. (See a longer explanation).

Both the CSS and HTML specs echo this same admonition.

Only use markup for special circumstances

Use the dir attribute on a block element only when you need to change the base direction of content in that block.

Having established the base direction at the html tag level, you may have no need to use the attribute for any block elements on the page, since the direction set at the start of the page percolates down to all block elements.

(You may however need to use it for inline stretches of bidirectional text. That is described in more detail in Inline markup and bidirectional text in HTML.)

The following is an example of how to mark up a block element with a left-to-right base direction in a right-to-left document.

<blockquote dir="ltr" lang="en" cite="Romeo and Juliet (II, ii, 1-2)">But, soft! What light through yonder window breaks? It is the east, and Juliet is the sun.</blockquote>

Using logical properties in CSS

Text that is aligned to the right in an English page generally needs to be aligned to the left in a RTL page. It is possible to make that happen automatically, without the hassle of changing all the CSS in your style sheet. The solution is to use 'logical properties' when setting up your style: ie. use 'start' and 'end', rather than 'left' or 'right'.

Using logical properties by default makes it much easier to localise your content in the future or include text with a different direction. After a while, thinking about start and end rather than left and right becomes natural, and will be useful to you when dealing with layout methods such as CSS grid layout or flexbox which follow the same patterns.

Left and right values may still be useful, occasionally, if you want the item you are positioning to remain in a fixed location, independent of the language of the text. Learning to distinguish when to use the left/right terms rather than the default start/end terms helps you to be more aware of your design intent.

Logical values or property names that enjoy interoperable support on the major browser engines include:

text-align: start | end
justify-content: flex-start | flex-end ...
align-content: flex-start | flex-end ...
grid-column-start: <value>
grid-column-end: <value>
inline-size: <width>
margin-inline-start/end: <value>
padding-inline-start/end: <value>
border-inline-start/end-width: <value>
border-inline-start/end-style: <value>
border-inline-start/end-color: <value>
etc.

For many of these properties it is also possible to replace inline with block. This facilitates changing between horizontal and vertical modes when dealing with Chinese, Japanese, Mongolian, etc.

When you use these properties in your style sheet and set the direction of your content to RTL, the alignment of that content treats start as right, and end as left. If you change the direction of the text, you don't have to worry about also adapting the style sheet.

At the time of writing, additional properties are still awaiting adoption by some of the main browser engines. These include float, caption-side, clear, and border-radius. Also, the shortcut properties for margin and padding are not yet implemented. See a set of test results for major browsers.

Other recommendations include:

Working with tables

The dir attribute setting also affects the flow of columns in a table. The following picture shows a table in a right-to-left document (ie. the html tag includes dir="rtl"). The content of the table cells is right-aligned, the flow of content in each cell is right-to-left, and the columns also run right-to-left.

Picture of table.

In the table just below, the code dir="ltr" has been added to the table element, like this:

<table dir="ltr"> … </table>

Note how the order of columns has changed, how the contents of the cells are now left aligned (look at the numbers), and how the flow of words within each cell is now left-to-right (although the words themselves are still read, character by character, in the same direction).

Picture of table.

What hasn't changed, however, is the alignment of the table itself within its containing block. It is still over to the right.

If, for some reason, you wanted to use markup (rather than styling) to make the table appear over on the left as well as reorder the columns (perhaps because you see the table as part of a left-to-right direction block), you would need to wrap it in something like a div element, and add the dir="ltr" to that element to achieve that effect. (Don't use CSS text-align because that will affect the table cells!) See the third rendering of the table below, which is now left-aligned.

Picture of table.

Note that we don"t have to repeat the dir attribute on the table itself, but that the columns run left-to-right.

dir=auto

If the value of the dir attribute is set to auto, the browser will look at the first strongly typed character in the element and work out from that what the base direction of the element should be. If it's a Hebrew (or Arabic, etc.) character, the element will get a direction of rtl. If it's, say, a Latin character, the direction will be ltr.

There are some corner cases where this may not give the desired outcome, but in the vast majority of cases it should produce the expected result.

Applied to block elements, the auto value comes in handy when you don't know in advance the direction of the text inserted into a page. It is also particularly useful for working with forms.

Inserting text into a page with the correct base direction

Applications often insert text into a page at run time by pulling information from a database or other location, be it via server-side scripting such as PHP, using AJAX, or some other method. Such text can be multilingual/multiscript, and the direction of the text may not be known in advance. (Multiscript text is much more common in pages that are predominantly right-to-left than in other pages.)

Such inserted text is commonly inline, and the auto value of the dir attribute and another element called bdi play a useful role in handling such situations. Their use for inline markup is described in more detail in the article Inline markup and bidirectional text in HTML.

It is sometimes useful to also label block level content. For example, in a forum where posts are in both Urdu and English, or where text in a single post is a mixture of Hebrew and English paragraphs. Simply add dir="auto" to the element that surrounds each post and the first strongly-typed character in the element will determine the direction of that element's content.

The HTML5 specification gives an example related to a chat session. Given the following markup:

<p dir="auto" class="u1"><bdi>S</bdi>: <span class="msg">How do you write "What's your name?" in Arabic?</span></p>
<p dir="auto" class="u2"><bdi>T</bdi>: <span class="msg"> ما اسمك؟</span></p>
<p dir="auto" class="u1"><bdi>S</bdi>: <span class="msg">Thanks.</span></p>
<p dir="auto" class="u2"><bdi>T</bdi>: <span class="msg">That's written "شكرًا".</span></p>
<p dir="auto" class="u2"><bdi>T</bdi>: <span class="msg">Do you know how to write "Please"?</span></p>
<p dir="auto" class="u1"><bdi>S</bdi>: <span class="msg">"من فضلك", right?</span></p>

The browser will display the following:

Picture of output

Note how, when searching for the first strongly-typed character, the browser skips over text in a bdi element. It also skips text in script, style, and textarea elements, and any element with a dir attribute.

Note, also, how this approach is not foolproof: the final paragraph in this example is misinterpreted as being right-to-left text, since it begins with an Arabic character. This causes the line to be right-aligned and the text "right?" to be to the left of the Arabic text, with the question mark at the far left.

Working with forms

Many web applications with a right-to-left-language interface or a right-to-left-language data source need to display and/or accept as input both LTR and RTL data. The application often doesn't know, and cannot control, the direction of the data.

Correctly displaying text in the input element

An online book store that carries books in many languages needs to work with the original book titles regardless of the language of the user interface. Thus, a Hebrew or Arabic book title may appear in an English interface, and vice-versa (this problem is actually much more widespread in RTL pages). The direction of the title may be available as a separate attribute, but more likely it isn't.

In the following example we search for a Hebrew title, הצהחת קידוד תװי CSS, in an English user interface.

Without taking steps to prevent it, you'll notice that (a) the word 'CSS' comes out in the wrong place (it should be on the left), and (b) the text remains left-aligned rather than over to the right. Perhaps even worse, the user experience of typing opposite-direction data can be quite awkward in some cases due to the cursor and punctuation jumping around during data entry and difficulty in selecting text.

Picture of code with dir attributes on every element.

The solution is to just add dir="auto" to the input tag.

Picture of code with dir attributes on every element.

Since the first strong character is right-to-left, the auto value causes the input field to be right-to-left too.

If the next book that the user searches for has an English title, the text will automatically be left-aligned and the base direction will be set to LTR.

Alternating directionality in textarea (and pre) paragraphs

Both textarea and pre elements can contain more than one paragraph of text, and it is not possible to apply markup to those paragraphs.

If a textarea element inherits or sets a direction of rtl, all paragraphs will be right-aligned, but the paragraphs that should have a LTR base direction won't have it. For example, in the following picture the exclamation mark associated with the word 'two' should appear to the right, not the left.

If you set dir to auto on the element then base direction is assigned to each paragraph independently, according to the direction of the first strong character in that paragraph. RTL and LTR paragraphs are also aligned differently.

When a line contains no strong directional characters, such as '123-456', a LTR base direction is used for the arrangement of the characters, however the alignment of the line currently varies by browser. Webkit browsers keep the text right-aligned, whereas Blink and Gecko browsers left-align it. It is likely that in future all browsers will base the alignment of such lines on that of the previous paragraph.

Reporting direction to the server with dirname

When you cause the browser to dynamically apply the correct direction to text in a form field, either by using dir="auto", by using JavaScript, or even by using browser-specific keystrokes or context menus, the dirname attribute allows you to pass that information to the server, so that it can be re-used when the text is displayed in another context.

Here's an example of it in use:

<form action="addcomment.cgi" method="get">
<p><label>Comment: <input type="text" name="comment" dirname="commentdir" required></label></p>
<p><button name="mode" type=submit value="add">Post Comment</button></p>
</form>

The value of dirname can be whatever you want (but not empty). When it is set, the form passes the direction of the element to the server, using the name you have provided. So if the user switches the direction of the form entry field in the example above to RTL and enters مرحبا, then when the form is submitted, the submission body will look like this:

comment=%D9%85%D8%B1%D8%AD%D8%A8%D8%A7&commentdir=rtl&mode=add

The directional information can then be used to apply the correct direction to the text when it is displayed on another page.

This attribute can, of course, also be used to submit the direction of the input field when dir is set to rtl or ltr. This could be useful for a database that stores data in a variety of languages.

Setting direction on forms manually

Browsers may allow users to set the base direction of form entry fields using key strokes. Having the right base direction set can significantly improve the user's experience, especially if the text they are inputting contains punctuation and numbers. Unfortunately, each browser has a different way of doing this. This section lists how to do it for some major desktop browsers.

In some cases you will need to set up your system for this to work. For example, for Internet Explorer you may need to install the Hebrew package and enable the Hebrew keyboard before this will work.

Chrome: Right-click on input or textarea elements to reveal the Writing Direction submenu. Choose either Right to Left or Left to Right. This sets the value of the element's dir attribute, which is then available to scripts.

Safari: Right-click on input or textarea elements to reveal the Paragraph Direction submenu. Choose either Right to Left or Left to Right. This sets the value of the element's dir attribute, which is then available to scripts.

Firefox: Set direction using the CTRL/CMD+SHIFT+X keyboard shortcut, which cycles through LTR and RTL. It does not set the value of the element's dir attribute, and is thus invisible to scripts.

Internet Explorer: Use CTRL+LEFT SHIFT for LTR and CTRL+RIGHT SHIFT for RTL. (These key combinations are also adopted for this purpose by most Microsoft products, e.g. Windows dialogs, Notepad and Word.) They set the value of the element's dir attribute, which is then available to scripts.

Try it: