Complex text layout
- See Help:Multilingual support for enabling complex text layout on your computer
Complex text layout (abbreviated CTL) or complex text rendering refers to the typesetting of writing systems which require complex transformations between text input and text display for proper rendering on the screen or the printed page (also known as complex scripts). In other words, for these scripts the way text is stored is not mapped to the way it is displayed in a straightforward fashion. The term is used in the field of software internationalization.
CTL is a generalization of the concept of ligature: for the Latin alphabet, ligatures are usually considered a marginal aesthetic concern, but there is no fundamental difference between the ligatures required for acceptable typesetting of the Arabic script, and typesetting a Latin cursive. Conversely, most characters of the Chinese script are compositional and could be considered ligatures, but are usually encoded as so many individual characters, that typesetting requires an enormous typeface rather than sophisticated layout. An example of a contextual variant that is not considered a ligature is Greek final sigma ς, the word-final contextual variant of the usual σ shape. Unicode encodes both variants separately, at U+03C2 and U+03C3 respectively. However, for collation and comparison purposes, software should likely consider the string "δῖος Ἀχιλλεύς." equivalent to "δῖοσ Ἀχιλλεύσ." (Unicode does not direct conforming software to treat ς and σ as canonically or compatibility equivalent).
The main characteristics of CTL language complexity are:
- Bi-directional text, where characters may be written from either right-to-left or left-to-right direction.
- Context-sensitive shaping (ligatures), where a character may change its shape, dependent on its location and/or the surrounding characters. For example, a character in Arabic script can have as many as four different shape-forms, depending on context.
- Ordering, the displayed order of the characters is not the same as the logical order. For example, in Devanagari, which is written from left to right, the grapheme for "short i" appears to the left of ("before") the preceding consonant: in कि ki, the ि -i should render on the left, its bow reaching until above the क k- to the right.
Some CTL implementations do not encapsulate information about specific scripts. In these implementations, the script-specific CTL information resides within the font files. Therefore, they are able to render any script:
Other CTL implementations encapsulate information about specific scripts. In these implementations, the script-specific CTL information is provided by the CTL implementation. Therefore, they are only able to render the scripts that are previously implemented:
- International Components for Unicode (ICU)
- Pango provides text services to GTK+
- Harfbuzz is the new OpenType layout engine for Pango and Qt
- Uniscribe and its successor, DirectWrite
- Indeed, historically, the Arabic alphabet is simply a cursive of the Nabataean alphabet, with context-dependent letter shapes that became mandatory from ca. the 4th century AD.
See also 
- Writing systems which require complex text layout:
- Examples of complex rendering — SIL international's examples of complex writing systems around the world
- Complex Text Layout — The Open Group's Desktop Technologies
- Supporting Indic Scripts in Mozilla — also other CTL scripts
- Project SILA — Graphite and Mozilla integration project
- CTL Architecture in Solaris — Solaris Globalization Whitepapers
- Complex Scripts — Microsoft Global Development and Computing Portal
- Theppitak's Homepage — information about Thai language processing
- HarfBuzz' page at Freedesktop.org
- D-Type Unicode Text Module — Portable software library for complex text