Posts in Category "Standards"

Standardized Variants—Part 4

As I described in Part 1, Part 2, and Part 3 of this series, Standardized Variants offer a Normalization-proof representation for the 1,002 CJK Compatibility Ideographs, which are encoded in the BMP, and at the end of Plane 2. These 1,002 Standardized Variants have been approved, and will be included in Unicode Version 6.3. They will, of course, also be included in IS0/IEC 10646.

In an effort to provide to font developers advance support for the Standardized Variants that correspond to glyphs in Adobe’s public ROSes, the next version of AFDKO will include a new version of the Adobe-Japan1_sequences.txt file that appends entries that correspond to 89 of these Standardized Variants, along with Adobe-CNS1_sequences.txt and Adobe-Korea1_sequences.txt files that specify 14 and 270 entries, respectively, that correspond to these Standardized Variants. If you click on the file names, you can download the files and use them immediately. These are used with the AFDKO makeotf tool, and specified as the argument of the “-ci” command-line option.

Baby steps…

UTR #50 Released!

The Unicode Consortium announced the release of UTR #50, Unicode Vertical Text Layout, today, via Twitter and their blog. Although I was involved in this Unicode Technical Report to some extent, any congratulatory comments should be directed toward its original and current editors, Eric Muller and Koji ISHII (石井宏治), respectively.

A Tale of Three (OpenType) Features

In an effort to make sure that the infrastructure to support UTR #50 (Unicode Vertical Text Layout) will be in place—sooner rather than later—I spent a significant part of last week working with key people within Adobe, and at Microsoft and W3C, to put together a proposal for a new OpenType feature, to be tagged ‘vrtr’, for supporting this soon-to-be published standard. Below is full description that we came up with, and which was submitted for inclusion in the OpenType Specification and in OFF (ISO/IEC 14496-22 or Open Font Format):

Tag: ‘vrtr’

Friendly name: Vertical Alternates For Rotation

Registered by: Adobe/Microsoft/W3C

Function: Transforms default glyphs into glyphs that are appropriate for sideways presentation in vertical writing mode. While the glyphs for most characters in East Asian writing systems remain upright when set in vertical writing mode, glyphs for other characters—such as those of other scripts or for particular Western-style punctuation—are expected to be presented sideways in vertical writing.

Example: As a first example, the glyphs for FULLWIDTH LESS-THAN SIGN (U+FF1C; “<”) and FULLWIDTH GREATER-THAN SIGN (U+FF1E; “>”) in a font with a non-square em-box are transformed into glyphs whose aspect ratio differs from the default glyphs, which are properly sized for sideways presentation in vertical writing mode. As a second example, the glyph for LEFT SQUARE BRACKET (U+005B, “[“) in a brush-script font that exhibits slightly rising horizontal strokes may use an obtuse angle for its upper-left corner when in horizontal writing mode, but an alternate glyph with an acute angle for that corner is supplied for vertical writing mode.

Recommended implementation: The font includes versions of the glyphs covered by this feature that, when rotated 90 degrees clockwise by the layout engine for sideways presentation in vertical writing, differ in some visual way from rotated versions of the default glyphs, such as by shifting or shape. The vrtr feature maps the default glyphs to the corresponding to-be-rotated glyphs (GSUB lookup type 1).

Application interface: For GIDs found in the vrtr coverage table, the layout engine passes GIDs to the feature, then gets back new GIDs.

UI suggestion: This feature should be active by default for sideways runs in vertical writing mode.

Script/language sensitivity: Applies to any script when set in vertical writing mode.

Feature interaction: The vrtr and vert features are intended to be used in conjunction: vrtr for glyphs intended to be presented sideways in vertical writing, and vert for glyphs to be presented upright. Since they must never be activated simultaneously for a given glyph, there should be no interaction between the two features. These features are intended for layout engines that graphically rotate glyphs for sideways runs in vertical writing mode, such as those conforming to UTR#50. (Layout engines that instead depend on the font to supply pre-rotated glyphs for all sideways glyphs should use the vrt2 feature in lieu of vrtr and vert.) Because vrt2 supplies pre-rotated glyphs, the vrtr feature should never be used with vrt2, but may be used in addition to any other feature.

Continue reading…

Some initial Adobe-Japan1-6 versus UTR #50 thoughts…

UTC (Unicode Technical Committee) Meeting #136 took place last week, and one of the significant outcomes was that UTR (Unicode Technical Report) #50 was advanced from Draft to Approved status. Congratulations to Koji ISHII (石井宏治), its editor, and also to Eric Muller, who took the initiative to start this project and served as its first editor.
Continue reading…

Font Development Via Unicode

Unicode has become the de facto way in which to represent text in digital form, and for good reason: its character set covers the vast majority of the world’s scripts. Other benefits of Unicode include the following:

  • That it is under active and continuous development, meaning that with each new version, more scripts are being supported, and additional characters for existing scripts are being standardized.
  • That it is aligned and kept in sync with ISO/IEC 10646 (available at no charge), which is quite a feat.

With regard to font development, Unicode is considered the default encoding for OpenType, which refers to the ‘cmap‘ table. The most common ‘cmap’ subtables are Formats 4 (BMP-only UTF-16) and 12 (UTF-32). The latter is used only when mappings outside of the BMP (Basic Multilingual Plane), meaning from one or more of the 16 Supplementary Planes, are used.
Continue reading…

「CSS Orientation Test OpenType Fonts」について

[This Japanese version of the May 31, 2013 article entitled CSS Orientation Test OpenType Fonts is courtesy of Hitomi Kudo (工藤仁美).]

五月三十一日にアドビの新しいオープンソースプロジェクトで、「CSS Orientation Test OpenType Fonts」をリリースしたのでお知らせします。このオープンソースプロジェクトは、Unicodeの次期UTR #50(「Unicode Vertical Text Layout」)のエディタである石井宏治氏のリクエストをもとに開発された、二つのOpenType/CFFフォントを含みます。これらフォントの目的は、フォント開発者がより簡単にグリフの方向に関するテストを行えるよう考慮したものです。
Continue reading…

CSS Orientation Test OpenType Fonts

I am pleased to announce that the new CSS Orientation Test OpenType Fonts open source project was launched on Adobe’s open-source portal, Open@Adobe, today. This open source project consists of two OpenType/CFF fonts that were developed at the request of Koji Ishii (石井宏治), the editor of Unicode’s forthcoming UTR #50 (Unicode Vertical Text Layout). The purpose of these fonts is for developers to be able to more easily test whether glyph orientation in their implementation is correct or not.
Continue reading…

Heisei “StdN” Fonts

We recently released alternate versions of two Heisei (平成) fonts, specifically Heisei Mincho StdN W3 (平成明朝 StdN W3) and Heisei Kaku Gothic StdN W5 (平成角ゴシック StdN W5). As the “StdN” designator suggests, JIS2004 glyphs are the default for these two fonts (the Heisei “Std” fonts use JIS90 glyphs by default).

These two fonts also differ from the Heisei “Std” fonts in that they include significantly more glyphs. The Heisei fonts were developed by a consortium of companies, and Adobe is one of the member companies. Interestingly, JIS X 0213:2004 glyph data was developed only for Heisei Mincho W3 and Heisei Kaku Gothic W5, and JIS X 0212-1990 glyph data was developed only for the former font. So, one of my projects last year was to map as many of these glyphs as possible to Adobe-Japan1-6 CIDs.
Continue reading…

Sequences

Sequences are important in the context of Unicode, and UAX #34 (Unicode Named Character Sequences) is a good reference for Unicode sequences. The first type of sequence that typically comes to mind in the context of Japanese are Ideographic Variation Sequences (IVSes), which are registered and maintained by The Unicode Consortium via the Ideographic Variation Database (IVD). There are also Standardized Variation Sequences that are much more closely bound to the standard.
Continue reading…

Standardized Variants—Part 3

I will close this particular topic by detailing how to support these proposed standardized variants in OpenType/CFF fonts.

For fonts that are currently IVS-enabled, such as those that include Format 14 ‘cmap’ subtables with Adobe-Japan1 or Hanyo-Denshi IVSes, it is important to note that the proposed standardized variants can co-exist with them, at least in terms of being specified in the font. For the former, I created an Adobe-Japan1_sequences.txt file that includes all registered Adobe-Japan1 IVSes, along with 89 of the 1,002 proposed standardized variants. The 89 standardized variants are at the end of the file. AFDKO tools, such as makeotf and spot, already support these standardized variants. When building OpenType/CFF fonts using the makeotf tool, this file is specified as the argument of the “-ci” command-line option.
Continue reading…

Standardized Variants—Part 2

To continue from the December 26, 2012 article, I should first point out that there is a relationship between these 1,002 proposed standardized variants and IVSes (Ideographic Variation Sequences). Standardized variants are standardized, hence their name. IVSes, on the other hand, are registered via a process that is described in UTS #37 and administered by the IVD Registrar (which happens to me at the moment).
Continue reading…

Standardized Variants—Part 1

One problem that has been plaguing CJK Compatibility Ideographs is the fact that they are adversely affected by normalization. Regardless of which of the four normalization forms is applied—NFC, NFD, NFKC, or NFKD—they are converted to their canonical equivalents, which are CJK Unified Ideographs. This is a problem, particularly for Japan, because 75 kanji in JIS X 0213:2004 kanji map to CJK Compatibility Ideograph code points. Furthermore, 57 of these 75 kanji correspond to Jinmei-yō Kanji (人名用漢字), meaning that they are used for personal names. The bottom-line problem with CJK Compatibility Ideographs is that any application of normalization, by any process, will permanently remove any distinctions between a CJK Compatibility Ideograph and its canonical equivalent. Not all processes are under one’s direct control, meaning that it is impossible to guarantee that normalization will not be applied. My opinion is that it is prudent to assume that normalization will be applied, and that preemption is the best solution.
Continue reading…

Old Hangul—Redux

In the December 4, 2012 Old Hangul article I mentioned that the ‘ccmp’ GSUB feature that is referenced in Microsoft’s Developing OpenType Fonts for Korean Hangul Script document is not necessary. Jaemin Chung kindly pointed out to me that environments that do not yet support Unicode Version 5.2 still require the ‘ccmp‘ (Glyph Composition/Decomposition) GSUB feature to be present, otherwise proper shaping will not happen.

The main purpose of this short article is to provide a revised Perl script, named mkoldhangul-ccmp.pl, that adds a complete ‘ccmp’ GSUB feature definition for environments that do not yet support Unicode Version 5.2 (or greater). The sample glyph-map.txt datafile that maps the Unicode-based glyph names to CIDs is unchanged.

Old Hangul

Okay. It is time to put some “K” into CJK…

Seriously, much of the content of this blog has been focused on Chinese and Japanese issues. This article will provide some much-deserved Korean content.

I spent the last few days coming to grips with Old Hangul (옛한글 yethangeul), specifically how to implement proper shaping using the three registered OpenType GSUB features, ‘ljmo‘ (Leading Jamo Forms), ‘vjmo‘ (Vowel Jamo Forms), and ‘tjmo‘ (Trailing Jamo Forms).
Continue reading…

These are a few of my favorite things…

I like ASCII. Do I like ASCII because of all the wonderful things one can do with its extraordinarily large repertoire of 94 printable characters? Actually, yes. Before I defend that answer, I’d like to point out that ASCII has three important strengths: simplicity, robustness, and ubiquity. In other words, ASCII is simple in that it has a relatively small number of characters; it forms a subset of virtually every encoding, Unicode or otherwise; and is supported everywhere. In fact, ASCII can be used to represent Unicode through the use of notations. Richard Ishida‘s excellent Unicode Code Converter is an excellent way to explore the various notations that are currently in use.
Continue reading…

Adobe’s First Open Source Font: Kenten Generic

Hoping not to detract from the attention that Paul Hunt‘s Source Sans Pro, Adobe’s first open source typeface family, deserves, I’d like to use this opportunity to point out that another font, a single typeface design with a very small number of glyphs, was Adobe’s first entry in the open source world, in terms of font offerings. Kenten Generic was released on November 4th, 2010 at the Open @ Adobe portal. It includes only thirteen glyphs—ten of which are functional—that are intended for use in typesetting emphasis marks, which are referred to as kenten (圏点) in Japanese, hence the font’s name. The easiest way to view its glyphs is to download its Unicode-based glyph synopsis.
Continue reading…

CFR Support in Mac OS X Version 10.8 (Mountain Lion)

On July 25, 2012, Apple released to the world Mac OS X Version 10.8 (aka Mountain Lion). Among the many new features in this latest iteration of Mac OS X is support for CFR objects. For those who are not aware, CFR objects are based on ISO/IEC 14496-28:2012 (Composite Font Representation), and are used to define both composite fonts and fallback fonts. CFR objects effectively break the 64K glyph barrier. Mac OS X Version 10.8 is thus the first implementation that has broken the 64K glyph barrier.
Continue reading…

ISO/IEC 10646:2012 Published!

ISO/IEC 10646:2012 (Third Edition) was just published. This is the first version of the standard that includes multiple-column Code Charts for Extension B, and for CJK Compatibility Ideographs. Another significant aspect of ISO/IEC 10646:2012 is that it is equivalent to Unicode Version 6.1.

For Adobe, the publishing of this new version of the standard represents a significant milestone, because it means that every Adobe-Japan1-6 kanji is either directly encoded, or is directly associated with a registered IVS in the IVD (Ideographic Variation Database).

Speaking of Unicode Version 6.1, the printed version of the Core Specification is available via POD from Lulu, and at a very attractive price.

ISO/IEC 14496-28:2012 Published

Born from the conclusion that OpenType’s 64K glyph barrier cannot be broken in the context of the format itself, ISO/IEC 14496-28:2012 (Composite Font Representation) was developed, and was subsequently published three days ago, on April 17, 2012, as a new ISO standard. As described in the January 26, 2012 CJK Type Blog article, CID-keyed fonts can include a maximum of 65,535 glyphs (CIDs 0 through 65534). Considering that Unicode Version 6.1 includes over 100K characters, with approximately 75K of which being CJK Unified Ideographs, it becomes immediately apparent that a single font resource cannot support all of Unicode, let alone all of the characters for a single script (referring to CJK Unified Ideographs).
Continue reading…