Adobe-KR-9 Third Draft

This article picks up where the 2017-12-19 article left off, and provides details about the third draft of the forthcoming Adobe-KR-9 character collection that was issued today.

The third draft of the Adobe-KR-9 character collection includes 22,863 glyphs (CIDs 0 through 22862) distributed among ten Supplements. When compared to the second draft, three glyphs were removed, 254 glyphs were added, and the distribution of glyphs among some of the Supplements was changed. Because it is a draft, the details are still subject to change, though I suspect that any changes will be minimal at this point.

The table below details the number of glyphs per Supplement, their CID ranges, and a high-level summary of the glyphs in each:

Supplement Glyphs CID Range Scope
0 3,044 0–3043 Core glyphs
1 1,581 3044–4624 Supplementary modern hangul syllables
2 6,814 4625–11438 Tertiary modern hangul syllables
3 4,620 11439–16058 KS X 1001 hanja
4 280 16059–16338 Enclosed digits, Latin characters & hangul letters/syllables
5 144 16339–16482 Full-width Latin characters & vertical forms
6 352 16483–16834 KS X 1001 & Microsoft Code Page 949 compatibility
7 404 16835–17238 Latin, Greek, Cyrillic & Kana
8 2,003 17239–19241 Hangul tone marks, pre-composed hangul syllables for the Jeju dialect (제주말 jejumal) & combining jamo
9 3,621 19242–22862 Supplementary hanja

No actual glyphs are provided or shown, and like for the first and second drafts, I have put together a data file that specifies for each glyph its CID, Unicode-based glyph name, the Unicode code point or sequence, and the actual character or sequence with optional character name. A future draft will include a glyph table to supplement the data file. The final published version will definitely include a glyph table that will use representative glyphs based on the open source Source Han Serif (본명조) Pan-CJK typeface.

I also prepared an updated mapping file that maps 506 additional code points to existing glyphs, 270 of which correspond to CJK Compatibility Ideographs. Added to this file for the third draft are mappings for 176 of the 214 Kangxi Radicals and for the 52 Halfwidth Hangul variants.

The sections below provide some brief details about the scope and purpose of each of the ten tentative Supplements:

Supplement 0

Supplement 0 includes a very modest 3,044 glyphs, is meant to include the core glyphs that should be present in modern Korean font resources, and therefore serves as a minimal glyph set for today’s Unicode-based environments. Of course, glyphs for the core set of 2,350 modern hangul syllables are included, along with glyphs for 418 additional high-frequency modern hangul syllables whose set was determined by KFA (Korea Font Association). In addition, glyphs for nine additional modern hangul LV syllables that enable input by preventing the orphaning of the corresponding LVT ones that include them are supported (five of them—U+B894 뢔, U+C330 쌰, U+C3BC 쎼, U+C4D4 쓔, and U+CB2C 쬬—benefit the basic set of 2,350 syllables, and the other four—U+B060 끠, U+B7D0 럐, U+CB80 쮀, and U+D5AC 햬—benefit the 418 additional syllables). In other words, glyphs for 2,777 modern hangul syllables are included in this Supplement. 422 modern hangul syllables were moved from Supplement 1 to this Supplement in the third draft.

Also included in this Supplement are glyphs for ASCII, some ISO Latin 1 (aka ISO/IEC 8859-1) characters, punctuation, and some symbols. Several of the glyphs, such as those for punctuation, include both Western and Korean forms, and the short-term intent is to use the OpenType 'locl' (Localized Forms) GSUB feature to switch between them. The long-term goal is to define Standardized Variation Sequences (SVSes) for them as proposed in L2/18-013 that is expected to be discussed during UTC #154 next week.

Supplement 1

The second Supplement includes the glyphs for an additional 1,581 modern hangul syllables that come from the union of those in the KS X 1002 (ROK 🇰🇷), KPS 9566 (DPRK 🇰🇵), and GB 12052 (PRC 🇨🇳) standards, but exclude those that are already supported in Supplement 0. 1,561 of these glyphs correspond to KS X 1002, 11 are specific to KPS 9566 (U+AD98 궘, U+AF31 꼱, U+AFE5 꿥, U+B2FE 닾, U+B570 땰, U+B6CC 뛌, U+B745 띅, U+C836 젶, U+CA34 쨴, U+CD44 쵄 & U+D5D5 헕), and nine are specific to GB 12052 (U+AC03 갃, U+B609 똉, U+B9E7 맧, U+BBC3 믃, U+BF59 뽙, U+BFE5 뿥, U+C6D8 웘, U+CB94 쮔 & U+D63B 혻). 422 modern hangul syllables were moved from this Supplement to Supplement 0 in the third draft.

In other words, Supplements 0 and 1 together provide basic support for the three regions with Korean-speaking populations for which regional standards have been established, at least in terms of the glyphs for pre-composed modern hangul syllables.

Supplement 2

Supplement 2 simply includes the glyphs for the remaining 6,814 modern hangul syllables to form the complete set of 11,172 that have been in Unicode since Version 2.0.

(This Supplement is unchanged from the second draft.)

Supplement 3

Supplement 3 includes the glyphs for the 4,888 hanja (aka CJK Unified Ideographs) that are included in the KS X 1001 standard. The number of glyphs is actually 4,620, because 268 of the 4,888 hanja are genuine duplicates that are included due to multiple readings.

(This Supplement is unchanged from the second draft.)

Supplement 4

The fifth Supplement includes 280 glyphs for enclosed digits, Latin characters, and hangul letters/syllables. The scope goes beyond what is found in the KS standards, and includes appropriate characters found in the Unicode blocks named Enclosed Alphanumerics, Dingbats, Enclosed CJK Letters and Months, and Enclosed Alphanumeric Supplement. Added in the third draft are glyphs for the 30 characters in the ranges U+3251 ㉑ through U+325F ㉟ and U+32B1 ㊱ through U+32BF ㊿.

Supplement 5

Supplement 5 includes glyphs for the full-width Latin characters and vertical forms. Added in the third draft are additional full-width brackets and their corresponding vertical forms. In addition, the glyphs for the two hangul tone marks, to include their vertical forms, were moved to Supplement 8.

Supplement 6

This Supplement is meant to include glyphs for KS X 1001 and Microsoft Code Page 949 compatibility, for the benefit of font developers who feel that they need to support these particular standards in their entirety. Included are glyphs for math (only the basic math symbols are included in Supplement 0), line-drawing characters, and other symbols. Added in the third draft are six glyphs for Microsoft Code Page 949 compatibility.

Supplement 7

Supplement 7 is intended to include glyphs for foreign languages, such as those for extended Latin, Greek, Cyrillic, and Japanese kana. While most of the characters that are supported by these glyphs are in the KS X 1001 standard, I need to point out that this Supplement actually includes glyphs for characters outside of that standard, such as U+03C2 ς GREEK SMALL LETTER FINAL SIGMA for making Greek functional, and additional kana and kana-related characters, such as U+30FC ー KATAKANA-HIRAGANA PROLONGED SOUND MARK, which is necessary for katakana, along with appropriate vertical forms.

(This Supplement is unchanged from the second draft.)

Supplement 8

Supplement 8 includes the two hangul tone marks and their vertical forms, and is meant to include a small set of pre-composed pre-modern hangul syllables that fall outside the modern set of 11,172, and whose scope is well-defined. As opposed to the approach that was used for the Source Han and Noto CJK typeface designs, which involved cherry-picking the 500 most frequently-used pre-modern hangul syllables, I figured that including pre-composed forms of the 160 pre-modern hangul syllables that are necessary for the Jeju dialect (제주말 jejumal) seemed appropriate, along with an additional LV syllable (<U+1105,U+11A2> ᄅᆢ) that prevents the orphaning of an LVT one that includes it (<U+1105,U+11A2,U+11B8> ᄅᆢᆸ) for a total of 161 pre-modern hangul syllables. The rest of this Supplement includes the nominal forms of combining jamo, along with the combining forms themselves. Included in the latter are six sets of leading jamo, two sets of vowel jamo, and four sets of trailing jamo. Of course, this is modeled after what was done for the successful and broadly-deployed Source Han and Noto CJK typeface designs. The OpenType 'ljmo' (Leading Jamo Forms), 'vjmo' (Vowel Jamo Forms), and 'tjmo' (Trailing Jamo Forms) GSUB features are expected to be used. The glyphs for the two hangul tone marks, to include their vertical forms, were moved from Supplement 5 to this Supplement, and a sixth set of leading jamo forms were added so that the corresponding 124 nominal forms can have their own glyphs.

The 2,003 glyphs in this Supplement include a modest subset of 1,838 glyphs for combining jamo that can represent a staggering 1,638,750 hangul syllables (11,875 LV plus 1,626,875 LVT sequences), with the 11,172 modern hangul syllables being a very tiny subset.

Supplement 9

The tenth and final Supplement includes 3,621 glyphs for additional hanja beyond those in Supplement 3. Of course, glyphs for the 2,856 hanja in the KS X 1002 standard are included. The rest of the glyphs are for hanja found in the Korean Supreme Court’s list, 665 of which are encoded in the URO and Extensions A, B, E, and F. 18 are supported by the IVD (Ideographic Variation Database) via the recently-registered KRName IVD collection, and one outlier will be in Extension G with U+30726 as its tentative code point. Also included—and added in the third draft—are 81 additional hanja, 73 of which are from PRC’s GB 12052 standard, with the remaining eight coming from DPRK’s KPS 9566 standard. The ordering of the glyphs was changed in the third draft to reflect Unicode order, with those that correspond to KRName IVSes at the end.

In closing, I once again welcome any and all actionable feedback. While I don’t expect any glyphs to be removed at this point, a small number of glyphs may still be added, and the chance that the distribution of some glyphs will change among the Supplements—particularly those for modern hangul syllables in Supplements 0 and 1—is non-zero. Anyway, this third draft is currently under review by my friends at Sandoll Communications, along with the Korea Font Association (KFA), but anyone is welcome to provide feedback by submitting comments against this article.

The constructive comments and feedback received thus far—from Sandoll Communications, KFA, and my friend Jaemin Chung—have been extraordinarily helpful in preparing this third draft. 감사합니다!


