Author Archive: Dr. Ken Lunde

Adobe-KR-9 캐릭터 콜렉션—첫 번째 공개 릴리스

English (영어) here

2017년 6월 23일 제가 서울에 위치한 산돌의 사무실을 방문했을 때, 마지막 업데이트로부터 근 20년이 다 되어가는 Adobe-Korea1-2 (링크를 클릭하면 PDF가 다운로드됨)를 대체할 만한 새로운 한국어 글리프 세트를 개발하는 것에 대한 논의가 시작되었습니다. 그리고 이는 새로운 Adobe-KR-9 캐릭터 컬렉션의 첫 번째 공개 릴리스로 완결되었습니다. 본 글리프 세트는 네 번의 초안을 거쳐 완성되었습니다. 첫 번째 초안은 2017년 10월 1일, 두 번째 초안은 2017년 12월 19일, 세 번째 초안은 2018년 1월 8일, 그리고 2018년 3월 2일의 네 번째 초안을 거쳐 2018년 5월 15일 베타 릴리스가 공개되었습니다. 여기에는 전체 데이터 파일 세트, 대표적인 글리프들의 완전한 세트, 오픈타입/CFF 폰트의 모든 기능이 포함된 두 가지 예제, 그리고 기타 보조 자료가 포함되어 있습니다.

본 글리프 세트에 대한 아이디어가 탄생한 이래로 1년이 조금 넘게 지난 오늘, 첫 번째 공개 릴리스를 발표하게 된 것을 기쁘게 생각합니다. 베타 릴리스와 첫 번째 공개 릴리스 사이에 변경된 사항이 궁금하신 분들은 스펙의 이전 버전 변경사항 섹션을 참조하십시오. 첫 번째 공개 릴리스에 해당하는 Adobe-KR-9 CMap 리소스는 이제 CMap 리소스 프로젝트에서 사용이 가능합니다. 이 프로젝트를 방문하는 동안, 최신 출시 버전에서 1,990페이지의 UTF-32.pdf 파일을 다운로드하여 책갈피를 확인하시기 바랍니다. 이 버전은 UTF-32 CMap 리소스를 위한 글리프 테이블을 제공합니다.


Adobe-Japan1-6 Is Expecting!

Dr. Ken Lunde, the attending physician at Adobe Regional Hospital, would like to share some Very Good News™ about Adobe-Japan1-6. According to a recent pregnancy test, Adobe-Japan1-6 is expecting, and Supplement 7—to be referred to as Adobe-Japan1-7—is now scheduled to be born sometime during the first half of next year. What’s more, the ultrasounds have revealed that near-identical twins are expected.

After a thorough investigation into this news-worthy matter, Dr. Lunde strongly suspects that Supplement 7 was originally conceived on December 1, 2017, and based on the current growth rate of the twins, as evidenced through ultrasounds, the expected delivery date will be on or around May 1, 2019, but definitely not before February 24, 2019.
Continue reading…

The Adobe-KR-9 Character Collection—First Public Release

한국어는 (Korean) 여기

What began on 2017-06-23 when I visited Sandoll‘s office in Seoul, which included discussions about developing a new Korean glyph set to replace Adobe-Korea1-2 (if you click the link, the PDF will download) that was last updated nearly 20 years ago, has culminated in the First Public Release of the new Adobe-KR-9 Character Collection. This glyph set went through four drafts—First Draft on 2017-10-01, Second Draft on 2017-12-19, Third Draft on 2018-01-18, and Fourth Draft on 2018-03-02—followed by a Beta Release on 2018-05-15 that included a complete set of data files, a complete set of representative glyphs, two fully-functional example OpenType/CFF fonts, and other collateral materials.

After a little over a year since the idea for this glyph set was born, I am pleased to announce that the First Public Release was issued today. For those who are curious about what changed between the Beta Release and the First Public Release, please reference the Changes Since Earlier Versions section of the specification. The Adobe-KR-9 CMap resources that correspond to the First Public Release are now available in the CMap Resources project. While visiting that project, be sure to download the bookmarked 1,990-page UTF-32.pdf file from the latest release that provides glyph tables for all UTF-32 CMap resources.


“Tally Marks” OpenType-SVG Font

As a follow up to my Ideographic Tally Marks article from over two years ago, the characters for two tally mark systems—ideographic (called 正の字 sei-no ji in Japanese, and 正字 zhèng zì in Chinese) and Western-style—are among the 684 new characters in Unicode Version 11.0 that was released exactly a week ago, and these seven new characters can be found in the existing Counting Rod Numerals block from U+1D372 through U+1D378.
Continue reading…

Unicode Version 11.0

Unicode Version 11.0 was released today, and—as usual—new CJK Unified Ideographs were added, albeit a very modest number. I used this opportunity to update my trusty single-page PDF that keeps track of the CJK Unified Ideographs and CJK Compatibility Ideographs in Unicode, and which provides additional details, such as version information, the number of remaining code points in each block, and so on.

Interestingly, Extension G (aka IRG Working Set 2015) is unlikely to be included in Unicode Version 12.0 (2019), given its accelerated schedule, so we’re looking at Version 13.0 (2020) for the official opening of Plane 3 (aka TIP or Tertiary Ideographic Plane).

Or, are you more interested in the new emoji that were added? 🤔


The Adobe-KR-9 Character Collection—Beta Release

I am pleased to announce that the Adobe-KR-9 character collection, which went through four drafts, is now available as a Beta release that includes all of the expected collateral pieces, to include two fully-functional OpenType fonts with all of its glyphs. The Adobe-KR-9 project includes the specification proper, along with most of the collateral pieces. The two OpenType fonts are available for convenient download on the latest release page.

The CMap resources are also available in the CMap Resources project, and an updated UTF-32.pdf file that includes a Unicode-based glyph synopsis for the Adobe-KR-9 character collection is available on the latest release page.
Continue reading…

Contextual Spacing GPOS Features: ‘cspc’ & ‘vcsp’

Japanese line layout is very complex, and the first attempt to standardize its rules and principles was in the JIS X 4051 standard, which was first issued in 1993 with the title 日本語文書の行組版方法 (Line Composition Rules for Japanese Documents in English). There was a revision issued in 1995, and the latest version was issued in 2004 with the slightly different title 日本語文書の組版方法 (Formatting rules for Japanese documents). Another important document is the W3C Working Group Note JLREQ (Requirements for Japanese Text Layout), which provides much of what is described in JIS X 4051, but covers additional areas, and is tailored toward web technologies. Although still considered working drafts, W3C is also preparing similar documents for Chinese and Korean as CLREQ (Requirements for Chinese Text Layout) and KLREQ (Requirements for Hangul Text Layout and Typography), respectively.

This article is not about these standards per se, which are intended for apps and environments that implement sophisticated line layout. Rather, this article is about harsher “plain text” or comparable environments that generally do not need such treatment, yet still benefit from a modest amount of context-based spacing adjustment, particularly to get rid of unwanted space between full-width brackets and other punctuation whose glyphs generally fill half of the em-box. App menus, app dialogs, and simple text editors are examples of where such adjustments can improve text layout in these modest ways.
Continue reading…


This is a brief article to let the readership know that the Unicode Consortium now offers lifetime memberships for individual members. My lifetime membership certificate is shown above.
Continue reading…

UTC #155

The next UTC (Unicode Technical Committee) meeting—the 155th one—takes place during the week of April 30th, and will be hosted at the Adobe headquarters in San José, California. Of course, all voting members of the Unicode Consortium are strongly encouraged to attend.
Continue reading…

CMap Resources & Character Collections

The CMap resources that are associated with our public glyph sets—called character collections—were first open-sourced on 2009-09-21 via Adobe’s first open source portal, and about a year later the project was moved to SourceForge. I then migrated the project to GitHub on 2015-03-27 where it is likely to remain for the foreseeable future. The main purpose for open-sourcing our CMap resources was to make it easier for developers to include them in their own open source projects, many of which require that the components themselves be open source.

I then open-sourced three of our four character collections on GitHub—Adobe-GB1-5, Adobe-CNS1-7, and Adobe-Japan1-6—in October of last year. The Adobe-Korea1-2 character collection was intentionally not open-sourced, because it will soon be replaced by the Adobe-KR-9 character collection that is expected to be published in mid-May.
Continue reading…

Adobe-KR-9 Fourth Draft

This article picks up where the 2018-01-18 article left off, and provides details about the fourth—and hopefully final—draft of the forthcoming Adobe-KR-9 character collection that was issued today.

The fourth draft of the Adobe-KR-9 character collection includes 22,860 glyphs (CIDs 0 through 22859) distributed among ten Supplements. When compared to the third draft, four glyphs were removed, only one glyph was added, a small number of glyphs were moved from Supplement 0 to later Supplements, and the ordering of Supplements 3 through 9 was changed. Because it is a draft, the details are still subject to change, though my hope is that this draft represents what will become the final character collection specification.
Continue reading…

Exploring IICore—Part 5

Part 1, Part 2, Part 3, and Part 4 of this series scrutinized the ideographs that are associated with each of the seven region tags of the kIICore property. In this fifth and final article of this series, I will provide some details about the earlier versions of IICore, and what changed between them.
Continue reading…

Exploring IICore—Part 4

In Part 1, Part 2, and Part 3 of this series, we examined and scrutinized the ideographs that are tagged “K” (for ROK or South Korea), “P” (for DPRK or North Korea), “J” (for Japan), and “G” (for PRC or China) in the kIICore property. In Part 4, which is today’s article, we will explore the ideographs that are tagged “T” (for ROC or Taiwan), “H” (for Hong Kong SAR), and “M” (for Macao SAR).
Continue reading…

Year of the Dog

I’d like to use this opportunity to welcome the year of the dog, which is expressed using the CJK Unified Ideograph (U+620C), and to wish a Happy Chinese New Year to all of my friends, colleagues, and blog readers who are celebrating this holiday. May this year be safe, prosperous, and enjoyable.
Continue reading…

Exploring IICore—Part 3

In Part 1 and Part 2 of this series, we examined and scrutinized the ideographs that are tagged “K” (for ROK or South Korea), “P” (for DPRK or North Korea), and “J” (for Japan) in the kIICore property. In Part 3, which is today’s article, we will explore the 5,825 ideographs that are tagged “G” (for PRC or China).
Continue reading…

Exploring IICore—Part 2

In Part 1 of this series, which is intended to scrutinize the 9,810 CJK Unified Ideographs that comprise IICore, we explored some of the oddities that related to ROK (aka South Korea). In Part 2 of this series, we will explore the ideographs that are tagged “P” and “J” for DPRK (aka North Korea) and Japan use, respectively.
Continue reading…

Exploring IICore—Part 1

Today’s article is the very first one that references IICore (International Ideographs Core), which is best described as a region-agnostic subset that includes the most commonly used CJK Unified Ideographs in Unicode, and is intended for use in memory-challenged devices and environments. Included are 9,810 ideographs, the bulk of which are in the URO (9,706), with the remaining ones in Extensions A (42) and B (62).

IICore is instantiated as the kIICore property of the Unihan Database, and documented in UAX #38. The kIICore property values consist of an initial letter—A, B, or C—that indicates priority, followed by one or more letters that specify a source that more or less corresponds to a region: G, H, J, K, M, P (short for KP), and T.
Continue reading…

Unihan & Moji Jōhō Kiban Project: The Tip of the Iceberg

As evidenced by the very last paragraph of IRG N1964 (aka L2/13-192), which was discussed during IRG #41 that took place in Tōkyō, Japan at the end of 2013, I have been curious as to why many ideographs that are commonly used in Japan lack a UAX #38 kIRG_JSource property value. As suggested by this recent tweet, I have been thinking about this again…
Continue reading…

Standardized Variation Sequences—Part 1

This is a brief article to report that the 16 SVSes (Standardized Variation Sequences) for eight full-width punctuation characters—U+3001 、 IDEOGRAPHIC COMMA, U+3002 。 IDEOGRAPHIC FULL STOP, U+FF01 ! FULLWIDTH EXCLAMATION MARK, U+FF0C , FULLWIDTH COMMA, U+FF0C , FULLWIDTH COMMA, U+FF1A : FULLWIDTH COLON, U+FF1B ; FULLWIDTH SEMICOLON & U+FF1F ? FULLWIDTH QUESTION MARK—that I proposed in L2/17-436 were accepted for Unicode Version 12.0 during UTC #154 this week. After reading the Script Ad Hoc group’s comments, I prepared a revised version (L2/17-436R) that provided additional information as a response to the two comments, which included the table that is shown above, and this served as the basis for the discussions.

This all began with a proposal that I submitted four years ago, L2/14-006, which was resurrected as L2/17-056, and finally discussed during UTC #153 during which I received constructive feedback. This prompted me to split the proposal into two parts. The first part proposed the less-controversial SVSes, which are the ones that were accepted. The second part, L2/18-013, proposes the more controversial ones. I am fully expecting to revise the second part before it is discussed during UTC #155, which begins on 2018-04-30.

I would like to use this opportunity to solicit comments and feedback for L2/18-013, which would be taken into account when I revise it. (I also hope to receive feedback from the Script Ad Hoc group prior to UTC #155, which would also be taken into account.)

In closing, the 16 new SVSes should soon appear in The Pipeline.


Adobe-KR-9 Third Draft

This article picks up where the 2017-12-19 article left off, and provides details about the third draft of the forthcoming Adobe-KR-9 character collection that was issued today.

The third draft of the Adobe-KR-9 character collection includes 22,863 glyphs (CIDs 0 through 22862) distributed among ten Supplements. When compared to the second draft, three glyphs were removed, 254 glyphs were added, and the distribution of glyphs among some of the Supplements was changed. Because it is a draft, the details are still subject to change, though I suspect that any changes will be minimal at this point.
Continue reading…