“Out of the blue.”
I was sitting near the back of a crowded room, filled with many familiar faces, when the two-day conference proper of IUC42 (42nd Unicode & Internationalization Conference) began promptly at 9AM on September 11, 2018, with Mark Davis, President of Unicode, making the opening statements. When Mark announced the recipient of this year’s Unicode Bulldog Award, it took me by complete surprise when I heard my name called. Wow. What an absolute honor. In fact, I would claim that this is one of the biggest honors of my life, especially given that Unicode now transcends so many aspects of our society. Looking back at the 27 prior recipients of this award, almost all of whom I consider to be friends, I am definitely in good company.
Luckily for this blog’s readership, Rick McGowan managed to capture all of my embarrassing moments in a three-minute video. You can also see the tweets that were published by @unicode and @AdobeType.
For better or worse, the proverbial bar has been raised, in terms of others’ expectations of me. I shall therefore endeavor not to disappoint. #MaximumEffort
This week once again proved that one is never too old to learn something new.
My friends at Sandoll Communications (산돌커뮤니케이션) kindly informed me earlier this week that the offical Korean Standards Association (한국표준협회/韓國標準協會) logo, U+327F ㉿ KOREAN STANDARD SYMBOL, which has been encoded in Unicode from the very beginning (Version 1.1), is generic, both in terms of typeface design and weight, and that there is an actual specification for its design. This character is included in Unicode because it was also included in the KS X 1001 (정보 교환용 부호(한글 및 한자)) standard at position 02-62. The very bottom of the specification page on the KSA website includes a link to a ZIP file that contains the image for the KSA logo in two forms: a 592×840-pixel JPEG image and an Adobe Illustrator vector image file.
日本語 (Japanese) はこちら
The feedback that we received from the previous article on this subject has been extraordinarily valuable. Our proposal to leave the names of Adobe-Japan1-7 subset fonts unchanged met with virtually unanimous agreement, but given the relatively minor nature of the Adobe-Japan1-7 additions, to the tune of only two glyphs, the same naming policy seems to benefit Adobe-Japan1-6 fonts as well.
In other words, fonts that currently support Adobe-Japan1-6 in its entirety can be updated to Adobe-Japan1-7 without changing their names. Of course, the advertised Supplement value as recorded in the 'CFF ' table should reflect 7. The following is a revised version of the table from the previous article on this subject:
||/CIDFontName & Menu Name Examples
||KozMinStd-Regular, 小塚明朝 Std R
KozMinStdN-Regular, 小塚明朝 StdN R
||KozMinPro-Regular, 小塚明朝 Pro R
KozMinProN-Regular, 小塚明朝 ProN R
||KozMinPr5-Regular, 小塚明朝 Pr5 R
KozMinPr5N-Regular, 小塚明朝 Pr5N R
||KozMinPr6-Regular, 小塚明朝 Pr6 R
KozMinPr6N-Regular, 小塚明朝 Pr6N R
||KozMinPr6-Regular, 小塚明朝 Pr6 R
KozMinPr6N-Regular, 小塚明朝 Pr6N R
Of course, we still welcome any and all feedback about this font-naming issue.
Page 697 of the Unicode Version 11.0 Core Specification includes an IICore subsection that states the following about its 9,810 ideographs: This coverage is of particular use on devices such as cell phones or PDAs, which have relatively stringent resource limitations. Various iterations of L2/18-066 attempted to tweak the existing IICore set based on my recent five-part analysis. I was even given a UTC #156 Action Item to update L2/18-066 once again.
However, after realizing that the “resource limitations” from about 15 years ago, which was when the current IICore set was established, no longer apply to today’s devices, I opted to propose a completely new version of IICore that would be tagged with its year of vintage, 2020, which is also the year in which the earliest version of Unicode that includes it would be released. While I intend to reveal more details later, what I plan to propose will include a little over 20K ideographs, and will be discussed during UTC #157 and IRG #51 later this year.
Edited To Add: The proposal was posted to the UTC document register on 2018-09-04 as L2/18-279, and is also available in the IRG #51 document register as IRG N2334. If anyone has any formal feedback for the UTC to consider during UTC #157, please use Unicode’s Contact Form.
日本語 (Japanese) はこちら
Per the previous article, the Adobe-Japan1-6 Character Collection specification will be updated to Adobe-Japan1-7 shortly after Japan’s new era name is announced. This article notes some of the changes that need to be considered as part of that update, and I am therefore soliciting feedback on the ideas that are presented below.
For OpenType Japanese fonts that already support Adobe-Japan1-6 in its entirety, meaning that all 23,058 glyphs are included, updating to Adobe-Japan1-7 is a relatively simply matter of adding two glyphs and its associated mappings, along with renaming the fonts to use an Adobe-Japan1-7 designator. Of course, not all fonts need to be updated to include the Adobe-Japan1-7 glyphs, and this article is meant to benefit Japanese font developers who plan to do so for some or all of their fonts.
(After realizing that the retargeting of Adobe-Japan1-7 to include only two glyphs, and with a fairly predictable release date range, exhibited characteristics of a pregnancy, I became inspired to write the text for the Adobe-Japan1-6 is Expecting! article while flying from SJC to ORD on the morning of 2018-07-20. I also prepared the article’s images while in-flight. The passenger sitting next to me was justifiably giving me funny looks. My flight to MSN, which was the final destination to attend my 35th high school class reunion in greater-metropolitan Mount Horeb, was delayed three hours, and this gave me an opportunity to publish the article while still on the ground at ORD.)
What do we know about Japan’s new era name? First and foremost, its announcement is unlikely to occur before 2019-02-25, because doing so would divert attention away from the 30th anniversary of the enthronement, 2019-02-24, but it may occur as late as 2019-05-01, which is the date on which the new era begins. That’s effectively a two-month window of uncertainty.
Interestingly, the date 2019-05-01 takes place not only during UTC #159, which will be hosted by me at Adobe, but also during Japan’s Golden Week (ゴールデンウィーク), which may begin early to prepare for the imperial transition.
Dr. Ken Lunde, the attending physician at Adobe Regional Hospital, would like to share some Very Good News™ about Adobe-Japan1-6. According to a recent pregnancy test, Adobe-Japan1-6 is expecting, and Supplement 7—to be referred to as Adobe-Japan1-7—is now scheduled to be born sometime during the first half of next year. What’s more, the ultrasounds have revealed that near-identical twins are expected.
After a thorough investigation into this news-worthy matter, Dr. Lunde strongly suspects that Supplement 7 was originally conceived on December 1, 2017, and based on the current growth rate of the twins, as evidenced through ultrasounds, the expected delivery date will be on or around May 1, 2019, but definitely not before February 24, 2019.
한국어는 (Korean) 여기
What began on 2017-06-23 when I visited Sandoll‘s office in Seoul, which included discussions about developing a new Korean glyph set to replace Adobe-Korea1-2 (if you click the link, the PDF will download) that was last updated nearly 20 years ago, has culminated in the First Public Release of the new Adobe-KR-9 Character Collection. This glyph set went through four drafts—First Draft on 2017-10-01, Second Draft on 2017-12-19, Third Draft on 2018-01-18, and Fourth Draft on 2018-03-02—followed by a Beta Release on 2018-05-15 that included a complete set of data files, a complete set of representative glyphs, two fully-functional example OpenType/CFF fonts, and other collateral materials.
After a little over a year since the idea for this glyph set was born, I am pleased to announce that the First Public Release was issued today. For those who are curious about what changed between the Beta Release and the First Public Release, please reference the Changes Since Earlier Versions section of the specification. The Adobe-KR-9 CMap resources that correspond to the First Public Release are now available in the CMap Resources project. While visiting that project, be sure to download the bookmarked 1,990-page UTF-32.pdf file from the latest release that provides glyph tables for all UTF-32 CMap resources.
As a follow up to my Ideographic Tally Marks article from over two years ago, the characters for two tally mark systems—ideographic (called 正の字 sei-no ji in Japanese, and 正字 zhèng zì in Chinese) and Western-style—are among the 684 new characters in Unicode Version 11.0 that was released exactly a week ago, and these seven new characters can be found in the existing Counting Rod Numerals block from U+1D372 through U+1D378.
Unicode Version 11.0 was released today, and—as usual—new CJK Unified Ideographs were added, albeit a very modest number. I used this opportunity to update my trusty single-page PDF that keeps track of the CJK Unified Ideographs and CJK Compatibility Ideographs in Unicode, and which provides additional details, such as version information, the number of remaining code points in each block, and so on.
Interestingly, Extension G (aka IRG Working Set 2015) is unlikely to be included in Unicode Version 12.0 (2019), given its accelerated schedule, so we’re looking at Version 13.0 (2020) for the official opening of Plane 3 (aka TIP or Tertiary Ideographic Plane).
Or, are you more interested in the new emoji that were added? 🤔
I am pleased to announce that the Adobe-KR-9 character collection, which went through four drafts, is now available as a Beta release that includes all of the expected collateral pieces, to include two fully-functional OpenType fonts with all of its glyphs. The Adobe-KR-9 project includes the specification proper, along with most of the collateral pieces. The two OpenType fonts are available for convenient download on the latest release page.
The CMap resources are also available in the CMap Resources project, and an updated UTF-32.pdf file that includes a Unicode-based glyph synopsis for the Adobe-KR-9 character collection is available on the latest release page.
Japanese line layout is very complex, and the first attempt to standardize its rules and principles was in the JIS X 4051 standard, which was first issued in 1993 with the title 日本語文書の行組版方法 (Line Composition Rules for Japanese Documents in English). There was a revision issued in 1995, and the latest version was issued in 2004 with the slightly different title 日本語文書の組版方法 (Formatting rules for Japanese documents). Another important document is the W3C Working Group Note JLREQ (Requirements for Japanese Text Layout), which provides much of what is described in JIS X 4051, but covers additional areas, and is tailored toward web technologies. Although still considered working drafts, W3C is also preparing similar documents for Chinese and Korean as CLREQ (Requirements for Chinese Text Layout) and KLREQ (Requirements for Hangul Text Layout and Typography), respectively.
This article is not about these standards per se, which are intended for apps and environments that implement sophisticated line layout. Rather, this article is about harsher “plain text” or comparable environments that generally do not need such treatment, yet still benefit from a modest amount of context-based spacing adjustment, particularly to get rid of unwanted space between full-width brackets and other punctuation whose glyphs generally fill half of the em-box. App menus, app dialogs, and simple text editors are examples of where such adjustments can improve text layout in these modest ways.
This is a brief article to let the readership know that the Unicode Consortium now offers lifetime memberships for individual members. My lifetime membership certificate is shown above.
The next UTC (Unicode Technical Committee) meeting—the 155th one—takes place during the week of April 30th, and will be hosted at the Adobe headquarters in San José, California. Of course, all voting members of the Unicode Consortium are strongly encouraged to attend.
The CMap resources that are associated with our public glyph sets—called character collections—were first open-sourced on 2009-09-21 via Adobe’s first open source portal, and about a year later the project was moved to SourceForge. I then migrated the project to GitHub on 2015-03-27 where it is likely to remain for the foreseeable future. The main purpose for open-sourcing our CMap resources was to make it easier for developers to include them in their own open source projects, many of which require that the components themselves be open source.
I then open-sourced three of our four character collections on GitHub—Adobe-GB1-5, Adobe-CNS1-7, and Adobe-Japan1-6—in October of last year. The Adobe-Korea1-2 character collection was intentionally not open-sourced, because it will soon be replaced by the Adobe-KR-9 character collection that is expected to be published in mid-May.
This article picks up where the 2018-01-18 article left off, and provides details about the fourth—and hopefully final—draft of the forthcoming Adobe-KR-9 character collection that was issued today.
The fourth draft of the Adobe-KR-9 character collection includes 22,860 glyphs (CIDs 0 through 22859) distributed among ten Supplements. When compared to the third draft, four glyphs were removed, only one glyph was added, a small number of glyphs were moved from Supplement 0 to later Supplements, and the ordering of Supplements 3 through 9 was changed. Because it is a draft, the details are still subject to change, though my hope is that this draft represents what will become the final character collection specification.
Part 1, Part 2, Part 3, and Part 4 of this series scrutinized the ideographs that are associated with each of the seven region tags of the kIICore property. In this fifth and final article of this series, I will provide some details about the earlier versions of IICore, and what changed between them.
In Part 1, Part 2, and Part 3 of this series, we examined and scrutinized the ideographs that are tagged “K” (for ROK or South Korea), “P” (for DPRK or North Korea), “J” (for Japan), and “G” (for PRC or China) in the kIICore property. In Part 4, which is today’s article, we will explore the ideographs that are tagged “T” (for ROC or Taiwan), “H” (for Hong Kong SAR), and “M” (for Macao SAR).
I’d like to use this opportunity to welcome the year of the dog, which is expressed using the CJK Unified Ideograph 戌 (U+620C), and to wish a Happy Chinese New Year to all of my friends, colleagues, and blog readers who are celebrating this holiday. May this year be safe, prosperous, and enjoyable.
In Part 1 and Part 2 of this series, we examined and scrutinized the ideographs that are tagged “K” (for ROK or South Korea), “P” (for DPRK or North Korea), and “J” (for Japan) in the kIICore property. In Part 3, which is today’s article, we will explore the 5,825 ideographs that are tagged “G” (for PRC or China).