Exploring IICore—Part 5

Part 1, Part 2, Part 3, and Part 4 of this series scrutinized the ideographs that are associated with each of the seven region tags of the kIICore property. In this fifth and final article of this series, I will provide some details about the earlier versions of IICore, and what changed between them.
Continue reading…

Exploring IICore—Part 4

In Part 1, Part 2, and Part 3 of this series, we examined and scrutinized the ideographs that are tagged “K” (for ROK or South Korea), “P” (for DPRK or North Korea), “J” (for Japan), and “G” (for PRC or China) in the kIICore property. In Part 4, which is today’s article, we will explore the ideographs that are tagged “T” (for ROC or Taiwan), “H” (for Hong Kong SAR), and “M” (for Macao SAR).
Continue reading…

Year of the Dog

I’d like to use this opportunity to welcome the year of the dog, which is expressed using the CJK Unified Ideograph (U+620C), and to wish a Happy Chinese New Year to all of my friends, colleagues, and blog readers who are celebrating this holiday. May this year be safe, prosperous, and enjoyable.
Continue reading…

Exploring IICore—Part 3

In Part 1 and Part 2 of this series, we examined and scrutinized the ideographs that are tagged “K” (for ROK or South Korea), “P” (for DPRK or North Korea), and “J” (for Japan) in the kIICore property. In Part 3, which is today’s article, we will explore the 5,825 ideographs that are tagged “G” (for PRC or China).
Continue reading…

Exploring IICore—Part 2

In Part 1 of this series, which is intended to scrutinize the 9,810 CJK Unified Ideographs that comprise IICore, we explored some of the oddities that related to ROK (aka South Korea). In Part 2 of this series, we will explore the ideographs that are tagged “P” and “J” for DPRK (aka North Korea) and Japan use, respectively.
Continue reading…

Exploring IICore—Part 1

Today’s article is the very first one that references IICore (International Ideographs Core), which is best described as a region-agnostic subset that includes the most commonly used CJK Unified Ideographs in Unicode, and is intended for use in memory-challenged devices and environments. Included are 9,810 ideographs, the bulk of which are in the URO (9,706), with the remaining ones in Extensions A (42) and B (62).

IICore is instantiated as the kIICore property of the Unihan Database, and documented in UAX #38. The kIICore property values consist of an initial letter—A, B, or C—that indicates priority, followed by one or more letters that specify a source that more or less corresponds to a region: G, H, J, K, M, P (short for KP), and T.
Continue reading…

Unihan & Moji Jōhō Kiban Project: The Tip of the Iceberg

As evidenced by the very last paragraph of IRG N1964 (aka L2/13-192), which was discussed during IRG #41 that took place in Tōkyō, Japan at the end of 2013, I have been curious as to why many ideographs that are commonly used in Japan lack a UAX #38 kIRG_JSource property value. As suggested by this recent tweet, I have been thinking about this again…
Continue reading…

Standardized Variation Sequences—Part 1

This is a brief article to report that the 16 SVSes (Standardized Variation Sequences) for eight full-width punctuation characters—U+3001 、 IDEOGRAPHIC COMMA, U+3002 。 IDEOGRAPHIC FULL STOP, U+FF01 ! FULLWIDTH EXCLAMATION MARK, U+FF0C , FULLWIDTH COMMA, U+FF0C , FULLWIDTH COMMA, U+FF1A : FULLWIDTH COLON, U+FF1B ; FULLWIDTH SEMICOLON & U+FF1F ? FULLWIDTH QUESTION MARK—that I proposed in L2/17-436 were accepted for Unicode Version 12.0 during UTC #154 this week. After reading the Script Ad Hoc group’s comments, I prepared a revised version (L2/17-436R) that provided additional information as a response to the two comments, which included the table that is shown above, and this served as the basis for the discussions.

This all began with a proposal that I submitted four years ago, L2/14-006, which was resurrected as L2/17-056, and finally discussed during UTC #153 during which I received constructive feedback. This prompted me to split the proposal into two parts. The first part proposed the less-controversial SVSes, which are the ones that were accepted. The second part, L2/18-013, proposes the more controversial ones. I am fully expecting to revise the second part before it is discussed during UTC #155, which begins on 2018-04-30.

I would like to use this opportunity to solicit comments and feedback for L2/18-013, which would be taken into account when I revise it. (I also hope to receive feedback from the Script Ad Hoc group prior to UTC #155, which would also be taken into account.)

In closing, the 16 new SVSes should soon appear in The Pipeline.


Adobe-KR-9 Third Draft

This article picks up where the 2017-12-19 article left off, and provides details about the third draft of the forthcoming Adobe-KR-9 character collection that was issued today.

The third draft of the Adobe-KR-9 character collection includes 22,863 glyphs (CIDs 0 through 22862) distributed among ten Supplements. When compared to the second draft, three glyphs were removed, 254 glyphs were added, and the distribution of glyphs among some of the Supplements was changed. Because it is a draft, the details are still subject to change, though I suspect that any changes will be minimal at this point.
Continue reading…

UTC #154: SVSes, IDCs, KPS 9566 & Unicode 11.0

The 154th UTC (Unicode Technical Committee) meeting, which starts one week from tomorrow, will have a very interesting agenda for me, based on the latest documents at the end of the 2017 document register, and in the 2018 one.
Continue reading…

Standards 102—Silent Corrections

Continuing where my Standards 101 article left off, class is once again in session as Standards 102, and today’s topic is “silent corrections.”

The ultimate focus of this particular article is on the first three pages of WG2 N4008 (2011), Resolution M58.03 of WG2 N4104 (2011), and the Unicode mappings for two ideographs in GB 12052-89 (1989; 信息交换用朝鲜文字编码字符集), a standard from China that is a regional Korean character set. The two ideographs in question are at positions 72-33 and 72-67 in that standard. All of this started when I submitted L2/10-362 (2010), which proposed better source references for 94 ideographs that were appended to the special version of the GB/T 12345-90 (1990; 信息交换用汉字编码字符集―辅助集) standard that was used to compile the URO (Unified Repertoire & Ordering) in Unicode Version 1.1, but which are not actually present in that standard proper. It turns out that these ideographs originated in the GB 12052-89 standard.

But first, let’s briefly discuss the issue of “silent corrections” in standards, particularly in GB standards…
Continue reading…

Adobe-KR-9 Second Draft

This article picks up where the 2017-10-01 article left off, and provides details about the second draft of the forthcoming Adobe-KR-9 character collection that was issued today.

The second draft of the Adobe-KR-9 character collection includes 22,612 glyphs (CIDs 0 through 22611) distributed among ten Supplements. When compared to the first draft, 35 glyphs were removed, ten glyphs were added, three Supplements were added, and the distribution of glyphs among some of the Supplements was changed. Because it is the second draft, the details are still subject to change—and most certainly will change, though I hope that the changes are minimal.
Continue reading…

Unicode IVD: Six Versions & Five Collections

The sixth version of the Unicode IVD (Ideographic Variation Database) was released today, and is named based on today’s date: 2017-12-12.

This new version of the IVD incorporates three PRIs, #349, #351, and #354, which resulted in the registration of a fifth IVD collection, KRName, and its 36 IVSes, along with additional IVSes for the registered Adobe-Japan1 and Moji_Joho IVD collections. Be sure to read Unicode’s official announcement, and consider following @IVD_Registrar on Twitter.

As the image below confirms, the road to ideographic hell is indeed paved with turtles and dragons.


Ten Mincho: To Boldly Go Where No Font Has Gone Before

(All of the marten photos that are used in this article can be found on Adobe Stock)

日本語 (Japanese) はこちら

The purpose of this article is to provide technical details of how the Ten Mincho明朝 in Japanese—typeface and its fonts, which are initially being offered as a Typekit exclusive, were developed, and how they boldly go where no Japanese font has gone before. For more details about the Ten Mincho typeface design itself, which is probably much more interesting than this really long and technical article, I encourage you to read the official announcement (日本語) on the Typekit Blog. As stated in the official announcement, this new Adobe Originals Japanese typeface is unique in many ways, and should serve as inspiration for type foundries and typeface designers in Japan and elsewhere.

Continue reading…

Unicode Beyond-BMP Top Ten List—2017 Redux

Another three years have elapsed since I posted an update to the always-enjoyable Unicode Beyond-BMP Top Ten List, so I figured that an updated version—taking into account standardization developments that have occurred since then—was in order for the current year of 2017.



OpenType SVG Fonts in Creative Cloud Apps

Today’s article provides useful details for our relatively small number of customers who author documents with our flagship Creative Cloud apps and make use of CID-keyed OpenType SVG fonts. A rather broadly-deployed CID-keyed OpenType SVG typeface is the open source Source Han Code JP family, whose development details are described in the very first section of this article.

While it is fully possible to build OpenType fonts—CID-keyed or otherwise—that include an 'SVG ' (Scalable Vector Graphics) table, the infrastructure to support them in apps is still maturing. That is the purpose of this article, so please continue reading if the details interest or otherwise affect you.
Continue reading…

Three Down, One To Go…

Earlier this month, I decided to move the Adobe-Japan1-6 character collection specification to the Adobe Type Tools organization on GitHub, which was partly motivated by constantly-changing URLs on our Font Technical Notes page. Another motivation was to make the specification itself easier to maintain. At some point, I will be adding a more complete list of Supplement 7 (aka Adobe-Japan1-7) candidates to its wiki.

To this end, I decided to do the same for the Adobe-CNS1-7 and Adobe-GB1-5 character collection specifications while on vacation in South Dakota. For the former, I also used the opportunity to update the specification to include Supplement 7 (aka Adobe-CNS1-7), by adding its representative glyphs and other details.

So, that’s three down, and one to go.
Continue reading…

Adobe-Japan1-6 on GitHub

This is a very brief article whose purpose is to simply state that—due to recent events beyond my control*—the Adobe-Japan1-6 character collection specification is now an open source project that is hosted on GitHub as a new repository in the Adobe Type Tools organization.

Most of my morning was consumed by porting the original text from Adobe InDesign to GitHub-flavored Markdown, and, while I was touching the text, I decided to seize the opportunity to make several corrections and updates. The 500-glyphs-per-page representative glyph charts are now in a separate PDF file. I also used the opportunity to update the aj16-kanji.txt datafile, and also added the latest-and-greatest Adobe-Japan1-6 UVS (Unicode Variation Sequence) definition file. All good stuff, I think.



* Adobe’s IT folks apparently felt compelled to (once again) change the URLs for all of the font-related Adobe Tech Notes, including Adobe Tech Note #5078 (The Adobe-Japan1-6 Character Collection). Its URL is somewhat broadly referenced, including in the IVD_Collection.txt file of the latest version of the IVD (Ideographic Variation Database). The bottom line is that I needed a stable URL.

A Forthcoming Registry & Ordering: Adobe-KR-6

It is difficult to imagine that it has been over 20 years since a new RO—or Adobe CID-keyed glyph set—was born. Of course, I am referring to the static glyph sets, not the ones based on the special-purpose Adobe-Identity-0 ROS.

“RO” stands for Registry and Ordering, which represent compatibility names or identifiers for CID-keyed glyph sets that are referred to as character collections. Adobe CID-keyed glyph sets are usually referred to as ROSes, with the final “S” being an integer that refers to a specific Supplement. The first Supplement, of course, is 0 (zero).

One of my recent projects is to revitalize and modernize our Korean glyph set, Adobe-Korea1-2 (see Adobe Tech Note #5093), which was last modified on 1998-10-12 by defining Supplement 2 that added only pre-rotated versions of the proportional and half-width glyphs that are referenced by the effectively-deprecated 'vrt2' (Vertical Alternates and Rotation) GSUB feature. Instead of defining a new Supplement, I decided that it would be better to simply define a completely new glyph set for a variety of reasons. The tentative Registry and Ordering names are Adobe and KR (meaning “Adobe-KR”), and unlike other ROSes for which Supplements are defined incrementally, my current plan is to simultaneously define seven Supplements, 0 through 6.
Continue reading…

Internationalization & Unicode Conferences

I have attended every Internationalization & Unicode Conference (IUC) since IUC31 in 2007, and Adobe has been a continuous Gold Sponsor since IUC31. Unfortunately, duty calls, in the form of attending and hosting IRG #49 that takes place during the same week as IUC41, which means that I can neither attend nor present this year. Of course, Adobe continues to be a Gold Sponsor of this important event.
Continue reading…