This article is largely a test, but also serves to start the process of resurrecting L2/14-006 (Proposal to add standardized variation sequences for nine characters) for discussion at UTC #151 in early May.
Liang Hai (梁海) brought up this document for discussion at UTC #150 last week, and while I had an opportunity to have it accepted by the UTC, to be included in Unicode Version 10.0 (June, 2017), I decided that it was prudent to instead prepare a revised proposal that is more complete, mainly because L2/14-006 was submitted and discussed prior to the first release of the Adobe-branded Source Han Sans and Google-branded Noto Sans CJK Pan-CJK typeface families. This functionality was implemented in those typeface families via the 'locl' GSUB feature, which requires the text to be language-tagged. In other words, I learned a lot since L2/14-006 was discussed, and prefer to submit a more complete proposal, even if it means waiting for Unicode Version 11.0 (June, 2018).
It is now January 28, 2017 in China and other Chinese-speaking regions.
I’d like to use this opportunity to welcome the Year of the Rooster, and to wish a Chinese New Year to all of my Chinese friends, colleagues, and blog readers. May this year be safe, prosperous, and enjoyable.
As recorded on the very first page of Adobe Tech Note #5078, Adobe-Japan1-6 was released on 2004-03-05, and one of the glyphs that was added was CID+20958. According to the Adobe-Japan1-6 ordering file, its glyph name is freedial, and is assigned to the Dingbats FDArray element for the purpose of hinting. Of course, if you look for CID+20958 in Adobe Tech Note #5078, you can find it on the bottom of page 54, immediately to the right of CID+20957 that maps from U+26BD ⚽ SOCCER BALL, though it is blank. This is simply because Adobe does not have the rights to use NTT’s trademarked FreeDial mark. CID+20958 was included in Adobe-Japan1-6 for the benefit of font developers who do have the rights to use this mark, and can thus include the glyph in their fonts.
UTC #150, the 150th Unicode Technical Committee meeting, takes place later this month, from 2017-01-23 through 2017-01-26, and will be hosted by our friends at Apple in Sunnyvale, California. I will attend as Adobe’s representative. As usual, there will be CJK- and IRG-related items to discuss. One item will be the UTC’s review of IRG Working Set 2015 Version 3.0 (aka CJK Unified Ideographs Extension G), L2/17-006, which I recently finished.
A major focus of UTC #150 will be Unicode Version 10.0, which is scheduled to be released in June of this year. Unicode Version 10.0 will include 21 additional characters appended to the URO (Unified Repertoire & Ordering), along with Extension F (7,473 characters), as shown here.
While we’re on the subject of Unicode, be sure to explore the sidebar on the right side of this blog’s landing page, which includes links to several useful Unicode-related resources.
Please pardon the apparent non-CJK interruption in the form of this particular article, but I wanted to bring to the readership’s attention a new open source project that has a very long history: ehandler.ps.
Unlike the first and second similarly-titled articles that I published last month, this article will focus on a minor efficiency for the combining jamo feature of the Adobe-branded Source Han Sans and Google-branded Noto Sans CJK Pan-CJK typeface families.
Prior to Japan’s script reform of 1900, there was more than one shape associated with each syllable of the hiragana syllabary. There is now only one shape associated with each syllable. The now-obsolete and nonstandard shapes are referred to as hentaigana (変体仮名), which simply means variant kana. Hentaigana are still in use today in Japan, but are limited to Japan’s family registry (戸籍 koseki in Japanese) and specialized uses, such as business signage and other decor that are specifically designed to convey a feeling of nostalgia or traditional charm.
In addition to the Wikipedia article that is linked from the previous paragraph, 『変体仮名のこれまでとこれから—情報交換のための標準化』 (The past, present, and future of hentaigana: Standardization for information processing) by TAKADA Tomokazu (高田智和) et al. and About the inclusion of standardized codepoints for Hentaigana by YADA Tsutomu (矢田勉) serve as excellent reading material.
To (significantly) expand yesterday’s super exciting article, and in the continued interest of (stress-)testing the extent to which combining jamo works in various browsers—and when being served as a fully-functional webfont via Adobe Typekit—if you click here, you will open a 40MB HTML file that includes all 1,626,875 possible three-character combining jamo sequences (125 leading consonants, 95 vowels, and 137 trailing consonants) rendered using Adobe Clean Han and its 'ljmo' (Leading Jamo Forms), 'vjmo' (Vowel Jamo Forms), and 'tjmo' (Trailing Jamo Forms) GSUB features.
In the interest of testing the extent to which combining jamo works in various browsers—and when being served as a fully-functional webfont via Adobe Typekit—if you click here, you will open a 200K HTML file that includes all 11,875 possible two-character combining jamo sequences (125 leading consonants and 95 vowels) rendered using Adobe Clean Han and its 'ljmo' (Leading Jamo Forms), 'vjmo' (Vowel Jamo Forms), and 'tjmo' (Trailing Jamo Forms) GSUB features.
Again. I arrived on the afternoon of 2016-10-16.
This month provided to me yet another opportunity to visit Japan, the Land of the Rising Sun and my wife’s home country, thanks to IRG #47 (Ideographic Rapporteur Group Meeting #47) being hosted there. This trip was also the first time for me to visit an island of Japan other than Honshū (本州), specifically Shikoku (四国).
As a follow on to our seven-year-old May of 2009 article of the same name, several things have happened with the Adobe Clean family that have yet to be reported, and which have CJK implications. Hence the reason for spending my Sunday morning writing this article.
In the following year, 2010, I developed and deployed a Japanese version of Adobe Clean named Ryo Clean PlusN (りょう Clean PlusN in Japanese), and then in 2015, I developed and deployed a Pan-CJK version named Adobe Clean Han (Adobe Clean 黑体 in Simplified Chinese, Adobe Clean 黑體 in Traditional Chinese, Adobe Clean 角ゴシック in Japanese, and Adobe Clean 고딕 in Korean). These typeface families are Adobe corporate fonts that are meant to be used for product literature, for serving to Adobe websites, and for use by Adobe apps. They are not meant to be used by our customers, but I suspect that the readership of this blog may be interested in some of the development details. If this interests you, please continue reading.
Attention, students! Class is in session.
In my experience, the following two statements about standards are seemingly conflicting yet accurate:
- Standards are incredibly useful—and required—for product development.
- Standards cannot be completely trusted.
On one hand, developing products, such as typeface designs and their fonts, depends on standards.
On the other hand, standards themselves are developed by humans, meaning that they are prone to error, especially when they happen to be character set or glyph standards that include thousands or tens of thousands of representative glyphs.
The first day of Autumn this year is Thursday, September 22nd, and my schedule for this upcoming season is filled with several standards-related activities…
UTC #148 took place in Redmond, Washington last week, hosted by our friends at Microsoft. It was a four-day working meeting, and many important Unicode-related issues and proposals were discussed. A total of 7,888 new characters were formally accepted into the standard during this meeting. Among them were the 7,473 CJK Unified Ideographs of Extension F, along with the lone CJK Unified Ideograph U+9FEA that is appended to the URO (Unified Repertoire & Ordering) and is the result of the disunification of 㸂 U+3E02, which were accepted on 2016-08-04 for inclusion into Unicode Version 10.0. Version 10.0 is slated for a June 2017 release. This means that my table above is now less tentative (clicking on the image will reveal the entire PDF file that includes details about the unchanged CJK Compatibility Ideographs).
Other CJK Unified Ideographs that are slated to be included in Unicode Version 10.0 are the 20 characters, U+9FD6 through U+9FE9, which were accepted on 2014-10-28 (UTC #141).
This will bring the total number of CJK Unified Ideographs to 87,882, and as the table at the top of this article suggests, there is not much room left in Plane 2, and Extension G is just around the corner.
For those who are curious about the 414 other new characters that were accepted during UTC #148, please click here, here, here, here, here, here, and here.
August 2, 2016 is the official release date for Microsoft’s Windows 10 Anniversary Update (aka Redstone or RS1). Although I do not use Windows OS, I am jumping for joy, for the benefit of those who do use this modern and world-class OS.
Thanks to our friends at Microsoft, the DirectWrite that ships with the Windows 10 Anniversary Update supports OpenType/CFF Collections (aka OTCs), such as those deployed as part of the Adobe-branded Source Han Sans and Google-branded Noto Sans CJK open source projects, to include their all-inclusive “one font to rule them all” Super OTCs.
For those who missed the memo, Unicode Version 9.0 was released on June 21, 2016, which added exactly 7,500 characters to the standard. Unicode now includes a total 128,172 characters, which is just shy of 3,000 characters under two full 256×256 planes.
While Version 9.0 does not add any new CJK Unified Ideographs, I used this opportunity to enhance my single-page CJK Unified/Compatibility Ideographs document to better track unassigned code points for the relevant blocks and planes. The image at the top of this article shows the first half of the document, and if you click on it, you’ll access the original PDF file that can be squirreled away for reference purposes.
I also used this opportunity to update my tentative Unicode Version 10.0 document in the same way.
As usual, enjoy!
Today is Friday, July 1st, 2016, which is a date that has a special significance for me. I am publishing this from Hot Springs, South Dakota where I am enjoying a few days away from work.
My life was put on a new path exactly 25 years ago, on Monday, July 1st, 1991. I was 25 years old at the time, and I am therefore 50 years old now. It was on this date that I started working at Adobe as a member of its Type Development team. My employee number is 879, though at the time there were approximately 500 employees in total. It was a much smaller company back then. As you can see from my very first business card below, I was involved in things related to Japanese type from the very beginning:
This event effectively launched a 25-year career that is still going strong, and which has been in the same department doing essentially the same thing, though the technologies and related standards have changed or evolved.
The rest of this somewhat lengthy article will be used to highlight some of my accomplishments during each five-year period.
This will be a short, sweet, and to-the-point article. Sorry, no graphics nor photos.
When developing name-keyed fonts, glyph names matter. They matter a lot. When developing new fonts, the glyph names should either be explicitly listed in AGLFN (Adobe Glyph List For New Fonts) or derivable via the AGL Specification. Glyph names that adhere to AGLFN or the AGL Specification result in fonts with well-formed 'cmap' tables, which means that their glyphs will behave better in a broader range of environments. I cannot stress the importance of this.
CIDs (Character IDs), on the other hand, represent a completely different beast. If a font is genuinely CID-keyed, it means that there are absolutely no glyph names, regardless of whether the source font or fonts that were used to build the CID-keyed font were named-keyed. Once a font resource becomes CID-keyed, the original glyph names are literally jettisoned, and the only way in which to map Unicode values to glyphs is via the 'cmap' table, which is usually done using a UTF-32 CMap resource. In other words, when developing fonts that are intended to be deployed in a CID-keyed fashion, the source glyph names play absolutely no role in how such fonts are processed.
By ESO/José Francisco Salgado (josefrancisco.org) — ALMA antennas under the Milky Way
Five years ago, I wrote this article that described how to manage XUID arrays. Then last year, I wrote this article that suggested that XUID arrays are no longer necessary.
Anyway, there are two messages that are being conveyed in today’s article.
The first message is short and sweet, and meant to be strong: Adobe advises against including XUID arrays in all new and updated font-related resources, meaning fonts themselves and their corresponding CMap resources. The good news is that omitting the XUID array represents one less thing to worry about during font development.
The second message is longer, meant to provide some background information, and describes why Adobe advises against including XUID arrays in font-related resources.
One of my more popular open source fonts is Adobe Blank, and to a less extent the related Adobe Blank 2 because it uses a 'cmap' table format, Format 13, that is not broadly supported. Actually, Adobe Blank provides absolutely nothing, because it maps all 1,111,998 Unicode code points to a range of 2,048 non-spacing and non-marking glyphs, yet such a font is useful for particular scenarios, such as addressing the FOUT (Flash Of Unstyled Text) problem.
Allow me to introduce Adobe NotDef, which is modeled after Adobe Blank in that it covers all of Unicode and maps to a range of 2,048 glyphs, but differs in that the functional glyphs are spacing and marking. The original suggestion for Adobe NotDef came from Dave Crossland. The glyphs match the shape and advance width of the standard Adobe .notdef glyph that is invoked in environments that do not support font fallback when the selected font does not include a glyph for a particular character, and as Dave wrote, Adobe NotDef is useful for font fallback purposes in that it can be used to prevent the display of non-standard .notdef glyphs that may be present in some fonts in the font fallback chain.