Over on Typophile, Nick Shinn asked: "What is the difference between a code page, a glyph list, and a Unicode chart?" (By that last I think he meant "Unicode range.") Nick also made some mentions of "encoding" in the same thread, so I thought I would define the whole lot of them. I expect this will be useful to font developers, and perhaps some other folks as well. I've just dashed this off, so I reserve the right to twiddle the descriptions for more accuracy and/or clarity. Especially if my readers point out errors. :)
August 2008 Archives
About two years ago I posted my thoughts on extended Cyrillic character sets. Now we're finally ready to talk about future extended Latin character sets, and to better document what we consider to be the existing Latin character sets as well. The largest character sets here (Adobe Latin 4 and Adobe Latin 5) are drafts; I welcome any feedback, especially (though not only) on things that "ought to be in Adobe Latin 5" but aren't there yet.
This post owes a special thanks to my colleague Miguel Sousa, who spent many hours compiling lists based on my spreadsheets and directions, and checked my data repeatedly in various ways. Any errors are probably mine, but he created the linked tables of HTML which are linked from this page, as well as the tab-delimited text files which are linked in turn from those pages.
Every so often I get a request (either from within or outside Adobe) for a "Unicode font." Unfortunately, that term is not very meaningful to me. The obvious interpretations are:
1) To me as a font geek, the phrase "a Unicode font" "logically" means "a font with a unicode encoding (cmap table)." That would be pretty much every one of the 2400+ OpenType fonts Adobe has in our type library. So that interpretation doesn't really narrow things much.
2) They could mean "a font that covers all of Unicode." However, Unicode today has over 100,000 defined code points, and as there is no font format that can include more than 65,535 glyphs, such a font is not technically possible. (There's a separate question as to whether it would be desirable - see below.)
3) They could also mean "a font that covers some useful subset of Unicode that is more than just the basic WinANSI or MacRoman 8-byte (256-character) set." However, for that to be meaningful, they'd have to define exactly what writing systems or languages are important to them.
