CID vs GID

When working with OpenType/CFF fonts, particularly those that are CID-keyed, CIDs (Character IDs) and GIDs (Glyph IDs) are often referenced as ways to uniquely identify glyphs in a font resource. But, how are CIDs and GIDs different, and perhaps more importantly, under what circumstances are they different, or the same? These are good questions, and the answers can be found in today’s article.

CIDs reference glyphs in CIDFont resources, and are also mapped from character codes in CMap resources. One particular attribute of CIDs is that they need not be contiguous. Non-contiguous CIDs are common in CIDFont resources that include a subset of the glyphs in the advertised ROS (/Registry, /Ordering, and /Supplement: the three elements of the /CIDSystemInfo dictionary). Eight of our Heisei fonts, for example, include glyphs only for the following CIDs: 0–8358, 8720–9081, and 9084–9353.

GIDs reference glyphs in the tables of an ‘sfnt’ resource, which would include the ‘CFF‘, ‘GPOS‘, ‘GSUB‘, ‘glyf‘, ‘cmap‘, and other tables. GIDs, unlike CIDs, must be contiguous. Even if the ‘CFF’ table of an OpenType/CFF font was built from a CIDFont resource, GIDs are used, but the ‘CFF’ table maintains a mapping from GIDs to CIDs. In other words, although a ‘CFF’ table references glyphs by GID, it is possible to perform various testing and verification tasks by explicitly referencing glyphs by CID. Explicitly referencing glyphs by CID will be covered later in this brief article. For the Heisei font example in the previous paragraph, the corresponding (and necessarily contiguous) GID range would be 0–8990.

When developing the materials necessary for building OpenType/CFF fonts using our AFDKO tools, specifically makeotf, and when the source font is CID-keyed, meaning a CIDFont resource, all of the materials should reference CIDs, not GIDs. In other words, the UTF-32 CMap resource should map UTF-32 character codes to CIDs, not GIDs, and any GPOS or GSUB features that are defined in the “features” file should reference CIDs, not GIDs. The makeotf tool handles the conversion from CID to GID automatically.

For OpenType/CFF fonts that are built from a CIDFont resource and include all of the glyphs in the advertised ROS, GIDs equal CIDs. Our Kozuka Gothic/Mincho Pr6N fonts, for example, include all 23,058 glyphs of the Adobe-Japan1-6 ROS, and the CID range, which is the same as the GID range, is 0–23057.

Various AFDKO tools include a “-g” option that takes glyphs and glyph ranges as its argument. For name-keyed fonts, the argument of the “-g” option must be comma-separated glyph names or GIDs. For CID-keyed fonts, the argument of the “-g” option must be CIDs (prefixed with a slash, such as “/1200″ for CID+1200) or GIDs (with no prefix), and can be single CIDs or GIDs separated by commas, or hyphenated ranges. In other words, prefixing an integer value with a slash (“/”) explicitly specifies a glyph by CID.

I have developed a series of simple Perl tools for listing glyphs by CID or GID. The extract-cids.pl tool takes a single font resource as its argument, and outputs a list of CIDs. The extract-gids.pl tool does the same, but outputs GIDs instead. I prefer to pipe the output of both tools through the mkrange.pl tool, because the lists are almost always very long. Below is the output of running extract-cids.pl, piped through mkrange.pl, for eight of our Heisei fonts:

% extract-cids.pl HeiseiKakuGoStd-W7.otf | mkrange.pl
0-8358
8720-9081
9084-9353

And, below is the same, but using the extract-gids.pl tool instead:

% extract-gids.pl HeiseiKakuGoStd-W7.otf | mkrange.pl
0-8990

The image below is an excerpt from a glyph synopsis that was made with the AFDKO tx tool, by using its “-pdf” command-line option, and shows the point at which GIDs no longer equal CIDs (GIDs are shown in the upper-left corner, and CIDs are shown in the lower-left corner):

Up through CID+8358, GIDs equal CIDs, but starting with CID+8720, they do not. This is consistent with the output of the extract-cids.pl and extract-gids.pl tools.

Note what happens when the same tools are used, but with the source CIDFont resource (the “cidfont.ps” file):

% extract-cids.pl cidfont.ps | mkrange.pl
0-8358
8720-9081
9084-9353

% extract-gids.pl cidfont.ps | mkrange.pl
0-8990

The results are identical. This is expected when using extract-cid.pl, but it may appear to be odd behavior for extract-gids.pl. Because GIDs are, by definition, contiguous, when glyphs in a CIDFont resource are referenced in a GID context, and if the CIDs are not contiguous, they become contiguous.

NOTE: The extract-gids.pl tool can also be used with name-keyed fonts, but the extract-cids.pl tool cannot, and will issue an error if done:

% extract-cids.pl font.pfa
ERROR: name-keyed font! Quitting...

My advice is that when working with OpenType/CFF fonts that were built from a CIDFont resource, it is always safest to explicitly reference glyphs by CID, which means prefixing the CID value with a slash, such as /0-/23057 for the complete Adobe-Japan1-6 CID range.

NOTE: The AFDKO tools’ “-g” option requires a slash (“/”) prefix for explicitly referencing glyphs by CID, but the “features” file’s syntax requires a backslash (“\”) for the same purpose.

2 Responses to CID vs GID

  1. Mike shih says:

    What is AFDKO tool?

    • AFDKO is a collection of powerful command-line OpenType font development tools. To learn more, you can 1) follow the link for the first instance of “AFDKO” in this article; 2) click on the “AFDKO” link in the “LINKS” sidebar at the top level of this blog; or 3) click here. ☺