In Part 1, Part 2, and Part 3 of this series, we examined and scrutinized the ideographs that are tagged “K” (for ROK or South Korea), “P” (for DPRK or North Korea), “J” (for Japan), and “G” (for PRC or China) in the kIICore property. In Part 4, which is today’s article, we will explore the ideographs that are tagged “T” (for ROC or Taiwan), “H” (for Hong Kong SAR), and “M” (for Macao SAR).
A total of 6,566 ideographs are tagged “T” in IICore. When I compared these against the two most basic ideograph sets from Taiwan—the 5,401 ideographs in CNS 11643 Plane 1 and the 4,808 ideographs in 常用國字標準字體表 (chángyòng guózì biâozhǔn zìtǐ biǎo)—I discovered that only one, U+5F5E 彞, is neither tagged “T” nor present in IICore, though its related ideograph that is included in Big Five Level 1, U+5F5D 彝, is tagged “T” in IICore. (This ideograph pair represents the only difference between CNS 11643 Plane 1 and Big Five Level 1, both of which include 5,401 ideographs.)
Other than the one omission pointed out in the previous paragraph, 1,156 ideographs remain outside the scope of what is a reasonably minimal set. Predictably, most of them—1,063 to be exact—map to CNS 11643 Plane 2, which is equivalent to Big Five Level 2, and another 81, two of which—U+3577 㕷 and U+4CB3 䲳—are in Extension A, map to CNS 11643 Plane 3.
That leaves a mere 12 T-tagged IICore ideographs outside the scope of the first three planes of CNS 11643. Six of them map to CNS 11643 Plane 4 (with half being in Extension A), one maps to Plane 5, and two map to Plane 15. The three tables below provide their details:
|Ideograph||kIICore||kIRG_TSource—CNS 11643 Plane 4|
|Ideograph||kIICore||kIRG_TSource—CNS 11643 Plane 5|
|Ideograph||kIICore||kIRG_TSource—CNS 11643 Plane 15|
The three remaining ideographs are the only somewhat suspicious ones in that they do not have a kIRG_TSource property value, but are related to ideographs that are tagged “T” in IICore and are in CNS 11643 Plane 1 or 2, per the table below:
|Ideograph||kIICore||Other Source References||Related Ideograph|
|嗬 U+55EC||BGTH||G0-6040, H-8F52||呵 U+5475|
|礴 U+7934||BGT||G0-6D67, H-FEE8, J13-7932, KP1-6109, K2-4D65||礡 U+7921|
|繊 U+7E4A||ATJ||GE-3858, J0-4121, KP1-67CC, K2-5330||纖 U+7E96|
The only actions that I can suggest are to tag U+5F5E 彞 “T” in IICore, and for Taiwan to consider a horizontal extension for U+55EC 嗬, U+7934 礴, and U+7E4A 繊.
Hong Kong SAR
A total of 5,224 ideographs are tagged “H” in IICore. When I compared these against the 5,401 ideographs in Big Five Level 1, I discovered that 577 are not included. This leaves 400 ideographs, 171 of which map to Big Five Level 2, and the remaining 229 map to Hong Kong SCS proper (24 are in Extension A, 61 are in Extension B, and the remaining 144 are in the URO).
All looks okay until we consider Hong Kong SCS-2016 that added 24 new characters, 22 of which are best described as the preferred Hong Kong SAR forms of existing Big Five ideographs. Of these 22 ideographs, 14 have corresponding Big Five versions that are tagged “H” in IICore, which strongly suggests that they should be tagged “H” if already present in IICore, or added to IICore and tagged “H.” The following table provides the details:
|HKSCS-2016||kIICore||Big Five Level 1||kIICore|
|兑 U+5151||AG||兌 U+514C||ATJHKMP|
|吿 U+543F||n/a||告 U+544A||AGTJHKMP|
|媪 U+5AAA||CG||媼 U+5ABC||ATJHKM|
|悦 U+60A6||AGJ||悅 U+6085||ATHKMP|
|愠 U+6120||CG||慍 U+614D||ATHM|
|氲 U+6C32||n/a||氳 U+6C33||ATH|
|税 U+7A0E||AGJ||稅 U+7A05||ATHKMP|
|脱 U+8131||AGJ||脫 U+812B||ATHKMP|
|藴 U+85F4||n/a||蘊 U+860A||ATJHKMP|
|蜕 U+8715||AG||蛻 U+86FB||ATHM|
|説 U+8AAC||AJ||說 U+8AAA||ATHKMP|
|醖 U+9196||n/a||醞 U+919E||ATHM|
|鋭 U+92ED||AJ||銳 U+92B3||ATHKMP|
|閲 U+95B2||AJ||閱 U+95B1||ATHKMP|
A total of 4,955 ideographs are tagged “M” in IICore. When I compared these against the 5,401 ideographs in Big Five Level 1, I discovered that 739 are not included. This leaves 283 ideographs, 223 of which map to Big Five Level 2, and 59 of which map to HKSCS (two are in Extension A, eight are in Extension B, and the remaining 49 are in the URO). Only one ideograph, U+5F66 彦, stands out as odd in that its source references don’t suggest Macao SAR use. Its related ideograph, U+5F65 彥, is also tagged “M” in IICore (ATHM), and its source references, particularly T1-507D, more strongly suggest Macao SAR use. The table below provides more details about these two ideographs:
|kIICore—AGJKMP||Source References||kIICore—ATHM||Source References|
|彦 U+5F66||G0-5165, J0-4927, KP0-F8BA, K0-6569, T3-2C50||彥 U+5F65||GE-2955, HB1-ABDB, KP1-41F9, T1-507D|
In addition, 13 of the 14 ideographs—meaning all except for U+6C32 氲—in the first column of the table in the “Hong Kong SAR” section above should probably be tagged “M” in IICore, because Macao SAR has similar regional conventions, and because the ideographs in the third column are already tagged “M” in IICore.
Interestingly, I never mentioned anything about the kIRG_MSource property in the previous paragraph, because none of the M-tagged ideographs in IICore have such source references. Given that there is a fairly close relationship with Big Five and HKSCS, comparing against those sets seemed to be appropriate, and as it turned out, was completely appropriate.