Code Monkey home page Code Monkey logo

source-han-sans's People

Contributors

danrhatigan avatar davelab6 avatar punchcutter avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

source-han-sans's Issues

Interpolation error in U+7B98

The Japanese glyph for U+7B98 (uni7B98-JP) has an interpolation error that affects all weights except for ExtraLight. This affects only the Japanese fonts and Japanese OTC instances. The Simplified Chinese (uni7B98-CN) and Traditional Chinese (uni7B98-TW) glyphs do not exhibit this issue.

Suggestion about adding some mappings (no new glyphs needed) & save 3 glyphs

The following code points can be covered with the existing glyphs:

U+20457 𠑗 CID+2621
U+20F96 𠾖 CID+13371
U+21428 𡐨 CID+14363
U+237EC 𣟬 CID+22859
U+23F7D 𣽽 CID+24800
U+23F9E 𣾞 CID+60334
U+2420E 𤈎 CID+5278
U+24968 𤥨 CID+26945
U+24A01 𤨁 CID+27076
U+25133 𥄳 CID+28315
U+2592E 𥤮 CID+29809
U+25C83 𥲃 CID+30633
U+25CBB 𥲻 CID+30603
U+26900 𦤀 CID+33613
U+284DC 𨓜 CID+40542
U+28E93 𨺓 CID+43344
U+29460 𩑠 CID+44103
U+29516 𩔖 CID+44210
U+29C18 𩰘 CID+9382

U+FA72 全 CID+11275
U+FA76 勇 CID+11844
U+FA77 勺 CID+11930
U+FA78 喝 CID+13009
U+FA7B 嗢 CID+13137
U+FA7C 塚 CID+14185
U+FA82 廒 CID+17461
U+FA8A 慠 CID+18597
U+FA8B 懲 CID+18874
U+FA8E 搜 CID+19779
U+FA90 敖 CID+20351
U+FA91 晴 CID+20838
U+FA92 朗 CID+21116
U+FA96 殺 CID+23223
U+FA9A 漢 CID+24652
U+FA9C 煮 CID+25758
U+FAA0 猪 CID+26557
U+FAA3 画 CID+27479
U+FAA4 瘝 CID+60606
U+FAA5 瘟 CID+27888
U+FAA6 益 CID+28196
U+FAA8 直 CID+28268
U+FAAA 着 CID+28385
U+FAAB 磌 CID+29015
U+FAAD 節 CID+30477
U+FAAE 类 CID+30890
U+FAB0 練 CID+31674
U+FAB1 缾 CID+62520
U+FAB2 者 CID+32591
U+FAB5 蝹 CID+36589
U+FAB6 襁 CID+37423
U+FAB8 視 CID+37588
U+FAB9 調 CID+38147
U+FABA 諸 CID+38271
U+FABB 請 CID+38172
U+FABC 謁 CID+58907
U+FABD 諾 CID+38283
U+FAC4 醙 CID+41228
U+FAC6 陼 CID+43326
U+FAC7 難 CID+43527
U+FAC8 靖 CID+43763
U+FAC9 韛 CID+44003
U+FACA 響 CID+44063
U+FACC 頻 CID+44156
U+FACD 鬒 CID+45464
U+FAD4 䀹 CID+47521

I don't think it is a bad idea to cover these code points.

ps. You can save three glyphs by removing these:
CID+4122 (mapped to U+39B3 㦳)
CID+6950 (mapped to U+439B 䎛)
CID+60708 (mapped to U+25874 𥡴)

U+363D 㘽 and U+39B3 㦳, and U+3588 㖈 and U+439B 䎛 are exact duplicates, so you can map CID+3084 (mapped to U+363D) to both U+363D and U+39B3, and CID+2877 (mapped to U+3588) to both U+3588 and U+439B.
CID+60708 is the same as CID+29676 (Korean glyph of U+7A3D), so you can map CID+29676 to both U+7A3D 稽 (for Korean) and U+25874 𥡴.

This is similar to mapping CID+61476 to both U+29FCE 𩿎 and U+29FD7 𩿗, which is already done.

Create a proper UniSourceHanSansHK

The Hong Kong standardized forms and social norms depart from the Taiwanese MOE variants to the extent that certain characters are exact replicas to the PRC variants, instead of the Taiwanese MOE. Such a problem means that UniSourceHanSansTWHK is actually unfit for daily use in Hong Kong.

The main standard, called 《香港常用字字表》"List of Graphemes of Commonly-used Chinese Characters", is now incorporated into《香港小學學習字詞表》, online version can be found at http://www.edbchinese.hk/lexlist_ch/. This list of graphemes is the de-facto standard as all Kai scripts in the primary and secondary school textbooks are required to adhere to this standard. Local dictionaries also adhere to this standard, instead of the ROC MOE.

There is also a set of guideline for the IT industry prepared by the Chinese Language Interface Advisory Committee (CLIAC). Despite this guideline drafted to cover all CJK characters in the BMP and the CJK Extension A, the guideline is specified in a list "basic components", which can be used to apply to newly created characters. Certain ROC MOE variations are deemed incorrect, especially 周、告、骨. This guideline requests that certain glyphs have a glypheme that is already specified by another codepoint*.

*The word 兌 in Big5 character set is required to be mapped to codepoint U+514C. However, the standard in Hong Kong is to render it exactly as 兑, which-so-happens-to-be the form mapped to U+5151 by GBK. Therefore, the guideline requires both U+514C and U+5151 to be rendered as 兑 (and similarly for all characters that utilize the component).

Of the characters, they generally consist of five types, roughly in order of prevalence:

  • Exact Replicas of both PRC & ROC Variants
  • Exact replicas of ROC variants
  • Exact replicas of PRC variants
  • Contain components similar to PRC and ROC variants
  • Contain components that do not exist in either PRC or ROC variants

Exact Replicas of both PRC & ROC Variants

  • e.g. 我你什共們原的話**

Exact Replicas of ROC Variants:

  • e.g. 他甚花草這勻鈞敢茲今

Exact Replicas of PRC Variants:

  • e.g. 示隸黃廣總統
  • e.g. those that involve "月" "次" "户" "兌" components: 有育胃臂資扇請 資次 肩房扁 說銳

Contain components similar to PRC and ROC variants

  • 脫能化

Contain components that do not exist in either PRC or ROC variants

  • 滋、磁、龜、告、周

**《香港電腦漢字字形參考指引》specify a different glyph where the top right component is a 干 instead of a 千. This deviates from that specified in 《香港常用字字表》. As the standard is prepared by a committee of teachers and experts, and of de-facto standard, while the guideline is prepared by non-experts and of suggestive nature, the standard should prevail.

As most characters can be directly re-used, the work should be feasible in a short span of time.

Gridfit

Gridfitted TTFs are extremely useful for low-density display devices and web use, however there is still NO CJK font which is well fitted. Gridfits of MSYH often make characters ragged, due to lack of blue zone alignments.

Manually adding gridfits is unacceptable due to the quantity of characters, therefore is it possible to create an automatic gridfit generation algorithm which generates gridfits for thousands of characters? Adding it into ttfautohint might be a great help because people may reuse Source Han Sans in their own composite font and re-generate gridfits.

Confirmed JP glyph errors (issue with "Heavy" master)—consolidated

Issues #19, #20, #21, and #24 have been closed and consolidated in this issue. Two additional issues, for U+9A2B and U+9AFA, have been added here.

The Japanese glyph for U+617E (uni617E-JP; CID+18660) has an issue with the Heavy master that affects all weights except for ExtraLight due to interpolation. This issue also affects Korean (it corresponds to a hanja in KS X 1001).

The Japanese glyph for U+7B8F (uni7B8F-JP; CID+30378) has an issue with the Heavy master that affects all weights except for ExtraLight due to interpolation. This issue also affects Korean (it corresponds to a hanja in KS X 1001).

The Japanese glyph for U+7B98 (uni7B98-JP; CID+30400) has an issue with the Heavy master that affects all weights except for ExtraLight due to interpolation. This affects only Japanese.

The Japanese glyph for U+9A2B (uni9A2B-JP; CID+45034) has an issue with the Heavy master that affects all weights except for ExtraLight due to interpolation. This issue also affects Korean (it corresponds to a hanja in KS X 1001).

The Japanese glyph for U+9AFA (uni9AFA-JP; CID+45409) has an issue with the Heavy master that affects all weights except for ExtraLight due to interpolation. This issue also affects Korean (it corresponds to a hanja in KS X 1002).

The Japanese glyph for U+9DD9 (uni9DD9-JP; CID+46597) has an issue with the Heavy master that affects all weights except for ExtraLight due to interpolation. This issue also affects Korean (it corresponds to a hanja in KS X 1002).

The table below shows all seven weights.

six-jp-errors

Is there any chance to release TTF version ?

I know OTF has lot of advantages over TTF.

As a Java programmer, I have to admit that many Java Swing-based IDEs do not support OTF well, especially non-Latin characters.
Is there any chance to release TTF version like Source (Sans | Serif | Code) Pro or we can compile them from source ?

Thanks.

Problems of Bopomofo symbols.

rect3091

According to the image, the most significant problem is that ㄍ U+310D BOPOMOFO LETTER G is not aligned in in heavy width, and that the width of ㄢ U+3122 BOPOMOFO LETTER AN ,ㄣ U+3123 BOPOMOFO LETTER EN and ㄥ U+3125 BOPOMOFO LETTER ENG is not equal to others and seems a bit slanted.

OTF files are redundant

It would be nice to remove */OTC/*.otf files, which can be theoretically built from sources. These OTF files consume this repository about 481MB.

$ find . -name '*.otf' -print | xargs du -ch
 16M    ./Bold/OTC/SourceHanSans-Bold.otf
 18M    ./Bold/OTC/SourceHanSansK-Bold.otf
 18M    ./Bold/OTC/SourceHanSansSC-Bold.otf
 18M    ./Bold/OTC/SourceHanSansTC-Bold.otf
 14M    ./ExtraLight/OTC/SourceHanSans-ExtraLight.otf
 17M    ./ExtraLight/OTC/SourceHanSansK-ExtraLight.otf
 17M    ./ExtraLight/OTC/SourceHanSansSC-ExtraLight.otf
 17M    ./ExtraLight/OTC/SourceHanSansTC-ExtraLight.otf
 17M    ./Heavy/OTC/SourceHanSans-Heavy.otf
 19M    ./Heavy/OTC/SourceHanSansK-Heavy.otf
 19M    ./Heavy/OTC/SourceHanSansSC-Heavy.otf
 19M    ./Heavy/OTC/SourceHanSansTC-Heavy.otf
 16M    ./Light/OTC/SourceHanSans-Light.otf
 17M    ./Light/OTC/SourceHanSansK-Light.otf
 17M    ./Light/OTC/SourceHanSansSC-Light.otf
 17M    ./Light/OTC/SourceHanSansTC-Light.otf
 16M    ./Medium/OTC/SourceHanSans-Medium.otf
 17M    ./Medium/OTC/SourceHanSansK-Medium.otf
 17M    ./Medium/OTC/SourceHanSansSC-Medium.otf
 17M    ./Medium/OTC/SourceHanSansTC-Medium.otf
 16M    ./Normal/OTC/SourceHanSans-Normal.otf
 17M    ./Normal/OTC/SourceHanSansK-Normal.otf
 17M    ./Normal/OTC/SourceHanSansSC-Normal.otf
 17M    ./Normal/OTC/SourceHanSansTC-Normal.otf
 16M    ./Regular/OTC/SourceHanSans-Regular.otf
 17M    ./Regular/OTC/SourceHanSansK-Regular.otf
 17M    ./Regular/OTC/SourceHanSansSC-Regular.otf
 17M    ./Regular/OTC/SourceHanSansTC-Regular.otf
481M    total

Placement of hangul tone marks (bangjeom)

A hangul tone mark (U+302E and U+302F) modifies the hangul syllable preceding it, not the one succeeding it.

For example, 한〮글 (U+D55C U+302E U+AE00) should be displayed something like
·한

not something like

·글

But Source Han Sans displays the latter.
bangjeom

calt duplicates jamo features

The calt in Noto Sans CJK Regular includes all lookups from ljmo / vjmo / tjmo. HarfBuzz applies calt in Hangul. This makes shaping a sequence like U+11A2 to fail.

I'm guessing that this was added for shaping to work with engines that don't have a dedicated Hangul shaper. I'm going to disable calt in HarfBuzz, but wanted to make sure that was the intention.

Move built fonts to Github Release/SourceForge Files

It's not a good idea to put those huge files in the source folder since developers with the source code can just simply build them when needed.

Moving those files to GitHub Release, along with a README is a better solution.

For those binary files which already exists in the repository, there is also a way to clean up:

git filter-branch --force --index-filter \
  'find . -name "*.otf" | xargs git rm --cached --ignore-unmatch' \
  --prune-empty --tag-name-filter cat -- --all

This will rewrite all the commits, but it doesn't matter much since there is only one commit now.

  • Actually the problem also exists in other Adobe Open Source fonts.

請還原Traditional Chinese的眞正Tradition寫法

正體(繁體)中文所用的寫法,個人認為大有問題。從圖例顯示,寫法是向台灣敎育部靠攏。然而,使用正體中文的地區有許多,如香港、澳門和海外華僑等,其他地區並不以台灣寫法爲尚。此外,即使在台灣,過去舊日的書刊出版,乃至今天的報章,主要使用的也不是敎育部寫法。大家主要使用的,是過去傳統字書裏的正體寫法,有人稱作「舊字形」,日本朋友會叫「康熙字典體」——不是指某款遭濫用的字型,而是指參照同文書局原版的《康熙字典》每字字頭之寫法。這種寫法,一來有充份字理,二來在字型美學上也較美觀。至於台灣敎育部寫法,則以楷書寫法,來強行扭曲明體、黑體等印版字型,既缺乏字理,也不夠美觀,已有不少人詬病。在下由衷感謝 Adobe的貢獻,但極望 Adobe能把正體中文的字型,改回眞正正統的《康熙字典》寫法(即「舊字形」),而不是台灣以手寫楷書扭曲黑體的寫法。不勝銘感!

我不反對有台灣人想用台灣敎育部的寫法,但也應還其他Traditional Chinese使用者,使用眞正Tradition寫法的空間,分拆開「Taiwan」和「Traditional」兩體。而不是強迫其他正體使用者依從台灣那種以楷扭曲黑的寫法。

Apostrophe stands incorrect width

The apostrophe in Source Han Sans appears problematic on its width. It's significantly wider (almost full-width) than its latin derivative counterpart.

Here's a comparision among Helvetica, Noto Sans (Latin), Noto Sans T Chinese, and Source Han Sans TWHK:
2014-07-20 1 56 10

Hope this could be fixed soon!

Changed to fallback font when typing

When I use the multilingal OTF version of the font and typing Chinese characters with New ChangJie from Microsoft Office IME 2010 (Office 新倉頡 2010). When I type the first radical, the font will change to fallback font like PMingLiU (新細明體) in Microsoft Word 2010 or AdobeMingStd (Adobe 明體 Std) in InDesign CS6. However, for TWHK version, it works normally.

It may due to the IME insert a U+3000 character to the input area when the first radical is typed but the multilingal OTF version does not have this glyph and thus the software apply the fallback font but I don't have tools to verify this.

Interpolation error in U+9DD9

The Japanese glyph for U+9DD9 (uni9DD9-JP; CID+46597) has an interpolation error that affects all weights except for ExtraLight. This also affects Korean because it corresponds to a hanja in KS X 1002. The Simplified Chinese glyph (uni9DD9-CN) does not exhibit this issue.

Some fonts' name are different from SourceForge

Sorry I'm not good at English....

For example,
h tg a p9 e5hl08 ou
The first one is from GitHub and another one is from SourceForge.
u59 p7v5 r7vji 52s uwv
The first one is from GitHub and another one is from SourceForge.

Though they are the same font, their names are different.

Splitting TWHK into TW & HK

I am opening this new issue to indicate how we plan to handle the Traditional Chinese situation as it pertains to TW (Taiwan) versus HK (Hong Kong) usage, and will be closing Issues #6, #17, #18, and #32 because the intention here is to indicate the action that will be taken. Please reference these four closed issues for specific user comments and suggestions.

For those who are interested, the background is that we (Adobe and Google) made a decision to follow the Taiwan MOE glyph standard for Big Five. For better or worse, and mainly for the sake of consistency, we also decided to apply the Taiwan MOE glyph standard to the characters that correspond to Hong Kong SCS. Hong Kong SCS is an add-on to Big Five, so from a particular point of view this makes sense. We very much appreciate the feedback from the community about this, and will attempt to remedy the issue, though it may take some time and effort.

We are targeting late August to release the first major update to Source Han Sans (and the Google-branded Noto Sans CJK) that will address most of the issues that have been reported and confirmed as bugs. Below is the current plan for addressing the HK issue:

First, all instances of TWHK will be changed to simply TW, but the glyphs will remain the same (other than any corrections). In other words, the character code coverage will still be Big Five and Hong Kong SCS, and in a way that still follows the Taiwan MOE glyph standard. Note that most of the Hong Kong SCS characters also correspond to CNS 11643 characters, in Plane 3 and beyond. The Traditional Chinese (TW) subset OTFs may be changed in that the HK-specific glyphs are removed. This has not yet been decided.

Second, experimental HK subset OTFs and HK font instances (OTC) will be added, which will attempt to address the community concerns by repurposing existing glyphs to the extent that is possible.

Lastly, about the suggestion to use Kangxi Dictionary–style glyphs, doing so is beyond the scope of this project. One particular difficulty of implementing such glyphs is that there is no Sans Serif (黑体/黑體) typeface design reference or standard.

Glyphs of U+119F ᆟ and U+11A1 ᆡ

The vertical stroke of U+119F ᆟ (preceded by a leading jamo (choseong), NOT followed by a final jamo (jongseong)) is too short compared to other vowels.
u119f

Text used in image:
ᄀퟅᄂퟅᄃퟅᄅퟅᄫퟅᅀퟅᅌퟅᅙퟅ
ᄀᆟᄂᆟᄃᆟᄅᆟᄫᆟᅀᆟᅌᆟᅙᆟ
ᄀퟆᄂퟆᄃퟆᄅퟆᄫퟆᅀퟆᅌퟆᅙퟆ
ᄀᆡᄂᆡᄃᆡᄅᆡᄫᆡᅀᆡᅌᆡᅙᆡ

Also, the vertical strokes of the default glyphs (before GSUB) of U+119F ᆟ and U+11A1 ᆡ are too short.
araea-aeoei

Text used in image:
ퟅ ᆟ ퟆ ᆡ

Old hangul and vertical writing mode on MS Word 2007 on Windows 7

Old hangul is not composed at all in the vertical writing mode of Microsoft Word 2007 on Windows 7.
Also, the default line spacing is too wide (see the size of the cursor).

Font: Source Han Sans Regular
Text size: 12 pt
Line spacing: 1.0, pressed "Remove Space After Paragraph" once
msword2007oldhangul

Text used in image:
사ᄅᆞᆷ마다ᄒᆡᅇᅧ수ᄫᅵ니겨날로ᄡᅮ메
便뼌安ᅙᅡᆫ킈ᄒᆞ고져ᄒᆞᇙᄯᆞᄅᆞ미니라

There is no problem in the horizontal writing mode.
msword2007oldhangulhorizontal

UniSourceHanSansTWHK should be renamed UniSourceHanSansTW

The name "UniSourceHanSansTWHK" suggests that this font is suitable for Hong Kong regions as for Taiwan regions. However, this font neither adheres to the government standard, match industry guidelines for fonts, nor does it even resemble the written form in use by Hong Kong people.

In Hong Kong, there exists a pair of standards/guideline for the proper glyph rendering: "the Standard" by the former Education Department (superceeded by the Education Bureau), and "the Guideline", by the CLIAC of the Hong Kong Government.

The Standard is mandated by the Hong Kong Government in all primary and secondary Chinese textbooks, and closely resembles the written form in use by Hong Kong people.

The Guideline is an extension of the Standard to cover all typeable characters of Unicode, and applies to all fonts produced for local use. All mainstream Hong Kong newspapers and most printed articles, commercials and posters use fonts that all conform to the latter guideline.

The establishment of the local Standard and Guideline is partly due to the controversy that arose when Taiwan MOE attempted to standardize Chinese glyphs; the standardized version parted from immensely from the written forms that were current in use. For the vast majority of commonly used characters, the standardized, guideline suggested, and actual written form consistently depart from the Taiwan MOE standard. For many common characters, they depart to the extent that they are exact replicas of standardized forms of other locales, such as those mandated in the PRC Standard (for characters such as 有,胃,育,夏,戶,以,化,嘅,次 etc), or those specified in the Japanese Standards (for characters such as 骨,教 etc).

In recent years, however, there has been an increase in printed materials that use fonts that adhere to the Taiwan MOE Standard. These materials are usually prepared by individuals, and are due to Microsoft JhengHei (a font that adheres to the Taiwan MOE standard) being the default font for newer versions of Microsoft Office. Still, the glyphs of these fonts are often (incorrectly) referred as "電腦字" or "computer words", which colloquially suggest that these words are artificially composed and not of "proper" descent.

Given both the font departing significantly from the Hong Kong's government standard (by design), nor resembling Hong Kong people's general writing forms (by design), it would be misleading to continue to refer to the font as UniSourceHanSansTWHK.

Character 冒( U+5192) is wrong

The upper part of 冒 is wrong. It comes from the character 冃 (U+5183), not 日(U+65E5) or 曰(U+66F0). The two horizontal lines should not be connect to the two vertical lines. That is, there should be a small amount of space to pad those horizontal lines.

Here is the comparison of different typeface
comparison of different typeface

Problems with old hangul

  1. The default glyph (before GSUB) of U+D7B0 is wrong; it is the same as that of U+117D.
  2. When I put a final jamo (jongseong) after U+1198, U+1199, or U+D7B0, the final jamo does not form part of a syllable; it is left alone.
  3. The glyph of a leading jamo (choseong) followed by U+11A0 does not change.

oldhangulproblems

Text used in image:

  1. ᄋᆘᆨ ᅀᆙᇹ ᅙힰᇫ
  2. ᄀᆠᄂᆠᄃᆠᅀᆠᄫᆠ ᄀᆠᆨᄂᆠᆨᄃᆠᆨᅀᆠᆨᄫᆠᆨ

Two hanzi/hanja coverage suggestions

  1. U+9FD4 (tentative, ⿰钅哥) and U+2B7F7
    These two are the characters for recently synthesized chemical elements (copernicium (Cn, element 112) and livermorium (Lv, element 116), respectively). With the coverage of these two code points, Source Han Sans will be able to display all the characters for chemical elements. It would be great for a periodic table in Simplified Chinese.
    (For U+9FD4 (⿰钅哥), I guess we need to wait until the code point becomes stable.)
  2. 37 non-BMP ROK Inmyeongnyong Hanja (인명용 한자, hanja for personal names)
    These are used for personal names in the ROK (South Korea). (I guess this is less important than the first suggestion, because hanja is rarely used in Korean.)

U+200D7 U+2012C U+21155 U+21594 U+21727 U+21F5C U+22C6F U+22EE0 U+230FD U+23343 U+2363B U+23AD9 U+242F1 U+2439D U+24A01 U+2533E U+253B5 U+253FE U+2583A U+25978 U+25B97 U+26057 U+267D8 U+27A51 U+27B02 U+27E7D U+27F1B U+283F6 U+284DC U+28D8A U+28DA1 U+294DE U+29509 U+2983B U+2A2AD U+2C115 U+2C7D3

Inconsistent shapes of the hangul consonant hieuh (ㅎ)

Typically, there are two ways to design the hangul consonant hieuh (ㅎ). What to choose between them depends on fonts.
twohieuh

Source Han Sans seems to go with the first one, but I found two characters that have the second one: 휓 (U+D713) and 휗 (U+D717).
inconsistenthieuh

Although this is a very minor issue, I think there is no reason to leave the design of the same consonant inconsistent.

Unable to preview fonts in Win8.1

OS: Windows 8.1 pro 64bit, with all updates installed
Fonts affected:
SourceHanSansCN-1.000.zip
SourceHanSansJP-1.000.zip
SourceHanSansKR-1.000.zip
SourceHanSansTWHK-1.000.zip

Fonts NOT affected: SourceHanSansOTF-1.000.zip
(All fonts are downloaded from http://sourceforge.net/projects/source-han-sans.adobe/files/)

Steps to reproduce:

  1. Install the affected fonts.
  2. Try to preview them in Windows' font setting.
    (Optional: Restart Windows before previewing.)

Actual result:
The preview window pops up very slowly. Sometimes Windows says it is not responding. In the end no characters are shown.
Video of the slow loading: http://gfycat.com/ConsiderateCommonCurassow
Pic of the final result: https://i.imgur.com/fJzIBqB.png

Expected result:
The preview pops up instantly and characters are shown.

Note:
The affected fonts actually works in common usage, e.g. LibreOffice, Adobe Photoshop. However, I cannot use it in BabelMap (http://www.babelstone.co.uk/Software/BabelMap.html) as it works similarly as a font preview program.

Cross reference from Noto Sans: https://code.google.com/p/noto/issues/detail?id=65

Glyphs of U+F92C 郎 and U+F9B8 隸

The glyphs of U+F92C 郎 and U+F9B8 隸 should be the same as those of U+90DE 郞 and U+96B7 隷, not those of U+90CE 郎 and U+96B8 隸.
U+F92C and U+F9B8 are originally duplicates of U+90DE and U+96B7 (not U+90CE and U+96B8) respectively; the canonical mappings are wrong.
(ROK wanted to fix the canonical mappings, but it couldn't. So it proposed to add U+FA2E 郞 and U+FA2F 隷 with the correct canonical mappings (U+90DE and U+96B7).)

Related documents:
http://appsrv.cse.cuhk.edu.hk/~irg/irg/irg33/IRGN1614CDcomm_KR.pdf
http://appsrv.cse.cuhk.edu.hk/~irg/irg/irg33/IRGN1638T19T20.pdf
http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3419.pdf
http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3747.pdf

Font weight Normal and Regular is not distinguishable in fontconfig 2.11.0

Fontconfig is used by almost every GNU/Linux distribution to manage fonts. The font weight between Light and Regular in fontconfig is Book. Normal and Regular is considered the same weight. So on GNU/Linux either one of them is not selectable from font chooser.

Note Fontconfig and Pango has their own idea of how particular weight should be named, so the following screenshot does not map to the actual font name.

% fc-list 'Source Han Sans TC' family style weight
Source Han Sans TC,思源黑體,思源黑體 Normal,Source Han Sans TC Normal:style=Normal,Regular:weight=80
Source Han Sans TC,思源黑體,思源黑體 Heavy,Source Han Sans TC Heavy:style=Heavy,Bold:weight=210
Source Han Sans TC,思源黑體,思源黑體 Light,Source Han Sans TC Light:style=Light,Regular:weight=50
Source Han Sans TC,思源黑體,思源黑體 Medium,Source Han Sans TC Medium:style=Medium,Regular:weight=100
Source Han Sans TC,思源黑體,思源黑體 ExtraLight,Source Han Sans TC ExtraLight:style=ExtraLight,Regular:weight=0
Source Han Sans TC,思源黑體,思源黑體 Regular,Source Han Sans TC Regular:style=Regular:weight=80
Source Han Sans TC,思源黑體,思源黑體 Bold,Source Han Sans TC Bold:style=Bold:weight=200

2014-07-16-161101_543x558_scrot

Improve the design of 辶 in TW/HK version

This issue refers to the actual rendering form of radical #162, i.e. the 辶 (U+8FB6) component in the Hong Kong/Taiwan version. This applies to all words that belong to the said radical, e.g.這遇過選遍 etc.

Background

The design of 辶 in Source Han Sans is different from the the MoE standard and other font vendors.

Here is an exerpt from MoE's "Chinese Character Fong Style" document:

8fb6-1-moe

And below illustrates how 辶 is drawn in Source Han Sans TWHK:

8fb6-2-sourcehansans

For comparison, here is how the same 辶 is rendered in other fonts:

8fb6-3-3rdparty

One can immediately notice the stroke differences:

8fb6-2

It looks like the 辶 in Source Han Sans TWHK resembles that of MoE's Song style:

8fb6-5-song

The problem

To me, it is completely OK to be different and not 100% following the MoE standard. This issue isn't about not following the standard, but the design of this component which is not appealing:

a) The left side of the 辶 component looks a bit thin. It is more obvious when compared with other fonts. This gives me feel that the radical is separated from its surrounded component. The most obvious one is "導".

b) The second "乛" is shifted rightward compared to the first "乛". This causes a "stair" effect, which makes the component look unbalanced. (And it seems to be the reason why it feel it is "thin")

8fb6-6-b

c) In addition, since the upper "乛" is shifted to the left, the empty space is obvious when the upper part of the top-right component is smaller than the lower part (like 過).

8fb6-6-c

d) The decoration on the corner of the second "乛" looks OK at small size. However, when it is viewed at a larger font size, it not only fails to enhance viewing experience, but gives an impression that the line was meant to be straight but accidentally cut/ground

8fb6-6-d

e) Finally, the "spike" of the last stroke spoiled the smoothness of the left-slanting downward stroke.

8fb6-6-e

Closing

It seems that TWHK form of 辶 doesn’t play well with other components. It is a bit complicated and unbalanced, which makes me think the component is a hand-written.

It is hard to tell if Source Han Sans deliberately chose a different presentation in Hei(Gothic) style to differentiate itself from other vendors. While I agree the necessity to be different, my opinion is that the style in Source Han Sans lacks aesthetics.

It would be appreciated if the team could consider improving the look of this component.

Proposal to extend the coverage of T-Chinese to Japanese Jinmeiyō kanji (人名用漢字)

Sometimes people needs to write Japanese name in Traditional Chinese.

The Japanese Jinmeiyō kanji can be divided into 4 catalogues.

  1. The character is already included in Big5 and HKSCS
    Do nothing
  2. The character is unique but it is not included in neither Big5 nor HKSCS
    3 characters: 凪 u+51EA , 雫 u+96EB, 匂 u+5302
    Suggestion: coverage these characters in the T-Chinese
  3. The character is variant and uses the same glyph for both T-Chinese and Japanese, or existing glyph
    e.g. 歩, 桜, 来, 亜...
    Suggestion: add the mapping
  4. The character is variant but uses different glyphs between T-Chinese and Japanese
    Suggestion: add a new glyph if it's possible, or ignore it.

人名用漢字の一覧
http://ja.wiktionary.org/wiki/Wiktionary:%E4%BA%BA%E5%90%8D%E7%94%A8%E6%BC%A2%E5%AD%97%E3%81%AE%E4%B8%80%E8%A6%A7

常用漢字一覧
http://ja.wiktionary.org/wiki/Wiktionary:%E5%B8%B8%E7%94%A8%E6%BC%A2%E5%AD%97%E3%81%AE%E4%B8%80%E8%A6%A7

Besides, there are 12 japanese kanji characters in FA0E-FA29
﨎, 﨏, 﨑, 﨓, 﨔, 﨟, 﨡, 﨣, 﨤, 﨧, 﨨, 﨩
I'm not sure if it's common in Japanese name.

GSUB (and GPOS?) bloated

I see multiple copies of the same features. Eg:

  <FeatureRecord index="0"> 
    <FeatureTag value="aalt"/> 
    <Feature> 
      <!-- LookupCount=2 --> 
      <LookupListIndex index="0" value="0"/> 
      <LookupListIndex index="1" value="1"/> 
    </Feature> 
  </FeatureRecord> 
  <FeatureRecord index="1"> 
    <FeatureTag value="aalt"/> 
    <Feature> 
      <!-- LookupCount=2 --> 
      <LookupListIndex index="0" value="0"/> 
      <LookupListIndex index="1" value="1"/> 
    </Feature> 
  </FeatureRecord> 
  <FeatureRecord index="2"> 
    <FeatureTag value="aalt"/> 
    <Feature> 
      <!-- LookupCount=2 --> 
      <LookupListIndex index="0" value="0"/> 
      <LookupListIndex index="1" value="1"/> 
    </Feature> 
  </FeatureRecord> 
  <FeatureRecord index="3"> 
    <FeatureTag value="aalt"/> 
    <Feature> 
      <!-- LookupCount=2 --> 
      <LookupListIndex index="0" value="0"/> 
      <LookupListIndex index="1" value="1"/> 
    </Feature> 
  </FeatureRecord> 
  <FeatureRecord index="4"> 
    <FeatureTag value="aalt"/> 
    <Feature> 
      <!-- LookupCount=2 --> 
      <LookupListIndex index="0" value="0"/> 
      <LookupListIndex index="1" value="1"/> 
    </Feature> 
  </FeatureRecord> 

It makes reading the GSUB tables very hard unnecessarily.
The snippet above is from NotoSansCJK-Regular, but I suppose it's the same with Source Han Sans.

strange Full name of Bold and Heavy fonts

$ otfinfo -i SourceHanSans-*.otf| grep 'Full name:'
SourceHanSans-Bold.otf:Full name:           Source Han Sans Bold Bold
SourceHanSans-ExtraLight.otf:Full name:           Source Han Sans ExtraLight
SourceHanSans-Heavy.otf:Full name:           Source Han Sans Heavy Bold
SourceHanSans-Light.otf:Full name:           Source Han Sans Light
SourceHanSans-Medium.otf:Full name:           Source Han Sans Medium
SourceHanSans-Normal.otf:Full name:           Source Han Sans Normal
SourceHanSans-Regular.otf:Full name:           Source Han Sans Regular

As shown, Full name of Bold and Heavy weights are strange and inconsistent with others.
otfinfo is a util from LCDF Typetools, but any other font viewer should be ok to see the 'name' table. Again, I found SourceHanSansKR fonts also have the same issue.

"嘅(U+5605)" is not consistent with other words with the same phonetic component (T Chinese)

This issue has been reported in Google's issue tracker, but I've read from Adobe CJK Type Blog's Twitter that it'd be better to report the NotoSansCJK issues in this Github for better tracking, so I include the original link with a brief summary here.

Same issue in Google's issue tracker: https://code.google.com/p/noto/issues/detail?id=70

This issue is about the representation of "嘅" in the Traditional Chinese version of the font.

  • "嘅(U+5605)" is what we call a "phono-semantic compound" where "口" is the radical and "既" is the phonetic component. So 嘅 = 口 + 既. Similar words are 溉, 概, 慨, 暨.
  • There are a couple of variants of "既", e.g. U+65E2(既), U+65E3(旣) and U+FA42(既).
  • It makes sense for a font to use the same writing form for all words using the same phonetic component.

ge3c

  • The first form in the above screenshot, 既, is chosen by Taiwan Ministry of Education as the standard. As a result, we see 溉 = 氵+ 既, 概 = 木+既, 慨 = 忄+既.... etc. in the Traditional Chinese font. This is correct.
  • However, "嘅" is the an exception: It is rendered as "口旣", not "口既".

ge3b

  • According to the conversation in Google's issue tracker, this inconsistency is probably caused by some strange decision made in CNS-11643. So the font is actually behaving correctly according to this standard. However, the point is that this is an irrational decision because it breaks consistency.
  • Practically, 嘅 is rarely used in Taiwan (it isn't even in the Big5 table). But it is frequently used in Cantonese speaking community like Hong Kong.

I would like to know if this word can be specially treated so that we HK user won't see different form for the same component. This should have no impact to non-Cantonese speaking community like TW because 嘅 is an obsolete or dead word to them. And this doesn't involve redrawing the glyph, because the glyph "口既" in Simplified Chinese version is exactly what I am looking for.

Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.