Programmatically Tell If A Unicode Character Takes Up More Than One Character Space In A Terminal
Solution 1:
While it's not relevant for the specific examples you give (all of which display at the size of a single character for me on Ubuntu), CJK characters have a unicode property which indicates that they are wider than normal, and display at double width in some terminals.
For example, in python:
# 'a' is a normal (narrow) character
# '愛' can be interpreted as a double-width (wide) character
import unicodedata
assert unicodedata.east_asian_width('a') == 'N'
assert unicodedata.east_asian_width('愛') == 'W'
Apart from this, I don't think there's a specification for how much space certain characters should take up, other than the size of the glyph in whatever font you are using (which your terminal is probably ignoring for the reason Ignacio gave).
For more info on the "east asian width" property, see http://www.unicode.org/reports/tr11/
Solution 2:
No, since there's no way to tell what font the terminal is using. Always use a monospace font, lesson learned.
It happens because the terminal is using a "cell" font layout engine (i.e. characters are printed at specific X and Y coordinates regardless of their actual size) whereas the browser is using a "flow" font layout engine (subsequent characters print where the previous character ended).
Solution 3:
This is a bug in the OS X terminal.
I wouldn't recommend trying to work around it, because it will break on other systems (e.g. Linux), and it might get fixed eventually on the Mac. It will also confuse anyone that pastes into another applicaton.
Post a Comment for "Programmatically Tell If A Unicode Character Takes Up More Than One Character Space In A Terminal"