Views
References, misc links: UnicodeBookmarks
Python coding tips and recipes .. PythonUnicode
Some test & utilitiy progs
- ucode_info.py
- a prog to search by character name in the unicodedata database (see usage help within)
- charmap_out.py
generates a sample output (from latin-1) in specified encoding, for testing use. ex:
./charmap_out.py utf8 > test.utf8 ./charmap_out.py utf_16_le | xxd -g1 -l4
- Command line tools a n usage:
general recommendation: use a utf8 locale:
locale LANG=en_US.UTF-8 LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL=
- echo does a nice job of expanding utf8:
echo -e "xc3x86" gives the (ligature) LATIN CAPITAL LETTER AE which this wiki rejects as non-ascii <sigh>, illustrating that we yet quite arrived where we want to be, eh?
- Trivia:
Not all terminal emulators (eg, gnome-terminal, konsole) do equally well on rendering unicode combining character forms. The old xterm actuall does a better job.
- there's also mined http://towo.net/mined/ (run it in an xterm)
- I've tried yudit http://www.yudit.org/ some time ago, but can't remember much
- vim abiword lyx all have unicode capabilities of some kind that I haven't explored
- and I think google on unicode text editors has a fe more suggestions, too
Some more test data:
combined.m2u ..just some of the pre-composed combining forms
ANSEL-2u.ged ..contains a lot of combining forms for which there exists no pre-composed character in unicode. This is agood example to test your terminal (xterm does quite well).
###
###
| subtopics: |