Personal tools
You are here: Home Members jgsack's Home UnicodeNotes
Views

References, misc links: UnicodeBookmarks

Python coding tips and recipes .. PythonUnicode

Some test & utilitiy progs

ucode_info.py
a prog to search by character name in the unicodedata database (see usage help within)
charmap_out.py

generates a sample output (from latin-1) in specified encoding, for testing use. ex:

./charmap_out.py utf8 > test.utf8
./charmap_out.py utf_16_le | xxd -g1 -l4
Command line tools a n usage:

general recommendation: use a utf8 locale:

locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
  • echo does a nice job of expanding utf8:
echo -e "xc3x86" gives the (ligature) LATIN CAPITAL LETTER AE which this wiki rejects as non-ascii <sigh>, illustrating that we yet quite arrived where we want to be, eh?
Trivia:

Not all terminal emulators (eg, gnome-terminal, konsole) do equally well on rendering unicode combining character forms. The old xterm actuall does a better job.

  • there's also mined http://towo.net/mined/ (run it in an xterm)
  • I've tried yudit http://www.yudit.org/ some time ago, but can't remember much
  • vim abiword lyx all have unicode capabilities of some kind that I haven't explored
  • and I think google on unicode text editors has a fe more suggestions, too

Some more test data:

combined.m2u ..just some of the pre-composed combining forms

ANSEL-2u.ged ..contains a lot of combining forms for which there exists no pre-composed character in unicode. This is agood example to test your terminal (xterm does quite well).

###

###


subtopics:



Powered by Plone CMS, the Open Source Content Management System

This site conforms to the following standards: