A while back I made a post about ASCIIizing Text. With it was a simple python application that would convert Unicode characters to ASCII equivalents. It doesn’t do a basic conversion but also Latinizes the characters when they are outside of the ASCII range.

The uni2ascii package I made has a few short comings I’ve decided to fix. The three major problems with it are: 1) Very basic permission checking, 2) Only accepts one file, 3) Required all input to be UTF8 encoded, 4) The decoder was a very literal port of a the ruby version.

To fix these issues I’ve written an entirely new script. Problems 1, 2 and 3 are fixed. It has robust error checking, can handle an arbitrary number of files, and the file encoding can be specified. Number 4 is fixed by using the Python port created by Tomaz Solc.

I’ve put the source code for the new decoder into a Launchpad branch:

$ bzr branch lp:~user-none/+junk/unidecoder