TRENS is a character encoding converter API and tool that lets you batch convert text files, from and to different encodings.
For example, you can use it to convert a directory full of text files while maintaining the directory structure of the original files.
To do so, TRENS
wraps the GNU iconv
library in an efficient way. If used correctly, the performances are outstanding.
We’ve experienced batch UTF-8 speed conversion up to a million Tweets per second (~130 MB/sec) on commodity machines.
TRENS has been tested and is known to run on Windows, Linux, and Mac OS X ... 32/64-bit.
It will likely work fine on most UNIX systems too.
Immediate benefits for your programs are:
- Only 3 calls out of 6 to get the job done.
- Provides support for different encodings: European languages, Semitic languages, Japanese, Chinese, full Unicode, etc.
- An adaptive streaming API: able to convert infinite stream of data (far larger than available RAM)
- Never exceed 10MB of memory usage: caller can also decide of the max. buffer size to use (down to only 4B).
- Support advanced libiconv‘s features such as transliteration, etc.
- Extend your application to an international audience.
- Parallel conversion of several thousands files at a time.
- No cygwin environment needed, fully native on Windows.
- Offer a portable command line tool equivalent to the iconv command.