Mozilla have a pretty nice universal character set detector built into their products. It’s modular, it’s quick, and it has a great deal of real-world research and testing behind it. I wanted to be able to use it as part of a project I am working on, but couldn’t find a nice standalone command-line version. There is a Java port, but the overhead of loading up a JVM just to detect the character set of a document was unappealing, and porting the entire codebase to another language would take too long (plus it would run a lot slower). So, I spent an evening learning some C/C++ and came up with just what I needed. I thought it might be useful to someone else, too, so I am releasing it here.
The README.txt contains compilation and usage instructions. I have no more words now. Get it below!