Page 1 of 1

LoadTextW and Unicode encoding

Posted: Wed Apr 25, 2007 12:44 pm
by martindholmes
Hi there,

The documentation for LoadTextW says:
For LoadTextW:
File must be in Unicode encoding.
My question is: what constitutes Unicode encoding? A Unicode text file could be in UTF-16 BE or UTF-16 LE; the byte order mark will reveal this. However, UTF-8, the most commonly-used Unicode encoding, does not normally have a byte-order mark.

Does LoadTextW recognize and load UTF-8 without a BOM?

Cheers,
Martin

Posted: Wed Apr 25, 2007 6:01 pm
by Sergey Tkachenko
LoadTextW cannot load UTF-8.
It loads UTF-16 files. LE is implied, BE is supported if specified by BOM.

Posted: Wed Apr 25, 2007 6:05 pm
by martindholmes
OK, I guess I'll have to try to auto-detect UTF-8 then. I've written some code for doing that in the past.

Cheers,
Martin