[m-users.] Read text files encoded as Latin1 or UTF16-LE
Dirk Ziegemeyer
dirk at ziegemeyer.de
Sat Oct 19 02:47:35 AEDT 2019
Hi,
is there any best practice how to read text files encoded as Latin1 or UTF16-LE with Mercury?
When I use Julien’s csv stream reader (https://github.com/juliensf/mercury-csv) to parse a file encoded in Latin1, I first convert it to UTF-8 by calling iconv from within Mercury:
io.call_system("iconv -f ISO-8859-1 -t UTF-8 File.csv > File.csv.utf8-temp", CmdResult, !IO)
I would rather perform encoding conversion on the stream level to avoid disk write.
There is a hint in io.read_char_code documentation about „converting external character encodings into Mercury's internal character representation“, but I could not find out how to tell Mercury the encoding scheme of the stream.
In extras/xml/xml.encoding.m is some code dealing with encoding conversion, but it seems to be unused by the standard library.
Dirk
More information about the users
mailing list