[m-users.] Read text files encoded as Latin1 or UTF16-LE

Dirk Ziegemeyer dirk at ziegemeyer.de
Sat Oct 19 02:47:35 AEDT 2019


Hi,

is there any best practice how to read text files encoded as Latin1 or UTF16-LE with Mercury?

When I use Julien’s csv stream reader (https://github.com/juliensf/mercury-csv) to parse a file encoded in Latin1, I first convert it to UTF-8 by calling iconv from within Mercury:

io.call_system("iconv -f ISO-8859-1 -t UTF-8 File.csv > File.csv.utf8-temp", CmdResult, !IO)

I would rather perform encoding conversion on the stream level to avoid disk write.

There is a hint in io.read_char_code documentation about „converting external character encodings into Mercury's internal character representation“, but I could not find out how to tell Mercury the encoding scheme of the stream.

In extras/xml/xml.encoding.m is some code dealing with encoding conversion, but it seems to be unused by the standard library.

Dirk



More information about the users mailing list