[mercury-users] Problem with non-ASCII characters in term_io__read_term

Ondrej Bojar oboj7042 at ss1000.ms.mff.cuni.cz
Thu May 30 03:52:15 AEST 2002


Hi.

I use term_io__read_term and io__write to read and write configuration
files. My terms contain non-ASCII characters (ISO-8859-2 encoded letters
with Czech accents).

When I read the terms for the first time, Mercury gets correct characters
and is able to print them back, so they appear on my terminal correctly.
When I "compile" the configuration file to other set of terms with exactly
the same string constants in it, the compiled file contains all accented
characters escaped with \ and an octal code. So far so good.

But when I attempt to read the compiled file again, *some* of the escape
sequences are read all right a recognized as letters but some are not.

What I get instead of the letter is just the octal code, without the
backslash.

I attach a sample module to show the problem, and also a log, what happens
to me.

(I use KDE, national support for Czech, ISO-8859-2 is the charset. I do
not set locale to cs_CZ, because it doesn't make any difference, as I've
just tested.)

Thanks for help,
  Andrew.
-------------- next part --------------
:- module testaccents.
:- interface.
:- import_module io.

:- pred main(io::di, io::uo) is det.

:- implementation.
:- import_module term, term_io, string.

:- type foo ---> foo(string).

main -->
  io__write_string("Now I give you a term:\n"),
  {Data = "abcèdïeéìfghij"},
  io__write(foo(Data)), io__write_string(".\n"),
  io__write_string("Now copy the previous line here:\n"),
  term_io__read_term(ThisTerm),
  io__write_string("I got: "),
  io__write(ThisTerm), io__nl,
  (
  if { ThisTerm = term(_varset, Term),
       try_term_to_type(Term, Result),
       Result = ok(Foo) }
  then
    io__write_string("Yes, I got foo: "),
    io__write(Foo), io__nl,
    { Foo = foo(GotData) },
    (
    if {Data = GotData}
    then io__write_string("Superb! The data didn't get twisted!\n")
    else io__write_string("Bad! Data is twisted: "++GotData++" instead of "++Data++"\n")
    )
  else
    io__write_string("Bad! Didn't catch the term. Did you copy it correctly?\n")
  ).
    
-------------- next part --------------
Now I give you a term:
foo("abc\350d\357e\351\354fghij").
Now copy the previous line here:
foo("abc\350d\357e\351\354fghij").
I got: term(varset(var_supply(0), empty, empty), functor(atom("foo"), [functor(string("abc\350d\357e\351354fghij"), [], context("<standard input>", 1))], context("<standard input>", 1)))
Yes, I got foo: foo("abc\350d\357e\351354fghij")
Bad! Data is twisted: abcèdïeé354fghij instead of abcèdïeéìfghij


More information about the users mailing list