[m-users.] Function to remove unicode byte order mark
Dirk Ziegemeyer
dirk at ziegemeyer.de
Sat Feb 25 18:57:58 AEDT 2017
Hi,
I’d like to share a litte piece of code that removes the byte order mark (BOM) from the top of an utf-8 file.
I didn’t even know that a byte order mark exists until I saved an Excel table as Unicode.txt in order to be able to read it with Mercury. BOM is preserved during conversion to utf-8.
As Mercury doesn’t remove the BOM by itself, it’s up to the application to deal with it.
Dirk
% Remove optional unicode byte order mark (BOM) from the beginning
% of an utf-8 file
%
:- func remove_byte_order_mark(string) = string.
remove_byte_order_mark(RawFirstLine) = FirstLine :-
( if
string.first_char(RawFirstLine, FirstChar, Rest),
char.to_int(FirstChar) = 0xfeff % Unicode byte order mark (BOM)
then FirstLine = Rest
else FirstLine = RawFirstLine
).
More information about the users
mailing list