[m-rev.] for review: new implementation of string__format in mercury.
Ralph Becket
rafe at cs.mu.OZ.AU
Thu Jan 31 18:24:35 AEDT 2002
p.incani, Thursday, 31 January 2002:
>
> Index: string.m
> ===================================================================
> @@ -1253,14 +1254,17 @@
> zero_or_more_occurences(digit),
> =(Final),
>
> - { list__remove_suffix(Init, Final, Width) },
> + { list__remove_suffix(Init, Final, WidthCharList) },
> + { string__from_char_list(WidthCharList, WidthString)},
> + { string__to_int(WidthString, Width)},
There is a func verstion of string__from_char_list which I'd encourage
you to use rather than the pred version.
If you know that the WidthString only includes digits, you can also
improve matters by using the func string__det_to_int rather than just
string__to_int.
> @@ -1280,15 +1284,18 @@
> zero_or_more_occurences(digit),
> =(Final)
> ->
> - { list__remove_suffix(Init, Final, Prec) },
> + { list__remove_suffix(Init, Final, PrecCharList) },
> + { string__from_char_list(PrecCharList, PrecString)},
> + { string__to_int(PrecString, Prec)},
Ditto.
This code duplication may warrant hoisting it out into a separate
function.
> % NB the capital letter specifiers are proceeded with a 'c'.
An alternative scheme would be to use quoted terms, e.g. 'X', 'E', 'F',
'G'.
> + %
> + % Format a signed int (d,i).
> + %
> +:- func format_int(list(char), maybe(int), maybe(int), int) = string.
> +format_int(Flags, Width, Prec, Int) = String :-
> + %
> + % Find integers absolute value, and take care of special
> + % case of precision zero with an integer of 0.
> + %
> + ( Int = 0, Prec = yes(0)
> + ->
Quirky syntax: Mercury programmers would typically write
(
Int = 0,
Prec = yes(0)
->
...
)
or
( Int = 0, Prec = yes(0) ->
...
)
> + AbsIntStr = ""
> + ;
> + Integer = integer(Int),
> + AbsInteger = integer__abs(Integer),
> + AbsIntStr = integer__to_string(AbsInteger)
> + ),
> + AbsIntStrLength = string__length(AbsIntStr),
> + %
> + % Do we need to increase precision?
> + %
> + (Prec = yes(Precision),
> + Precision > AbsIntStrLength
> + ->
> + PrecStr = string__pad_left(AbsIntStr, '0', Precision)
> + ;
> + PrecStr = AbsIntStr
> + ),
> + %
> + % Do we need to pad to the field width.
> + %
> + ( Width = yes(FieldWidth),
> + FieldWidth > string__length(PrecStr),
> + member('0', Flags),
> + \+member('-', Flags),
> + Prec = no
> + ->
> + FieldStr = string__pad_left(PrecStr, '0', FieldWidth - 1),
> + ZeroPadded = yes
> + ;
> + FieldStr = PrecStr,
> + ZeroPadded = no
> + ),
> + %
> + % Prefix with appropriate sign or zero padding.
> + % The previous step has deliberately left room for this.
> + %
> + ( Int < 0 ->
> + SignedStr = string__append("-", FieldStr)
> + ;
> + member('+', Flags) ->
> + SignedStr = string__append("+", FieldStr)
> + ;
> + member(' ', Flags) ->
> + SignedStr = string__append(" ", FieldStr)
> + ;
> + ZeroPadded = yes ->
> + SignedStr = string__append("0", FieldStr)
> + ;
> + SignedStr = FieldStr
> + ),
Slightly quirky if-then-else formatting; a more common convention is
( Int < 0 ->
SignedStr = string__append("-", FieldStr)
; member('+', Flags) ->
SignedStr = string__append("+", FieldStr)
; member(' ', Flags) ->
SignedStr = string__append(" ", FieldStr)
; ZeroPadded = yes ->
SignedStr = string__append("0", FieldStr)
;
SignedStr = FieldStr
),
> + %
> + % Do we need to justify?
> + %
> + String = justify_string(Flags, Width, SignedStr).
If you're unsure, that comment should probably have an XXX prefix.
> +
> +
> + %
> + % Format an unsigned int, unsigned octal, unsigned hexadecimal (u,o,x,X).
> + %
> +:- func format_unsigned_int(list(char), maybe(int), maybe(int), int, int, string) = string.
Don't wrap lines past 80 cols (and people will think you're using Visual
Studio :)
> +format_unsigned_int(Flags, Width, Prec, Base, Int, Prefix) = String :-
> + %
> + % Find integers absolute value, and take care of special
> + % case of precision zero with an integer of 0.
> + %
> + ( Int = 0, Prec = yes(0)
> + ->
> + AbsIntStr = ""
> + ;
> + integer__pow(integer(2), integer(int__bits_per_int), Div),
> + UnsignedInteger = integer(Int) mod Div,
> + ( Base = 10
> + ->
> + AbsIntStr0 = integer__to_string(UnsignedInteger)
> + ; Base = 8
> + ->
> + AbsIntStr0 = to_octal(UnsignedInteger)
> + ; Prefix = "0x"
> + ->
> + AbsIntStr0 = to_hex(UnsignedInteger)
> + ;
Presumably Prefix = "0x" => Base = 16? If so, it would be more
consistent to test Base rather than Prefix here.
If Base can only take on certain values, it might be better to turn it
into an enumerated type rather than an int, in which case you could
switch on it rather than resort to if-then-elses.
> + ;
> + AbsIntStr = AbsIntStr0
> + )
> + ),
> + AbsIntStrLength = string__length(AbsIntStr),
> + %
> + % Do we need to increase precision?
> + %
> + (Prec = yes(Precision),
s/(Prec/( Prec/
> + Precision > AbsIntStrLength
s/P/ P/
> + ->
> + PrecStr = string__pad_left(AbsIntStr, '0', Precision)
> + ;
> + PrecStr = AbsIntStr
> + ),
> + %
> + % Do we need to increase the precision of an octal?
> + %
> + ( Base = 8,
> + member('#', Flags),
> + \+string__prefix(PrecStr, "0")
Indent the \+ to the same level as the other goals in the condition.
Add a space between the \+ and string__prefix (repeat throughout).
> + ->
> + PrecModStr = append("0", PrecStr)
> + ;
> + PrecModStr = PrecStr
> + ),
> + %
> + % Do we need to pad to the field width.
> + %
> + ( Width = yes(FieldWidth),
> + FieldWidth > string__length(PrecModStr),
> + member('0', Flags),
> + \+member('-', Flags),
> + Prec = no
> + ->
> + %
> + % Do we need to make room for "0x" or "0X" ?
> + %
> + (Base = 16, member('#', Flags)
Inconsistent formatting.
> + ->
> + FieldStr = string__pad_left(PrecModStr, '0', FieldWidth - 2)
> + ;
> + FieldStr = string__pad_left(PrecModStr, '0', FieldWidth)
> + )
> + ;
> + FieldStr = PrecModStr
> + ),
> + %
> + % Do we have to prefix "0x" or "0X"?
> + %
> + ( Base = 16,
> + member('#', Flags),
> + FieldStr \= "0",
> + FieldStr \= ""
> + ->
> + FieldModStr = string__append(Prefix, FieldStr)
> + ;
> + FieldModStr = FieldStr
> + ),
> + %
> + % Do we need to justify?
> + %
> + String = justify_string(Flags, Width, FieldModStr).
XXX comment.
> +
>
> - % Construct a format string.
> -:- func make_format(list(char), maybe(list(char)),
> - maybe(list(char)), string, string) = string.
> + %
> + % Format a character (c).
> + %
> +:- func format_char(list(char), maybe(int), char) = string.
It might be a good idea to use equivalence types for list(char),
maybe(int) etc. to improve readability.
> + ( Float < 0.0 ->
> + SignedStr = string__append("-", FieldStr)
> + ;
> + member('+', Flags) ->
> + SignedStr = string__append("+", FieldStr)
> + ;
> + member(' ', Flags) ->
> + SignedStr = string__append(" ", FieldStr)
> + ;
> + ZeroPadded = yes ->
> + SignedStr = string__append("0", FieldStr)
> + ;
> + SignedStr = FieldStr
> + ),
SignedStr = "-" ++ FieldStr
is more readable than using string__append/2.
> + ( Float < 0.0 ->
> + SignedStr = string__append("-", FieldStr)
> + ;
> + member('+', Flags) ->
> + SignedStr = string__append("+", FieldStr)
> + ;
> + member(' ', Flags) ->
> + SignedStr = string__append(" ", FieldStr)
> + ;
> + ZeroPadded = yes ->
> + SignedStr = string__append("0", FieldStr)
> + ;
> + SignedStr = FieldStr
> + ),
Ditto.
> + ( Float < 0.0 ->
> + SignedStr = string__append("-", FieldStr)
> + ;
> + member('+', Flags) ->
> + SignedStr = string__append("+", FieldStr)
> + ;
> + member(' ', Flags) ->
> + SignedStr = string__append(" ", FieldStr)
> + ;
> + ZeroPadded = yes ->
> + SignedStr = string__append("0", FieldStr)
> + ;
> + SignedStr = FieldStr
> + ),
This is common enough to separate out into a separate function.
> + % Given a floating point number, this function calculates the size of the exponent
> + % needed to represent the float in scientific notation.
> + %
> +:- func size_of_required_exponent(string, int) = int.
> +size_of_required_exponent(Float, Prec) = Exponent :-
> + UnsafeExponent = decimal_pos(Float),
> + UnsafeBase = calculate_base_unsafe(Float, Prec),
Throughout: add a blank line between :- func/pred/mode declarations and
the first clause.
> + %
> + % Is mantissa one digit long?
> + %
> + MantAndFrac = string__words(is_decimal_point, UnsafeBase),
> + list__index0_det(MantAndFrac, 0, MantissaStr),
There's a func version of list__index0_det.
> + ( string__length(MantissaStr) > 1
> + ->
> + % we will need need to move decimal pt one place to the left: therefore,
More than 80 chars on a line.
> + %
> + % Given a string representing a floating point number, function returns a string with
> + % all trailing zeros removed.
> + %
> +:- func remove_trailing_zeros(string) = string.
> +remove_trailing_zeros(Float) = TrimmedFloat :-
> + FloatCharList = string__to_char_list(Float),
> + FloatCharListRev = list__reverse(FloatCharList),
> + TrimmedFloatRevCharList = remove_zeros(FloatCharListRev),
> + TrimmedFloatCharList = list__reverse(TrimmedFloatRevCharList),
> + TrimmedFloat = string__from_char_list(TrimmedFloatCharList).
You can probably do this more simply and cheaply by iterating from R to
L along the string until you hit a non-zero char and then calling
string__substring.
>
> -:- pred using_sprintf is semidet.
> + %
> + % Given a char list, this function removes all leading zeros, including decimal point,
> + % if need be.
> + %
> +:- func remove_zeros(list(char)) = list(char).
> +remove_zeros(CharNum) = TrimmedNum :-
> + ( CharNum = ['0'|Rest]
Throughout: space around `|' in lists.
> + ->
> + TrimmedNum = remove_zeros(Rest)
> + ;
> + CharNum = ['.'|Rest]
Formatting (indentation).
> + ->
> + TrimmedNum = Rest
> + ;
> + TrimmedNum = CharNum
> + ).
>
> -:- pragma foreign_proc("C", using_sprintf,
> - [will_not_call_mercury, promise_pure, thread_safe], "
> - SUCCESS_INDICATOR = TRUE;
> -").
> -:- pragma foreign_proc("MC++", using_sprintf,
> - [will_not_call_mercury, promise_pure, thread_safe], "
> - SUCCESS_INDICATOR = FALSE;
> -").
> -
>
> - % Construct a format string suitable to passing to sprintf.
> -:- func make_format_sprintf(list(char), maybe(list(char)),
> - maybe(list(char)), string, string) = string.
> + %
> + % Converts a floating point number to a specified number of standard figures.
> + % The style used depends on the value converted; style e (or E) is used only if the
> + % exponent resulting from such a conversion is less than -4 or greater than or equal
> + % to the precision. Trailing zeros are removed from the fractional portion of the
> + % result unless the # flag is specified: a decimal-point character appears only if it
> + % is followed by a digit.
> + %
And below: lines too long.
> +:- func change_to_g_notation(string, int, string, list(char)) = string.
> +change_to_g_notation(Float, Prec, E, Flags) = FormattedFloat :-
> + Exponent = size_of_required_exponent(Float, Prec),
> + ( Exponent >= -4, Exponent < Prec
> + ->
> + % Float will be represented normally.
> + % -----------------------------------
> + % Need to calculate precision to pass to the change_precision
> + % function, because the current precision represents significant
> + % figures, not decimal places.
> + %
> + % now change float's precision.
> + %
> + ( Exponent =< 0
> + ->
> + %
> + % deal with floats such as 0.00000000xyz
> + %
> + DecimalPos = decimal_pos(Float),
> + FormattedFloat0 = change_precision(abs(DecimalPos)-1+Prec, Float)
> + ;
> + %
> + % deal with floats such as ddddddd.mmmmmmmm
> + %
> + ScientificFloat = change_to_e_notation(Float, Prec - 1, "e"),
> + BaseAndExponent = string__words(is_exponent, ScientificFloat),
> + list__index0_det(BaseAndExponent, 0, BaseStr),
> + list__index0_det(BaseAndExponent, 1, ExponentStr),
Use funcs (throughout).
... Remainder of review to follow in near future...
- Ralph
--------------------------------------------------------------------------
mercury-reviews mailing list
post: mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------
More information about the reviews
mailing list