[m-rev.] for review: fix float_to_string so that roundtripping works

Fergus Henderson fjh at cs.mu.OZ.AU
Thu Nov 21 02:26:50 AEDT 2002
Previous message: [m-rev.] for review: fix float_to_string so that roundtripping works
Next message: [m-rev.] for review: fix float_to_string so that roundtripping works
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 20-Nov-2002, Peter Ross <pro at missioncriticalit.com> wrote:
> Estimated hours taken: 8
> Branches: main
> 
> Fix the I/O routines for floats so that can roundtrip.
> 
> library/string.m:
>     Ensure that string__float_to_string returns a float that is
>     round-trippable.  On the C backend we do that by starting at the min
>     precision required and increasing it until the float roundtrips.  On
>     the IL backend the R flag guarantees that a double will be
>     round-trippable, so we just use that.
>     Change string__to_float so that it uses the format string for the
>     precision float that we are using, and export this predicate for use
>     by the trace system.
>     Delete the unused string__float_to_f_string.
> 
> library/io.m:
>     Call string__float_to_string to determine the float to output so
>     that the float output is round-trippable.

This change will decrease efficiency of writing out floats, unfortunately,
for two reasons:

	(1) rather than just formatting the float once,
	    we format the float,
	    then scan it back in again,
	    then check if the two are equal,
	    and in some cases we may have to repeat the whole process
	    two or three times.
	    
	(2) io__write_float allocates space on the Mercury heap
	    and does not reclaim it

(1) is acceptable, since it seems to be the minimum necessary to fix the bug.
But I'd rather avoid (2).

For formatting floats with "%.*g" format, a dynamically sized
heap-allocated buffer is overkill, since the size needed is
bounded by the precision plus a small constant; a fixed-size,
stack-allocated buffer should be sufficient.

> Index: library/io.m
> ===================================================================
> RCS file: /home/staff/zs/imp/mercury/library/io.m,v
> retrieving revision 1.276
> diff -u -r1.276 io.m
> --- library/io.m	6 Nov 2002 04:30:47 -0000	1.276
> +++ library/io.m	20 Nov 2002 10:43:29 -0000
> @@ -4405,16 +4405,6 @@
>  ").
>  
>  :- pragma foreign_proc("C",
> -	io__write_float(Val::in, IO0::di, IO::uo),
> -		[may_call_mercury, promise_pure, tabled_for_io, thread_safe],
> -"
> -	if (ML_fprintf(mercury_current_text_output, ""%#.15g"", Val) < 0) {
> -		mercury_output_error(mercury_current_text_output);
> -	}
> -	MR_update_io(IO0, IO);
> -").
...
> @@ -4846,10 +4805,8 @@
...
> +io__write_float(Stream, Float) -->
> +	io__write_string(Stream, string__float_to_string(Float)).

That is OK as a fall-back, and for the .NET and Java versions,
but the C version should avoid the heap allocation implied by the
call to string__float_to_string.

> Index: library/string.m
> ===================================================================
> RCS file: /home/staff/zs/imp/mercury/library/string.m,v
> retrieving revision 1.184
> diff -u -r1.184 string.m
> --- library/string.m	15 Nov 2002 04:50:37 -0000	1.184
> +++ library/string.m	20 Nov 2002 13:46:46 -0000
> @@ -1824,76 +1824,70 @@
>  %-----------------------------------------------------------------------------%
>  
>  :- pragma foreign_proc("C",
> -	string__float_to_string(FloatVal::in, FloatString::uo),
> +	string__float_to_string(Flt::in, Str::uo),
>  		[will_not_call_mercury, promise_pure, thread_safe], "{
> -	char buf[500];
> -	sprintf(buf, ""%#.15g"", FloatVal);
> -	MR_allocate_aligned_string_msg(FloatString, strlen(buf), MR_PROC_LABEL);
> -	strcpy(FloatString, buf);
> +#ifdef MR_USE_SINGLE_PREC_FLOAT
> +	#define MIN_PRECISION	7
> +	const char *format = ""%f"";
> +#else
> +	#define MIN_PRECISION	15
> +	const char *format = ""%lf"";
> +#endif
> +	int i = MIN_PRECISION;
> +	MR_Float round;
> +
> +		/*
> +		 * Round-trip the float to ensure the precision was 
> +		 * sufficient, and if not then try with the next precision.
> +		*/
> +	Str = MR_make_string(MR_PROC_LABEL, ""%#.*g"", i, Flt);
> +	sscanf(Str, format, &round);
> +
> +	while (round != Flt) {
> +		i++;
> +		Str = MR_make_string(MR_PROC_LABEL, ""#%.*g"", i, Flt);
> +		sscanf(Str, format, &round);
> +	}

You should not use an unbounded loop here.  Some C compilers or libraries
may not be able to round-trip a float no matter how many digits you output.
The C89 standard does not have any guarantees about accuracy in such things.
For such (broken) compilers, it is better to output a value to 17 digits
rather than going into an infinite loop.

Also, your loop structure involves unnecessary (and undesirable) code
duplication.  Better to use

	int i = MIN_PRECISION;
	do {
		Str = MR_make_string(MR_PROC_LABEL, ""#%.*g"", i, Flt);
		sscanf(Str, format, &round);
		i++;
	} while (round != Flt);

Incorporating my earlier suggestion above about avoiding loops gives

	int i = MIN_PRECISION;
	do {
		Str = MR_make_string(MR_PROC_LABEL, ""#%.*g"", i, Flt);
		if (i >= MAX_PRECISION) {
			/*
			** This should be sufficient precision to
			** round-trip any value.  Don't bother checking
			** whether it can actually be round-tripped,
			** since if it can't, this is a bug in the
			** C implementation.
			*/
			break;
		}
		sscanf(Str, format, &round);
		i++;
	} while (round != Flt);

Using a fixed-size buffer instead of MR_make_string() gives

	/*
	** Longest possible string for %.*g format is `-n.nnnnnnE-mmmm',
	** which has size  PRECISION + MAX_EXPONENT_DIGITS + 5
	** (for the `-', `.', `E', '-', and '\0').
	** PRECISION is at most 20, and MAX_EXPONENT_DIGITS is at most 5,
	** so we need at most 30 chars.  80 is way more than enough.
	*/
	#define FLOAT_BUF_SIZE 80

	char buf[FLOAT_BUF_SIZE];

	int i = MIN_PRECISION;
	do {
		sprintf(buf, ""#%.*g"", i, Flt);

		... rest of loop as above ...

This code can be encapsulated in a C macro or a C function (with
`buf' passed in by the caller) which can then be called from both
io__write_float and string__float_to_string.

> +:- pragma foreign_proc("il",
>  	string__float_to_string(FloatVal::in, FloatString::uo),
> +	[will_not_call_mercury, promise_pure, thread_safe, max_stack_size(1)], "
> +	ldloca	'FloatVal'
> +	ldstr	""R""
>  
> +		// The R format string prints the double out such that it
> +		// can be round-tripped.
> +		// XXX According to the documentation it tries the 15 digits of
> +		// precision, then 17 digits skipping 16 digits of precision.
> +		// unlike what we do for the C backend.
> +	call instance string [mscorlib]System.Double::ToString(string)
>  
> +	stloc	'FloatString'
> +").

This should be written in C# rather than IL.  IL is a low-level assembly
language, so we don't want to write it by hand unless we have to.

As proof of this, "ldloca" in the above is incorrect (it should be "ldloc")
and "max_stack_size(1)" is also incorrect (it should be 2).
But please fix these problems by rewriting in C# rather than fixing the IL.

> +:- pragma export(string__to_float(in, out), "ML_string_to_float").
>  :- pragma foreign_proc("C",
>  	string__to_float(FloatString::in, FloatVal::out),
>  		[will_not_call_mercury, promise_pure, thread_safe], "{
> -	/*
> -	** Use a temporary, since we can't don't know whether FloatVal is a
> -	** double or float.  The %c checks for any erroneous characters
> -	** appearing after the float; if there are then sscanf() will
> -	** return 2 rather than 1.

The comment about %c should not be deleted.

> -	** The logic used here is duplicated in the function MR_trace_is_float
> -	** in trace/mercury_trace_util.c.
> -	*/
> -	double tmpf;
> -	char   tmpc;
> +#ifdef MR_USE_SINGLE_PREC_FLOAT
> +  #define FMT    """"
> +#else
> +  #define FMT    ""l""
> +#endif
> +
> +	char   	tmpc;
>  	SUCCESS_INDICATOR =
>  		(!MR_isspace(FloatString[0])) &&
> -		(sscanf(FloatString, ""%lf%c"", &tmpf, &tmpc) == 1);
> +		(sscanf(FloatString, ""%"" FMT ""f%c"", &FloatVal, &tmpc) == 1);
>  		/* MR_TRUE if sscanf succeeds, MR_FALSE otherwise */
> -	FloatVal = tmpf;
> -}").

As a matter of style, I'd put the "f" part of the format in FMT.

The use of `FMT' here is a namespace violation.  s/FMT/ML_FMT/
or s/FMT_ML_FLOAT_FMT/

Also, macros which are defined locally like this should be #undef'd
after their last use.

-- 
Fergus Henderson <fjh at cs.mu.oz.au>  |  "I have always known that the pursuit
The University of Melbourne         |  of excellence is a lethal habit"
WWW: <http://www.cs.mu.oz.au/~fjh>  |     -- the last words of T. S. Garp.
--------------------------------------------------------------------------
mercury-reviews mailing list
post:  mercury-reviews at cs.mu.oz.au
administrative address: owner-mercury-reviews at cs.mu.oz.au
unsubscribe: Address: mercury-reviews-request at cs.mu.oz.au Message: unsubscribe
subscribe:   Address: mercury-reviews-request at cs.mu.oz.au Message: subscribe
--------------------------------------------------------------------------
Previous message: [m-rev.] for review: fix float_to_string so that roundtripping works
Next message: [m-rev.] for review: fix float_to_string so that roundtripping works
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the reviews mailing list