[m-dev.] for review: string__fmt

Peter Ross peter.ross at miscrit.be
Tue Aug 8 19:57:42 AEST 2000


On Tue, Aug 08, 2000 at 05:26:59PM +1000, Fergus Henderson wrote:
> On 07-Aug-2000, Peter Ross <peter.ross at miscrit.be> wrote:
> > 
> > library/string.m:
> >     Implement a new predicate string__fmt.  It provides almost the same
> >     functionality as string__format, but uses a lot less memory and is
> >     more compatible with the C version of sprintf seeing as that is what
> >     it calls to do all of its formatting.
> 
> It seems like poor interface design to have two functions
> `string__format' and `string__fmt' that do almost the same thing but
> with some subtle differences.  It seems that the differences between
> the two are artifacts of their implementation, rather than being a
> deliberate interface design choice.  Why not just have one function
> that does it right?
> 
For some inputs string__fmt and string__format will give different
outputs which may cause problems for our users.  This is why I left the
old version in.

With my personal experience of string__format, I am inclined to removing
it but didn't want to do it without consultation.

> For example, couldn't you modify the code for string__fmt to handle
> the `*' modifier, and then just replace the old code for string__format
> with the new code for string__fmt?
> 
All the possible solutions that I thought of that handled the `*' modifier
increased the complexity of the code substantially or increased the
memory requirements.

> > Index: library/string.m
> > @@ -341,8 +340,16 @@
> >  %		It is always better to include a `.' to remove ambiguity.  This
> >  %		interpretation is non-standard and may change.
> >  %
> > -%		Numbers are truncated by a precision value, not rounded off.
> 
> Why did you remove that comment?
> I think it should stay.
> 
Cut and paste rather then copy and paste error.  The comment will be
reinstated.


> > +	% Implementation of append_list that uses C as this minimises the
> > +	% amount of garbage created.
> > +:- func fast_append_list(list(string)) = string.
> > +:- pragma c_code(fast_append_list(Strs::in) = (Str::out),
> 
> Why have two seperate procedures `string__append_list' and
> `string__fast_append_list'?  Why not just make the ordinary
> `string__append_list' fast?
> 
Because string__append_list has two modes and I didn't want to implement
the second mode in C.  Any suggestions on how to avoid this problem?

> > +:- type specifier
> > +	--->	conv(list(char))	% a list of chars which contains
> > +					% one conversion specifier.
> > +	;	string(list(char)).	% a list of chars which may only
> > +					% contain the "%%" conversion
> > +					% specifier.
> 
> One way to handle `*' conversions would be to change that to
> 
> 	:- type specifier
> 		--->	zero_arg_conv(list(char))
> 				% a format specifier that requires
> 				% no arguments (i.e. `%%')
> 		;	one_arg_conv(list(char))
> 				% a format specifier that requires
> 				% one argument (e.g. `%d', `%s', etc.)
> 		;	two_arg_conv(list(char))
> 				% a format specifier that requires
> 				% two arguments (e.g. `%*d', `%*s', etc.)
> 		;	string(list(char)).
> 				% a list of chars which does not
> 				% contain `%'
> 
> and modify the rest of the code accordingly.
> 
> > +:- pred length_mod(list(char)::in, list(char)::out) is semidet.
> > +
> > +length_mod --> ['h'], ( ['h'] -> [] ; [] ).
> > +length_mod --> ['l'], ( ['l'] -> [] ; [] ).
> > +length_mod --> ['L'].
> > +length_mod --> ['j'].
> > +length_mod --> ['z'].
> > +length_mod --> ['t'].
> 
> Most of these (`hh', `ll', `j', `z', and `t') are new in C99.
> They were not included in C89.  If someone uses these length
> modifiers, they may run into portability problems.
> 
> But in any case, you can't allow arbitrary length modifiers here,
> since what you actually pass to sprintf() will have a definite fixed size.
> If you pass a value of type `long', but give it the format specifier
> `%d' or `%hd', then bad things will happen...
> 
OK, I need to think a bit more about how to handle this

> > +	% Create a string from a float using the format string.
> > +:- func float_to_string(string, float) = string.
> 
> I think it would be clearer to name these functions
> `format_float/2', `format_string/2', etc.
> rather than `float_to_string/2', `string_to_string/2', etc.
> 
> You should document that it is the caller's responsibility to
> check the validity of the format string.
> 	
> > +:- pragma c_code(float_to_string(FormatStr::in, Val::in) = (Str::out),
> > +		[will_not_call_mercury, thread_safe], "{
> > +	char buf[500];
> > +	Word tmp;
> > +	sprintf(buf, FormatStr, Val);
> 
> Here you are passing an `MR_Float' to sprintf().
> But the format string might not be valid for an MR_Float.
> 
> Probably you should (a) cast the argument to sprintf() from `MR_Float'
> to `long double', so that it has a known type, and (b) replace the
> length modifier in the format string with a length modifier
> appropriate for long double, i.e. "%Lf".
> (a) would be done here, while (b) would be done while parsing the
> format string.
> 
> Likewise for int_to_string, except for that it is a little bit
> more complicated.  There are three cases:
> 
> 	- if MR_Integer is not long long, then you should use `long'
> 	  and `%ld'
> 
> 	- if MR_Integer is long long, and the C standard library supports
> 	  the `%lld' format, then you should use long long and `%lld'.
> 
> 	- if MR_Integer is long long, and the C standard library does not
> 	  support the `%lld' format -- as can happen when you're using
> 	  a modern compiler such as gcc but an old standard library --
> 	  then you're a bit screwed.  Short of reimplementing printf(),
> 	  probably the best approach in that situation is to check
> 	  that the value is in range for `long'; if so, you can cast to
> 	  long and use `%d', and if not, then you can throw an exception,
> 	  or perhaps just insert "<overflow>" in the output.
> 
and this.

> > +	% Create a string from a string using the format string.
> > +:- func string_to_string(string, string) = string.
> > +:- pragma c_code(string_to_string(FormatStr::in, Val::in) = (Str::out),
> > +		[will_not_call_mercury, thread_safe], "{
> > +	char buf[500];
> > +	Word tmp;
> > +	sprintf(buf, FormatStr, Val);
> > +	incr_hp_atomic_msg(tmp, (strlen(buf) + sizeof(Word)) / sizeof(Word),
> > +		MR_PROC_LABEL, ""string:string/0"");
> > +	Str = (char *)tmp;
> > +	strcpy(Str, buf);
> 
> The fixed-length buffer here is a problem.
> At very least, the buffer should be much larger.
> But of course that won't solve the problem.
> 
> A better approach would be to check the amount of space
> you need, and if it doesn't fit in the fixed-size buffer,
> then dynamically allocate a big enough buffer.
> On systems that support snprintf(), you could use that.
> If not, then you could check the length of `Val'...
> 
Good idea.

Pete
--------------------------------------------------------------------------
mercury-developers mailing list
Post messages to:       mercury-developers at cs.mu.oz.au
Administrative Queries: owner-mercury-developers at cs.mu.oz.au
Subscriptions:          mercury-developers-request at cs.mu.oz.au
--------------------------------------------------------------------------



More information about the developers mailing list