[m-dev.] for discussion: design issue for new integer types

Julien Fischer jfischer at opturion.com
Sun Oct 30 12:24:04 AEDT 2016



Hi Zoltan,

On Fri, 28 Oct 2016, Zoltan Somogyi wrote:

> On Fri, 28 Oct 2016 11:51:51 +1100 (AEDT), Julien Fischer <jfischer at opturion.com> wrote:
>
>> In order to not overload the current type checker, literals for each integer
>> type will need to be lexically distinct.  My suggestion is that each integer
>> type have a distinguishing suffix.  For the fixed size integer types these
>> would be:
>>
>>        i8, i16, i32, i64
>>        u8, u16, u32, u64
>
> I think those are fine.
>
>> For int and uint, there are couple of choices:
>>
>>        i, u
>>        iw, uw     (where w == "word sized")
>
> I prefer i and u.

Ok. I've started writing up a change to the reference manual for all
this.  I'll add the sized fixed types as well and we can comment them out
until they're actually added.

>> The suffix would not be required literals of type 'int'.
>
> I don't think anyone would argue against that.

I certainly hope not!

...

>> As an aside: it's long since time we allowed some form of separator
>> between groups of digits in integer (and float) literals.  I propose
>> that we allow '_' between digits as in Java and C#.
>
> I agree that is a good idea.
>
> A followup question: should we require that the _s be where western
> convention dictates the decimal commas should go, i.e. between
> every third digit? I for one would prefer that, but people using the
> indian number system, which puts commas around groups of *two*
> digits above the thousands, would probably prefer that there
> not be such a rule (look up "lakh" or "crore" on wikipedia).

As Peter has mentioned elsewhere in this thread, there are *good* reasons
why their positioning should be left up to the programmer.

> We would need to delete the _s at some point anyway. If we do it
> in the compiler, we can make the coding doing the deletion
> generate a warning if the _s are in the "wrong" place, with the
> notion of "wrong" being selected by compiler options such as
> --warn-misplaced-integer-underscores-{western,indian}.

I don't think that's something the compiler should be concerned with
(except possibly in the formatting of error messages).

>> 2. Automatic coercion and promotion.
>>
>> There won't be any in Mercury.  If you are converting between integer
>> types then you will be required to say so.
>
> Agreed.
>
> What form would those explicit coercions take? Would we have a specific
> function for each pair of integer types?

Yes.  The existing numeric types (int, float, rational, integer) already
define these sort of coercions; with the new types there's just going
to be a lot more of them.

> How about e.g. i16 to float: would you have to convert the i16 to int first?

I think having the function int16.to_float (or float.from_int16) is reasonable
enough, there's no need to go via an int.

>> 3. Representation of new integer types in the term type.
>>
>> How should the new new integer types be represented in the term.const/0
>> type?
>>
>> The obvious way would be:
>>
>>      :- type const
>>          --->    atom(string)
>>          ;       integer(int)
>>          ;       big_integer(integer_base, integer)
>>                  % An integer that is too big for `int'.
>>
>>  	;       unsigned_integer(uint)
>>  	;	big_unsigned_integer(integer_base, integer).
>>  		% An unsigned integer that is too big for `uint'.
>>
>>          ;       string(string)
>>          ;       float(float)
>>          ;       implementation_defined(string)
>>
>>  	;	uint8(uint8)
>>  	;	uint16(uint16)
>>  	;	uint32(uint32)
>>  	;	uint64(uint64)
>>  	;	int8(int8)
>>  	;	int16(int16)
>>  	;	int32(int32)
>>  	;	int64(int64).
>
> I would instead suggest that we keep just the existing
> integer and big_integer functors, and add a new argument to both.
> This argument would say int vs uint, and 8 vs 16 vs 32 vs 64 vs
> default size, *purely on the basis of the suffix, without any check
> in the scanner*, for reason given above.
>
> To allow the underscore check mentioned above, the existing argument
> of the integer and big_integer functors would need to be a string,
> with the conversion done in the compiler. However, doing that
> would erase the need for the big_integer functor, since the integer
> functor would then be able to represent everything it can.

I prefer the second scheme.

> Two other things. First, some people may be using the library's
> lexer and parser modules for their own purposes (e.g. Prolog interpreters),
> so if we change their basic representation, we should add their old
> versions to e.g. extras under names such as old_{lexer,parser}.m.

Ok, I will add a copy of the existing modules to extras.

> Second, I have a big outstanding change to fact_table.m that would
> be affected by a change to the term type, so please warn be before
> committing such a change.

Will do.  Such a change is some way off yet in any case.

> A question you did not ask was how the representation of integers
> should change in the HLDS, i.e. in the cons_id type. I think I would
> prefer adding a size argument to the int_const and uint_const
> functors to adding a new int8_const, int16_const etc functors
> to the type, because most code would want to treat all integers
> the same regardless of size.

Ok.

> I would even prefer to erase the distinction between int_const and
> uint_const, but realize that this cannot be done, because in the HLDS, we
> definitely want the constant in integer, not string, form, and there is no
> word sized type that can hold both all ints and all uints. However, we could
> switch to int_const(integer, signedness, maybe(size)).

Ok.

> The checks I mentioned above (does e.g. a i8 fit in -128 to 127,
> are the _s in the right place) would naturally fit in the code
> (in superhomogeneous.m, I think) that converts from term consts
> to cons_ids.
>
>> 4. poly_type and format.
>> 5. Reverse modes of arithmetic operations.
>
> I will comment on these later.

Do you have a preference as to the type of the second operand of the
shift operations (point 6 in my original post).

Julien.


More information about the developers mailing list