[m-dev.] for discussion: design issue for new integer types
Julien Fischer
jfischer at opturion.com
Sun Oct 30 12:24:04 AEDT 2016
Hi Zoltan,
On Fri, 28 Oct 2016, Zoltan Somogyi wrote:
> On Fri, 28 Oct 2016 11:51:51 +1100 (AEDT), Julien Fischer <jfischer at opturion.com> wrote:
>
>> In order to not overload the current type checker, literals for each integer
>> type will need to be lexically distinct. My suggestion is that each integer
>> type have a distinguishing suffix. For the fixed size integer types these
>> would be:
>>
>> i8, i16, i32, i64
>> u8, u16, u32, u64
>
> I think those are fine.
>
>> For int and uint, there are couple of choices:
>>
>> i, u
>> iw, uw (where w == "word sized")
>
> I prefer i and u.
Ok. I've started writing up a change to the reference manual for all
this. I'll add the sized fixed types as well and we can comment them out
until they're actually added.
>> The suffix would not be required literals of type 'int'.
>
> I don't think anyone would argue against that.
I certainly hope not!
...
>> As an aside: it's long since time we allowed some form of separator
>> between groups of digits in integer (and float) literals. I propose
>> that we allow '_' between digits as in Java and C#.
>
> I agree that is a good idea.
>
> A followup question: should we require that the _s be where western
> convention dictates the decimal commas should go, i.e. between
> every third digit? I for one would prefer that, but people using the
> indian number system, which puts commas around groups of *two*
> digits above the thousands, would probably prefer that there
> not be such a rule (look up "lakh" or "crore" on wikipedia).
As Peter has mentioned elsewhere in this thread, there are *good* reasons
why their positioning should be left up to the programmer.
> We would need to delete the _s at some point anyway. If we do it
> in the compiler, we can make the coding doing the deletion
> generate a warning if the _s are in the "wrong" place, with the
> notion of "wrong" being selected by compiler options such as
> --warn-misplaced-integer-underscores-{western,indian}.
I don't think that's something the compiler should be concerned with
(except possibly in the formatting of error messages).
>> 2. Automatic coercion and promotion.
>>
>> There won't be any in Mercury. If you are converting between integer
>> types then you will be required to say so.
>
> Agreed.
>
> What form would those explicit coercions take? Would we have a specific
> function for each pair of integer types?
Yes. The existing numeric types (int, float, rational, integer) already
define these sort of coercions; with the new types there's just going
to be a lot more of them.
> How about e.g. i16 to float: would you have to convert the i16 to int first?
I think having the function int16.to_float (or float.from_int16) is reasonable
enough, there's no need to go via an int.
>> 3. Representation of new integer types in the term type.
>>
>> How should the new new integer types be represented in the term.const/0
>> type?
>>
>> The obvious way would be:
>>
>> :- type const
>> ---> atom(string)
>> ; integer(int)
>> ; big_integer(integer_base, integer)
>> % An integer that is too big for `int'.
>>
>> ; unsigned_integer(uint)
>> ; big_unsigned_integer(integer_base, integer).
>> % An unsigned integer that is too big for `uint'.
>>
>> ; string(string)
>> ; float(float)
>> ; implementation_defined(string)
>>
>> ; uint8(uint8)
>> ; uint16(uint16)
>> ; uint32(uint32)
>> ; uint64(uint64)
>> ; int8(int8)
>> ; int16(int16)
>> ; int32(int32)
>> ; int64(int64).
>
> I would instead suggest that we keep just the existing
> integer and big_integer functors, and add a new argument to both.
> This argument would say int vs uint, and 8 vs 16 vs 32 vs 64 vs
> default size, *purely on the basis of the suffix, without any check
> in the scanner*, for reason given above.
>
> To allow the underscore check mentioned above, the existing argument
> of the integer and big_integer functors would need to be a string,
> with the conversion done in the compiler. However, doing that
> would erase the need for the big_integer functor, since the integer
> functor would then be able to represent everything it can.
I prefer the second scheme.
> Two other things. First, some people may be using the library's
> lexer and parser modules for their own purposes (e.g. Prolog interpreters),
> so if we change their basic representation, we should add their old
> versions to e.g. extras under names such as old_{lexer,parser}.m.
Ok, I will add a copy of the existing modules to extras.
> Second, I have a big outstanding change to fact_table.m that would
> be affected by a change to the term type, so please warn be before
> committing such a change.
Will do. Such a change is some way off yet in any case.
> A question you did not ask was how the representation of integers
> should change in the HLDS, i.e. in the cons_id type. I think I would
> prefer adding a size argument to the int_const and uint_const
> functors to adding a new int8_const, int16_const etc functors
> to the type, because most code would want to treat all integers
> the same regardless of size.
Ok.
> I would even prefer to erase the distinction between int_const and
> uint_const, but realize that this cannot be done, because in the HLDS, we
> definitely want the constant in integer, not string, form, and there is no
> word sized type that can hold both all ints and all uints. However, we could
> switch to int_const(integer, signedness, maybe(size)).
Ok.
> The checks I mentioned above (does e.g. a i8 fit in -128 to 127,
> are the _s in the right place) would naturally fit in the code
> (in superhomogeneous.m, I think) that converts from term consts
> to cons_ids.
>
>> 4. poly_type and format.
>> 5. Reverse modes of arithmetic operations.
>
> I will comment on these later.
Do you have a preference as to the type of the second operand of the
shift operations (point 6 in my original post).
Julien.
More information about the developers
mailing list