[m-rev.] for review: add builtin 64-bit integer types -- Part 1

Julien Fischer jfischer at opturion.com
Fri Jan 12 22:49:26 AEDT 2018


Hi Peter,

On Fri, 12 Jan 2018, Peter Wang wrote:

> On Thu, 11 Jan 2018 21:45:51 -0500 (EST), Julien Fischer <jfischer at opturion.com> wrote:
>>
>> Hi all,
>>
>> The attached change begins the process of adding 64-bit integer types.
>> Once it is committed everyone will need to update their compiler before
>> part 2 is committed.
>>
>> Julien.
>
>> Add builtin 64-bit integer types -- Part 1.
>>
>> Add the new builitn types: int64 and uint64.
>
> builtin

Fixed.

>> Support for these new types will need to be bootstrapped over several changes.
>> This is the first such change and does the following:
>>
>> - Extends the compiler to recognise 'int64' and 'uint64' as builtin types.
>> - Extends the set of builtin arithmetic, bitwise and relational operators
>>   to cover the new types.
>> - Adds the new internal option '--unboxed-int64s' to the compiler; this will be
>>   used to control whether 64-bit integer types are boxed or not.
>> - Extends all of the code generators to handle the new types.
>> - Extends the runtimes to support the new types.
>> - Adds new modules to the standard library intend to contain basic operations
>>   on the new types.  (These are currently empty and not documented.)
>>
>> There are bunch of limitations marks with "XXX INT64"; these will be lifted in
>> part 2 of this change.  Also, 64-bit integer types are currently always boxed,
>> again this limitation will be lifted in later changes.
>>
>> compiler/options.m:
>>     Add the new option --unboxed-int64s.
>>
>> compiler/prog_type.m:
>> compiler/prog_data.m:
>> compiler/builtin_lib_types.m:
>>      Recognise int64 and uint64 as builitn types.
>
> builtin

Fixed.

>
>> library/private_builtin.m:
>>     Add placeholders for the builtin unify and compare operations for
>>     the new types.  Since the bootstrapping compiler will not recognise
>>     the new types we give the polymorphic arguments.  These can be
>
> give them

Fixed.

>> diff --git a/compiler/global_data.m b/compiler/global_data.m
>> index 8daa37c..9fceb62 100644
>> --- a/compiler/global_data.m
>> +++ b/compiler/global_data.m
>> @@ -82,8 +82,8 @@
>>
>>  :- type static_cell_info.
>>
>> -:- func init_static_cell_info(module_name, have_unboxed_floats, bool)
>> -    = static_cell_info.
>> +:- func init_static_cell_info(module_name, have_unboxed_floats, bool,
>> +    have_unboxed_int64s) = static_cell_info.
>>
>>  :- pred add_scalar_static_cell(list(typed_rval)::in, data_id::out,
>>      static_cell_info::in, static_cell_info::out) is det.
>> @@ -98,8 +98,9 @@
>>      list(alloc_site_info)::out, map(alloc_site_id, layout_slot_name)::out)
>>      is det.
>>
>> -:- pred find_general_llds_types(have_unboxed_floats::in, list(mer_type)::in,
>> -    list(list(rval))::in, list(llds_type)::out) is semidet.
>> +:- pred find_general_llds_types(have_unboxed_floats::in,
>> +    have_unboxed_int64s::in, list(mer_type)::in, list(list(rval))::in,
>> +    list(llds_type)::out) is semidet.
>>
>>  :- pred add_vector_static_cell(list(llds_type)::in,
>>      list(list(rval))::in, data_id::out,
>> @@ -112,13 +113,15 @@
>>      list(scalar_common_data_array)::out, list(vector_common_data_array)::out)
>>      is det.
>>
>> -    % Given an rval, the value of the --unboxed_float option and the width of
>> -    % the constructor argument, figure out the type the rval would have as an
>> -    % argument. Normally that's the same as its usual type; the exception is
>> -    % that for boxed floats, the type is data_ptr (i.e. the type of the boxed
>> -    % value) rather than float (the type of the unboxed value).
>> +    % Given an rval, the value of the --unboxed-float option, the vlaue of the
>> +    % --unboxed-int64s and the width of the constructor argument, figure out
>> +    % the type the rval would have as an argument. Normally that's the same as
>> +    % its usual type; the exception is that for boxed floats, boxed int64s and
>> +    % boxed uint64s the type is data_ptr (i.e. the type of the boxed value)
>> +    % rather than float, int64, uint64 (the type of the unboxed value).
>
> s/vlaue/value

Fixed.

>>      %
>> -:- func rval_type_as_arg(have_unboxed_floats, arg_width, rval) = llds_type.
>> +:- func rval_type_as_arg(have_unboxed_floats,have_unboxed_int64s,  arg_width,
>> +    rval) = llds_type.
>
> whitespace

Fixed.

>> @@ -330,7 +333,8 @@ make_alloc_id_map(AllocSite, Slot, Slot + 1, !Map) :-
>>      --->    static_cell_sub_info(
>>                  scsi_module_name            :: module_name, % base file name
>>                  scsi_unbox_float            :: have_unboxed_floats,
>> -                scsi_common_data            :: bool
>> +                scsi_common_data            :: bool,
>> +                scsi_unbox_int64s           :: have_unboxed_int64s
>>              ).
>
> Maybe reorder the fields?

Done, and similarly for the function used to initialise this structure.

>> diff --git a/configure.ac b/configure.ac
>> index fcd1f52..7012010 100644
>> --- a/configure.ac
>> +++ b/configure.ac
>> @@ -2325,6 +2325,14 @@ AC_SUBST(MR_LONG_DOUBLE_IS_64_BIT)
>>
>>  #-----------------------------------------------------------------------------#
>>
>> +# XXX INT64 -- while we are boostrapping the changes that introduce int64
>> +# and uint65, we always box them.  Support for unboxed 64-bit integer types
>
> uint64

Fixed.

>> diff --git a/runtime/mercury_int.c b/runtime/mercury_int.c
>> new file mode 100644
>> index 0000000..919c136
>> --- /dev/null
>> +++ b/runtime/mercury_int.c
>> @@ -0,0 +1,25 @@
>> +// vim: ts=4 sw=4 expandtab ft=c
>> +
>> +// Copyright (C) 2017 The Mercury team.
>> +// This file may only be copied under the terms of the GNU Library General
>> +// Public License - see the file COPYING.LIB in the Mercury distribution.
>> +
>> +#include "mercury_int.h"
>> +
>> +MR_Integer MR_hash_int64(int64_t i)
>> +{
>> +    if (sizeof(int64_t) <= sizeof(MR_Integer)) {
>> +        return (MR_Integer) i;
>> +    } else {
>> +        return (uint32_t)i ^ ((uint64_t)i >> 32);
>> +    }
>> +}
>> +
>> +MR_Integer MR_hash_uint64(uint64_t i)
>> +{
>> +    if (sizeof(uint64_t) <= sizeof(MR_Integer)) {
>> +        return (MR_Integer) i;
>> +    } else {
>> +        return (uint32_t)i ^ (i >> 32);
>> +    }
>> +}
>
> Ok. (But we could experiment with other mixing functions later.)

I've added a comment to that effect.  (I've also fixed the incorrect
year in the copyright notice for this file.)


>> diff --git a/runtime/mercury_int.h b/runtime/mercury_int.h
>> index 51088b7..548c2b5 100644
>> --- a/runtime/mercury_int.h
>> +++ b/runtime/mercury_int.h
>> @@ -9,10 +9,102 @@
>>  #ifndef MERCURY_INT_H
>>  #define MERCURY_INT_H
>>
>> -#include "mercury_conf.h"       // for MR_BOXED_FLOAT, MR_CONSERVATIVE_GC
>> +#include "mercury_conf.h"       // for MR_BOXED_INT64S, MR_CONSERVATIVE_GC
>>  #include "mercury_types.h"      // for `MR_Word'
>>  #include "mercury_std.h"        // for `MR_bool'
>>
>> +#define MR_INT64_WORDS          ((sizeof(int64_t) + sizeof(MR_Word) - 1) \
>> +                                        / sizeof(MR_Word))
>> +#define MR_UINT64_WORDS         ((sizeof(uint64_t) + sizeof(MR_Word) - 1) \
>> +                                        / sizeof(MR_Word))
>> +#if defined(MR_BOXED_INT64S)
>> +
>> +   #define MR_word_to_int64(w)   (* (int64_t *) (w))
>> +   #define MR_word_to_uint64(w)   (* (uint64_t *) (w))
>
> Alignment

Fixed.

>> +
>> +   #if defined(MR_CONSERVATIVE_GC)
>> +
>> +    #define MR_int64_to_word(i)                                             \
>> +      (                                                                     \
>> +        MR_hp_alloc_atomic_msg(MR_INT64_WORDS,                              \
>> +            (MR_AllocSiteInfoPtr) MR_ALLOC_SITE_INT64, NULL),               \
>
> Delete cast.

Done.

>> +        * (int64_t *) (void *) (MR_hp - MR_INT64_WORDS) = (i),              \
>> +        /* return */ (MR_Word) (MR_hp - MR_INT64_WORDS)                     \
>> +      )
>> +    #define MR_make_hp_int64_aligned() ((void)0)
>> +
>> +    #define MR_uint64_to_word(u)                                            \
>> +      (                                                                     \
>> +        MR_hp_alloc_atomic_msg(MR_UINT64_WORDS,                             \
>> +            (MR_AllocSiteInfoPtr) MR_ALLOC_SITE_UINT64, NULL),              \
>> +        * (uint64_t *) (void *) (MR_hp - MR_UINT64_WORDS) = (u),            \
>> +        /* return */ (MR_Word) (MR_hp - MR_UINT64_WORDS)                    \
>> +      )
>> +    #define MR_make_hp_uint64_aligned() ((void)0)
>> +
>> +   #else // ! defined(MR_CONSERVATIVE_GC)
>> +    // We need to ensure that what we allocated on the heap is properly
>> +    // aligned for a floating-point value, by rounding MR_hp up to the
>> +    // nearest int64-aligned boundary.
>
> for a int64 value

Fixed.

>> +    // XXX This code assumes that sizeof(int64_t) is a power of two,
>> +    // and not greater than 2 * sizeof(MR_Word).
>
> Delete that comment.

Done.

>> +
>> +    #define MR_make_hp_int64_aligned()                                      \
>> +      ( (MR_Word) MR_hp & (sizeof(int64_t) - 1) ?                           \
>> +            MR_hp_alloc_atomic_msg(1, MR_ALLOC_SITE_INT64, NULL)            \
>> +      :                                                                     \
>> +            (void)0                                                         \
>> +      )
>> +    #define MR_int64_to_word(i)                                             \
>> +      (                                                                     \
>> +        MR_make_hp_int64_aligned(),                                         \
>> +        MR_hp_alloc_atomic_msg(MR_INT64_WORDS, MR_ALLOC_SITE_INT64, NULL),  \
>> +        * (int64_t *) (void *)(MR_hp - MR_INT64_WORDS) = (i),               \
>> +        /* return */ (MR_Word) (MR_hp - MR_INT64_WORDS)                     \
>> +      )
>> +
>> +    #define MR_make_hp_uint64_aligned()                                      \
>> +      ( (MR_Word) MR_hp & (sizeof(uint64_t) - 1) ?                           \
>> +            MR_hp_alloc_atomic_msg(1, MR_ALLOC_SITE_UINT64, NULL)            \
>> +      :                                                                      \
>> +            (void)0                                                          \
>> +      )
>> +    #define MR_uint64_to_word(u)                                             \
>> +      (                                                                      \
>> +        MR_make_hp_uint64_aligned(),                                         \
>> +        MR_hp_alloc_atomic_msg(MR_UINT64_WORDS, MR_ALLOC_SITE_UINT64, NULL), \
>> +        * (uint64_t *) (void *)(MR_hp - MR_UINT64_WORDS) = (u),              \
>> +        /* return */ (MR_Word) (MR_hp - MR_UINT64_WORDS)                     \
>> +      )
>> +
>> +  #endif
>> +
>> +  #if defined(MR_GNUC)
>> +    #define MR_int64_const(i) ({ static const int64_t d = i; (MR_Word) &d; })
>> +    #define MR_uint64_const(u) ({ static const uint64_t d = u; (MR_Word) &d; })
>> +  #else
>> +    #define MR_int64_const(i) MR_int64_to_word(i)   // inefficient
>> +    #define MR_uint64_const(u) MR_uint64_to_word(u)   // inefficient
>> +  #endif
>
> Maybe delete these. MR_float_const is not used either.

Done.  I'll look into possibly removing MR_float_const separately.

>> +#else // not MR_BOXED_INT64S
>> +
>> +  // Unboxed float means we can assume sizeof(int64_t) <= sizeof(MR_Word).
>
> Fix comment.

Fixed.

>> +
>> +  #define MR_make_hp_int64_aligned() ((void)0)
>> +  #define MR_make_hp_uint64_aligned() ((void)0)
>> +
>> +  #define MR_int64_const(i) MR_int64_to_word(i)
>> +  #define MR_uint64_const(u) MR_uint64_to_word(u)
>> +
>> +  #define MR_int64_to_word(i) ((MR_Word)(i))
>> +  #define MR_uint64_to_word(u) ((MR_Word)(u))
>> +
>> +#endif // not MR_BOXED_INT64S
>> +
>> +extern MR_Integer MR_hash_int64(int64_t);
>> +extern MR_Integer MR_hash_uint64(uint64_t);
>> +
>>  #if defined(MR_GNUC) || defined(MR_CLANG)
>>    #define MR_uint16_reverse_bytes(U) __builtin_bswap16((U))
>>  #elif defined(MR_MSVC)
>
>> diff --git a/runtime/mercury_ml_expand_body.h b/runtime/mercury_ml_expand_body.h
>> index 0203ac2..8fbec28 100644
>> --- a/runtime/mercury_ml_expand_body.h
>> +++ b/runtime/mercury_ml_expand_body.h
>> @@ -115,6 +115,7 @@
>>  //  instead, but it is not yet safe to throw exceptions across the C interface.
>>
>>  #include <stdio.h>
>> +#include <inttypes.h>
>
>> @@ -900,6 +901,44 @@ EXPAND_FUNCTION_NAME(MR_TypeInfo type_info, MR_Word *data_word_ptr,
>>
>>              handle_zero_arity_args();
>>              return;
>> +
>> +        case MR_TYPECTOR_REP_INT64:
>> +#ifdef  EXPAND_FUNCTOR_FIELD
>> +            {
>> +                MR_Word data_word;
>> +                char    buf[500];
>> +                int64_t i64;
>> +                char    *str;
>> +
>> +                data_word = *data_word_ptr;
>> +                i64 = MR_word_to_int64(data_word);
>> +                sprintf(buf, "%" PRIx64 "i64", i64);
>
> Should be decimal, right?

Should have been; now is.  (And the other instances you noted.)

...

>> diff --git a/runtime/mercury_tabling_preds.h b/runtime/mercury_tabling_preds.h
>> index e3b8749..bc6432a 100644
>> --- a/runtime/mercury_tabling_preds.h
>> +++ b/runtime/mercury_tabling_preds.h
>> @@ -225,6 +237,16 @@
>>          MR_TABLE_UINT32(stats, debug, back, T, T0, (MR_Unsigned) V);        \
>>      } while (0)
>>
>> +#define MR_tbl_lookup_insert_int64(stats, debug, back, T0, V, T)            \
>> +    do {                                                                    \
>> +        MR_TABLE_INT64(stats, debug, back, T, T0, (MR_Integer) V);          \
>> +    } while (0)
>> +
>> +#define MR_tbl_lookup_insert_uint64(stats, debug, back, T0, V, T)           \
>> +    do {                                                                    \
>> +        MR_TABLE_UINT64(stats, debug, back, T, T0, (MR_Unsigned) V);        \
>> +    } while (0)
>> +
>
> Casts look wrong.

Fixed.

>> diff --git a/runtime/mercury_term_size.c b/runtime/mercury_term_size.c
>> index e2115e1..3253a95 100644
>> --- a/runtime/mercury_term_size.c
>> +++ b/runtime/mercury_term_size.c
>> @@ -321,6 +321,24 @@ try_again:
>>              }
>>  #endif
>>              return 0;
>> +
>> +        case MR_TYPECTOR_REP_INT64:
>> +#ifdef MR_DEBUG_TERM_SIZES
>> +            if (MR_heapdebug && MR_lld_print_enabled) {
>> +                printf(
>> +                    "MR_term_size: int64 %p\n", (void *) term);
>> +            }
>> +#endif
>
> I guess this is wrong.

It's how that section of code also handles floats.  I've added a note
that it might also be useful to also see the values of these terms.  (If
anyone works on the term size profiling code again, they can decide what
to do then.)

Thanks for the review.

Cheers,
Julien.



More information about the reviews mailing list