[m-rev.] for review: unboxed float ctor args in high-level C grades on 32-bit

Peter Wang novalazy at gmail.com
Wed Aug 24 15:01:57 AEST 2011
Previous message: [m-rev.] diff: more set_of_var interoperability improvements
Next message: [m-rev.] for review: support using clang as a C compiler with Mercury
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Branches: main

Store double-precision `float' constructor arguments in unboxed form,
in high-level C grades on 32-bit platforms, i.e. `float' (and equivalent)
arguments may occupy two machine words.

As the C code generated by the MLDS back-end makes use of MR_Float variables
and parameters, float (un)boxing may be reduced substantially in many programs.

compiler/prog_data.m:
	Add `double_word' as a new option for constructor argument widths,
	only used for float arguments as yet.

compiler/make_hlds_passes.m:
	Set constructor arguments to have `double_word' width if required,
	and possible.

compiler/type_util.m:
	Add helper predicate.

compiler/builtin_ops.m:
compiler/c_util.m:
compiler/llds.m:
	Add two new binary operators used by the MLDS back-end.

compiler/arg_pack.m:
	Handle `double_word' arguments.

compiler/ml_code_util.m:
	Deciding whether or not a float constructor argument requires boxing
	now depends on the width of the field.

compiler/ml_global_data.m:
	When a float constant appears as an initialiser of a generic array
	element, it is now always unboxed, irrespective of --unboxed-float.

compiler/ml_type_gen.m:
	Take double-word arguments into account when generating structure
	fields.

compiler/ml_unify_gen.m:
	Handle double-word float constructor arguments in (de)constructions.
	In some cases we break a float argument into its two words, so
	generating two assignments statements or two separate rvals.

	Take double-word arguments into account when calculating field offsets.

compiler/mlds_to_c.m:
	The new binary operators require no changes here.

	As a special case, write `MR_float_from_dword_ptr(&X)' instead of
	`MR_float_from_dword(X, Y)' when X, Y are consecutive words within a
	field. The definition of `MR_float_from_dword_ptr' is more
	straightforward, and gcc produces better code than if we use the more
	general `MR_float_from_dword'.

compiler/rtti_out.m:
	For double-word arguments, generate MR_DuArgLocn structures with
	MR_arg_bits set to -1.

compiler/rtti_to_mlds.m:
	Handle double-word arguments in field offset calculation.

compiler/unify_gen.m:
	Partially handle double_word arguments in LLDS back-end.

compiler/handle_options.m:
	Set --unboxed-float when targetting Java, C# and Erlang.

compiler/structure_reuse.direct.choose_reuse.m:
	Rename a predicate.

compiler/bytecode.m:
compiler/equiv_type.m:
compiler/equiv_type_hlds.m:
compiler/llds_to_x86_64.m:
compiler/mlds_to_gcc.m:
compiler/mlds_to_il.m:
compiler/opt_debug.m:
	Conform to changes.

library/construct.m:
library/store.m:
	Handle double-word constructor arguments.

runtime/mercury_conf.h.in:
	Clarify what `MR_BOXED_FLOAT' now means.

runtime/mercury_float.h:
	Add helper macros for converting between doubles and word/dwords.

runtime/mercury_deconstruct.c:
runtime/mercury_deconstruct.h:
	Add a macro `MR_arg_value' and a helper function to extract a
	constructor argument value.  This replaces `MR_unpack_arg'.

runtime/mercury_type_info.h:
	Remove `MR_unpack_arg'.

	Document that MR_DuArgLocn.MR_arg_bits may be -1.

runtime/mercury_deconstruct_macros.h:
runtime/mercury_deep_copy_body.h:
runtime/mercury_ml_arg_body.h:
runtime/mercury_table_type_body.h:
runtime/mercury_tabling.c:
runtime/mercury_type_info.c:
	Handle double-word constructor arguments.

tests/hard_coded/Mercury.options:
tests/hard_coded/Mmakefile:
tests/hard_coded/lco_double.exp:
tests/hard_coded/lco_double.m:
tests/hard_coded/pack_args_float.exp:
tests/hard_coded/pack_args_float.m:
	Add test cases.

trace/mercury_trace_vars.c:
	Conform to changes.

diff --git a/compiler/arg_pack.m b/compiler/arg_pack.m
index 0961005..a84449c 100644
--- a/compiler/arg_pack.m
+++ b/compiler/arg_pack.m
@@ -77,6 +77,9 @@ do_pack_args(ShiftCombine, [Width | Widths], [Arg0 | Args0], [Arg | Args],
         Width = full_word,
         Shift = 0
     ;
+        Width = double_word,
+        Shift = 0
+    ;
         Width = partial_word_first(_Mask),
         Shift = 0
     ;
@@ -116,6 +119,9 @@ count_distinct_words([H | T]) = Words :-
         H = full_word,
         Words = 1 + count_distinct_words(T)
     ;
+        H = double_word,
+        Words = 2 + count_distinct_words(T)
+    ;
         H = partial_word_first(_),
         Words = 1 + count_distinct_words(T)
     ;
@@ -132,6 +138,10 @@ chunk_list_by_words([W | Ws], [X | Xs], Xss) :-
         chunk_list_by_words(Ws, Xs, Xss0),
         Xss = [[X] | Xss0]
     ;
+        W = double_word,
+        % Not yet supported in LLDS grades.
+        sorry($module, $pred, "double_word")
+    ;
         W = partial_word_first(_),
         split_at_next_word(Ws, WsTail, Xs, XsHead, XsTail),
         chunk_list_by_words(WsTail, XsTail, Xss0),
@@ -152,6 +162,7 @@ split_at_next_word([], [], XsTail, [], XsTail).
 split_at_next_word([W | Ws], WsTail, [X | Xs], XsHead, XsTail) :-
     (
         ( W = full_word
+        ; W = double_word
         ; W = partial_word_first(_)
         ),
         WsTail = [W | Ws],
diff --git a/compiler/builtin_ops.m b/compiler/builtin_ops.m
index ee52f39..cf9e5c1 100644
--- a/compiler/builtin_ops.m
+++ b/compiler/builtin_ops.m
@@ -87,6 +87,8 @@
     ;       float_gt
     ;       float_le
     ;       float_ge
+    ;       float_word_bits
+    ;       float_from_dword
 
     ;       compound_eq
     ;       compound_lt.
diff --git a/compiler/bytecode.m b/compiler/bytecode.m
index cd65b97..174c666 100644
--- a/compiler/bytecode.m
+++ b/compiler/bytecode.m
@@ -1069,6 +1069,8 @@ binop_code(unsigned_le,             36).
 binop_code(compound_eq,             37).
 binop_code(compound_lt,             38).
 binop_code(str_cmp,                 39).
+binop_code(float_word_bits,         40).
+binop_code(float_from_dword,        41).
 
 :- pred binop_debug(binary_op::in, string::out) is det.
 
@@ -1112,6 +1114,8 @@ binop_debug(unsigned_le,            "unsigned_le").
 binop_debug(compound_eq,            "compound_eq").
 binop_debug(compound_lt,            "compound_lt").
 binop_debug(str_cmp,                "strcmp").
+binop_debug(float_word_bits,        "float_word_bits").
+binop_debug(float_from_dword,       "float_from_dword").
 
 :- pred unop_code(unary_op::in, int::out) is det.
 
diff --git a/compiler/c_util.m b/compiler/c_util.m
index 0a06f6a..fa6b739 100644
--- a/compiler/c_util.m
+++ b/compiler/c_util.m
@@ -588,6 +588,8 @@ binop_category_string(int_ge, int_or_bool_binary_infix_binop, ">=").
 
 binop_category_string(str_cmp, macro_binop, "MR_strcmp").
 binop_category_string(body, macro_binop, "MR_body").
+binop_category_string(float_word_bits, macro_binop, "MR_float_word_bits").
+binop_category_string(float_from_dword, macro_binop, "MR_float_from_dword").
 
 %-----------------------------------------------------------------------------%
 
diff --git a/compiler/equiv_type.m b/compiler/equiv_type.m
index 20800aa..8fac69c 100644
--- a/compiler/equiv_type.m
+++ b/compiler/equiv_type.m
@@ -1055,6 +1055,8 @@ replace_in_ctor_arg_list_2(Location, EqvMap, Seen,
     (
         Width = full_word
     ;
+        Width = double_word
+    ;
         ( Width = partial_word_first(_)
         ; Width = partial_word_shifted(_, _)
         ),
diff --git a/compiler/equiv_type_hlds.m b/compiler/equiv_type_hlds.m
index f9f7fbd..f712b73 100644
--- a/compiler/equiv_type_hlds.m
+++ b/compiler/equiv_type_hlds.m
@@ -326,7 +326,9 @@ replace_in_constructor_arg(EqvMap, CtorArg0, CtorArg, !TVarSet) :-
     (
         Changed = yes,
         (
-            Width = full_word,
+            ( Width = full_word
+            ; Width = double_word
+            ),
             CtorArg = ctor_arg(MaybeFieldName, Type, Width, Context)
         ;
             ( Width = partial_word_first(_)
diff --git a/compiler/handle_options.m b/compiler/handle_options.m
index b646ffa..8667e98 100644
--- a/compiler/handle_options.m
+++ b/compiler/handle_options.m
@@ -739,6 +739,7 @@ convert_options_to_globals(OptionTable0, Target, GC_Method, TagsMethod0,
     %     and better for interoperability.
     %     (In theory --low-level-data should work too,
     %     but there's no reason to bother supporting it.)
+    %   - unboxed floats
     %   - turning off nested functions
     %     Because Java doesn't support nested functions.
     %   - using copy-out for both det and nondet output arguments
@@ -774,6 +775,7 @@ convert_options_to_globals(OptionTable0, Target, GC_Method, TagsMethod0,
             !Globals),
         globals.set_option(highlevel_code, bool(yes), !Globals),
         globals.set_option(highlevel_data, bool(yes), !Globals),
+        globals.set_option(unboxed_float, bool(yes), !Globals),
         globals.set_option(gcc_nested_functions, bool(no), !Globals),
         globals.set_option(nondet_copy_out, bool(yes), !Globals),
         globals.set_option(det_copy_out, bool(yes), !Globals),
@@ -804,6 +806,7 @@ convert_options_to_globals(OptionTable0, Target, GC_Method, TagsMethod0,
     %   - gc_method `automatic' and no heap reclamation on failure
     %     because GC is handled automatically by the Erlang
     %     implementation.
+    %   - unboxed floats
     %   - delay-partial-instantiations
     %   - no-can-compare-constants-as-ints
     %   - can-compare-compound-values
@@ -817,6 +820,7 @@ convert_options_to_globals(OptionTable0, Target, GC_Method, TagsMethod0,
         Target = target_erlang,
         globals.set_gc_method(gc_automatic, !Globals),
         globals.set_option(gc, string("automatic"), !Globals),
+        globals.set_option(unboxed_float, bool(yes), !Globals),
         globals.set_option(reclaim_heap_on_nondet_failure, bool(no),
             !Globals),
         globals.set_option(reclaim_heap_on_semidet_failure, bool(no),
diff --git a/compiler/llds.m b/compiler/llds.m
index cb6f876..0c8afa6 100644
--- a/compiler/llds.m
+++ b/compiler/llds.m
@@ -1598,6 +1598,8 @@ binop_return_type(float_lt, lt_bool).
 binop_return_type(float_gt, lt_bool).
 binop_return_type(float_le, lt_bool).
 binop_return_type(float_ge, lt_bool).
+binop_return_type(float_word_bits, lt_word).
+binop_return_type(float_from_dword, lt_float).
 binop_return_type(body, lt_word).
 binop_return_type(compound_eq, lt_bool).
 binop_return_type(compound_lt, lt_bool).
diff --git a/compiler/llds_to_x86_64.m b/compiler/llds_to_x86_64.m
index b5a725f..319edc1 100644
--- a/compiler/llds_to_x86_64.m
+++ b/compiler/llds_to_x86_64.m
@@ -871,6 +871,8 @@ binop_instr(float_lt, _, _, [x86_64_comment("<<float_lt>>")]).
 binop_instr(float_gt, _, _, [x86_64_comment("<<float_gt>>")]).
 binop_instr(float_le, _, _, [x86_64_comment("<<float_le>>")]).
 binop_instr(float_ge, _, _, [x86_64_comment("<<float_ge>>")]).
+binop_instr(float_word_bits, _, _, [x86_64_comment("<<float_word_bits>>")]).
+binop_instr(float_from_dword, _, _, [x86_64_comment("<<float_from_dword>>")]).
 binop_instr(compound_eq, _, _, [x86_64_comment("<<compound_eq>>")]).
 binop_instr(compound_lt, _, _, [x86_64_comment("<<compound_lt>>")]).
 
diff --git a/compiler/make_hlds_passes.m b/compiler/make_hlds_passes.m
index ac8ae62..8f03247 100644
--- a/compiler/make_hlds_passes.m
+++ b/compiler/make_hlds_passes.m
@@ -171,19 +171,12 @@ do_parse_tree_to_hlds(Globals, DumpBaseFileName, unit_module(Name, Items),
     (
         InvalidTypes1 = no,
         some [!TypeTable] (
-            % Figure out how arguments should be packed into fields,
+            % Figure out how arguments should be stored into fields
             % before constructors are added to the HLDS.
-            globals.lookup_bool_option(Globals, allow_argument_packing,
-                ArgPacking),
             module_info_get_type_table(!.ModuleInfo, !:TypeTable),
-            (
-                ArgPacking = yes,
-                foldl_over_type_ctor_defns(pack_du_type_args(!.ModuleInfo),
-                    !.TypeTable, !TypeTable),
-                module_info_set_type_table(!.TypeTable, !ModuleInfo)
-            ;
-                ArgPacking = no
-            ),
+            foldl_over_type_ctor_defns(decide_du_type_layout(!.ModuleInfo),
+                !.TypeTable, !TypeTable),
+            module_info_set_type_table(!.TypeTable, !ModuleInfo),
 
             % Add constructors and special preds to the HLDS. This must be done
             % after adding all type and `:- pragma foreign_type' declarations.
@@ -233,16 +226,16 @@ do_parse_tree_to_hlds(Globals, DumpBaseFileName, unit_module(Name, Items),
 
 %-----------------------------------------------------------------------------%
 
-:- pred pack_du_type_args(module_info::in, type_ctor::in, hlds_type_defn::in,
-    type_table::in, type_table::out) is det.
+:- pred decide_du_type_layout(module_info::in, type_ctor::in,
+    hlds_type_defn::in, type_table::in, type_table::out) is det.
 
-pack_du_type_args(ModuleInfo, TypeCtor, TypeDefn, !TypeTable) :-
+decide_du_type_layout(ModuleInfo, TypeCtor, TypeDefn, !TypeTable) :-
     get_type_defn_body(TypeDefn, Body0),
     (
         Body0 = hlds_du_type(Ctors0, ConsTagValues, MaybeCheaperTagTest,
             DuKind, MaybeUserEqComp, DirectArgFunctors, ReservedTag,
             ReservedAddr, MaybeForeign),
-        list.map(decide_du_ctor_packing(ModuleInfo), Ctors0, Ctors),
+        list.map(layout_du_ctor_args(ModuleInfo), Ctors0, Ctors),
         Body = hlds_du_type(Ctors, ConsTagValues, MaybeCheaperTagTest,
             DuKind, MaybeUserEqComp, DirectArgFunctors, ReservedTag,
             ReservedAddr, MaybeForeign),
@@ -257,36 +250,82 @@ pack_du_type_args(ModuleInfo, TypeCtor, TypeDefn, !TypeTable) :-
         % Leave these types alone.
     ).
     
-:- pred decide_du_ctor_packing(module_info::in,
+:- pred layout_du_ctor_args(module_info::in,
     constructor::in, constructor::out) is det.
 
-decide_du_ctor_packing(ModuleInfo, Ctor0, Ctor) :-
+layout_du_ctor_args(ModuleInfo, Ctor0, Ctor) :-
     Ctor0 = ctor(ExistTVars, Constraints, Name, Args0, Context),
     module_info_get_globals(ModuleInfo, Globals),
-    globals.lookup_int_option(Globals, bits_per_word, TargetWordBits),
-    pack_du_ctor_args(ModuleInfo, TargetWordBits, 0, Args0, Args, _),
-    list.length(Args0, UnpackedLength),
-    count_words(Args, 0, PackedLength),
+    use_double_word_floats(Globals, DoubleWordFloats),
     (
+        DoubleWordFloats = yes,
+        set_double_word_floats(ModuleInfo, Args0, Args1)
+    ;
+        DoubleWordFloats = no,
+        Args1 = Args0
+    ),
+    globals.lookup_bool_option(Globals, allow_argument_packing, ArgPacking),
+    (
+        ArgPacking = yes,
+        globals.lookup_int_option(Globals, bits_per_word, TargetWordBits),
+        pack_du_ctor_args(ModuleInfo, TargetWordBits, 0, Args1, Args2, _),
+        WorthPacking = worth_arg_packing(Args1, Args2),
         (
-            PackedLength = UnpackedLength
+            WorthPacking = yes,
+            Args = Args2
         ;
-            % Boehm GC will round up allocations (at least) to the next even
-            % number of words.  There is no point saving a single word if that
-            % word will be allocated anyway.
-            int.even(UnpackedLength),
-            PackedLength = UnpackedLength - 1
+            WorthPacking = no,
+            Args = Args1
         )
-    ->
-        Ctor = Ctor0
     ;
-        PackedLength < UnpackedLength
-    ->
-        Ctor = ctor(ExistTVars, Constraints, Name, Args, Context)
+        ArgPacking = no,
+        Args = Args1
+    ),
+    Ctor = ctor(ExistTVars, Constraints, Name, Args, Context).
+
+:- pred use_double_word_floats(globals::in, bool::out) is det.
+
+use_double_word_floats(Globals, DoubleWordFloats) :-
+    globals.get_target(Globals, Target),
+    globals.lookup_int_option(Globals, bits_per_word, TargetWordBits),
+    globals.lookup_bool_option(Globals, single_prec_float, SinglePrecFloat),
+    (
+        Target = target_c,
+        (
+            TargetWordBits = 32,
+            SinglePrecFloat = no
+        ->
+            % Double-word floats are not yet supported in low-level C grades.
+            globals.lookup_bool_option(Globals, highlevel_code, HighLevelCode),
+            DoubleWordFloats = HighLevelCode
+        ;
+            DoubleWordFloats = no
+        )
     ;
-        unexpected($module, $pred, "packed length exceeds unpacked length")
+        ( Target = target_il
+        ; Target = target_csharp
+        ; Target = target_java
+        ; Target = target_asm
+        ; Target = target_x86_64
+        ; Target = target_erlang
+        ),
+        DoubleWordFloats = no
     ).
 
+:- pred set_double_word_floats(module_info::in,
+    list(constructor_arg)::in, list(constructor_arg)::out) is det.
+
+set_double_word_floats(_ModuleInfo, [], []).
+set_double_word_floats(ModuleInfo, [Arg0 | Args0], [Arg | Args]) :-
+    Arg0 = ctor_arg(Name, Type, _, Context),
+    ( type_is_float_eqv(ModuleInfo, Type) ->
+        ArgWidth = double_word,
+        Arg = ctor_arg(Name, Type, ArgWidth, Context)
+    ;
+        Arg = Arg0
+    ),
+    set_double_word_floats(ModuleInfo, Args0, Args).
+
 :- pred pack_du_ctor_args(module_info::in, int::in, int::in,
     list(constructor_arg)::in, list(constructor_arg)::out,
     arg_width::out) is det.
@@ -295,19 +334,19 @@ pack_du_ctor_args(_ModuleInfo, _TargetWordBits, _Shift, [], [],
         full_word).
 pack_du_ctor_args(ModuleInfo, TargetWordBits, Shift,
         [Arg0 | Args0], [Arg | Args], ArgWidth) :-
-    Arg0 = ctor_arg(Name, Type, _, Context),
+    Arg0 = ctor_arg(Name, Type, ArgWidth0, Context),
     ( type_is_enum_bits(ModuleInfo, Type, NumBits) ->
         Mask = int.pow(2, NumBits) - 1,
         % Try to place the argument in the current word, otherwise move on to
         % the next word.
         ( Shift + NumBits > TargetWordBits ->
-            ArgWidth0 = partial_word_first(Mask),
+            ArgWidth1 = partial_word_first(Mask),
             NextShift = NumBits
         ; Shift = 0 ->
-            ArgWidth0 = partial_word_first(Mask),
+            ArgWidth1 = partial_word_first(Mask),
             NextShift = NumBits
         ;
-            ArgWidth0 = partial_word_shifted(Shift, Mask),
+            ArgWidth1 = partial_word_shifted(Shift, Mask),
             NextShift = Shift + NumBits
         ),
         pack_du_ctor_args(ModuleInfo, TargetWordBits, NextShift, Args0, Args,
@@ -315,20 +354,17 @@ pack_du_ctor_args(ModuleInfo, TargetWordBits, Shift,
         % If this argument starts a word but the next argument is not packed
         % with it, then this argument is not packed.
         (
-            ArgWidth0 = partial_word_first(_),
-            ( NextArgWidth = full_word
-            ; NextArgWidth = partial_word_first(_)
-            )
+            ArgWidth1 = partial_word_first(_),
+            NextArgWidth \= partial_word_shifted(_, _)
         ->
             ArgWidth = full_word
         ;
-            ArgWidth = ArgWidth0
+            ArgWidth = ArgWidth1
         ),
         Arg = ctor_arg(Name, Type, ArgWidth, Context)
     ;
-        % This argument occupies a full word.
         Arg = Arg0,
-        ArgWidth = full_word,
+        ArgWidth = ArgWidth0,
         NextShift = 0,
         pack_du_ctor_args(ModuleInfo, TargetWordBits, NextShift, Args0, Args,
             _)
@@ -362,6 +398,22 @@ max_int_tag(ConsTag, !Max) :-
         unexpected($module, $pred, "non-integer value for enumeration")
     ).
 
+:- func worth_arg_packing(list(constructor_arg), list(constructor_arg)) = bool.
+
+worth_arg_packing(UnpackedArgs, PackedArgs) = Worthwhile :-
+    count_words(UnpackedArgs, 0, UnpackedLength),
+    count_words(PackedArgs, 0, PackedLength),
+    expect(PackedLength =< UnpackedLength, $module, $pred,
+        "packed length exceeds unpacked length"),
+    % Boehm GC will round up allocations (at least) to the next even
+    % number of words.  There is no point saving a single word if that
+    % word will be allocated anyway.
+    ( round_to_even(PackedLength) < round_to_even(UnpackedLength) ->
+        Worthwhile = yes
+    ;
+        Worthwhile = no
+    ).
+
 :- pred count_words(list(constructor_arg)::in, int::in, int::out) is det.
 
 count_words([], !Count).
@@ -371,6 +423,9 @@ count_words([Arg | Args], !Count) :-
         ArgWidth = full_word,
         !:Count = !.Count + 1
     ;
+        ArgWidth = double_word,
+        !:Count = !.Count + 2
+    ;
         ArgWidth = partial_word_first(_),
         !:Count = !.Count + 1
     ;
@@ -378,6 +433,10 @@ count_words([Arg | Args], !Count) :-
     ),
     count_words(Args, !Count).
 
+:- func round_to_even(int) = int.
+
+round_to_even(I) = (int.even(I) -> I ; I + 1).
+
 %-----------------------------------------------------------------------------%
 
 :- pred add_builtin_type_ctor_special_preds(type_ctor::in,
diff --git a/compiler/ml_code_util.m b/compiler/ml_code_util.m
index b8c159d..e0ffb61 100644
--- a/compiler/ml_code_util.m
+++ b/compiler/ml_code_util.m
@@ -321,10 +321,11 @@
 
     % Succeeds iff the specified type must be boxed when used as a field.
     %
-:- pred ml_must_box_field_type(module_info::in, mer_type::in) is semidet.
+:- pred ml_must_box_field_type(module_info::in, mer_type::in, arg_width::in)
+    is semidet.
 
 :- pred ml_gen_box_const_rval(module_info::in, prog_context::in,
-    mlds_type::in, mlds_rval::in, mlds_rval::out,
+    mlds_type::in, bool::in, mlds_rval::in, mlds_rval::out,
     ml_global_data::in, ml_global_data::out) is det.
 
     % Given a source type and a destination type, and given an source rval
@@ -1371,7 +1372,7 @@ ml_gen_field_name(MaybeFieldName, ArgNum) = FieldName :-
     % XXX Currently we box such types even for the other MLDS based back-ends
     % that don't need it, e.g. the .NET back-end.
     %
-ml_must_box_field_type(ModuleInfo, Type) :-
+ml_must_box_field_type(ModuleInfo, Type, Width) :-
     module_info_get_globals(ModuleInfo, Globals),
     globals.get_target(Globals, Target),
     globals.lookup_bool_option(Globals, unboxed_float, UnboxedFloat),
@@ -1384,16 +1385,18 @@ ml_must_box_field_type(ModuleInfo, Type) :-
         ; Target = target_erlang
         ),
         classify_type(ModuleInfo, Type) = Category,
-        MustBox = ml_must_box_field_type_category(Category, UnboxedFloat)
+        MustBox = ml_must_box_field_type_category(Category, UnboxedFloat,
+            Width)
     ;
         Target = target_java,
         MustBox = no
     ),
     MustBox = yes.
 
-:- func ml_must_box_field_type_category(type_ctor_category, bool) = bool.
+:- func ml_must_box_field_type_category(type_ctor_category, bool, arg_width)
+    = bool.
 
-ml_must_box_field_type_category(CtorCat, UnboxedFloat) = MustBox :-
+ml_must_box_field_type_category(CtorCat, UnboxedFloat, Width) = MustBox :-
     (
         ( CtorCat = ctor_cat_builtin(cat_builtin_int)
         ; CtorCat = ctor_cat_builtin(cat_builtin_string)
@@ -1412,10 +1415,27 @@ ml_must_box_field_type_category(CtorCat, UnboxedFloat) = MustBox :-
         MustBox = yes
     ;
         CtorCat = ctor_cat_builtin(cat_builtin_float),
-        MustBox = bool.not(UnboxedFloat)
+        (
+            UnboxedFloat = yes,
+            MustBox = no
+        ;
+            UnboxedFloat = no,
+            (
+                Width = full_word,
+                MustBox = yes
+            ;
+                Width = double_word,
+                MustBox = no
+            ;
+                ( Width = partial_word_first(_)
+                ; Width = partial_word_shifted(_, _)
+                ),
+                unexpected($module, $pred, "partial word for float")
+            )
+        )
     ).
 
-ml_gen_box_const_rval(ModuleInfo, Context, Type, Rval, BoxedRval,
+ml_gen_box_const_rval(ModuleInfo, Context, Type, DoubleWidth, Rval, BoxedRval,
         !GlobalData) :-
     (
         ( Type = mercury_type(type_variable(_, _), _, _)
@@ -1441,6 +1461,8 @@ ml_gen_box_const_rval(ModuleInfo, Context, Type, Rval, BoxedRval,
         HaveUnboxedFloats = ml_global_data_have_unboxed_floats(!.GlobalData),
         (
             HaveUnboxedFloats = do_not_have_unboxed_floats,
+            DoubleWidth = no
+        ->
             % Generate a local static constant for this float.
             module_info_get_name(ModuleInfo, ModuleName),
             MLDS_ModuleName = mercury_module_name_to_mlds(ModuleName),
@@ -1452,7 +1474,6 @@ ml_gen_box_const_rval(ModuleInfo, Context, Type, Rval, BoxedRval,
             % cast to mlds_generic_type.
             BoxedRval = ml_unop(cast(mlds_generic_type), ConstAddrRval)
         ;
-            HaveUnboxedFloats = have_unboxed_floats,
             % This is not a real box, but a cast. The "box" is required as it
             % may be further cast to pointer types.
             BoxedRval = ml_unop(box(Type), Rval)
diff --git a/compiler/ml_global_data.m b/compiler/ml_global_data.m
index db71071..0766a1e 100644
--- a/compiler/ml_global_data.m
+++ b/compiler/ml_global_data.m
@@ -371,9 +371,8 @@ ml_global_data_add_maybe_nonflat_defns(Defns, !GlobalData) :-
 
 ml_gen_static_scalar_const_value(MLDS_ModuleName, ConstBaseName, ConstType0,
         Initializer0, Context, DataRval, !GlobalData) :-
-    HaveUnboxedFloats = !.GlobalData ^ mgd_have_unboxed_floats,
-    ml_maybe_specialize_generic_array_type(HaveUnboxedFloats,
-        ConstType0, ConstType, Initializer0, Initializer),
+    ml_maybe_specialize_generic_array_type(ConstType0, ConstType,
+        Initializer0, Initializer),
     UseCommonCells = !.GlobalData ^ mgd_use_common_cells,
     (
         UseCommonCells = use_common_cells,
@@ -391,9 +390,8 @@ ml_gen_static_scalar_const_value(MLDS_ModuleName, ConstBaseName, ConstType0,
 
 ml_gen_static_scalar_const_addr(MLDS_ModuleName, ConstBaseName, ConstType0,
         Initializer0, Context, DataAddrRval, !GlobalData) :-
-    HaveUnboxedFloats = !.GlobalData ^ mgd_have_unboxed_floats,
-    ml_maybe_specialize_generic_array_type(HaveUnboxedFloats,
-        ConstType0, ConstType, Initializer0, Initializer),
+    ml_maybe_specialize_generic_array_type(ConstType0, ConstType,
+        Initializer0, Initializer),
     UseCommonCells = !.GlobalData ^ mgd_use_common_cells,
     (
         UseCommonCells = use_common_cells,
@@ -489,14 +487,12 @@ ml_gen_plain_static_defn(ConstBaseName, ConstType,
     RevDefns = [Defn | RevDefns0],
     ml_global_data_set_rev_flat_cell_defns(RevDefns, !GlobalData).
 
-:- pred ml_maybe_specialize_generic_array_type(have_unboxed_floats::in,
-    mlds_type::in, mlds_type::out, mlds_initializer::in, mlds_initializer::out)
-    is det.
+:- pred ml_maybe_specialize_generic_array_type(mlds_type::in, mlds_type::out,
+    mlds_initializer::in, mlds_initializer::out) is det.
 
-ml_maybe_specialize_generic_array_type(HaveUnboxedFloats,
-        ConstType0, ConstType, Initializer0, Initializer) :-
+ml_maybe_specialize_generic_array_type(ConstType0, ConstType,
+        Initializer0, Initializer) :-
     (
-        HaveUnboxedFloats = have_unboxed_floats,
         ConstType0 = mlds_array_type(mlds_generic_type),
         Initializer0 = init_array(Inits0),
         list.map2(ml_specialize_generic_array_init, Inits0, Inits, Types),
@@ -593,6 +589,7 @@ ml_specialize_generic_array_binop(Op, IsFloat) :-
         ; Op = float_gt
         ; Op = float_le
         ; Op = float_ge
+        ; Op = float_word_bits
         ; Op = body
         ; Op = array_index(_)   % should not be an initializer anyway
         ; Op = compound_eq
@@ -604,6 +601,7 @@ ml_specialize_generic_array_binop(Op, IsFloat) :-
         ; Op = float_minus
         ; Op = float_times
         ; Op = float_divide
+        ; Op = float_from_dword
         ),
         IsFloat = yes
     ).
diff --git a/compiler/ml_type_gen.m b/compiler/ml_type_gen.m
index f4510fc..7d2f273 100644
--- a/compiler/ml_type_gen.m
+++ b/compiler/ml_type_gen.m
@@ -1038,7 +1038,7 @@ gen_init_tag(Target, ClassType, SecondaryTagClassId, TagVal, Context)
 
 ml_gen_typeclass_info_member(ModuleInfo, Context, Constraint, Defn, !ArgNum) :-
     polymorphism.build_typeclass_info_type(Constraint, Type),
-    ml_gen_field(ModuleInfo, Context, no, Type, Defn, !ArgNum).
+    ml_gen_field(ModuleInfo, Context, no, Type, full_word, Defn, !ArgNum).
 
 :- pred ml_gen_type_info_member(module_info::in, prog_context::in, tvar::in,
     mlds_defn::out, int::in, int::out) is det.
@@ -1049,21 +1049,23 @@ ml_gen_type_info_member(ModuleInfo, Context, TypeVar, Defn, !ArgNum) :-
     % and won't be used in any other way.
     Kind = kind_star,
     polymorphism.build_type_info_type(type_variable(TypeVar, Kind), Type),
-    ml_gen_field(ModuleInfo, Context, no, Type, Defn, !ArgNum).
+    ml_gen_field(ModuleInfo, Context, no, Type, full_word, Defn, !ArgNum).
 
 :- pred ml_gen_du_ctor_field(module_info::in, prog_context::in,
     constructor_arg::in, mlds_defn::out, int::in, int::out) is det.
 
 ml_gen_du_ctor_field(ModuleInfo, Context, Arg, Defn, !ArgNum) :-
-    ml_gen_field(ModuleInfo, Context, Arg ^ arg_field_name, Arg ^ arg_type,
-        Defn, !ArgNum).
+    Arg = ctor_arg(MaybeFieldName, Type, Width, _Context),
+    ml_gen_field(ModuleInfo, Context, MaybeFieldName, Type, Width, Defn,
+        !ArgNum).
 
 :- pred ml_gen_field(module_info::in, prog_context::in,
-    maybe(ctor_field_name)::in, mer_type::in, mlds_defn::out,
+    maybe(ctor_field_name)::in, mer_type::in, arg_width::in, mlds_defn::out,
     int::in, int::out) is det.
 
-ml_gen_field(ModuleInfo, Context, MaybeFieldName, Type, Defn, !ArgNum) :-
-    ( ml_must_box_field_type(ModuleInfo, Type) ->
+ml_gen_field(ModuleInfo, Context, MaybeFieldName, Type, Width, Defn,
+        !ArgNum) :-
+    ( ml_must_box_field_type(ModuleInfo, Type, Width) ->
         MLDS_Type = mlds_generic_type
     ;
         MLDS_Type = mercury_type_to_mlds_type(ModuleInfo, Type)
diff --git a/compiler/ml_unify_gen.m b/compiler/ml_unify_gen.m
index 8d5bf8f..926f4e8 100644
--- a/compiler/ml_unify_gen.m
+++ b/compiler/ml_unify_gen.m
@@ -319,8 +319,9 @@ ml_gen_construct_tag(Tag, Type, Var, ConsId, Args, ArgModes, TakeAddr,
                 ArgGroundTerm = ml_ground_term(ArgRval, _ArgType,
                     MLDS_ArgType),
                 ml_gen_info_get_global_data(!.Info, GlobalData0),
+                DoubleWidth = no,
                 ml_gen_box_const_rval(ModuleInfo, Context, MLDS_ArgType,
-                    ArgRval, Rval0, GlobalData0, GlobalData),
+                    DoubleWidth, ArgRval, Rval0, GlobalData0, GlobalData),
                 ml_gen_info_set_global_data(GlobalData, !Info),
                 Rval = ml_cast_cons_tag(MLDS_Type, Tag, Rval0),
                 GroundTerm = ml_ground_term(Rval, Type, MLDS_Type),
@@ -681,19 +682,21 @@ ml_gen_new_object_dynamically(MaybeConsId, MaybeCtorName, MaybeTag, ExplicitSecT
         MayUseAtomic0 = may_not_use_atomic_alloc
     ),
     ml_gen_info_get_high_level_data(!.Info, HighLevelData),
-    ml_gen_cons_args(ArgVars, ArgLvals, ArgTypes, ConsArgTypes, ArgModes,
-        NumExtraRvals, ConsArgWidths, TakeAddr, ModuleInfo, HighLevelData,
+    ml_gen_cons_args(ArgVars, ArgLvals, ArgTypes, ConsArgTypes, ConsArgWidths,
+        ArgModes, NumExtraRvals, TakeAddr, ModuleInfo, HighLevelData,
         ArgRvals0, MLDS_ArgTypes0, TakeAddrInfos, MayUseAtomic0, MayUseAtomic),
 
-    % Pack arguments into fields.
+    % Replace double-word and packed arguments by uniform single word rvals.
     assoc_list.from_corresponding_lists(ArgRvals0, MLDS_ArgTypes0, ArgsTypes0),
-    pack_args(ml_shift_combine_rval_type, ConsArgWidths,
-        ArgsTypes0, ArgsTypes1, unit, _, unit, _),
-    assoc_list.keys_and_values(ArgsTypes1, ArgRvals1, MLDS_ArgTypes1),
+    ml_expand_double_word_rvals(ConsArgWidths, ArgWidths1,
+        ArgsTypes0, ArgsTypes1),
+    pack_args(ml_shift_combine_rval_type, ArgWidths1, ArgsTypes1, ArgsTypes2,
+        unit, _, unit, _),
+    assoc_list.keys_and_values(ArgsTypes2, ArgRvals2, MLDS_ArgTypes2),
 
     % Add the extra rvals to the start.
-    ArgRvals = ExtraRvals ++ ArgRvals1,
-    MLDS_ArgTypes = ExtraTypes ++ MLDS_ArgTypes1,
+    ArgRvals = ExtraRvals ++ ArgRvals2,
+    MLDS_ArgTypes = ExtraTypes ++ MLDS_ArgTypes2,
 
     % Compute the number of words to allocate.
     list.length(ArgRvals, Size),
@@ -761,14 +764,16 @@ ml_gen_new_object_statically(MaybeConsId, MaybeCtorName, MaybeTag,
                 ArgGroundTerms),
             ml_gen_box_extra_const_rval_list_lld(ModuleInfo, Context,
                 ExtraTypes, ExtraRvals, ExtraArgRvals, !GlobalData),
-            ml_gen_box_const_rval_list(ModuleInfo, Context, ArgGroundTerms,
-                ArgRvals1, !GlobalData)
+            assoc_list.from_corresponding_lists(ArgGroundTerms, ConsArgWidths,
+                ArgGroundTermsWidths),
+            ml_gen_box_const_rval_list_lld(ModuleInfo, Context,
+                ArgGroundTermsWidths, ArgRvals1, !GlobalData)
         ;
             HighLevelData = yes,
             list.map(ml_gen_info_lookup_const_var_rval(!.Info), ArgVars,
                 ArgRvals0),
-            list.map(ml_type_as_field(ModuleInfo, HighLevelData),
-                ConsArgTypes, FieldTypes),
+            list.map_corresponding(ml_type_as_field(ModuleInfo, HighLevelData),
+                ConsArgTypes, ConsArgWidths, FieldTypes),
             ml_gen_box_or_unbox_const_rval_list_hld(ModuleInfo, ArgTypes,
                 FieldTypes, ArgRvals0, Context, ArgRvals1, !GlobalData),
             % For --high-level-data, the ExtraRvals should already have
@@ -1047,27 +1052,28 @@ get_const_type_for_cons_id(Target, HighLevelData, MLDS_Type, UsesBaseClass,
         )
     ).
 
-:- pred ml_type_as_field(module_info::in, bool::in,
-    mer_type::in, mer_type::out) is det.
+:- pred ml_type_as_field(module_info::in, bool::in, mer_type::in,
+    arg_width::in, mer_type::out) is det.
 
-ml_type_as_field(ModuleInfo, HighLevelData, FieldType, BoxedFieldType) :-
+ml_type_as_field(ModuleInfo, HighLevelData, FieldType, FieldWidth,
+        BoxedFieldType) :-
     (
-        % With the low-level data representation, we store all fields as
-        % "boxed" so we ignore the original field type and instead generate a
-        % polymorphic type BoxedFieldType which we use for the type of the
-        % field. This type is used in the calls to ml_gen_box_or_unbox_rval
-        % to ensure that we box values when storing them into fields and
-        % unbox them when extracting them from fields.
-        %
-        % With the high-level data representation, we don't box everything,
-        % but for the MLDS->C and MLDS->asm back-ends we still need to box
-        % floating point fields if they are wider than a word.
-
         (
-            HighLevelData = no
+            HighLevelData = no,
+            % With the low-level data representation, we store all fields
+            % except for double-width floats as "boxed" so we ignore the
+            % original field type and instead generate a polymorphic type
+            % BoxedFieldType which we use for the type of the field. This type
+            % is used in the calls to ml_gen_box_or_unbox_rval to ensure that
+            % we box values when storing them into fields and unbox them when
+            % extracting them from fields.
+            FieldWidth \= double_word
         ;
             HighLevelData = yes,
-            ml_must_box_field_type(ModuleInfo, FieldType)
+            % With the high-level data representation, we don't box everything,
+            % but for the MLDS->C and MLDS->asm back-ends we still need to box
+            % floating point fields if they are wider than a word.
+            ml_must_box_field_type(ModuleInfo, FieldType, FieldWidth)
         )
     ->
         % XXX zs: I do not see any reason why TypeVar cannot be confused with
@@ -1212,8 +1218,9 @@ ml_gen_box_or_unbox_const_rval_hld(ModuleInfo, ArgType, FieldType, ArgRval,
         % -- in that case, we can just box the argument type.
         FieldType = type_variable(_, _),
         MLDS_ArgType = mercury_type_to_mlds_type(ModuleInfo, ArgType),
-        ml_gen_box_const_rval(ModuleInfo, Context, MLDS_ArgType, ArgRval,
-            FieldRval, !GlobalData)
+        DoubleWidth = no,
+        ml_gen_box_const_rval(ModuleInfo, Context, MLDS_ArgType, DoubleWidth,
+            ArgRval, FieldRval, !GlobalData)
     ;
         ( FieldType = defined_type(_, _, _)
         ; FieldType = builtin_type(_)
@@ -1229,17 +1236,27 @@ ml_gen_box_or_unbox_const_rval_hld(ModuleInfo, ArgType, FieldType, ArgRval,
             native_if_possible, ArgRval, FieldRval)
     ).
 
-:- pred ml_gen_box_const_rval_list(module_info::in, prog_context::in,
-    list(ml_ground_term)::in, list(mlds_rval)::out,
+:- pred ml_gen_box_const_rval_list_lld(module_info::in, prog_context::in,
+    assoc_list(ml_ground_term, arg_width)::in, list(mlds_rval)::out,
     ml_global_data::in, ml_global_data::out) is det.
 
-ml_gen_box_const_rval_list(_, _, [], [], !GlobalData).
-ml_gen_box_const_rval_list(ModuleInfo, Context, [GroundTerm | GroundTerms],
+ml_gen_box_const_rval_list_lld(_, _, [], [], !GlobalData).
+ml_gen_box_const_rval_list_lld(ModuleInfo, Context, [GroundTerm - ArgWidth | GroundTerms],
         [BoxedRval | BoxedRvals], !GlobalData) :-
     GroundTerm = ml_ground_term(Rval, _MercuryType, Type),
-    ml_gen_box_const_rval(ModuleInfo, Context, Type, Rval,
+    (
+        ArgWidth = double_word,
+        DoubleWidth = yes
+    ;
+        ( ArgWidth = full_word
+        ; ArgWidth = partial_word_first(_)
+        ; ArgWidth = partial_word_shifted(_, _)
+        ),
+        DoubleWidth = no
+    ),
+    ml_gen_box_const_rval(ModuleInfo, Context, Type, DoubleWidth, Rval,
         BoxedRval, !GlobalData),
-    ml_gen_box_const_rval_list(ModuleInfo, Context, GroundTerms,
+    ml_gen_box_const_rval_list_lld(ModuleInfo, Context, GroundTerms,
         BoxedRvals, !GlobalData).
 
 :- pred ml_gen_box_extra_const_rval_list_lld(module_info::in, prog_context::in,
@@ -1249,7 +1266,9 @@ ml_gen_box_const_rval_list(ModuleInfo, Context, [GroundTerm | GroundTerms],
 ml_gen_box_extra_const_rval_list_lld(_, _, [], [], [], !GlobalData).
 ml_gen_box_extra_const_rval_list_lld(ModuleInfo, Context, [Type | Types],
         [Rval | Rvals], [BoxedRval | BoxedRvals], !GlobalData) :-
-    ml_gen_box_const_rval(ModuleInfo, Context, Type, Rval,
+    % Extras are always a single word.
+    DoubleWidth = no,
+    ml_gen_box_const_rval(ModuleInfo, Context, Type, DoubleWidth, Rval,
         BoxedRval, !GlobalData),
     ml_gen_box_extra_const_rval_list_lld(ModuleInfo, Context, Types, Rvals,
         BoxedRvals, !GlobalData).
@@ -1297,17 +1316,17 @@ ml_cons_name(CompilationTarget, HLDS_ConsId, QualifiedConsId) :-
     % but if the argument is free, we produce a null value.
     %
 :- pred ml_gen_cons_args(list(prog_var)::in, list(mlds_lval)::in,
-    list(mer_type)::in, list(mer_type)::in, list(uni_mode)::in,
-    int::in, list(arg_width)::in, list(int)::in, module_info::in, bool::in,
+    list(mer_type)::in, list(mer_type)::in, list(arg_width)::in,
+    list(uni_mode)::in, int::in, list(int)::in, module_info::in, bool::in,
     list(mlds_rval)::out, list(mlds_type)::out, list(take_addr_info)::out,
     may_use_atomic_alloc::in, may_use_atomic_alloc::out) is det.
 
-ml_gen_cons_args(Vars, Lvals, ArgTypes, ConsArgTypes, UniModes,
-        NumExtraArgs, ConsArgWidths, TakeAddr, ModuleInfo, HighLevelData,
+ml_gen_cons_args(Vars, Lvals, ArgTypes, ConsArgTypes, ConsArgWidths,
+        UniModes, NumExtraArgs, TakeAddr, ModuleInfo, HighLevelData,
         !:Rvals, !:MLDS_Types, !:TakeAddrInfos, !MayUseAtomic) :-
     (
-        ml_gen_cons_args_2(Vars, Lvals, ArgTypes, ConsArgTypes, UniModes,
-            NumExtraArgs, ConsArgWidths, 1, TakeAddr, ModuleInfo, HighLevelData,
+        ml_gen_cons_args_2(Vars, Lvals, ArgTypes, ConsArgTypes, ConsArgWidths,
+            UniModes, NumExtraArgs, 1, TakeAddr, ModuleInfo, HighLevelData,
             !:Rvals, !:MLDS_Types, !:TakeAddrInfos, !MayUseAtomic)
     ->
         true
@@ -1316,18 +1335,18 @@ ml_gen_cons_args(Vars, Lvals, ArgTypes, ConsArgTypes, UniModes,
     ).
 
 :- pred ml_gen_cons_args_2(list(prog_var)::in, list(mlds_lval)::in,
-    list(mer_type)::in, list(mer_type)::in, list(uni_mode)::in,
-    int::in, list(arg_width)::in, int::in, list(int)::in,
-    module_info::in, bool::in,
-    list(mlds_rval)::out, list(mlds_type)::out, list(take_addr_info)::out,
+    list(mer_type)::in, list(mer_type)::in, list(arg_width)::in,
+    list(uni_mode)::in, int::in, int::in, list(int)::in,
+    module_info::in, bool::in, list(mlds_rval)::out, list(mlds_type)::out,
+    list(take_addr_info)::out,
     may_use_atomic_alloc::in, may_use_atomic_alloc::out) is semidet.
 
-ml_gen_cons_args_2([], [], [], [], [],
-        _NumExtraArgs, _ArgWidths, _CurArgNum, _TakeAddr,
+ml_gen_cons_args_2([], [], [], [], _, [],
+        _NumExtraArgs, _CurArgNum, _TakeAddr,
         _ModuleInfo, _HighLevelData, [], [], [], !MayUseAtomic).
 ml_gen_cons_args_2([Var | Vars], [Lval | Lvals], [ArgType | ArgTypes],
-        [ConsArgType | ConsArgTypes], [UniMode | UniModes],
-        NumExtraArgs, ArgWidths, CurArgNum, !.TakeAddr,
+        [ConsArgType | ConsArgTypes], ConsArgWidths,
+        [UniMode | UniModes], NumExtraArgs, CurArgNum, !.TakeAddr,
         ModuleInfo, HighLevelData, [Rval | Rvals],
         [MLDS_Type | MLDS_Types], TakeAddrInfos, !MayUseAtomic) :-
     % It is important to use ArgType instead of ConsArgType here. ConsArgType
@@ -1341,18 +1360,22 @@ ml_gen_cons_args_2([Var | Vars], [Lval | Lvals], [ArgType | ArgTypes],
     update_type_may_use_atomic_alloc(ModuleInfo, ArgType, !MayUseAtomic),
 
     % Figure out the type of the field.
-    ml_type_as_field(ModuleInfo, HighLevelData, ConsArgType, BoxedArgType),
+    list.det_index1(ConsArgWidths, CurArgNum, ConsArgWidth),
+    ml_type_as_field(ModuleInfo, HighLevelData, ConsArgType, ConsArgWidth,
+        BoxedArgType),
     MLDS_Type = mercury_type_to_mlds_type(ModuleInfo, BoxedArgType),
 
     % Compute the value of the field.
     UniMode = ((_LI - RI) -> (_LF - RF)),
     ( !.TakeAddr = [CurArgNum | !:TakeAddr] ->
+        expect(unify(ConsArgWidth, full_word), $module, $pred,
+            "taking address of non word-sized argument"),
         Rval = ml_const(mlconst_null(MLDS_Type)),
-        ml_gen_cons_args_2(Vars, Lvals, ArgTypes, ConsArgTypes, UniModes,
-            NumExtraArgs, ArgWidths, CurArgNum + 1, !.TakeAddr,
+        ml_gen_cons_args_2(Vars, Lvals, ArgTypes, ConsArgTypes, ConsArgWidths,
+            UniModes, NumExtraArgs, CurArgNum + 1, !.TakeAddr,
             ModuleInfo, HighLevelData, Rvals, MLDS_Types, TakeAddrInfosTail,
             !MayUseAtomic),
-        Offset = ml_calc_field_offset(NumExtraArgs, ArgWidths, CurArgNum),
+        Offset = ml_calc_field_offset(NumExtraArgs, ConsArgWidths, CurArgNum),
         OrigMLDS_Type = mercury_type_to_mlds_type(ModuleInfo, ConsArgType),
         TakeAddrInfo = take_addr_info(Var, Offset, OrigMLDS_Type, MLDS_Type),
         TakeAddrInfos = [TakeAddrInfo | TakeAddrInfosTail]
@@ -1367,16 +1390,16 @@ ml_gen_cons_args_2([Var | Vars], [Lval | Lvals], [ArgType | ArgTypes],
         ;
             Rval = ml_const(mlconst_null(MLDS_Type))
         ),
-        ml_gen_cons_args_2(Vars, Lvals, ArgTypes, ConsArgTypes, UniModes,
-            NumExtraArgs, ArgWidths, CurArgNum + 1, !.TakeAddr,
+        ml_gen_cons_args_2(Vars, Lvals, ArgTypes, ConsArgTypes, ConsArgWidths,
+            UniModes, NumExtraArgs, CurArgNum + 1, !.TakeAddr,
             ModuleInfo, HighLevelData, Rvals, MLDS_Types, TakeAddrInfos,
             !MayUseAtomic)
     ).
 
 :- func ml_calc_field_offset(int, list(arg_width), int) = field_offset.
 
-ml_calc_field_offset(NumExtraArgs, ArgWidths, ArgNum) = Offset :-
-    ( list.take(ArgNum - 1, ArgWidths, WidthsBeforeArg) ->
+ml_calc_field_offset(NumExtraArgs, ConsArgWidths, ArgNum) = Offset :-
+    ( list.take(ArgNum - 1, ConsArgWidths, WidthsBeforeArg) ->
         WordsBeforeArg = count_distinct_words(WidthsBeforeArg),
         Offset = offset(NumExtraArgs + WordsBeforeArg)
     ;
@@ -1468,8 +1491,9 @@ ml_gen_det_deconstruct_tag(Tag, Type, Var, ConsId, Args, Modes, Context,
             ml_gen_var(!.Info, Arg, ArgLval),
             ml_gen_var(!.Info, Var, VarLval),
             ml_gen_info_get_module_info(!.Info, ModuleInfo),
-            ml_gen_sub_unify(ModuleInfo, Mode, ArgLval, ArgType, VarLval, Type,
-                full_word, Context, [], Statements)
+            ml_gen_info_get_high_level_data(!.Info, HighLevelData),
+            ml_gen_sub_unify(ModuleInfo, HighLevelData, Mode, ArgLval, ArgType,
+                VarLval, Type, full_word, Context, [], Statements)
         ;
             unexpected($module, $pred, "no_tag: arity != 1")
         )
@@ -1625,28 +1649,74 @@ ml_gen_unify_args_2(_, [], [], [], _, _, _, _, _, _, _, !Statements, !Info).
 ml_gen_unify_args_2(ConsId, [Arg | Args], [Mode | Modes], [ArgType | ArgTypes],
         [Field | Fields], VarType, VarLval, Offset, ArgNum, Tag,
         Context, !Statements, !Info) :-
-    ml_next_field_offset(Fields, Offset, Offset1),
+    ml_next_field_offset(Field, Fields, Offset, Offset1),
     ArgNum1 = ArgNum + 1,
     ml_gen_unify_args_2(ConsId, Args, Modes, ArgTypes, Fields, VarType,
         VarLval, Offset1, ArgNum1, Tag, Context, !Statements, !Info),
     ml_gen_unify_arg(ConsId, Arg, Mode, ArgType, Field, VarType, VarLval,
         Offset, ArgNum, Tag, Context, !Statements, !Info).
 
-:- pred ml_next_field_offset(list(constructor_arg)::in,
+:- pred ml_next_field_offset(constructor_arg::in, list(constructor_arg)::in,
     field_offset::in, field_offset::out) is det.
 
-ml_next_field_offset([], Offset, Offset).
-ml_next_field_offset([NextArg | _], PrevOffset, NextOffset) :-
+ml_next_field_offset(_, [], Offset, Offset).
+ml_next_field_offset(CurArg, [NextArg | _], PrevOffset, NextOffset) :-
+    CurArg = ctor_arg(_, _, CurWidth, _),
     NextArg = ctor_arg(_, _, NextWidth, _),
     (
-        ( NextWidth = full_word
-        ; NextWidth = partial_word_first(_)
-        ),
-        PrevOffset = offset(Int),
-        NextOffset = offset(Int + 1)
+        CurWidth = full_word,
+        (
+            ( NextWidth = full_word
+            ; NextWidth = double_word
+            ; NextWidth = partial_word_first(_)
+            ),
+            PrevOffset = offset(Int),
+            NextOffset = offset(Int + 1)
+        ;
+            NextWidth = partial_word_shifted(_, _),
+            unexpected($module, $pred,
+                "partial_word_shifted follows full_word")
+        )
+    ;
+        CurWidth = double_word,
+        (
+            ( NextWidth = full_word
+            ; NextWidth = double_word
+            ; NextWidth = partial_word_first(_)
+            ),
+            PrevOffset = offset(Int),
+            NextOffset = offset(Int + 2)
+        ;
+            NextWidth = partial_word_shifted(_, _),
+            unexpected($module, $pred,
+                "partial_word_shifted follows double_word")
+        )
     ;
-        NextWidth = partial_word_shifted(_, _),
-        NextOffset = PrevOffset
+        CurWidth = partial_word_first(_),
+        (
+            NextWidth = partial_word_shifted(_, _),
+            NextOffset = PrevOffset
+        ;
+            ( NextWidth = full_word
+            ; NextWidth = double_word
+            ; NextWidth = partial_word_first(_)
+            ),
+            unexpected($module, $pred,
+                "partial_word_first not followed by partial_word_shifted")
+        )
+    ;
+        CurWidth = partial_word_shifted(_, _),
+        (
+            ( NextWidth = full_word
+            ; NextWidth = double_word
+            ; NextWidth = partial_word_first(_)
+            ),
+            PrevOffset = offset(Int),
+            NextOffset = offset(Int + 1)
+        ;
+            NextWidth = partial_word_shifted(_, _),
+            NextOffset = PrevOffset
+        )
     ).
 
 :- pred ml_gen_unify_args_for_reuse(cons_id::in, list(prog_var)::in,
@@ -1673,17 +1743,18 @@ ml_gen_unify_args_for_reuse(ConsId, Args, Modes, ArgTypes, Fields, TakeAddr,
         ArgTypes = [ArgType | ArgTypes1],
         Fields = [Field | Fields1]
     ->
-        ml_next_field_offset(Fields1, Offset, Offset1),
+        ml_next_field_offset(Field, Fields1, Offset, Offset1),
         ArgNum1 = ArgNum + 1,
         ( TakeAddr = [ArgNum | TakeAddr1] ->
             ml_gen_unify_args_for_reuse(ConsId, Args1, Modes1, ArgTypes1,
                 Fields1, TakeAddr1, VarType, VarLval, Offset1, ArgNum1,
                 Tag, Context, Statements, TakeAddrInfos0, !Info),
 
-            FieldType = Field ^ arg_type,
             ml_gen_info_get_module_info(!.Info, ModuleInfo),
             ml_gen_info_get_high_level_data(!.Info, HighLevelData),
-            ml_type_as_field(ModuleInfo, HighLevelData, FieldType,
+            FieldType = Field ^ arg_type,
+            FieldWidth = Field ^ arg_width,
+            ml_type_as_field(ModuleInfo, HighLevelData, FieldType, FieldWidth,
                 BoxedFieldType),
             ml_gen_type(!.Info, FieldType, MLDS_FieldType),
             ml_gen_type(!.Info, BoxedFieldType, MLDS_BoxedFieldType),
@@ -1746,7 +1817,8 @@ ml_gen_unify_arg(ConsId, Arg, Mode, ArgType, Field, VarType, VarLval,
     ),
     % Box the field type, if needed.
     ml_gen_info_get_module_info(!.Info, ModuleInfo),
-    ml_type_as_field(ModuleInfo, HighLevelData, FieldType, BoxedFieldType),
+    ml_type_as_field(ModuleInfo, HighLevelData, FieldType, FieldWidth,
+        BoxedFieldType),
 
     % Generate lvals for the LHS and the RHS.
     ml_gen_type(!.Info, VarType, MLDS_VarType),
@@ -1757,15 +1829,15 @@ ml_gen_unify_arg(ConsId, Arg, Mode, ArgType, Field, VarType, VarLval,
     ml_gen_var(!.Info, Arg, ArgLval),
 
     % Now generate code to unify them.
-    ml_gen_sub_unify(ModuleInfo, Mode, ArgLval, ArgType, FieldLval,
-        BoxedFieldType, FieldWidth, Context, !Statements).
+    ml_gen_sub_unify(ModuleInfo, HighLevelData, Mode, ArgLval, ArgType,
+        FieldLval, BoxedFieldType, FieldWidth, Context, !Statements).
 
-:- pred ml_gen_sub_unify(module_info::in, uni_mode::in, mlds_lval::in,
-    mer_type::in, mlds_lval::in, mer_type::in, arg_width::in, prog_context::in,
-    list(statement)::in, list(statement)::out) is det.
+:- pred ml_gen_sub_unify(module_info::in, bool::in, uni_mode::in,
+    mlds_lval::in, mer_type::in, mlds_lval::in, mer_type::in, arg_width::in,
+    prog_context::in, list(statement)::in, list(statement)::out) is det.
 
-ml_gen_sub_unify(ModuleInfo, Mode, ArgLval, ArgType, FieldLval, FieldType,
-        FieldWidth, Context, !Statements) :-
+ml_gen_sub_unify(ModuleInfo, HighLevelData, Mode, ArgLval, ArgType, FieldLval,
+        FieldType, FieldWidth, Context, !Statements) :-
     % Figure out the direction of data-flow from the mode,
     % and generate code accordingly.
     %
@@ -1802,19 +1874,35 @@ ml_gen_sub_unify(ModuleInfo, Mode, ArgLval, ArgType, FieldLval, FieldType,
         LeftMode = top_in,
         RightMode = top_out
     ->
-        ml_gen_box_or_unbox_rval(ModuleInfo, FieldType, ArgType,
-            native_if_possible, ml_lval(FieldLval), FieldRval),
         (
-            FieldWidth = full_word,
-            Statement = ml_gen_assign(ArgLval, FieldRval, Context)
-        ;
-            FieldWidth = partial_word_first(Mask),
-            UnpackRval = ml_bitwise_and(FieldRval, Mask),
-            Statement = ml_gen_assign(ArgLval, UnpackRval, Context)
+            ( FieldWidth = full_word
+            ; FieldWidth = partial_word_first(_)
+            ; FieldWidth = partial_word_shifted(_, _)
+            ),
+            ml_gen_box_or_unbox_rval(ModuleInfo, FieldType, ArgType,
+                native_if_possible, ml_lval(FieldLval), FieldRval),
+            (
+                FieldWidth = full_word,
+                Statement = ml_gen_assign(ArgLval, FieldRval, Context)
+            ;
+                FieldWidth = partial_word_first(Mask),
+                UnpackRval = ml_bitwise_and(FieldRval, Mask),
+                Statement = ml_gen_assign(ArgLval, UnpackRval, Context)
+            ;
+                FieldWidth = partial_word_shifted(Shift, Mask),
+                UnpackRval = ml_bitwise_and(ml_rshift(FieldRval, Shift), Mask),
+                Statement = ml_gen_assign(ArgLval, UnpackRval, Context)
+            )
         ;
-            FieldWidth = partial_word_shifted(Shift, Mask),
-            UnpackRval = ml_bitwise_and(ml_rshift(FieldRval, Shift), Mask),
-            Statement = ml_gen_assign(ArgLval, UnpackRval, Context)
+            FieldWidth = double_word,
+            ( ml_field_offset_pair(FieldLval, FieldLvalA, FieldLvalB) ->
+                FieldRval = ml_binop(float_from_dword,
+                    ml_lval(FieldLvalA), ml_lval(FieldLvalB))
+            ;
+                ml_gen_box_or_unbox_rval(ModuleInfo, FieldType, ArgType,
+                    native_if_possible, ml_lval(FieldLval), FieldRval)
+            ),
+            Statement = ml_gen_assign(ArgLval, FieldRval, Context)
         ),
         !:Statements = [Statement | !.Statements]
     ;
@@ -1826,7 +1914,8 @@ ml_gen_sub_unify(ModuleInfo, Mode, ArgLval, ArgType, FieldLval, FieldType,
             native_if_possible, ml_lval(ArgLval), ArgRval),
         (
             FieldWidth = full_word,
-            Statement = ml_gen_assign(FieldLval, ArgRval, Context)
+            Statement = ml_gen_assign(FieldLval, ArgRval, Context),
+            !:Statements = [Statement | !.Statements]
         ;
             (
                 FieldWidth = partial_word_first(Mask),
@@ -1835,12 +1924,32 @@ ml_gen_sub_unify(ModuleInfo, Mode, ArgLval, ArgType, FieldLval, FieldType,
                 FieldWidth = partial_word_shifted(Shift, Mask)
             ),
             CastVal = ml_unop(unbox(mlds_native_int_type), ml_lval(FieldLval)),
-            MaskOld = ml_bitwise_and(CastVal, \(Mask << Shift)),
+            MaskOld = ml_bitwise_and(CastVal, \ (Mask << Shift)),
             ShiftNew = ml_lshift(ArgRval, Shift),
             Combined = ml_bitwise_or(MaskOld, ShiftNew),
-            Statement = ml_gen_assign(FieldLval, Combined, Context)
-        ),
-        !:Statements = [Statement | !.Statements]
+            Statement = ml_gen_assign(FieldLval, Combined, Context),
+            !:Statements = [Statement | !.Statements]
+        ;
+            FieldWidth = double_word,
+            ( ml_field_offset_pair(FieldLval, FieldLvalA, FieldLvalB) ->
+                FloatWordA = ml_binop(float_word_bits, ArgRval,
+                    ml_const(mlconst_int(0))),
+                FloatWordB = ml_binop(float_word_bits, ArgRval,
+                    ml_const(mlconst_int(1))),
+                ml_type_as_field(ModuleInfo, HighLevelData, int_type,
+                    full_word, IntFieldType),
+                ml_gen_box_or_unbox_rval(ModuleInfo, int_type, IntFieldType,
+                    native_if_possible, FloatWordA, ArgRvalA),
+                ml_gen_box_or_unbox_rval(ModuleInfo, int_type, IntFieldType,
+                    native_if_possible, FloatWordB, ArgRvalB),
+                StatementA = ml_gen_assign(FieldLvalA, ArgRvalA, Context),
+                StatementB = ml_gen_assign(FieldLvalB, ArgRvalB, Context),
+                !:Statements = [StatementA, StatementB | !.Statements]
+            ;
+                Statement = ml_gen_assign(FieldLval, ArgRval, Context),
+                !:Statements = [Statement | !.Statements]
+            )
+        )
     ;
         % Unused - unused: the unification has no effect.
         LeftMode = top_unused,
@@ -1851,6 +1960,21 @@ ml_gen_sub_unify(ModuleInfo, Mode, ArgLval, ArgType, FieldLval, FieldType,
         unexpected($module, $pred, "some strange unify")
     ).
 
+:- pred ml_field_offset_pair(mlds_lval::in, mlds_lval::out, mlds_lval::out)
+    is semidet.
+
+ml_field_offset_pair(FieldLval, FieldLvalA, FieldLvalB) :-
+    FieldLval = ml_field(Tag, Address, FieldIdA, _, PtrType),
+    FieldIdA = ml_field_offset(FieldOffsetA),
+    ( FieldOffsetA = ml_const(mlconst_int(Offset)) ->
+        FieldIdB = ml_field_offset(ml_const(mlconst_int(Offset + 1))),
+        SubstType = mlds_generic_type,
+        FieldLvalA = ml_field(Tag, Address, FieldIdA, SubstType, PtrType),
+        FieldLvalB = ml_field(Tag, Address, FieldIdB, SubstType, PtrType)
+    ;
+        sorry($module, $pred, "unexpected field offset")
+    ).
+
 :- pred ml_gen_direct_arg_construct(module_info::in, uni_mode::in, int::in,
     mlds_lval::in, mer_type::in, mlds_lval::in, mer_type::in, prog_context::in,
     list(statement)::out) is det.
@@ -2392,8 +2516,9 @@ ml_gen_ground_term_conjunct_tag(ModuleInfo, Target, HighLevelData, VarTypes,
             Args = [Arg],
             map.det_remove(Arg, ArgGroundTerm, !GroundTermMap),
             ArgGroundTerm = ml_ground_term(ArgRval, _ArgType, MLDS_ArgType),
+            DoubleWidth = no,
             ml_gen_box_const_rval(ModuleInfo, Context, MLDS_ArgType,
-                ArgRval, Rval0, !GlobalData),
+                DoubleWidth, ArgRval, Rval0, !GlobalData),
             Rval = ml_cast_cons_tag(MLDS_Type, ConsTag, Rval0),
             GroundTerm = ml_ground_term(Rval, VarType, MLDS_Type),
             map.det_insert(Var, GroundTerm, !GroundTermMap)
@@ -2502,17 +2627,19 @@ ml_gen_ground_term_conjunct_compound(ModuleInfo, Target, HighLevelData,
         ConsArgTypes = ArgTypes,
         ConsArgWidths = list.duplicate(Length, full_word)
     ),
-    assoc_list.from_corresponding_lists(Args, ConsArgTypes, ArgConsArgTypes),
     (
         HighLevelData = yes,
         construct_ground_term_initializers_hld(ModuleInfo, Context,
-            ArgConsArgTypes, ArgRvals0, !GlobalData, !GroundTermMap)
+            Args, ConsArgTypes, ConsArgWidths, ArgRvals1, !GlobalData,
+            !GroundTermMap)
     ;
         HighLevelData = no,
+        assoc_list.from_corresponding_lists(Args, ConsArgWidths,
+            ArgConsArgWidths),
         construct_ground_term_initializers_lld(ModuleInfo, Context,
-            ArgConsArgTypes, ArgRvals0, !GlobalData, !GroundTermMap)
+            ArgConsArgWidths, ArgRvals1, !GlobalData, !GroundTermMap)
     ),
-    pack_args(ml_shift_combine_rval, ConsArgWidths, ArgRvals0, ArgRvals,
+    pack_args(ml_shift_combine_rval, ConsArgWidths, ArgRvals1, ArgRvals,
         unit, _, unit, _),
     ArgInitializers = list.map(func(Init) = init_obj(Init), ArgRvals),
 
@@ -2550,66 +2677,122 @@ ml_gen_ground_term_conjunct_compound(ModuleInfo, Target, HighLevelData,
 %-----------------------------------------------------------------------------%
 
 :- pred construct_ground_term_initializers_hld(module_info::in,
-    prog_context::in,
-    assoc_list(prog_var, mer_type) ::in, list(mlds_rval)::out,
+    prog_context::in, list(prog_var)::in, list(mer_type)::in,
+    list(arg_width)::in, list(mlds_rval)::out,
     ml_global_data::in, ml_global_data::out,
     ml_ground_term_map::in, ml_ground_term_map::out) is det.
 
-construct_ground_term_initializers_hld(_, _, [], [],
-        !GlobalData, !GroundTermMap).
-construct_ground_term_initializers_hld(ModuleInfo, Context,
-        [ArgConsArgType | ArgConsArgTypes], [ArgRval | ArgRvals],
+construct_ground_term_initializers_hld(ModuleInfo, Context, Args, ConsArgTypes,
+        ConsArgWidths, ArgRvals, !GlobalData, !GroundTermMap) :-
+    (
+        Args = [],
+        ConsArgTypes = [],
+        ConsArgWidths = []
+    ->
+        ArgRvals = []
+    ;
+        Args = [Arg | Args1],
+        ConsArgTypes = [ConsArgType | ConsArgTypes1],
+        ConsArgWidths = [ConsArgWidth | ConsArgWidths1]
+    ->
+        construct_ground_term_initializer_hld(ModuleInfo, Context,
+            Arg, ConsArgType, ConsArgWidth, ArgRval,
+            !GlobalData, !GroundTermMap),
+        construct_ground_term_initializers_hld(ModuleInfo, Context,
+            Args1, ConsArgTypes1, ConsArgWidths1, ArgRvals1,
+            !GlobalData, !GroundTermMap),
+        ArgRvals = [ArgRval | ArgRvals1]
+    ;
+        unexpected($module, $pred, "list length mismatch")
+    ).
+
+:- pred construct_ground_term_initializer_hld(module_info::in,
+    prog_context::in, prog_var::in, mer_type::in, arg_width::in,
+    mlds_rval::out, ml_global_data::in, ml_global_data::out,
+    ml_ground_term_map::in, ml_ground_term_map::out) is det.
+
+construct_ground_term_initializer_hld(ModuleInfo, Context,
+        Arg, ConsArgType, ConsArgWidth, ArgRval,
         !GlobalData, !GroundTermMap) :-
-    construct_ground_term_initializer_hld(ModuleInfo, Context,
-        ArgConsArgType, ArgRval, !GlobalData, !GroundTermMap),
-    construct_ground_term_initializers_hld(ModuleInfo, Context,
-        ArgConsArgTypes, ArgRvals, !GlobalData, !GroundTermMap).
+    map.det_remove(Arg, ArgGroundTerm, !GroundTermMap),
+    ArgGroundTerm = ml_ground_term(ArgRval0, ArgType, _MLDS_ArgType),
+    ml_type_as_field(ModuleInfo, yes, ConsArgType, ConsArgWidth, BoxedArgType),
+    ml_gen_box_or_unbox_const_rval_hld(ModuleInfo, ArgType, BoxedArgType,
+        ArgRval0, Context, ArgRval, !GlobalData).
 
 :- pred construct_ground_term_initializers_lld(module_info::in,
-    prog_context::in,
-    assoc_list(prog_var, mer_type) ::in, list(mlds_rval)::out,
-    ml_global_data::in, ml_global_data::out,
+    prog_context::in, assoc_list(prog_var, arg_width)::in,
+    list(mlds_rval)::out, ml_global_data::in, ml_global_data::out,
     ml_ground_term_map::in, ml_ground_term_map::out) is det.
 
 construct_ground_term_initializers_lld(_, _, [], [],
         !GlobalData, !GroundTermMap).
 construct_ground_term_initializers_lld(ModuleInfo, Context,
-        [ArgConsArgType | ArgConsArgTypes], [ArgRval | ArgRvals],
+        [Arg - ConsArgWidth | ArgConsArgWidths], [ArgRval | ArgRvals],
         !GlobalData, !GroundTermMap) :-
     construct_ground_term_initializer_lld(ModuleInfo, Context,
-        ArgConsArgType, ArgRval, !GlobalData, !GroundTermMap),
+        Arg, ConsArgWidth, ArgRval, !GlobalData, !GroundTermMap),
     construct_ground_term_initializers_lld(ModuleInfo, Context,
-        ArgConsArgTypes, ArgRvals, !GlobalData, !GroundTermMap).
-
-%-----------------------------------------------------------------------------%
-
-:- pred construct_ground_term_initializer_hld(module_info::in,
-    prog_context::in, pair(prog_var, mer_type) ::in, mlds_rval::out,
-    ml_global_data::in, ml_global_data::out,
-    ml_ground_term_map::in, ml_ground_term_map::out) is det.
-
-construct_ground_term_initializer_hld(ModuleInfo, Context,
-        Arg - ConsArgType, ArgRval, !GlobalData, !GroundTermMap) :-
-    map.det_remove(Arg, ArgGroundTerm, !GroundTermMap),
-    ArgGroundTerm = ml_ground_term(ArgRval0, ArgType, _MLDS_ArgType),
-    ml_type_as_field(ModuleInfo, yes, ConsArgType, BoxedArgType),
-    ml_gen_box_or_unbox_const_rval_hld(ModuleInfo, ArgType, BoxedArgType,
-        ArgRval0, Context, ArgRval, !GlobalData).
+        ArgConsArgWidths, ArgRvals, !GlobalData, !GroundTermMap).
 
 :- pred construct_ground_term_initializer_lld(module_info::in,
-    prog_context::in, pair(prog_var, mer_type) ::in, mlds_rval::out,
-    ml_global_data::in, ml_global_data::out,
+    prog_context::in, prog_var::in, arg_width::in,
+    mlds_rval::out, ml_global_data::in, ml_global_data::out,
     ml_ground_term_map::in, ml_ground_term_map::out) is det.
 
 construct_ground_term_initializer_lld(ModuleInfo, Context,
-        Arg - _ConsArgType, ArgRval, !GlobalData, !GroundTermMap) :-
+        Arg, ConsArgWidth, ArgRval, !GlobalData, !GroundTermMap) :-
     map.det_remove(Arg, ArgGroundTerm, !GroundTermMap),
     ArgGroundTerm = ml_ground_term(ArgRval0, _ArgType, MLDS_ArgType),
-    ml_gen_box_const_rval(ModuleInfo, Context, MLDS_ArgType,
+    (
+        ConsArgWidth = double_word,
+        DoubleWidth = yes
+    ;
+        ( ConsArgWidth = full_word
+        ; ConsArgWidth = partial_word_first(_)
+        ; ConsArgWidth = partial_word_shifted(_, _)
+        ),
+        DoubleWidth = no
+    ),
+    ml_gen_box_const_rval(ModuleInfo, Context, MLDS_ArgType, DoubleWidth,
         ArgRval0, ArgRval, !GlobalData).
 
 %-----------------------------------------------------------------------------%
 
+:- pred ml_expand_double_word_rvals(list(arg_width)::in, list(arg_width)::out,
+    assoc_list(mlds_rval, mlds_type)::in,
+    assoc_list(mlds_rval, mlds_type)::out) is det.
+
+ml_expand_double_word_rvals([], [], [], []).
+ml_expand_double_word_rvals([Width0 | Widths0], Widths,
+        [Rval0 - Type0 | RvalsTypes0], RvalsTypes) :-
+    ml_expand_double_word_rvals(Widths0, Widths1, RvalsTypes0, RvalsTypes1),
+    (
+        ( Width0 = full_word
+        ; Width0 = partial_word_first(_)
+        ; Width0 = partial_word_shifted(_, _)
+        ),
+        Widths = [Width0 | Widths1],
+        RvalsTypes = [Rval0 - Type0 | RvalsTypes1]
+    ;
+        Width0 = double_word,
+        ( Rval0 = ml_const(mlconst_null(_)) ->
+            SubstType = mlds_generic_type,
+            RvalA = ml_const(mlconst_null(SubstType)),
+            RvalB = ml_const(mlconst_null(SubstType))
+        ;
+            SubstType = mlds_native_int_type,
+            RvalA = ml_binop(float_word_bits, Rval0, ml_const(mlconst_int(0))),
+            RvalB = ml_binop(float_word_bits, Rval0, ml_const(mlconst_int(1)))
+        ),
+        Widths = [full_word, full_word | Widths1],
+        RvalsTypes = [RvalA - SubstType, RvalB - SubstType | RvalsTypes1]
+    ).
+ml_expand_double_word_rvals([], _, [_ | _], _) :-
+    unexpected($module, $pred, "list length mismatch").
+ml_expand_double_word_rvals([_ | _], _, [], _) :-
+    unexpected($module, $pred, "list length mismatch").
+
 :- pred ml_shift_combine_rval(mlds_rval::in, int::in, maybe(mlds_rval)::in,
     mlds_rval::out, unit::in, unit::out, unit::in, unit::out) is det.
 
diff --git a/compiler/mlds_to_c.m b/compiler/mlds_to_c.m
index 2c17b44..01e31ba 100644
--- a/compiler/mlds_to_c.m
+++ b/compiler/mlds_to_c.m
@@ -4503,14 +4503,33 @@ mlds_output_binop(Opts, Op, X, Y, !IO) :-
         io.write_string(")", !IO)
     ;
         Category = macro_binop,
-        io.write_string(OpStr, !IO),
-        io.write_string("(", !IO),
-        mlds_output_rval_as_op_arg(Opts, X, !IO),
-        io.write_string(", ", !IO),
-        mlds_output_rval_as_op_arg(Opts, Y, !IO),
-        io.write_string(")", !IO)
+        (
+            Op = float_from_dword,
+            consecutive_field_offsets(X, Y)
+        ->
+            % gcc produces faster code in this case.
+            io.write_string("MR_float_from_dword_ptr", !IO),
+            io.write_string("(&(", !IO),
+            mlds_output_rval(Opts, X, !IO),
+            io.write_string("))", !IO)
+        ;
+            io.write_string(OpStr, !IO),
+            io.write_string("(", !IO),
+            mlds_output_rval_as_op_arg(Opts, X, !IO),
+            io.write_string(", ", !IO),
+            mlds_output_rval_as_op_arg(Opts, Y, !IO),
+            io.write_string(")", !IO)
+        )
     ).
 
+:- pred consecutive_field_offsets(mlds_rval::in, mlds_rval::in) is semidet.
+
+consecutive_field_offsets(X, Y) :-
+    X = ml_lval(ml_field(Tag, Addr, FieldIdX, Type, PtrType)),
+    Y = ml_lval(ml_field(Tag, Addr, FieldIdY, Type, PtrType)),
+    FieldIdX = ml_field_offset(ml_const(mlconst_int(Offset))),
+    FieldIdY = ml_field_offset(ml_const(mlconst_int(Offset + 1))).
+
 :- pred mlds_output_rval_const(mlds_to_c_opts::in, mlds_rval_const::in,
     io::di, io::uo) is det.
 
diff --git a/compiler/mlds_to_gcc.m b/compiler/mlds_to_gcc.m
index 97fbef0..cae0df3 100644
--- a/compiler/mlds_to_gcc.m
+++ b/compiler/mlds_to_gcc.m
@@ -3569,6 +3569,10 @@ convert_binary_op(float_lt, gcc.lt_expr,        gcc.boolean_type_node).
 convert_binary_op(float_gt, gcc.gt_expr,        gcc.boolean_type_node).
 convert_binary_op(float_le, gcc.le_expr,        gcc.boolean_type_node).
 convert_binary_op(float_ge, gcc.ge_expr,        gcc.boolean_type_node).
+convert_binary_op(float_word_bits, _, _) :-
+    unexpected($module, $pred, "float_word_bits").
+convert_binary_op(float_from_dword, _, _) :-
+    unexpected($module, $pred, "float_from_dword").
 convert_binary_op(compound_eq, _, _) :-
     unexpected($module, $pred, "compound_eq").
 convert_binary_op(compound_lt, _, _) :-
diff --git a/compiler/mlds_to_il.m b/compiler/mlds_to_il.m
index 40d9e8f..20b2916 100644
--- a/compiler/mlds_to_il.m
+++ b/compiler/mlds_to_il.m
@@ -2729,6 +2729,12 @@ binaryop_to_il(ne, from_list(Instrs), !Info) :-
 binaryop_to_il(body, _, !Info) :-
     unexpected($module, $pred, "body").
 
+binaryop_to_il(float_word_bits, _, !Info) :-
+    unexpected($module, $pred, "float_word_bits").
+
+binaryop_to_il(float_from_dword, _, !Info) :-
+    unexpected($module, $pred, "float_from_dword").
+
 binaryop_to_il(array_index(ElemType), singleton(I), !Info) :-
     DataRep = !.Info ^ il_data_rep,
     MLDS_Type = ml_gen_array_elem_type(ElemType),
diff --git a/compiler/opt_debug.m b/compiler/opt_debug.m
index dfe53ea..7e66522 100644
--- a/compiler/opt_debug.m
+++ b/compiler/opt_debug.m
@@ -659,6 +659,8 @@ dump_binop(float_le) = "fl<=".
 dump_binop(float_ge) = "fl>=".
 dump_binop(float_lt) = "fl<".
 dump_binop(float_gt) = "fl>".
+dump_binop(float_word_bits) = "float_word_bits".
+dump_binop(float_from_dword) = "float_from_dword".
 dump_binop(int_add) = "+".
 dump_binop(int_sub) = "-".
 dump_binop(int_mul) = "*".
diff --git a/compiler/prog_data.m b/compiler/prog_data.m
index 0ca9a5e..0c127a5 100644
--- a/compiler/prog_data.m
+++ b/compiler/prog_data.m
@@ -1710,22 +1710,27 @@ equivalent_cons_ids(ConsIdA, ConsIdB) :-
     % How much space does a constructor argument occupy in the underlying
     % representation.
     %
-    % `full_word' indicates that an argument occupies a full word.
-    % This is the case for all arguments except some enumeration arguments.
+    % `full_word' indicates that the argument occupies a single word.
+    % This is the usual case.
+    %
+    % `double_word' indicates that the argument occupies two words.
+    % Currently only double-precision floats may do so.
     %
     % `partial_word_begin(Mask)' indicates that the argument is the first of
-    % two or more arguments which share the same word. The argument occupies
-    % the lowest bits in the word so no shifting is required. The other
-    % arguments can be masked out with the bit-mask `Mask'. The actual number
-    % of bits occupied by the argument is `int.log2(Mask + 1)'.
+    % two or more enumeration arguments which share the same word. The argument
+    % occupies the lowest bits in the word so no shifting is required. The
+    % other arguments can be masked out with the bit-mask `Mask'. The actual
+    % number of bits occupied by the argument is `int.log2(Mask + 1)'.
     %
     % `partial_word_shifted(Shift, Mask)' indicates that the argument is one of
-    % the subsequent arguments which share the same word. `Shift' is the
-    % non-zero number of bits that the argument value is left-shifted by.
-    % `Mask' is the unshifted bit-mask to mask out other arguments.
+    % the subsequent enumeration arguments which share the same word.
+    % `Shift' is the non-zero number of bits that the argument value is
+    % left-shifted by. `Mask' is the unshifted bit-mask to mask out other
+    % arguments.
     %
 :- type arg_width
     --->    full_word
+    ;       double_word
     ;       partial_word_first(int)         % mask
     ;       partial_word_shifted(int, int). % shift, mask
 
diff --git a/compiler/rtti_out.m b/compiler/rtti_out.m
index e70ed32..7d4f9d3 100644
--- a/compiler/rtti_out.m
+++ b/compiler/rtti_out.m
@@ -1138,7 +1138,8 @@ output_du_arg_locns(Info, RttiTypeCtor, Ordinal, ArgInfos, HaveArgLocns,
         !DeclSet, !IO) :-
     (
         list.member(ArgInfo, ArgInfos),
-        ArgInfo = du_arg_info(_, _, partial_word_first(_))
+        ArgInfo = du_arg_info(_, _, Width),
+        Width \= full_word
     ->
         output_generic_rtti_data_defn_start(Info,
             ctor_rtti_id(RttiTypeCtor, type_ctor_field_locns(Ordinal)),
@@ -1159,23 +1160,32 @@ output_du_arg_locns_2([ArgInfo | ArgInfos], PrevSlotNum, !IO) :-
     ArgWidth = ArgInfo ^ du_arg_width,
     (
         ArgWidth = full_word,
-        % Code which examines this structure must check for Bits = 0 as a
-        % special case, meaning that the argument is not packed.
+        % Bits = 0 is a special case.
         Shift = 0,
         Bits = 0,
-        SlotNum = PrevSlotNum + 1
+        SlotNum = PrevSlotNum + 1,
+        Skip = 0
+    ;
+        ArgWidth = double_word,
+        % Bits = -1 is a special case.
+        Shift = 0,
+        Bits = -1,
+        SlotNum = PrevSlotNum + 1,
+        Skip = 1
     ;
         ArgWidth = partial_word_first(Mask),
         Shift = 0,
         int.log2(Mask + 1, Bits),
-        SlotNum = PrevSlotNum + 1
+        SlotNum = PrevSlotNum + 1,
+        Skip = 0
     ;
         ArgWidth = partial_word_shifted(Shift, Mask),
         int.log2(Mask + 1, Bits),
-        SlotNum = PrevSlotNum
+        SlotNum = PrevSlotNum,
+        Skip = 0
     ),
     io.format("\t{ %d, %d, %d },\n", [i(SlotNum), i(Shift), i(Bits)], !IO),
-    output_du_arg_locns_2(ArgInfos, SlotNum, !IO).
+    output_du_arg_locns_2(ArgInfos, SlotNum + Skip, !IO).
 
 %-----------------------------------------------------------------------------%
 
diff --git a/compiler/rtti_to_mlds.m b/compiler/rtti_to_mlds.m
index dd28f46..bce59b5 100644
--- a/compiler/rtti_to_mlds.m
+++ b/compiler/rtti_to_mlds.m
@@ -988,24 +988,35 @@ gen_field_locns(_ModuleInfo, RttiTypeCtor, Ordinal, ArgInfos, HaveArgLocns,
 :- pred gen_field_locn(rtti_id::in, du_arg_info::in, mlds_initializer::out,
     int::in, int::out) is det.
 
-gen_field_locn(RttiId, ArgInfo, ArgLocnInitializer, !Offset) :-
+gen_field_locn(RttiId, ArgInfo, ArgLocnInitializer, PrevOffset,
+        NextPrevOffset) :-
     ArgWidth = ArgInfo ^ du_arg_width,
     (
         ArgWidth = full_word,
-        !:Offset = !.Offset + 1,
+        FieldOffset = PrevOffset + 1,
         Shift = 0,
-        Bits = 0
+        Bits = 0,
+        NextPrevOffset = FieldOffset
+    ;
+        ArgWidth = double_word,
+        FieldOffset = PrevOffset + 1,
+        Shift = 0,
+        Bits = -1,
+        NextPrevOffset = FieldOffset + 1
     ;
         ArgWidth = partial_word_first(Mask),
-        !:Offset = !.Offset + 1,
+        FieldOffset = PrevOffset + 1,
         Shift = 0,
-        int.log2(Mask + 1, Bits)
+        int.log2(Mask + 1, Bits),
+        NextPrevOffset = FieldOffset
     ;
         ArgWidth = partial_word_shifted(Shift, Mask),
-        int.log2(Mask + 1, Bits)
+        FieldOffset = PrevOffset,
+        int.log2(Mask + 1, Bits),
+        NextPrevOffset = FieldOffset
     ),
     ArgLocnInitializer = init_struct(mlds_rtti_type(item_type(RttiId)), [
-        gen_init_int(!.Offset),
+        gen_init_int(FieldOffset),
         gen_init_int(Shift),
         gen_init_int(Bits)
     ]).
diff --git a/compiler/structure_reuse.direct.choose_reuse.m b/compiler/structure_reuse.direct.choose_reuse.m
index 62b6ca1..0b58804 100644
--- a/compiler/structure_reuse.direct.choose_reuse.m
+++ b/compiler/structure_reuse.direct.choose_reuse.m
@@ -954,11 +954,12 @@ compute_reuse_type(Background, NewVar, NewCons, NewCellArgs, DeconSpec,
         SameCons = yes
     ;
         SameCons = no,
-        % XXX All the reuse code was written before packed arguments were
-        % introduced. For now only allow reuse of cells with packed fields when
-        % the dead variable and the new variable have the same constructor.
-        cons_has_no_packed_fields(ModuleInfo, NewCons),
-        cons_has_no_packed_fields(ModuleInfo, DeadCons)
+        % XXX All the reuse code was written before packed and double word
+        % arguments were introduced. For now only allow reuse of cells with
+        % packed fields when the dead variable and the new variable have the
+        % same constructor.
+        cons_has_normal_fields(ModuleInfo, NewCons),
+        cons_has_normal_fields(ModuleInfo, DeadCons)
     ),
 
     NewNumArgs = list.length(NewCellArgs),
@@ -1008,9 +1009,9 @@ compute_reuse_type(Background, NewVar, NewCons, NewCellArgs, DeconSpec,
     Weight > 0,
     ReuseType = reuse_type(SameCons, ReuseFields, float(Weight)).
 
-:- pred cons_has_no_packed_fields(module_info::in, cons_id::in) is semidet.
+:- pred cons_has_normal_fields(module_info::in, cons_id::in) is semidet.
 
-cons_has_no_packed_fields(ModuleInfo, Cons) :-
+cons_has_normal_fields(ModuleInfo, Cons) :-
     (
         Cons = cons(_, _, TypeCtor),
         get_cons_defn_det(ModuleInfo, TypeCtor, Cons, ConsDefn),
diff --git a/compiler/type_util.m b/compiler/type_util.m
index b71a09f..0865395 100644
--- a/compiler/type_util.m
+++ b/compiler/type_util.m
@@ -257,6 +257,10 @@
 :- pred type_is_no_tag_type(module_info::in, mer_type::in, sym_name::out,
     mer_type::out) is semidet.
 
+    % Check whether a type is float or a type equivalent to float.
+    %
+:- pred type_is_float_eqv(module_info::in, mer_type::in) is semidet.
+
     % cons_id_adjusted_arity(ModuleInfo, Type, ConsId):
     %
     % Returns the number of arguments of specified constructor id, adjusted
@@ -1082,6 +1086,17 @@ type_is_no_tag_type(ModuleInfo, Type, Ctor, ArgType) :-
 
 %-----------------------------------------------------------------------------%
 
+type_is_float_eqv(ModuleInfo, Type) :-
+    (
+        Type = float_type
+    ;
+        type_to_type_defn_body(ModuleInfo, Type, TypeBody),
+        TypeBody = hlds_eqv_type(EqvType),
+        type_is_float_eqv(ModuleInfo, EqvType)
+    ).
+
+%-----------------------------------------------------------------------------%
+
 cons_id_adjusted_arity(ModuleInfo, Type, ConsId) = AdjustedArity :-
     % Figure out the arity of this constructor, _including_ any type-infos
     % or typeclass-infos inserted for existential data types.
diff --git a/compiler/unify_gen.m b/compiler/unify_gen.m
index fac81c1..63088dc 100644
--- a/compiler/unify_gen.m
+++ b/compiler/unify_gen.m
@@ -1180,20 +1180,26 @@ var_types(CI, Vars, Types) :-
     rval::in, int::in, int::in, list(uni_val)::out, list(uni_val)::out) is det.
 
 make_fields_and_argvars([], [], _, _, _, [], []).
-make_fields_and_argvars([Var | Vars], [Width | Widths], Rval, PrevOffset,
+make_fields_and_argvars([Var | Vars], [Width | Widths], Rval, PrevOffset0,
         TagNum, [F | Fs], [A | As]) :-
     (
         ( Width = full_word
         ; Width = partial_word_first(_Mask)
         ),
-        Offset = PrevOffset + 1
+        Offset = PrevOffset0 + 1,
+        PrevOffset = Offset
     ;
         Width = partial_word_shifted(_Shift, _Mask),
-        Offset = PrevOffset
+        Offset = PrevOffset0,
+        PrevOffset = Offset
+    ;
+        Width = double_word,
+        Offset = PrevOffset0 + 1,
+        PrevOffset = Offset + 1
     ),
     F = lval(field(yes(TagNum), Rval, const(llconst_int(Offset))), Width),
     A = ref(Var),
-    make_fields_and_argvars(Vars, Widths, Rval, Offset, TagNum, Fs, As).
+    make_fields_and_argvars(Vars, Widths, Rval, PrevOffset, TagNum, Fs, As).
 make_fields_and_argvars([], [_ | _], _, _, _, _, _) :-
     unexpected($module, $pred, "mismatched lists").
 make_fields_and_argvars([_ | _], [], _, _, _, _, _) :-
@@ -1443,6 +1449,10 @@ generate_sub_assign(Left, Right, Code, !CI) :-
             Combined = binop(bitwise_or, MaskOld, ShiftNew),
             AssignCode = singleton(llds_instr(assign(Lval, Combined),
                 "Update part of word"))
+        ;
+            LeftWidth = double_word,
+            % Not yet supported in LLDS grades.
+            sorry($module, $pred, "double_word")
         ),
         Code = SourceCode ++ MaterializeCode ++ AssignCode
     ;
@@ -1464,6 +1474,10 @@ generate_sub_assign(Left, Right, Code, !CI) :-
                     ),
                     Rval = binop(bitwise_and, Rval0, const(llconst_int(Mask))),
                     assign_field_lval_expr_to_var(Lvar, Lval, Rval, Code, !CI)
+                ;
+                    RightWidth = double_word,
+                    % Not yet supported in LLDS grades.
+                    sorry($module, $pred, "double_word")
                 )
             ;
                 Right = ref(Rvar),
diff --git a/library/construct.m b/library/construct.m
index a431bf1..a6f5ced 100644
--- a/library/construct.m
+++ b/library/construct.m
@@ -719,8 +719,22 @@ find_functor_2(TypeInfo, Functor, Arity, Num0, FunctorNumber, ArgTypes) :-
                             MR_field(ptag, new_data, 1 + i) = arg_data;
                         } else {
                             const MR_DuArgLocn *locn = &arg_locns[i];
-                            MR_field(ptag, new_data, 1 + locn->MR_arg_offset)
-                                |= (arg_data << locn->MR_arg_shift);
+
+                            if (locn->MR_arg_bits == -1) {
+                              #ifdef MR_BOXED_FLOAT
+                                MR_memcpy(
+                                    &MR_field(ptag, new_data,
+                                        1 + locn->MR_arg_offset),
+                                    (MR_Word *) arg_data, sizeof(MR_Float));
+                              #else
+                                MR_fatal_error(
+                                    ""construct(): double precision float"");
+                              #endif
+                            } else {
+                                MR_field(ptag, new_data,
+                                    1 + locn->MR_arg_offset)
+                                    |= (arg_data << locn->MR_arg_shift);
+                            }
                         }
                         size += MR_term_size(arg_type_info, arg_data);
                         arg_list = MR_list_tail(arg_list);
@@ -760,8 +774,20 @@ find_functor_2(TypeInfo, Functor, Arity, Num0, FunctorNumber, ArgTypes) :-
                             MR_field(ptag, new_data, i) = arg_data;
                         } else {
                             const MR_DuArgLocn *locn = &arg_locns[i];
-                            MR_field(ptag, new_data, locn->MR_arg_offset)
-                                |= (arg_data << locn->MR_arg_shift);
+
+                            if (locn->MR_arg_bits == -1) {
+                              #ifdef MR_BOXED_FLOAT
+                                MR_memcpy(&MR_field(ptag, new_data,
+                                    locn->MR_arg_offset),
+                                    (MR_Word *) arg_data, sizeof(MR_Float));
+                              #else
+                                MR_fatal_error(
+                                    ""construct(): double-precision float"");
+                              #endif
+                            } else {
+                                MR_field(ptag, new_data, locn->MR_arg_offset)
+                                    |= (arg_data << locn->MR_arg_shift);
+                            }
                         }
                         size += MR_term_size(arg_type_info, arg_data);
                         arg_list = MR_list_tail(arg_list);
diff --git a/library/store.m b/library/store.m
index e964657..600dab9 100644
--- a/library/store.m
+++ b/library/store.m
@@ -714,7 +714,7 @@ ref_functor(Ref, Functor, Arity, !Store) :-
         MR_offset_incr_hp_msg(ArgRef, MR_SIZE_SLOT_SIZE,
             MR_SIZE_SLOT_SIZE + 1, MR_ALLOC_ID, ""store.ref/2"");
         MR_define_size_slot(0, ArgRef, 1);
-        * (MR_Word *) ArgRef = MR_unpack_arg(*arg_ref, arg_locn);
+        * (MR_Word *) ArgRef = MR_arg_value(arg_ref, arg_locn);
     } else {
         ArgRef = (MR_Word) arg_ref;
     }
@@ -780,7 +780,7 @@ ref_functor(Ref, Functor, Arity, !Store) :-
         MR_offset_incr_hp_msg(ArgRef, MR_SIZE_SLOT_SIZE,
             MR_SIZE_SLOT_SIZE + 1, MR_ALLOC_ID, ""store.ref/2"");
         MR_define_size_slot(0, ArgRef, 1);
-        * (MR_Word *) ArgRef = MR_unpack_arg(*arg_ref, arg_locn);
+        * (MR_Word *) ArgRef = MR_arg_value(arg_ref, arg_locn);
     } else if (arg_ref == &Val) {
         /*
         ** For no_tag types, the argument may have the same address as the
diff --git a/runtime/mercury_conf.h.in b/runtime/mercury_conf.h.in
index 816d61c..0e9150c 100644
--- a/runtime/mercury_conf.h.in
+++ b/runtime/mercury_conf.h.in
@@ -97,8 +97,11 @@
 #undef	MR_LOW_TAG_BITS
 
 /*
-** MR_BOXED_FLOAT: defined if double precision floats might not fit
-** in a Word, and hence must be boxed.
+** MR_BOXED_FLOAT: defined if MR_Float is wider than a MR_Word,
+** i.e. when using double-precision floats on 32-bit architectures.
+** It does NOT mean that all floats are necessarily boxed.
+** A better name might be MR_OVERSIZED_FLOAT.
+**
 ** Note that when bootstrapping from the source distribution,
 ** we initially build things without MR_BOXED_FLOAT even on machines
 ** for which sizeof(MR_Float) <= sizeof(MR_Word).
diff --git a/runtime/mercury_deconstruct.c b/runtime/mercury_deconstruct.c
index d4beed4..a3af852 100644
--- a/runtime/mercury_deconstruct.c
+++ b/runtime/mercury_deconstruct.c
@@ -344,3 +344,33 @@ MR_expand_type_name(MR_TypeCtorInfo tci, MR_bool wrap)
 
     return (MR_ConstString) str;
 }
+
+MR_Word
+MR_arg_value_uncommon(MR_Word *arg_ptr, const MR_DuArgLocn *arg_locn)
+{
+    MR_Float    flt;
+    MR_Word     val;
+
+    /*
+    ** MR_arg_bits == -1 means the argument is a double-precision floating
+    ** point value occupying two words.
+    */
+    if (arg_locn->MR_arg_bits == -1) {
+#ifdef MR_BOXED_FLOAT
+        flt = MR_float_from_dword_ptr(arg_ptr);
+    #ifdef MR_HIGHLEVEL_CODE
+        return (MR_Word) MR_box_float(flt);
+    #else
+        return MR_float_to_word(flt);
+    #endif
+#else
+        MR_fatal_error("double-word floats should not exist in this grade");
+#endif
+    }
+
+    /* The argument is a packed enumeration value. */
+    val = *arg_ptr;
+    val = (val >> arg_locn->MR_arg_shift)
+        & ((MR_Word) (1 << arg_locn->MR_arg_bits) - 1);
+    return val;
+}
diff --git a/runtime/mercury_deconstruct.h b/runtime/mercury_deconstruct.h
index db1ae5f..e38b4da 100644
--- a/runtime/mercury_deconstruct.h
+++ b/runtime/mercury_deconstruct.h
@@ -154,4 +154,13 @@ extern  MR_bool MR_named_arg(MR_TypeInfo type_info, MR_Word *term,
 extern  MR_bool MR_named_arg_num(MR_TypeInfo type_info, MR_Word *term_ptr,
                     const char *arg_name, int *arg_num_ptr);
 
+#define MR_arg_value(arg_ptr, arg_locn)                                 \
+    ( ((arg_locn) == NULL || (arg_locn)->MR_arg_bits == 0)              \
+    ? *(arg_ptr)                                                        \
+    : MR_arg_value_uncommon(arg_ptr, arg_locn)                          \
+    )
+
+extern  MR_Word MR_arg_value_uncommon(MR_Word *arg_ptr,
+                    const MR_DuArgLocn *arg_locn);
+
 #endif /* MERCURY_DECONSTRUCT_H */
diff --git a/runtime/mercury_deconstruct_macros.h b/runtime/mercury_deconstruct_macros.h
index bc29206..fae0287 100644
--- a/runtime/mercury_deconstruct_macros.h
+++ b/runtime/mercury_deconstruct_macros.h
@@ -66,10 +66,10 @@
             } else {                                                \
                 const MR_DuArgLocn *locn =                          \
                     &(ei).args_field.arg_locns[i];                  \
-                val = (ei).args_field.arg_values[                   \
+                MR_Word *argp = &(ei).args_field.arg_values[        \
                     locn->MR_arg_offset +                           \
                     (ei).args_field.num_extra_args];                \
-                val = MR_unpack_arg(val, locn);                     \
+                val = MR_arg_value(argp, locn);                     \
             }                                                       \
                                                                     \
             /* Create an argument on the heap */                    \
diff --git a/runtime/mercury_deep_copy_body.h b/runtime/mercury_deep_copy_body.h
index 63c7229..e50a227 100644
--- a/runtime/mercury_deep_copy_body.h
+++ b/runtime/mercury_deep_copy_body.h
@@ -289,19 +289,27 @@ try_again:
                     }                                                       \
                                                                             \
                     for (i = 0; i < arity; i++) {                           \
-                        if (arg_locns != NULL &&                            \
-                            arg_locns[i].MR_arg_bits != 0)                  \
-                        {                                                   \
-                            /*                                              \
-                            ** Copy fields holding packed arguments when    \
-                            ** we encounter the first argument.             \
-                            */                                              \
-                            if (arg_locns[i].MR_arg_shift == 0) {           \
+                        if (arg_locns != NULL) {                            \
+                            if (arg_locns[i].MR_arg_bits == -1) {           \
+                                /* Double precision float. */               \
                                 MR_field(0, new_data, cur_slot) =           \
                                     data_value[cur_slot];                   \
-                                cur_slot++;                                 \
+                                MR_field(0, new_data, cur_slot + 1) =       \
+                                    data_value[cur_slot + 1];               \
+                                cur_slot += 2;                              \
+                                continue;                                   \
+                            } else if (arg_locns[i].MR_arg_bits > 0) {      \
+                                /*                                          \
+                                ** Copy fields holding packed arguments     \
+                                ** when we encounter the first argument.    \
+                                */                                          \
+                                if (arg_locns[i].MR_arg_shift == 0) {       \
+                                    MR_field(0, new_data, cur_slot) =       \
+                                        data_value[cur_slot];               \
+                                    cur_slot++;                             \
+                                }                                           \
+                                continue;                                   \
                             }                                               \
-                            continue;                                       \
                         }                                                   \
                                                                             \
                         if (MR_arg_type_may_contain_var(functor_desc, i)) { \
diff --git a/runtime/mercury_float.h b/runtime/mercury_float.h
index 599fe14..b0036f8 100644
--- a/runtime/mercury_float.h
+++ b/runtime/mercury_float.h
@@ -71,6 +71,42 @@
   #define MR_float_const(f) MR_float_to_word(f)	/* inefficient */
 #endif
 
+#ifndef MR_USE_SINGLE_PREC_FLOAT
+  union MR_Float_Dword {
+	MR_Float f;
+	MR_Word	w[2];
+  };
+
+  #define MR_float_word_bits(F, I)                                          \
+    (((union MR_Float_Dword) (F)).w[(I)])
+
+  #define MR_float_from_dword_ptr(ptr)                                      \
+    (((union MR_Float_Dword *) (ptr))->f)
+
+  #if defined(MR_GNUC) || defined(MR_CLANG)
+    #define MR_float_from_dword(w0, w1)                                     \
+      ({                                                                    \
+	union MR_Float_Dword __ffdw;                                        \
+	__ffdw.w[0] = (MR_Word) (w0);                                       \
+	__ffdw.w[1] = (MR_Word) (w1);                                       \
+	__ffdw.f;                                                           \
+      })
+  #else
+    MR_EXTERN_INLINE MR_Float
+    MR_float_from_dword(MR_Word w0, MR_Word w1);
+
+    MR_EXTERN_INLINE MR_Float
+    MR_float_from_dword(MR_Word w0, MR_Word w1)
+    {
+	union MR_Float_Dword __ffdw;
+	__ffdw.w[0] = (MR_Word) (w0);
+	__ffdw.w[1] = (MR_Word) (w1);
+	return __ffdw.f;
+    }
+  #endif
+
+#endif /* not MR_USE_SINGLE_PREC_FLOAT */
+
 #else /* not MR_BOXED_FLOAT */
 
   /* unboxed float means we can assume sizeof(MR_Float) <= sizeof(MR_Word) */
diff --git a/runtime/mercury_ml_arg_body.h b/runtime/mercury_ml_arg_body.h
index b23884a..58ba440 100644
--- a/runtime/mercury_ml_arg_body.h
+++ b/runtime/mercury_ml_arg_body.h
@@ -62,10 +62,7 @@
         &argument_ptr, &arg_locn_ptr, NONCANON);
     MR_restore_transient_registers();
     if (success) {
-        value = *argument_ptr;
-        if (arg_locn_ptr != NULL) {
-            value = MR_unpack_arg(value, arg_locn_ptr);
-        }
+        value = MR_arg_value(argument_ptr, arg_locn_ptr);
 
         /*
         ** The following code is what *should* be here. The reason it is
diff --git a/runtime/mercury_table_type_body.h b/runtime/mercury_table_type_body.h
index a05ace2..ce5e664 100644
--- a/runtime/mercury_table_type_body.h
+++ b/runtime/mercury_table_type_body.h
@@ -222,13 +222,14 @@
 
                 for (i = 0; i < functor_desc->MR_du_functor_orig_arity; i++) {
                     const MR_DuArgLocn *arg_locn;
+                    MR_Word            *arg_ptr;
                     MR_Word             arg_value;
 
                     if (functor_desc->MR_du_functor_arg_locns != NULL) {
                         arg_locn = &functor_desc->MR_du_functor_arg_locns[i];
-                        arg_value = arg_vector[meta_args +
-                            arg_locn->MR_arg_offset];
-                        arg_value = MR_unpack_arg(arg_value, arg_locn);
+                        arg_ptr = &arg_vector[meta_args
+                            + arg_locn->MR_arg_offset];
+                        arg_value = MR_arg_value(arg_ptr, arg_locn);
                     } else {
                         arg_value = arg_vector[meta_args + i];
                     }
diff --git a/runtime/mercury_tabling.c b/runtime/mercury_tabling.c
index dea0617..563f0e6 100644
--- a/runtime/mercury_tabling.c
+++ b/runtime/mercury_tabling.c
@@ -17,6 +17,7 @@
 #include "mercury_type_info.h"
 #include "mercury_array_macros.h"
 #include "mercury_builtin_types.h"
+#include "mercury_deconstruct.h"
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
diff --git a/runtime/mercury_type_info.c b/runtime/mercury_type_info.c
index be8000a..4a7349e 100644
--- a/runtime/mercury_type_info.c
+++ b/runtime/mercury_type_info.c
@@ -907,10 +907,17 @@ MR_typeclass_ref_error(MR_Word tci, int n, const char *msg)
 int
 MR_cell_size_for_args(int arity, const MR_DuArgLocn *arg_locns)
 {
+    const MR_DuArgLocn *last_arg;
+
     if (arg_locns == NULL) {
         return arity;
+    }
+
+    last_arg = &arg_locns[arity - 1];
+    if (last_arg->MR_arg_bits == -1) {
+        return last_arg->MR_arg_offset + 2;
     } else {
-        return arg_locns[arity - 1].MR_arg_offset + 1;
+        return last_arg->MR_arg_offset + 1;
     }
 }
 
diff --git a/runtime/mercury_type_info.h b/runtime/mercury_type_info.h
index 359d4c5..f653047 100644
--- a/runtime/mercury_type_info.h
+++ b/runtime/mercury_type_info.h
@@ -893,9 +893,10 @@ typedef struct {
     MR_int_least8_t         MR_arg_bits;
     /*
     ** If MR_arg_bits is zero then the argument occupies the entire word.
-    ** Otherwise MR_arg_bits is non-zero and gives the number of bits used by
-    ** the argument. Storing the bit-mask would be more useful, but would not
-    ** be as compact. 
+    ** If MR_arg_bits is -1 then the argument is a double-precision floating
+    ** point value occupying two words. Otherwise MR_arg_bits is non-zero and
+    ** gives the number of bits used by the argument. Storing the bit-mask
+    ** would be more useful, but would not be as compact.
     */
 } MR_DuArgLocn;
 
@@ -943,12 +944,6 @@ typedef const MR_DuFunctorDesc              *MR_DuFunctorDescPtr;
 #define MR_some_arg_type_contains_var(functor_desc)                     \
     ((functor_desc)->MR_du_functor_arg_type_contains_var > 0)
 
-#define MR_unpack_arg(val, arg_locn)                                    \
-    ((arg_locn)->MR_arg_bits == 0                                       \
-     ? (val)                                                            \
-     : ((val) >> (arg_locn)->MR_arg_shift) &                            \
-        ((MR_Word) (1 << (arg_locn)->MR_arg_bits) - 1))
-
 /*---------------------------------------------------------------------------*/
 
 typedef struct {
diff --git a/tests/hard_coded/Mercury.options b/tests/hard_coded/Mercury.options
index 1eb61f9..ae87721 100644
--- a/tests/hard_coded/Mercury.options
+++ b/tests/hard_coded/Mercury.options
@@ -55,6 +55,7 @@ MCFLAGS-intermod_type_qual =	--intermodule-optimization
 MCFLAGS-intermod_type_qual2 =	--intermodule-optimization
 MCFLAGS-intermod_multimode =	--intermodule-optimization
 MCFLAGS-intermod_multimode_main = --intermodule-optimization
+MCFLAGS-lco_double	    =	--optimize-constructor-last-call
 MCFLAGS-lco_mday_bug_1	    =	--optimize-constructor-last-call
 MCFLAGS-lco_mday_bug_2	    =	--optimize-constructor-last-call
 MCFLAGS-lco_no_inline	    =	--optimize-constructor-last-call --no-inline-builtins
diff --git a/tests/hard_coded/Mmakefile b/tests/hard_coded/Mmakefile
index 4f66176..a7cd49e 100644
--- a/tests/hard_coded/Mmakefile
+++ b/tests/hard_coded/Mmakefile
@@ -157,6 +157,7 @@ ORDINARY_PROGS=	\
 	intermod_unused_args \
 	java_rtti_bug \
 	join_list \
+	lco_double \
 	lco_mday_bug_1 \
 	lco_mday_bug_2 \
 	lco_no_inline \
@@ -199,6 +200,7 @@ ORDINARY_PROGS=	\
 	opt_format \
 	pack_args \
 	pack_args_copy \
+	pack_args_float \
 	pack_args_intermod1 \
 	ppc_bug \
 	pprint_test \
diff --git a/tests/hard_coded/lco_double.exp b/tests/hard_coded/lco_double.exp
new file mode 100644
index 0000000..239405a
--- /dev/null
+++ b/tests/hard_coded/lco_double.exp
@@ -0,0 +1 @@
+thing(enum1, enum1, thing(enum2, enum2, thing(enum3, enum3, nil, 3.4567), 2.3456), 1.2345)
diff --git a/tests/hard_coded/lco_double.m b/tests/hard_coded/lco_double.m
new file mode 100644
index 0000000..447f749
--- /dev/null
+++ b/tests/hard_coded/lco_double.m
@@ -0,0 +1,40 @@
+%-----------------------------------------------------------------------------%
+
+:- module lco_double.
+:- interface.
+
+:- import_module io.
+
+:- pred main(io::di, io::uo) is det.
+
+%-----------------------------------------------------------------------------%
+%-----------------------------------------------------------------------------%
+
+:- implementation.
+
+:- import_module list.
+:- import_module pair.
+
+:- type thing
+    --->    thing(enum, enum, thing, float)
+    ;       nil.
+
+:- type enum
+    --->    enum1
+    ;       enum2
+    ;       enum3.
+
+:- pred gen(list(pair(enum, float))::in, thing::out) is det.
+
+gen([], nil).
+gen([E - F | Xs], T) :-
+    gen(Xs, Tail),
+    T = thing(E, E, Tail, F).
+
+main(!IO) :-
+    gen([enum1 - 1.2345, enum2 - 2.3456, enum3 - 3.4567], T),
+    io.write(T, !IO),
+    io.nl(!IO).
+
+%-----------------------------------------------------------------------------%
+% vim: ft=mercury ts=4 sts=4 sw=4 et
diff --git a/tests/hard_coded/pack_args_float.exp b/tests/hard_coded/pack_args_float.exp
new file mode 100644
index 0000000..4b7ab83
--- /dev/null
+++ b/tests/hard_coded/pack_args_float.exp
@@ -0,0 +1,2 @@
+ant, bat, 1.1, 2.2, cat, dog, eel, fox
+pug, owl, 101.101, 202.202, newt, moa, lark, jay
diff --git a/tests/hard_coded/pack_args_float.m b/tests/hard_coded/pack_args_float.m
new file mode 100644
index 0000000..1b9e0f5
--- /dev/null
+++ b/tests/hard_coded/pack_args_float.m
@@ -0,0 +1,77 @@
+%-----------------------------------------------------------------------------%
+
+:- module pack_args_float.
+:- interface.
+
+:- import_module io.
+
+:- pred main(io::di, io::uo) is det.
+
+%-----------------------------------------------------------------------------%
+%-----------------------------------------------------------------------------%
+
+:- implementation.
+
+:- type animal
+    --->    ant         % 0
+    ;       bat         % 1
+    ;       cat         % 2
+    ;       dog         % 3
+    ;       eel         % 4
+    ;       fox         % 5
+    ;       gnu         % 6
+    ;       hog         % 7
+    ;       ibis        % 8
+    ;       jay         % 9
+    ;       kea         % 10
+    ;       lark        % 11
+    ;       moa         % 12
+    ;       newt        % 13
+    ;       owl         % 14
+    ;       pug.        % 15
+
+:- type struct
+    --->    struct(
+                animal, animal,                 % word 0    | word 0
+                float,                          % word 1    | word 1+2
+                float,                          % word 2    | word 3+4
+                animal, animal, animal, animal  % word 3    | word 5
+            ).
+
+%-----------------------------------------------------------------------------%
+
+main(!IO) :-
+    Static = struct(ant, bat, 1.1, 2.2, cat, dog, eel, fox),
+    write_struct(Static, !IO),
+    io.nl(!IO),
+
+    Dynamic = struct(ani(pug), ani(owl), 101.101, 202.202,
+        ani(newt), ani(moa), lark, jay),
+    write_struct(Dynamic, !IO),
+    io.nl(!IO).
+
+:- func ani(animal) = animal.
+:- pragma no_inline(ani/1).
+
+ani(X) = X.
+
+:- pred write_struct(struct::in, io::di, io::uo) is det.
+:- pragma no_inline(write_struct/3).
+
+write_struct(struct(A, B, X, Y, C, D, E, F), !IO) :-
+    write_animal(A, !IO), write_string(", ", !IO),
+    write_animal(B, !IO), write_string(", ", !IO),
+    write_float(X, !IO),  write_string(", ", !IO),
+    write_float(Y, !IO),  write_string(", ", !IO),
+    write_animal(C, !IO), write_string(", ", !IO),
+    write_animal(D, !IO), write_string(", ", !IO),
+    write_animal(E, !IO), write_string(", ", !IO),
+    write_animal(F, !IO).
+
+:- pred write_animal(animal::in, io::di, io::uo) is det.
+
+write_animal(Animal, !IO) :-
+    write(Animal, !IO).
+
+%-----------------------------------------------------------------------------%
+% vim: ft=mercury ts=4 sts=4 sw=4 et
diff --git a/trace/mercury_trace_vars.c b/trace/mercury_trace_vars.c
index d7300aa..bd39673 100644
--- a/trace/mercury_trace_vars.c
+++ b/trace/mercury_trace_vars.c
@@ -1871,7 +1871,8 @@ MR_select_specified_subterm(char *path, MR_TypeInfo type_info, MR_Word *value,
                 MR_Word storage;
 
                 MR_incr_hp(storage, 1);
-                ((MR_Word *) storage)[0] = MR_unpack_arg(*new_value, arg_locn);
+                ((MR_Word *) storage)[0] = MR_arg_value(new_value,
+                    arg_locn);
                 value = (MR_Word *) storage;
             }
         } else {

--------------------------------------------------------------------------
mercury-reviews mailing list
Post messages to:       mercury-reviews at csse.unimelb.edu.au
Administrative Queries: owner-mercury-reviews at csse.unimelb.edu.au
Subscriptions:          mercury-reviews-request at csse.unimelb.edu.au
--------------------------------------------------------------------------
Previous message: [m-rev.] diff: more set_of_var interoperability improvements
Next message: [m-rev.] for review: support using clang as a C compiler with Mercury
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the reviews mailing list