[m-rev.] for review: float registers for higher order calls

Peter Wang novalazy at gmail.com
Fri Oct 14 16:58:09 AEDT 2011


Branches: main

Allow the use of Mercury abstract machine float registers for passing
double-precision float arguments in higher order calls.

In of itself this is not so useful for typical Mercury code.  However, as
all non-local procedures are potentially the targets of higher order calls,
without this change first order calls to non-local procedures could not use
float registers either.  That is the actual motivation for this change.

The basic mechanism is straightforward.  As before, do_call_closure_* is
invoked to place the closure's hidden arguments into r1, ..., rN, and extra
input arguments shifted into rN+1, etc.  With float registers, extra input
arguments may also be in f1, f2, etc. and the closure may also have hidden
float arguments.  Optimising for calls, we order the closure's hidden
arguments so that all float register arguments come after all regular
register arguments in the vector.  Having the arguments out of order does
complicate code which needs to deconstruct closures, but that is not so
important.

Polymorphism complicates things.  A closure with type pred(float) may be
passed to a procedure expecting pred(T).  Due to the `float' argument type,
the closure expects its argument in a float register.  But when passed to the
procedure, the polymorphic argument type means it would be called with the
argument in a regular register.

Higher-order insts already contain information about the calling convention,
without which a higher-order term cannot be called.  We extend higher-order
insts to include information about the register class required for each
argument.  For example, we can distinguish between:

	pred(in) is semidet /* arg regs: [reg_f] */
and
	pred(in) is semidet /* arg regs: [reg_r] */

Using this information, we can create a wrapper around a higher-order
variable if it appears in a context requiring a different calling convention.
We do this in a new HLDS pass, called float_regs.m.

Note: Mercury code has a tendency to lose insts for higher-order terms, then
"recover" them by hacky means.  The float_regs pass depends on higher-order
insts; it is impossible to create a wrapper for a procedure without knowing
how to call it.  The float_regs pass will report errors which we otherwise
accepted, due to higher-order insts being unavailable.  It should be possible
for the user to adjust the code to satify the pass, though the user may not
understand why it should be necessary.  In most cases, it probably really
*is* unnecessary.  We may be able to make the float_regs pass more tolerant
of missing higher-order insts in the future.

Class method calls do not use float registers because I didn't want to deal
with them yet.


compiler/options.m:
compiler/handle_options.m:
	Always enable float registers on low-level C grades when floats are
	wider than a word.

compiler/make_hlds_passes.m:
	Always allow double word floats to be stored unboxed in cells on C
	grades.

compiler/hlds_goal.m:
	Add an extra field to `generic_call' which gives the register class
	to use for each argument.  This is set by the float_regs pass.

compiler/prog_data.m:
	Add an extra field to `pred_inst_info' which records the register class
	to use for each argument.  This is set by the float_regs pass.

compiler/hlds_pred.m:
	Add a field to `proc_sub_info' which lists the headvars which must be
	passed via regular registers despite their types.

	Add a field to `pred_sub_info' to record the original unsubstituted
	argument types for instance method predicates.

compiler/check_typeclass.m:
	In the pred_info of an instance method predicate, record the original
	argument types before substiting the type variables for the instance.

compiler/float_regs.m:
compiler/transform_hlds.m:
	Add the new HLDS pass.

compiler/mercury_compile_middle_passes.m:
	Run the new pass if float registers are enabled.

compiler/lambda.m:
	Export the predicate to produce a predicate from a lambda.
	This is reused by float_regs.m to create wrapper closures.

	Add an argument to `expand_lambda' to set the reg_r_headvars field on
	the newly created procedure.

	Delete some unused fields from `lambda_info'.

compiler/arg_info.m:
	Make `generate_proc_arg_info' no longer always use regular registers
	for calls to exported procedures.  Do always use regular registers for
	class methods calls.

	Add a version of `make_arg_infos' which takes an explicit list of
	argument registers.  Rename the previous version.

	Add `generic_call_arg_reg_types' to return the argument registers
	for a generic call.

	Add a version of `compute_in_and_out_vars' which additionally separates
	arguments for float and regular registers.

compiler/call_gen.m:
	Use float registers for argument passing in higher-order calls, as
	directed by the new field in `generic_call'.

compiler/code_util.m:
	Add a function to encode the number of regular and float register
	arguments when making a higher-order call.

compiler/llds.m:
	Say that the `do_call_closure_N' functions only work for zero float
	register arguments.

compiler/follow_vars.m:
compiler/interval.m:
	Account for the use of float registers by generic call goals in these
	passes.

compiler/unify_gen.m:
	Move float register arguments to the end of a closure's hidden
	arguments vector, after regular register arguments.

	Count hidden regular and float register arguments separately, but
	encode them in the same word in the closure.  This is preferable to
	using two words because it reduces the differences between grades
	with and without float registers present.

	Disable generating code which creates a closure from an existing
	closure, if float registers exist.  That code does not understand the
	reordered hidden arguments vector yet.

compiler/continuation_info.m:
	Replace an argument's type_info in the closure layout if the argument
	is a float *and* is passed via a regular register, when floats are
	normally passed via float registers.  Instead, give it the type_info
	for `private_builtin.float_box'.

compiler/builtin_lib_types.m:
	Add function to return the type of `private_builtin.float_box/0'.

compiler/hlds_out_goal.m:
compiler/hlds_out_pred.m:
compiler/mercury_to_mercury.m:
	Dump the new fields added to `generic_call', `pred_inst_info' and
	`proc_sub_info'.

compiler/prog_type.m:
	Add helper predicate.

compiler/*.m:
	Conform to changes.

library/private_builtin.m:
	Add a type `float_box'.

runtime/mercury_ho_call.h:
	Describe the modified closure representation.

	Rename the field which counts the number of hidden arguments to prevent
	it being used incorrectly, as it now encodes two numbers (potentially).

	Add macros to unpack the encoded field.

runtime/mercury_ho_call.c:
	Update the description of how higher-order calls work.

	Update code which extracts closure arguments to take account the
	arguments being reordered in the hidden arguments vector.

runtime/mercury_deep_copy.c:
runtime/mercury_deep_copy_body.h:
runtime/mercury_layout_util.c:
runtime/mercury_ml_expand_body.h:
	Update code which extracts closure arguments to take account the
	arguments being reordered in the hidden arguments vector.

runtime/mercury_type_info.c:
runtime/mercury_type_info.h:
	Add helper function.

tools/make_spec_ho_call:
	Update the generated do_call_closure_* functions to place float
	register arguments.

tests/hard_coded/Mercury.options:
tests/hard_coded/Mmakefile:
tests/hard_coded/ho_float_reg.exp:
tests/hard_coded/ho_float_reg.m:
	Add new test case.

tests/hard_coded/copy_pred.exp:
tests/hard_coded/copy_pred.m:
tests/hard_coded/deconstruct_arg.exp:
tests/hard_coded/deconstruct_arg.exp2:
tests/hard_coded/deconstruct_arg.m:
	Extend test cases with float arguments in closures.

tests/debugger/higher_order.exp2:
	Add alternative output, changed due to closure wrapping.

tests/hard_coded/ho_univ_to_type.m:
	Adjust test case so that the float_regs pass does not report errors
	about missing higher-order insts.

compiler/notes/compiler_design.html:
	Describe the new module.

	Delete a duplicated paragraph.

compiler/notes/todo.html:
TODO:
	Delete one hundred billion year old todos.


diff --git a/compiler/add_heap_ops.m b/compiler/add_heap_ops.m
index c3813db..44466a2 100644
--- a/compiler/add_heap_ops.m
+++ b/compiler/add_heap_ops.m
@@ -228,7 +228,7 @@ goal_expr_add_heap_ops(GoalExpr0, GoalInfo0, Goal, !Info) :-
         Goal = hlds_goal(GoalExpr, GoalInfo0)
     ;
         ( GoalExpr0 = plain_call(_, _, _, _, _, _)
-        ; GoalExpr0 = generic_call(_, _, _, _)
+        ; GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = unify(_, _, _, _, _)
         ),
         Goal = hlds_goal(GoalExpr0, GoalInfo0)
diff --git a/compiler/add_pragma.m b/compiler/add_pragma.m
index 540b3f2..5bdc862 100644
--- a/compiler/add_pragma.m
+++ b/compiler/add_pragma.m
@@ -3674,8 +3674,8 @@ match_ho_inst_infos_with_renaming(ModuleInfo, HOInstInfoA, HOInstInfoB,
     ;
         HOInstInfoA = higher_order(PredInstInfoA),
         HOInstInfoB = higher_order(PredInstInfoB),
-        PredInstInfoA = pred_inst_info(PredOrFunc, ModesA, Detism),
-        PredInstInfoB = pred_inst_info(PredOrFunc, ModesB, Detism),
+        PredInstInfoA = pred_inst_info(PredOrFunc, ModesA, _, Detism),
+        PredInstInfoB = pred_inst_info(PredOrFunc, ModesB, _, Detism),
         mode_list_matches_with_renaming(ModesA, ModesB, Renaming, ModuleInfo)
     ).
 
diff --git a/compiler/add_pred.m b/compiler/add_pred.m
index d6f1b53..6de4ef2 100644
--- a/compiler/add_pred.m
+++ b/compiler/add_pred.m
@@ -295,7 +295,7 @@ add_builtin(PredId, Types, CompilationTarget, !PredInfo) :-
         AssignGoal = hlds_goal(AssignExpr, GoalInfoWithZero),
 
         CastExpr = generic_call(cast(unsafe_type_inst_cast),
-            [ZeroVar] ++ HeadVarList, [in_mode, uo_mode], detism_det),
+            [ZeroVar] ++ HeadVarList, [in_mode, uo_mode], no, detism_det),
         goal_info_set_nonlocals(
             set_of_var.list_to_set([ZeroVar | HeadVarList]),
             GoalInfo0, GoalInfoWithZeroHeadVars),
diff --git a/compiler/add_trail_ops.m b/compiler/add_trail_ops.m
index 8e5c9f0..0648f8b 100644
--- a/compiler/add_trail_ops.m
+++ b/compiler/add_trail_ops.m
@@ -337,7 +337,7 @@ goal_expr_add_trail_ops(GoalExpr0, GoalInfo0, Goal, !Info) :-
         Goal = hlds_goal(GoalExpr, GoalInfo0)
     ;
         ( GoalExpr0 = plain_call(_, _, _, _, _, _)
-        ; GoalExpr0 = generic_call(_, _, _, _)
+        ; GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = unify(_, _, _, _, _)
         ),
         Goal = hlds_goal(GoalExpr0, GoalInfo0)
diff --git a/compiler/arg_info.m b/compiler/arg_info.m
index 0c06ac2..4d0d63d 100644
--- a/compiler/arg_info.m
+++ b/compiler/arg_info.m
@@ -27,6 +27,7 @@
 :- interface.
 
 :- import_module hlds.code_model.
+:- import_module hlds.hlds_goal.
 :- import_module hlds.hlds_llds.
 :- import_module hlds.hlds_module.
 :- import_module hlds.hlds_pred.
@@ -34,6 +35,7 @@
 
 :- import_module assoc_list.
 :- import_module list.
+:- import_module maybe.
 :- import_module set.
 
 %-----------------------------------------------------------------------------%
@@ -46,22 +48,48 @@
     % Annotate a single procedure with information
     % about its argument passing interface.
     %
-:- pred generate_proc_arg_info(import_status::in, list(mer_type)::in,
+:- pred generate_proc_arg_info(pred_markers::in, list(mer_type)::in,
     module_info::in, proc_info::in, proc_info::out) is det.
 
     % Given the list of types and modes of the arguments of a procedure
-    % and its code model, return its argument passing interface.
+    % and its code model, return the standard argument passing interface,
+    % i.e. pass float arguments via float registers if present, and all
+    % other arguments via regular registers.
     %
-:- pred make_arg_infos(list(mer_type)::in, list(mer_mode)::in, code_model::in,
-    module_info::in, reg_type::in, list(arg_info)::out) is det.
+:- pred make_standard_arg_infos(list(mer_type)::in, list(mer_mode)::in,
+    code_model::in, module_info::in, list(arg_info)::out) is det.
+
+    % As above, but pass the register type for each argument explicitly.
+    % This is necessary for procedures with float arguments that must be
+    % passed via the regular registers instead of float registers.
+    %
+:- pred make_arg_infos(list(mer_type)::in, list(mer_mode)::in,
+    list(reg_type)::in, code_model::in, module_info::in,
+    list(arg_info)::out) is det.
+
+    % Return the register type to use for each argument of a generic call.
+    %
+:- pred generic_call_arg_reg_types(module_info::in, vartypes::in,
+    generic_call::in, list(prog_var)::in, maybe(list(ho_arg_reg))::in,
+    list(reg_type)::out) is det.
 
     % Divide the given list of arguments into those treated as inputs
     % by the calling convention and those treated as outputs.
     %
-:- pred compute_in_and_out_vars(module_info::in,
-    list(prog_var)::in, list(mer_mode)::in, list(mer_type)::in,
+:- pred compute_in_and_out_vars(module_info::in, list(prog_var)::in,
+    list(mer_mode)::in, list(mer_type)::in,
     list(prog_var)::out, list(prog_var)::out) is det.
 
+    % Divide the given list of arguments into those treated as inputs
+    % by the calling convention and those treated as outputs.
+    % Arguments are further divided those passed via regular registers
+    % or via float registers.
+    %
+:- pred compute_in_and_out_vars_sep_regs(module_info::in, list(prog_var)::in,
+    list(mer_mode)::in, list(mer_type)::in, list(reg_type)::in,
+    list(prog_var)::out, list(prog_var)::out, list(prog_var)::out,
+    list(prog_var)::out) is det.
+
     % Divide the given list of arguments and the arg_infos into three lists:
     % the inputs, the outputs, and the unused arguments, in that order.
     %
@@ -114,6 +142,7 @@
 :- import_module libs.globals.
 :- import_module libs.options.
 :- import_module parse_tree.builtin_lib_types.
+:- import_module parse_tree.set_of_var.
 
 :- import_module bool.
 :- import_module map.
@@ -150,11 +179,11 @@ generate_proc_list_arg_info(PredId, [ProcId | ProcIds], !ModuleInfo) :-
     module_info_get_preds(!.ModuleInfo, PredTable0),
     map.lookup(PredTable0, PredId, PredInfo0),
     pred_info_get_procedures(PredInfo0, ProcTable0),
-    pred_info_get_import_status(PredInfo0, Status),
+    pred_info_get_markers(PredInfo0, Markers),
     pred_info_get_arg_types(PredInfo0, ArgTypes),
     map.lookup(ProcTable0, ProcId, ProcInfo0),
 
-    generate_proc_arg_info(Status, ArgTypes, !.ModuleInfo,
+    generate_proc_arg_info(Markers, ArgTypes, !.ModuleInfo,
         ProcInfo0, ProcInfo),
 
     map.det_update(ProcId, ProcInfo, ProcTable0, ProcTable),
@@ -163,40 +192,39 @@ generate_proc_list_arg_info(PredId, [ProcId | ProcIds], !ModuleInfo) :-
     module_info_set_preds(PredTable, !ModuleInfo),
     generate_proc_list_arg_info(PredId, ProcIds, !ModuleInfo).
 
-generate_proc_arg_info(Status, ArgTypes, ModuleInfo, !ProcInfo) :-
-    proc_info_get_argmodes(!.ProcInfo, ArgModes),
-    proc_info_get_is_address_taken(!.ProcInfo, AddressTaken),
-    CodeModel = proc_info_interface_code_model(!.ProcInfo),
+generate_proc_arg_info(Markers, ArgTypes, ModuleInfo, !ProcInfo) :-
+    module_info_get_globals(ModuleInfo, Globals),
+    globals.lookup_bool_option(Globals, use_float_registers, UseFloatRegs),
+    proc_info_get_headvars(!.ProcInfo, HeadVars),
     (
-        % If the procedure may be invoked via a higher order call, we cannot
-        % use the float registers. In the future we will want to introduce
-        % wrappers, as in MLDS grades, which unbox their arguments and call
-        % the underlying procedure which does use float registers.
-        (
-            Status \= status_local
-        ;
-            AddressTaken = address_is_taken
-        )
+        UseFloatRegs = yes,
+        % XXX we don't yet use float registers for class method calls
+        not check_marker(Markers, marker_class_instance_method)
     ->
-        FloatRegType = reg_r
+        proc_info_get_reg_r_headvars(!.ProcInfo, RegR_HeadVars),
+        list.map_corresponding(reg_type_for_headvar(RegR_HeadVars),
+            HeadVars, ArgTypes, ArgRegTypes)
     ;
-        reg_type_for_float(ModuleInfo, FloatRegType)
+        list.duplicate(list.length(HeadVars), reg_r, ArgRegTypes)
     ),
-    make_arg_infos(ArgTypes, ArgModes, CodeModel, ModuleInfo, FloatRegType,
+    proc_info_get_argmodes(!.ProcInfo, ArgModes),
+    CodeModel = proc_info_interface_code_model(!.ProcInfo),
+    make_arg_infos(ArgTypes, ArgModes, ArgRegTypes, CodeModel, ModuleInfo,
         ArgInfo),
     proc_info_set_arg_info(ArgInfo, !ProcInfo).
 
-:- pred reg_type_for_float(module_info::in, reg_type::out) is det.
+:- pred reg_type_for_headvar(set_of_progvar::in, prog_var::in, mer_type::in,
+    reg_type::out) is det.
 
-reg_type_for_float(ModuleInfo, FloatRegType) :-
-    module_info_get_globals(ModuleInfo, Globals),
-    globals.lookup_bool_option(Globals, use_float_registers, UseFloatRegs),
-    (
-        UseFloatRegs = yes,
-        FloatRegType = reg_f
+reg_type_for_headvar(RegR_HeadVars, HeadVar, Type, RegType) :-
+    ( set_of_var.contains(RegR_HeadVars, HeadVar) ->
+        RegType = reg_r
     ;
-        UseFloatRegs = no,
-        FloatRegType = reg_r
+        ( Type = float_type ->
+            RegType = reg_f
+        ;
+            RegType = reg_r
+        )
     ).
 
 %---------------------------------------------------------------------------%
@@ -221,7 +249,31 @@ reg_type_for_float(ModuleInfo, FloatRegType) :-
     % module implement this decision consistently. (No code outside this module
     % should know about the outcome of this decision.)
     %
-make_arg_infos(ArgTypes, ArgModes, CodeModel, ModuleInfo, FloatRegType,
+make_standard_arg_infos(ArgTypes, ArgModes, CodeModel, ModuleInfo, ArgInfo) :-
+    module_info_get_globals(ModuleInfo, Globals),
+    globals.lookup_bool_option(Globals, use_float_registers, FloatRegs),
+    (
+        FloatRegs = yes,
+        FloatRegType = reg_f
+    ;
+        FloatRegs = no,
+        FloatRegType = reg_r
+    ),
+    list.map(standard_reg_type_for_type(FloatRegType), ArgTypes, RegTypes),
+    make_arg_infos(ArgTypes, ArgModes, RegTypes, CodeModel, ModuleInfo,
+        ArgInfo).
+
+:- pred standard_reg_type_for_type(reg_type::in, mer_type::in, reg_type::out)
+    is det.
+
+standard_reg_type_for_type(FloatRegType, Type, RegType) :-
+    ( Type = float_type ->
+        RegType = FloatRegType
+    ;
+        RegType = reg_r
+    ).
+
+make_arg_infos(ArgTypes, ArgModes, ArgRegTypes, CodeModel, ModuleInfo,
         ArgInfo) :-
     (
         CodeModel = model_semi,
@@ -235,23 +287,25 @@ make_arg_infos(ArgTypes, ArgModes, CodeModel, ModuleInfo, FloatRegType,
     FirstInRegR = 1,
     FirstInRegF = 1,
     FirstOutRegF = 1,
-    make_arg_infos_list(ArgModes, ArgTypes, FirstInRegR, FirstInRegF,
-        FirstOutRegR, FirstOutRegF, ModuleInfo, FloatRegType, ArgInfo).
+    (
+        make_arg_infos(ArgModes, ArgTypes, ArgRegTypes,
+            FirstInRegR, FirstInRegF, FirstOutRegR, FirstOutRegF,
+            ModuleInfo, ArgInfoPrime)
+    ->
+        ArgInfo = ArgInfoPrime
+    ;
+        unexpected($module, $pred, "length mismatch")
+    ).
 
-:- pred make_arg_infos_list(list(mer_mode)::in, list(mer_type)::in,
-    int::in, int::in, int::in, int::in, module_info::in, reg_type::in,
-    list(arg_info)::out) is det.
+:- pred make_arg_infos(list(mer_mode)::in, list(mer_type)::in,
+    list(reg_type)::in, int::in, int::in, int::in, int::in, module_info::in,
+    list(arg_info)::out) is semidet.
 
-make_arg_infos_list([], [], _, _, _, _, _, _, []).
-make_arg_infos_list([Mode | Modes], [Type | Types],
-        !.InRegR, !.InRegF, !.OutRegR, !.OutRegF,
-        ModuleInfo, FloatRegType, [ArgInfo | ArgInfos]) :-
+make_arg_infos([], [], [], _, _, _, _, _, []).
+make_arg_infos([Mode | Modes], [Type | Types], [RegType | RegTypes],
+        !.InRegR, !.InRegF, !.OutRegR, !.OutRegF, ModuleInfo,
+        [ArgInfo | ArgInfos]) :-
     mode_to_arg_mode(ModuleInfo, Mode, Type, ArgMode),
-    ( Type = float_type ->
-        RegType = FloatRegType
-    ;
-        RegType = reg_r
-    ),
     (
         ArgMode = top_in,
         get_arg_loc(RegType, ArgLoc, !InRegR, !InRegF)
@@ -262,12 +316,8 @@ make_arg_infos_list([Mode | Modes], [Type | Types],
         get_arg_loc(RegType, ArgLoc, !OutRegR, !OutRegF)
     ),
     ArgInfo = arg_info(ArgLoc, ArgMode),
-    make_arg_infos_list(Modes, Types, !.InRegR, !.InRegF, !.OutRegR, !.OutRegF,
-        ModuleInfo, FloatRegType, ArgInfos).
-make_arg_infos_list([], [_|_], _, _, _, _, _, _, _) :-
-    unexpected($module, $pred, "length mismatch").
-make_arg_infos_list([_|_], [], _, _, _, _, _, _, _) :-
-    unexpected($module, $pred, "length mismatch").
+    make_arg_infos(Modes, Types, RegTypes, !.InRegR, !.InRegF,
+        !.OutRegR, !.OutRegF, ModuleInfo, ArgInfos).
 
 :- pred get_arg_loc(reg_type::in, arg_loc::out, int::in, int::out,
     int::in, int::out) is det.
@@ -283,13 +333,47 @@ get_arg_loc(RegType, ArgLoc, !RegR, !RegF) :-
         !:RegF = !.RegF + 1
     ).
 
-%---------------------------------------------------------------------------%
+%-----------------------------------------------------------------------------%
 
-compute_in_and_out_vars(ModuleInfo, Vars, Modes, Types,
-        !:InVars, !:OutVars) :-
+generic_call_arg_reg_types(ModuleInfo, _VarTypes, GenericCall, ArgVars,
+        MaybeArgRegs, ArgRegTypes) :-
+    (
+        GenericCall = higher_order(_, _, _, _),
+        module_info_get_globals(ModuleInfo, Globals),
+        globals.lookup_bool_option(Globals, use_float_registers, UseFloatRegs),
+        (
+            UseFloatRegs = no,
+            list.duplicate(length(ArgVars), reg_r, ArgRegTypes)
+        ;
+            UseFloatRegs = yes,
+            MaybeArgRegs = yes(ArgRegs),
+            ArgRegTypes = list.map(arg_reg_to_reg_type, ArgRegs)
+        ;
+            UseFloatRegs = yes,
+            MaybeArgRegs = no,
+            % This should have been set by the float register wrapper pass.
+            unexpected($module, $pred, "missing ho_arg_regs")
+        )
+    ;
+        % We don't yet use float registers for class method calls.
+        ( GenericCall = class_method(_, _, _, _)
+        ; GenericCall = event_call(_)
+        ; GenericCall = cast(_)
+        ),
+        list.duplicate(length(ArgVars), reg_r, ArgRegTypes)
+    ).
+
+:- func arg_reg_to_reg_type(ho_arg_reg) = reg_type.
+
+arg_reg_to_reg_type(ho_arg_reg_r) = reg_r.
+arg_reg_to_reg_type(ho_arg_reg_f) = reg_f.
+
+%-----------------------------------------------------------------------------%
+
+compute_in_and_out_vars(ModuleInfo, Vars, Modes, Types, !:InVars, !:OutVars) :-
     (
         compute_in_and_out_vars_2(ModuleInfo, Vars, Modes, Types,
-            !:InVars, !:OutVars)
+            [], !:InVars, [], !:OutVars)
     ->
         true
     ;
@@ -298,13 +382,14 @@ compute_in_and_out_vars(ModuleInfo, Vars, Modes, Types,
 
 :- pred compute_in_and_out_vars_2(module_info::in,
     list(prog_var)::in, list(mer_mode)::in, list(mer_type)::in,
-    list(prog_var)::out, list(prog_var)::out) is semidet.
-
-compute_in_and_out_vars_2(_ModuleInfo, [], [], [], [], []).
-compute_in_and_out_vars_2(ModuleInfo, [Var | Vars],
-        [Mode | Modes], [Type | Types], !:InVars, !:OutVars) :-
-    compute_in_and_out_vars_2(ModuleInfo, Vars,
-        Modes, Types, !:InVars, !:OutVars),
+    list(prog_var)::in, list(prog_var)::out,
+    list(prog_var)::in, list(prog_var)::out) is semidet.
+
+compute_in_and_out_vars_2(_ModuleInfo, [], [], [], !InVars, !OutVars).
+compute_in_and_out_vars_2(ModuleInfo, [Var | Vars], [Mode | Modes],
+        [Type | Types], !InVars, !OutVars) :-
+    compute_in_and_out_vars_2(ModuleInfo, Vars, Modes, Types,
+        !InVars, !OutVars),
     mode_to_arg_mode(ModuleInfo, Mode, Type, ArgMode),
     (
         ArgMode = top_in,
@@ -316,6 +401,54 @@ compute_in_and_out_vars_2(ModuleInfo, [Var | Vars],
         !:OutVars = [Var | !.OutVars]
     ).
 
+%-----------------------------------------------------------------------------%
+
+compute_in_and_out_vars_sep_regs(ModuleInfo, Vars, Modes, Types, ArgRegTypes,
+        !:InVarsR, !:InVarsF, !:OutVarsR, !:OutVarsF) :-
+    (
+        compute_in_and_out_vars_sep_regs_2(ModuleInfo, Vars, Modes, Types,
+            ArgRegTypes, !:InVarsR, !:InVarsF, !:OutVarsR, !:OutVarsF)
+    ->
+        true
+    ;
+        unexpected($module, $pred, "length mismatch")
+    ).
+
+:- pred compute_in_and_out_vars_sep_regs_2(module_info::in,
+    list(prog_var)::in, list(mer_mode)::in, list(mer_type)::in,
+    list(reg_type)::in, list(prog_var)::out, list(prog_var)::out,
+    list(prog_var)::out, list(prog_var)::out) is semidet.
+
+compute_in_and_out_vars_sep_regs_2(_ModuleInfo,
+        [], [], [], [], [], [], [], []).
+compute_in_and_out_vars_sep_regs_2(ModuleInfo,
+        [Var | Vars], [Mode | Modes], [Type | Types], [RegType | RegTypes],
+        !:InVarsR, !:InVarsF, !:OutVarsR, !:OutVarsF) :-
+    compute_in_and_out_vars_sep_regs_2(ModuleInfo, Vars, Modes, Types,
+        RegTypes, !:InVarsR, !:InVarsF, !:OutVarsR, !:OutVarsF),
+    mode_to_arg_mode(ModuleInfo, Mode, Type, ArgMode),
+    (
+        ArgMode = top_in,
+        (
+            RegType = reg_r,
+            !:InVarsR = [Var | !.InVarsR]
+        ;
+            RegType = reg_f,
+            !:InVarsF = [Var | !.InVarsF]
+        )
+    ;
+        ( ArgMode = top_out
+        ; ArgMode = top_unused
+        ),
+        (
+            RegType = reg_r,
+            !:OutVarsR = [Var | !.OutVarsR]
+        ;
+            RegType = reg_f,
+            !:OutVarsF = [Var | !.OutVarsF]
+        )
+    ).
+
 %---------------------------------------------------------------------------%
 
 partition_args(Args, Ins, Outs) :-
diff --git a/compiler/assertion.m b/compiler/assertion.m
index 1844c33..0e1e6fd 100644
--- a/compiler/assertion.m
+++ b/compiler/assertion.m
@@ -512,8 +512,8 @@ equal_goal_exprs(GoalExprA, GoalExprB, !Subst) :-
         GoalExprB = plain_call(PredId, _, ArgVarsB, _, _, _),
         equal_vars(ArgVarsA, ArgVarsB, !Subst)
     ;
-        GoalExprA = generic_call(CallDetails, ArgVarsA, _, _),
-        GoalExprB = generic_call(CallDetails, ArgVarsB, _, _),
+        GoalExprA = generic_call(CallDetails, ArgVarsA, _, _, _),
+        GoalExprB = generic_call(CallDetails, ArgVarsB, _, _, _),
         equal_vars(ArgVarsA, ArgVarsB, !Subst)
     ;
         GoalExprA = switch(Var, CanFail, CasesA),
@@ -679,7 +679,7 @@ normalise_goal(Goal0, Goal) :-
 normalise_goal_expr(GoalExpr0, GoalExpr) :-
     (
         ( GoalExpr0 = plain_call(_, _, _, _, _, _)
-        ; GoalExpr0 = generic_call(_, _, _, _)
+        ; GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = unify(_, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
         ),
diff --git a/compiler/build_mode_constraints.m b/compiler/build_mode_constraints.m
index 53cd6b0..2482151 100644
--- a/compiler/build_mode_constraints.m
+++ b/compiler/build_mode_constraints.m
@@ -324,7 +324,7 @@ add_mc_vars_for_goal(PredId, ProgVarset, Goal, !VarInfo) :-
         )
     ;
         ( GoalExpr = plain_call(_, _, _, _, _, _)
-        ; GoalExpr = generic_call(_, _, _, _)
+        ; GoalExpr = generic_call(_, _, _, _, _)
         ; GoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
         ; GoalExpr = unify(_, _, _, _, _)
         )
@@ -459,7 +459,7 @@ add_goal_expr_constraints(ModuleInfo, ProgVarset, PredId, GoalExpr,
                 !Constraints)
         )
     ;
-        GoalExpr = generic_call(Details, _, _, _),
+        GoalExpr = generic_call(Details, _, _, _, _),
         % XXX Need to do something here.
         (
             % XXX Need to do something here.
diff --git a/compiler/builtin_lib_types.m b/compiler/builtin_lib_types.m
index 0d4f468..0f7e5c3 100644
--- a/compiler/builtin_lib_types.m
+++ b/compiler/builtin_lib_types.m
@@ -31,6 +31,7 @@
 :- func void_type = mer_type.
 :- func c_pointer_type = mer_type.
 :- func heap_pointer_type = mer_type.
+:- func float_box_type = mer_type.
 :- func sample_type_info_type = mer_type.
 :- func sample_typeclass_info_type = mer_type.
 :- func type_info_type = mer_type.
@@ -132,6 +133,10 @@ heap_pointer_type = defined_type(Name, [], kind_star) :-
     BuiltinModule = mercury_private_builtin_module,
     Name = qualified(BuiltinModule, "heap_pointer").
 
+float_box_type = defined_type(Name, [], kind_star) :-
+    BuiltinModule = mercury_private_builtin_module,
+    Name = qualified(BuiltinModule, "float_box").
+
 sample_type_info_type = defined_type(Name, [], kind_star) :-
     BuiltinModule = mercury_private_builtin_module,
     Name = qualified(BuiltinModule, "sample_type_info").
diff --git a/compiler/bytecode_gen.m b/compiler/bytecode_gen.m
index 168de21..ef008d8 100644
--- a/compiler/bytecode_gen.m
+++ b/compiler/bytecode_gen.m
@@ -212,7 +212,7 @@ gen_goal(hlds_goal(GoalExpr, GoalInfo), !ByteInfo, Code) :-
 gen_goal_expr(GoalExpr, GoalInfo, !ByteInfo, Code) :-
     (
         GoalExpr = generic_call(GenericCallType,
-            ArgVars, ArgModes, Detism),
+            ArgVars, ArgModes, _, Detism),
         (
             GenericCallType = higher_order(PredVar, _, _, _),
             gen_higher_order_call(PredVar, ArgVars, ArgModes, Detism,
@@ -366,10 +366,7 @@ gen_higher_order_call(PredVar, ArgVars, ArgModes, Detism, ByteInfo, Code) :-
     determinism_to_code_model(Detism, CodeModel),
     get_module_info(ByteInfo, ModuleInfo),
     list.map(get_var_type(ByteInfo), ArgVars, ArgTypes),
-    % Higher order calls use regular registers for all arguments.
-    FloatRegType = reg_r,
-    make_arg_infos(ArgTypes, ArgModes, CodeModel, ModuleInfo, FloatRegType,
-        ArgInfo),
+    make_standard_arg_infos(ArgTypes, ArgModes, CodeModel, ModuleInfo, ArgInfo),
     assoc_list.from_corresponding_lists(ArgVars, ArgInfo, ArgVarsInfos),
 
     arg_info.partition_args(ArgVarsInfos, InVars, OutVars),
diff --git a/compiler/call_gen.m b/compiler/call_gen.m
index 43e4ab0..da5f7a0 100644
--- a/compiler/call_gen.m
+++ b/compiler/call_gen.m
@@ -27,6 +27,7 @@
 
 :- import_module assoc_list.
 :- import_module list.
+:- import_module maybe.
 
 %---------------------------------------------------------------------------%
 
@@ -35,8 +36,9 @@
     code_info::in, code_info::out) is det.
 
 :- pred generate_generic_call(code_model::in, generic_call::in,
-    list(prog_var)::in, list(mer_mode)::in, determinism::in,
-    hlds_goal_info::in, llds_code::out, code_info::in, code_info::out) is det.
+    list(prog_var)::in, list(mer_mode)::in, maybe(list(ho_arg_reg))::in,
+    determinism::in, hlds_goal_info::in, llds_code::out,
+    code_info::in, code_info::out) is det.
 
 :- pred generate_builtin(code_model::in, pred_id::in, proc_id::in,
     list(prog_var)::in, llds_code::out, code_info::in, code_info::out) is det.
@@ -45,10 +47,11 @@
     --->    ho_call_known_num
     ;       ho_call_unknown.
 
-    % generic_call_info(Globals, GenericCall, NumImmediateInputArgs, CodeAddr,
-    %   SpecifierArgInfos, FirstImmediateInputReg, HoCallVariant).
+    % generic_call_info(Globals, GenericCall, NumImmediateInputArgsR,
+    %   NumImmediateInputArgsF, CodeAddr, SpecifierArgInfos,
+    %   FirstImmediateInputReg, HoCallVariant).
     %
-:- pred generic_call_info(globals::in, generic_call::in, int::in,
+:- pred generic_call_info(globals::in, generic_call::in, int::in, int::in,
     code_addr::out, assoc_list(prog_var, arg_info)::out, int::out,
     known_call_variant::out) is det.
 
@@ -79,6 +82,7 @@
 :- import_module bool.
 :- import_module cord.
 :- import_module int.
+:- import_module map.
 :- import_module maybe.
 :- import_module pair.
 :- import_module require.
@@ -157,8 +161,8 @@ generate_call(CodeModel, PredId, ProcId, ArgVars, GoalInfo, Code, !CI) :-
 
 %---------------------------------------------------------------------------%
 
-generate_generic_call(OuterCodeModel, GenericCall, Args, Modes, Det,
-        GoalInfo, Code, !CI) :-
+generate_generic_call(OuterCodeModel, GenericCall, Args, Modes,
+        MaybeRegTypes, Det, GoalInfo, Code, !CI) :-
     % For a generic_call, we split the arguments into inputs and outputs,
     % put the inputs in the locations expected by mercury.do_call_closure in
     % runtime/mercury_ho_call.c, generate the call to that code, and pick up
@@ -173,7 +177,7 @@ generate_generic_call(OuterCodeModel, GenericCall, Args, Modes, Det,
         ; GenericCall = class_method(_, _, _, _)
         ),
         generate_main_generic_call(OuterCodeModel, GenericCall, Args, Modes,
-            Det, GoalInfo, Code, !CI)
+            MaybeRegTypes, Det, GoalInfo, Code, !CI)
     ;
         GenericCall = event_call(EventName),
         generate_event_call(EventName, Args, GoalInfo, Code, !CI)
@@ -187,36 +191,45 @@ generate_generic_call(OuterCodeModel, GenericCall, Args, Modes, Det,
     ).
 
 :- pred generate_main_generic_call(code_model::in, generic_call::in,
-    list(prog_var)::in, list(mer_mode)::in, determinism::in,
-    hlds_goal_info::in, llds_code::out, code_info::in, code_info::out)
-    is det.
-
-generate_main_generic_call(_OuterCodeModel, GenericCall, Args, Modes, Det,
-        GoalInfo, Code, !CI) :-
-    Types = list.map(variable_type(!.CI), Args),
+    list(prog_var)::in, list(mer_mode)::in, maybe(list(ho_arg_reg))::in,
+    determinism::in, hlds_goal_info::in, llds_code::out,
+    code_info::in, code_info::out) is det.
 
+generate_main_generic_call(_OuterCodeModel, GenericCall, Args, Modes,
+        MaybeRegTypes, Det, GoalInfo, Code, !CI) :-
     get_module_info(!.CI, ModuleInfo),
-    arg_info.compute_in_and_out_vars(ModuleInfo, Args, Modes, Types,
-        InVars, OutVars),
+    VarTypes = get_var_types(!.CI),
+    map.apply_to_list(Args, VarTypes, Types),
+    arg_info.generic_call_arg_reg_types(ModuleInfo, VarTypes, GenericCall,
+        Args, MaybeRegTypes, ArgRegTypes),
+    arg_info.compute_in_and_out_vars_sep_regs(ModuleInfo, Args, Modes, Types,
+        ArgRegTypes, InVarsR, InVarsF, OutVarsR, OutVarsF),
     module_info_get_globals(ModuleInfo, Globals),
-    generic_call_info(Globals, GenericCall, length(InVars), CodeAddr,
-        SpecifierArgInfos, FirstImmInput, HoCallVariant),
+    generic_call_info(Globals, GenericCall, length(InVarsR), length(InVarsF),
+        CodeAddr, SpecifierArgInfos, FirstImmInputR, HoCallVariant),
+    FirstImmInputF = 1,
     determinism_to_code_model(Det, CodeModel),
     (
         CodeModel = model_semi,
-        FirstOutput = 2
+        FirstOutputR = 2
     ;
         ( CodeModel = model_det
         ; CodeModel = model_non
         ),
-        FirstOutput = 1
+        FirstOutputR = 1
     ),
-
-    give_vars_consecutive_arg_infos(InVars, reg_r, FirstImmInput, top_in,
-        InVarArgInfos),
-    give_vars_consecutive_arg_infos(OutVars, reg_r, FirstOutput, top_out,
-        OutArgsInfos),
-    ArgInfos = SpecifierArgInfos ++ InVarArgInfos ++ OutArgsInfos,
+    FirstOutputF = 1,
+
+    give_vars_consecutive_arg_infos(InVarsR, reg_r, FirstImmInputR, top_in,
+        InVarArgInfosR),
+    give_vars_consecutive_arg_infos(InVarsF, reg_f, FirstImmInputF, top_in,
+        InVarArgInfosF),
+    give_vars_consecutive_arg_infos(OutVarsR, reg_r, FirstOutputR, top_out,
+        OutArgsInfosR),
+    give_vars_consecutive_arg_infos(OutVarsF, reg_f, FirstOutputF, top_out,
+        OutArgsInfosF),
+    ArgInfos = list.condense([SpecifierArgInfos, InVarArgInfosR, InVarArgInfosF,
+        OutArgsInfosR, OutArgsInfosF]),
 
     % Save the necessary vars on the stack and move the input args defined
     % by variables to their registers.
@@ -226,10 +239,10 @@ generate_main_generic_call(_OuterCodeModel, GenericCall, Args, Modes, Det,
     % Move the input args not defined by variables to their registers.
     % Setting up these arguments last results in slightly more efficient code,
     % since we can use their registers when placing the variables.
-    generic_call_nonvar_setup(GenericCall, HoCallVariant, InVars, OutVars,
-        NonVarCode, !CI),
+    generic_call_nonvar_setup(GenericCall, HoCallVariant, InVarsR, InVarsF,
+        OutVarsR, OutVarsF, NonVarCode, !CI),
 
-    extra_livevals(FirstImmInput, ExtraLiveVals),
+    extra_livevals(FirstImmInputR, ExtraLiveVals),
     set.insert_list(ExtraLiveVals, LiveVals0, LiveVals),
 
     prepare_for_call(CodeModel, GoalInfo, CallModel, TraceCode, !CI),
@@ -246,6 +259,7 @@ generate_main_generic_call(_OuterCodeModel, GenericCall, Args, Modes, Det,
     instmap.apply_instmap_delta(InstMap, InstMapDelta, ReturnInstMap),
 
     % Update the code generator state to reflect the situation after the call.
+    OutArgsInfos = OutArgsInfosR ++ OutArgsInfosF,
     handle_return(OutArgsInfos, GoalInfo, NonLiveOutputs,
         ReturnInstMap, ReturnLiveLvalues, !CI),
 
@@ -327,7 +341,7 @@ generate_event_attributes([Attribute | Attributes], !.Vars,
 
 %---------------------------------------------------------------------------%
 
-    % The registers before the first input argument are all live.
+    % The registers before the first reg_r input argument are all live.
     %
 :- pred extra_livevals(int::in, list(lval)::out) is det.
 
@@ -345,8 +359,8 @@ extra_livevals_from(Reg, FirstInput, ExtraLiveVals) :-
         ExtraLiveVals = []
     ).
 
-generic_call_info(Globals, GenericCall, NumInputArgs, CodeAddr,
-        SpecifierArgInfos, FirstImmediateInputReg, HoCallVariant) :-
+generic_call_info(Globals, GenericCall, NumInputArgsR, NumInputArgsF,
+        CodeAddr, SpecifierArgInfos, FirstImmediateInputReg, HoCallVariant) :-
     (
         GenericCall = higher_order(PredVar, _, _, _),
         Reg = reg(reg_r, 1),
@@ -355,9 +369,10 @@ generic_call_info(Globals, GenericCall, NumInputArgs, CodeAddr,
             max_specialized_do_call_closure, MaxSpec),
         (
             MaxSpec >= 0,
-            NumInputArgs =< MaxSpec
+            NumInputArgsR =< MaxSpec,
+            NumInputArgsF = 0
         ->
-            CodeAddr = do_call_closure(specialized_known(NumInputArgs)),
+            CodeAddr = do_call_closure(specialized_known(NumInputArgsR)),
             HoCallVariant = ho_call_known_num,
             FirstImmediateInputReg = 2
         ;
@@ -369,19 +384,25 @@ generic_call_info(Globals, GenericCall, NumInputArgs, CodeAddr,
         GenericCall = class_method(TCVar, _, _, _),
         Reg = reg(reg_r, 1),
         SpecifierArgInfos = [TCVar - arg_info(Reg, top_in)],
-        globals.lookup_int_option(Globals,
-            max_specialized_do_call_class_method, MaxSpec),
-        (
-            MaxSpec >= 0,
-            NumInputArgs =< MaxSpec
-        ->
-            CodeAddr = do_call_class_method(specialized_known(NumInputArgs)),
-            HoCallVariant = ho_call_known_num,
-            FirstImmediateInputReg = 3
+        % XXX we do not use float registers for method calls yet
+        ( NumInputArgsF = 0 ->
+            globals.lookup_int_option(Globals,
+                max_specialized_do_call_class_method, MaxSpec),
+            (
+                MaxSpec >= 0,
+                NumInputArgsR =< MaxSpec
+            ->
+                CodeAddr = do_call_class_method(
+                    specialized_known(NumInputArgsR)),
+                HoCallVariant = ho_call_known_num,
+                FirstImmediateInputReg = 3
+            ;
+                CodeAddr = do_call_class_method(generic),
+                HoCallVariant = ho_call_unknown,
+                FirstImmediateInputReg = 4
+            )
         ;
-            CodeAddr = do_call_class_method(generic),
-            HoCallVariant = ho_call_unknown,
-            FirstImmediateInputReg = 4
+            sorry($module, $pred, "float reg inputs")
         )
     ;
         % Events and casts are generated inline.
@@ -407,25 +428,33 @@ generic_call_info(Globals, GenericCall, NumInputArgs, CodeAddr,
     % constants.
     %
 :- pred generic_call_nonvar_setup(generic_call::in, known_call_variant::in,
-    list(prog_var)::in, list(prog_var)::in, llds_code::out,
-    code_info::in, code_info::out) is det.
+    list(prog_var)::in, list(prog_var)::in, list(prog_var)::in, list(prog_var)::in,
+    llds_code::out, code_info::in, code_info::out) is det.
 
 generic_call_nonvar_setup(higher_order(_, _, _, _), HoCallVariant,
-        InVars, _OutVars, Code, !CI) :-
+        InVarsR, InVarsF, _OutVarsR, _OutVarsF, Code, !CI) :-
     (
         HoCallVariant = ho_call_known_num,
         Code = empty
     ;
         HoCallVariant = ho_call_unknown,
         clobber_regs([reg(reg_r, 2)], !CI),
-        list.length(InVars, NInVars),
+        list.length(InVarsR, NumInVarsR),
+        list.length(InVarsF, NumInVarsF),
+        NumInVars = encode_num_generic_call_vars(NumInVarsR, NumInVarsF),
         Code = singleton(
-            llds_instr(assign(reg(reg_r, 2), const(llconst_int(NInVars))),
+            llds_instr(assign(reg(reg_r, 2), const(llconst_int(NumInVars))),
                 "Assign number of immediate input arguments")
         )
     ).
 generic_call_nonvar_setup(class_method(_, Method, _, _), HoCallVariant,
-        InVars, _OutVars, Code, !CI) :-
+        InVarsR, InVarsF, _OutVarsR, _OutVarsF, Code, !CI) :-
+    (
+        InVarsF = []
+    ;
+        InVarsF = [_ | _],
+        sorry($module, $pred, "float input reg")
+    ),
     (
         HoCallVariant = ho_call_known_num,
         clobber_regs([reg(reg_r, 2)], !CI),
@@ -436,17 +465,20 @@ generic_call_nonvar_setup(class_method(_, Method, _, _), HoCallVariant,
     ;
         HoCallVariant = ho_call_unknown,
         clobber_regs([reg(reg_r, 2), reg(reg_r, 3)], !CI),
-        list.length(InVars, NInVars),
+        list.length(InVarsR, NumInVarsR),
+        % Currently we do not use float registers for method calls.
+        NumInVarsF = 0,
+        NumInVars = encode_num_generic_call_vars(NumInVarsR, NumInVarsF),
         Code = from_list([
             llds_instr(assign(reg(reg_r, 2), const(llconst_int(Method))),
                 "Index of class method in typeclass info"),
-            llds_instr(assign(reg(reg_r, 3), const(llconst_int(NInVars))),
-                "Assign number of immediate input arguments")
+            llds_instr(assign(reg(reg_r, 3), const(llconst_int(NumInVars))),
+                "Assign number of immediate regular input arguments")
         ])
     ).
-generic_call_nonvar_setup(event_call(_), _, _, _, _, !CI) :-
+generic_call_nonvar_setup(event_call(_), _, _, _, _, _, _, !CI) :-
     unexpected($module, $pred, "event_call").
-generic_call_nonvar_setup(cast(_), _, _, _, _, !CI) :-
+generic_call_nonvar_setup(cast(_), _, _, _, _, _, _, !CI) :-
     unexpected($module, $pred, "cast").
 
 %---------------------------------------------------------------------------%
diff --git a/compiler/check_typeclass.m b/compiler/check_typeclass.m
index ea88729..2a887a3 100644
--- a/compiler/check_typeclass.m
+++ b/compiler/check_typeclass.m
@@ -624,6 +624,7 @@ produce_auxiliary_procs(ClassId, ClassVars, MethodName, Markers0,
     Info0 = instance_method_info(ModuleInfo0, QualInfo0, PredName,
         Arity, ExistQVars0, ArgTypes0, ClassMethodClassContext0,
         ArgModes, TVarSet0, Status0, PredOrFunc),
+    UnsubstArgTypes = ArgTypes0,
 
     % Rename the instance variables apart from the class variables.
     tvarset_merge_renaming(TVarSet0, InstanceVarSet, TVarSet1, Renaming),
@@ -712,6 +713,8 @@ produce_auxiliary_procs(ClassId, ClassVars, MethodName, Markers0,
         goal_type_none, Markers, ArgTypes, TVarSet, ExistQVars, ClassContext,
         Proofs, ConstraintMap, ClausesInfo, VarNameRemap, PredInfo0),
     pred_info_set_clauses_info(ClausesInfo, PredInfo0, PredInfo1),
+    pred_info_set_instance_method_arg_types(UnsubstArgTypes,
+        PredInfo1, PredInfo2),
 
     % Add procs with the expected modes and determinisms
     AddProc = (pred(ModeAndDet::in, NewProcId::out,
@@ -721,7 +724,7 @@ produce_auxiliary_procs(ClassId, ClassVars, MethodName, Markers0,
             detism_decl_implicit, MaybeDet, Context, address_is_taken,
             OldPredInfo, NewPredInfo, NewProcId)
     ),
-    list.map_foldl(AddProc, ArgModes, InstanceProcIds, PredInfo1, PredInfo),
+    list.map_foldl(AddProc, ArgModes, InstanceProcIds, PredInfo2, PredInfo),
 
     module_info_get_predicate_table(ModuleInfo1, PredicateTable1),
     module_info_get_partial_qualifier_info(ModuleInfo1, PQInfo),
diff --git a/compiler/clause_to_proc.m b/compiler/clause_to_proc.m
index 98d292d..aa6c745 100644
--- a/compiler/clause_to_proc.m
+++ b/compiler/clause_to_proc.m
@@ -208,7 +208,7 @@ copy_clauses_to_proc(ProcId, ClausesInfo, !Proc) :-
                 "trace runtime cond")
         ;
             ( SingleExpr = plain_call(_, _, _, _, _, _)
-            ; SingleExpr = generic_call(_, _, _, _)
+            ; SingleExpr = generic_call(_, _, _, _, _)
             ; SingleExpr = unify(_, _, _, _, _)
             ; SingleExpr = conj(_, _)
             ; SingleExpr = disj(_)
diff --git a/compiler/closure_analysis.m b/compiler/closure_analysis.m
index 90fc8e7..726d3f7 100644
--- a/compiler/closure_analysis.m
+++ b/compiler/closure_analysis.m
@@ -226,7 +226,7 @@ process_goal(VarTypes, ModuleInfo, Goal0, Goal, !ClosureInfo) :-
         set_of_var.fold(insert_unknown, OutputArgs, !ClosureInfo),
         Goal = hlds_goal(GoalExpr0, GoalInfo)
     ;
-        GoalExpr0 = generic_call(Details, GCallArgs, GCallModes, _),
+        GoalExpr0 = generic_call(Details, GCallArgs, GCallModes, _, _),
         partition_arguments(ModuleInfo, VarTypes, GCallArgs, GCallModes,
             set_of_var.init, InputArgs0, set_of_var.init, OutputArgs),
 
@@ -456,7 +456,7 @@ dump_closure_info_expr(Varset, conj(_ConjType, Goals), _, !IO) :-
     list.foldl(dump_closure_info(Varset), Goals, !IO).
 dump_closure_info_expr(Varset, plain_call(_,_,_,_,_,_), GoalInfo, !IO) :-
     dump_ho_values(GoalInfo, Varset, !IO).
-dump_closure_info_expr(Varset, generic_call(_,_,_,_), GoalInfo, !IO) :-
+dump_closure_info_expr(Varset, generic_call(_,_,_,_,_), GoalInfo, !IO) :-
     dump_ho_values(GoalInfo, Varset, !IO).
 dump_closure_info_expr(Varset, scope(_, Goal), _, !IO) :-
     dump_closure_info(Varset, Goal, !IO).
diff --git a/compiler/code_gen.m b/compiler/code_gen.m
index 0da46ba..00a5faa 100644
--- a/compiler/code_gen.m
+++ b/compiler/code_gen.m
@@ -274,9 +274,9 @@ generate_goal_expr(GoalExpr, GoalInfo, CodeModel, ForwardLiveVarsBeforeGoal,
                 ForwardLiveVarsBeforeGoal, SubGoal, Code, !CI)
         )
     ;
-        GoalExpr = generic_call(GenericCall, Args, Modes, Det),
+        GoalExpr = generic_call(GenericCall, Args, Modes, MaybeRegTypes, Det),
         call_gen.generate_generic_call(CodeModel, GenericCall, Args,
-            Modes, Det, GoalInfo, Code, !CI)
+            Modes, MaybeRegTypes, Det, GoalInfo, Code, !CI)
     ;
         GoalExpr = plain_call(PredId, ProcId, Args, BuiltinState, _, _),
         (
diff --git a/compiler/code_util.m b/compiler/code_util.m
index b57f60f..c4c3959 100644
--- a/compiler/code_util.m
+++ b/compiler/code_util.m
@@ -87,6 +87,12 @@
 :- pred build_input_arg_list(proc_info::in, assoc_list(prog_var, lval)::out)
     is det.
 
+    % Encode the number of regular register and float register arguments
+    % into a single word. This representation is in both the MR_Closure
+    % num_hidden_args_rf field, and for the input to do_call_closure et al.
+    %
+:- func encode_num_generic_call_vars(int, int) = int.
+
 :- func size_of_cell_args(list(cell_arg)) = int.
 
 %---------------------------------------------------------------------------%
@@ -242,7 +248,7 @@ goal_may_alloc_temp_frame(hlds_goal(GoalExpr, _GoalInfo), May) :-
 :- pred goal_may_alloc_temp_frame_2(hlds_goal_expr::in, bool::out)
     is det.
 
-goal_may_alloc_temp_frame_2(generic_call(_, _, _, _), no).
+goal_may_alloc_temp_frame_2(generic_call(_, _, _, _, _), no).
 goal_may_alloc_temp_frame_2(plain_call(_, _, _, _, _, _), no).
 goal_may_alloc_temp_frame_2(unify(_, _, _, _, _), no).
     % We cannot safely say that a foreign code fragment does not allocate
@@ -431,6 +437,10 @@ build_input_arg_list_2([V - Arg | Rest0], VarArgs) :-
 
 %-----------------------------------------------------------------------------%
 
+encode_num_generic_call_vars(NumR, NumF) = (NumR \/ (NumF << 16)).
+
+%-----------------------------------------------------------------------------%
+
 size_of_cell_args([]) = 0.
 size_of_cell_args([CellArg | CellArgs]) = Size + Sizes :-
     (
diff --git a/compiler/common.m b/compiler/common.m
index 816470d..70fce93 100644
--- a/compiler/common.m
+++ b/compiler/common.m
@@ -793,7 +793,7 @@ generate_assign(ToVar, FromVar, UniMode, OldGoalInfo, GoalExpr, GoalInfo,
         % equivalence of the input and output.
         Modes = [(ToVarInst -> ToVarInst), (free -> ToVarInst)],
         GoalExpr = generic_call(cast(unsafe_type_cast), [FromVar, ToVar],
-            Modes, detism_det)
+            Modes, no, detism_det)
     ),
 
     % `ToVar' may not appear in the original instmap_delta, so we can't just
diff --git a/compiler/constraint.m b/compiler/constraint.m
index 879c320..4bc7162 100644
--- a/compiler/constraint.m
+++ b/compiler/constraint.m
@@ -205,7 +205,7 @@ propagate_conj_sub_goal_2(hlds_goal(GoalExpr, GoalInfo), Constraints,
         FinalGoals = [hlds_goal(negation(NegGoal), GoalInfo) | ConstraintGoals]
     ;
         ( GoalExpr = plain_call(_, _, _, _, _, _)
-        ; GoalExpr = generic_call(_, _, _, _)
+        ; GoalExpr = generic_call(_, _, _, _, _)
         ; GoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
         ; GoalExpr = unify(_, _, _, _, _)
         ),
@@ -843,7 +843,7 @@ strip_constraint_markers_expr(if_then_else(Vars, If, Then, Else)) =
 strip_constraint_markers_expr(Goal) = Goal :-
     Goal = call_foreign_proc(_, _, _, _, _, _, _).
 strip_constraint_markers_expr(Goal) = Goal :-
-    Goal = generic_call(_, _, _, _).
+    Goal = generic_call(_, _, _, _, _).
 strip_constraint_markers_expr(Goal) = Goal :-
     Goal = plain_call(_, _, _, _, _, _).
 strip_constraint_markers_expr(Goal) = Goal :-
diff --git a/compiler/continuation_info.m b/compiler/continuation_info.m
index e049be0..f4c4cc2 100644
--- a/compiler/continuation_info.m
+++ b/compiler/continuation_info.m
@@ -411,6 +411,7 @@
 :- import_module hlds.hlds_llds.
 :- import_module libs.options.
 :- import_module ll_backend.code_util.
+:- import_module parse_tree.builtin_lib_types.
 :- import_module parse_tree.prog_type.
 
 :- import_module int.
@@ -800,6 +801,8 @@ generate_layout_for_var(_ModuleInfo, ProcInfo, _InstMap, Var, LiveValueType,
 %---------------------------------------------------------------------------%
 
 generate_closure_layout(ModuleInfo, PredId, ProcId, ClosureLayout) :-
+    module_info_get_globals(ModuleInfo, Globals),
+    globals.lookup_bool_option(Globals, use_float_registers, UseFloatRegs),
     module_info_pred_proc_info(ModuleInfo, PredId, ProcId, PredInfo, ProcInfo),
     proc_info_get_headvars(ProcInfo, HeadVars),
     proc_info_arg_info(ProcInfo, ArgInfos),
@@ -809,7 +812,7 @@ generate_closure_layout(ModuleInfo, PredId, ProcId, ClosureLayout) :-
     set.init(TypeVars0),
     (
         build_closure_info(HeadVars, ArgTypes, ArgInfos, ArgLayouts, InstMap,
-            VarLocs0, VarLocs, TypeVars0, TypeVars)
+            UseFloatRegs, VarLocs0, VarLocs, TypeVars0, TypeVars)
     ->
         set.to_sorted_list(TypeVars, TypeVarsList),
         find_typeinfos_for_tvars(TypeVarsList, VarLocs, ProcInfo,
@@ -822,14 +825,27 @@ generate_closure_layout(ModuleInfo, PredId, ProcId, ClosureLayout) :-
 
 :- pred build_closure_info(list(prog_var)::in,
     list(mer_type)::in, list(arg_info)::in,  list(closure_arg_info)::out,
-    instmap::in, map(prog_var, set(lval))::in, map(prog_var, set(lval))::out,
+    instmap::in, bool::in,
+    map(prog_var, set(lval))::in, map(prog_var, set(lval))::out,
     set(tvar)::in, set(tvar)::out) is semidet.
 
-build_closure_info([], [], [], [], _, !VarLocs, !TypeVars).
-build_closure_info([Var | Vars], [Type | Types],
-        [ArgInfo | ArgInfos], [Layout | Layouts], InstMap,
+build_closure_info([], [], [], [], _, _, !VarLocs, !TypeVars).
+build_closure_info([Var | Vars], [Type0 | Types],
+        [ArgInfo | ArgInfos], [Layout | Layouts], InstMap, UseFloatRegs,
         !VarLocs, !TypeVars) :-
     ArgInfo = arg_info(ArgLoc, _ArgMode),
+    % If the float argument is passed via a regular register then replace the
+    % type_ctor_info in the closure layout so that we can distinguish those
+    % arguments from float arguments passed via float registers.
+    (
+        UseFloatRegs = yes,
+        Type0 = float_type,
+        ArgLoc = reg(reg_r, _)
+    ->
+        Type = float_box_type
+    ;
+        Type = Type0
+    ),
     instmap_lookup_var(InstMap, Var, Inst),
     Layout = closure_arg_info(Type, Inst),
     arg_loc_to_register(ArgLoc, Reg),
@@ -837,7 +853,7 @@ build_closure_info([Var | Vars], [Type | Types],
     map.det_insert(Var, Locations, !VarLocs),
     type_vars(Type, VarTypeVars),
     set.insert_list(VarTypeVars, !TypeVars),
-    build_closure_info(Vars, Types, ArgInfos, Layouts, InstMap,
+    build_closure_info(Vars, Types, ArgInfos, Layouts, InstMap, UseFloatRegs,
         !VarLocs, !TypeVars).
 
 %---------------------------------------------------------------------------%
diff --git a/compiler/coverage_profiling.m b/compiler/coverage_profiling.m
index 1e2507b..bd85a6e 100644
--- a/compiler/coverage_profiling.m
+++ b/compiler/coverage_profiling.m
@@ -240,7 +240,7 @@ coverage_prof_second_pass_goal(Goal0, Goal,
                 GathersCoverageAfter = no
             )
         ;
-            GoalExpr0 = generic_call(GenericCall, _, _, _),
+            GoalExpr0 = generic_call(GenericCall, _, _, _, _),
             (
                 ( GenericCall = higher_order(_, _, _, _)
                 ; GenericCall = class_method(_, _, _, _)
@@ -878,7 +878,7 @@ coverage_prof_first_pass(CPOptions, Goal0, Goal, PortCountsCoverageAfterBefore,
         ),
         GoalExpr = GoalExpr0
     ;
-        GoalExpr0 = generic_call(GenericCall, _, _, _),
+        GoalExpr0 = generic_call(GenericCall, _, _, _, _),
         (
             ( GenericCall = higher_order(_, _, _, _)
             ; GenericCall = class_method(_, _, _, _)
diff --git a/compiler/cse_detection.m b/compiler/cse_detection.m
index 4933099..e9a2bfd 100644
--- a/compiler/cse_detection.m
+++ b/compiler/cse_detection.m
@@ -285,7 +285,7 @@ detect_cse_in_goal_expr(GoalExpr0, GoalExpr, !CseInfo, GoalInfo, InstMap0,
         Redo) :-
     (
         ( GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
-        ; GoalExpr0 = generic_call(_, _, _, _)
+        ; GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = plain_call(_, _, _, _, _, _)
         ),
         GoalExpr = GoalExpr0,
diff --git a/compiler/dead_proc_elim.m b/compiler/dead_proc_elim.m
index c64f805..ebaf8e4 100644
--- a/compiler/dead_proc_elim.m
+++ b/compiler/dead_proc_elim.m
@@ -497,7 +497,7 @@ dead_proc_examine_goal(Goal, CurrProc, !Queue, !Needed) :-
         dead_proc_examine_goal(Then, CurrProc, !Queue, !Needed),
         dead_proc_examine_goal(Else, CurrProc, !Queue, !Needed)
     ;
-        GoalExpr = generic_call(_, _, _, _)
+        GoalExpr = generic_call(_, _, _, _, _)
     ;
         GoalExpr = plain_call(PredId, ProcId, _,_,_,_),
         Entity = entity_proc(PredId, ProcId),
@@ -1104,7 +1104,7 @@ pre_modecheck_examine_goal_expr(GoalExpr, !DeadInfo) :-
         GoalExpr = plain_call(_, _, _, _, _, PredName),
         dead_pred_info_add_pred_name(PredName, !DeadInfo)
     ;
-        GoalExpr = generic_call(_, _, _, _)
+        GoalExpr = generic_call(_, _, _, _, _)
     ;
         GoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
     ;
diff --git a/compiler/deep_profiling.m b/compiler/deep_profiling.m
index 685b9cb..5efe3c7 100644
--- a/compiler/deep_profiling.m
+++ b/compiler/deep_profiling.m
@@ -236,7 +236,7 @@ goal_contains_builtin_unify_or_compare(Goal) = Contains :-
         GoalExpr = unify(_, _, _, _, _),
         Contains = no
     ;
-        ( GoalExpr = generic_call(_, _, _, _)
+        ( GoalExpr = generic_call(_, _, _, _, _)
         ; GoalExpr = plain_call(_, _, _, _, _, _)
         ),
         % Unfortunately, even if the procedure we are calling is neither
@@ -355,7 +355,7 @@ apply_deep_prof_tail_rec_to_goal(Goal0, Goal, TailRecInfo, !FoundTailCall,
         ),
         Continue = no
     ;
-        GoalExpr0 = generic_call(_, _, _, _),
+        GoalExpr0 = generic_call(_, _, _, _, _),
         Goal = Goal0,
         Continue = no
     ;
@@ -518,7 +518,7 @@ figure_out_rec_call_numbers(Goal, !N, !TailCallSites) :-
             BuiltinState = inline_builtin
         )
     ;
-        GoalExpr = generic_call(_, _, _, _),
+        GoalExpr = generic_call(_, _, _, _, _),
         !:N = !.N + 1
     ;
         GoalExpr = unify(_, _, _, _, _)
@@ -1047,7 +1047,7 @@ deep_prof_transform_goal(Goal0, Goal, AddedImpurity, !DeepInfo) :-
             AddedImpurity = no
         )
     ;
-        GoalExpr0 = generic_call(GenericCall, _, _, _),
+        GoalExpr0 = generic_call(GenericCall, _, _, _, _),
         (
             ( GenericCall = higher_order(_, _, _, _)
             ; GenericCall = class_method(_, _, _, _)
@@ -1631,7 +1631,7 @@ classify_call(ModuleInfo, Expr) = Class :-
             Class = call_class_normal(proc(PredId, ProcId))
         )
     ;
-        Expr = generic_call(Generic, _, _, _),
+        Expr = generic_call(Generic, _, _, _, _),
         Class = call_class_generic(Generic)
     ;
         ( Expr = call_foreign_proc(_, _, _, _, _, _, _)
diff --git a/compiler/deforest.m b/compiler/deforest.m
index 7a179d8..522c72c 100644
--- a/compiler/deforest.m
+++ b/compiler/deforest.m
@@ -361,7 +361,7 @@ deforest_goal_expr(GoalExpr0, GoalExpr, !GoalInfo, !PDInfo) :-
             !PDInfo)
     ;
         ( GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
-        ; GoalExpr0 = generic_call(_, _, _, _)
+        ; GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr = unify(_, _, _, _, _)
         ),
         GoalExpr = GoalExpr0
@@ -448,7 +448,7 @@ deforest_get_branch_vars_goal(Goal, MaybeBranchInfo, !PDInfo) :-
         pd_util.get_branch_vars_goal(Goal, MaybeBranchInfo, !PDInfo)
     ;
         ( GoalExpr = unify(_, _, _, _, _)
-        ; GoalExpr = generic_call(_, _, _, _)
+        ; GoalExpr = generic_call(_, _, _, _, _)
         ; GoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
         ; GoalExpr = conj(_, _)
         ; GoalExpr = negation(_)
@@ -1767,7 +1767,7 @@ push_goal_into_goal(NonLocals, DeforestInfo, EarlierGoal,
     ;
         ( EarlierGoalExpr = unify(_, _, _, _, _)
         ; EarlierGoalExpr = plain_call(_, _, _, _, _, _)
-        ; EarlierGoalExpr = generic_call(_, _, _, _)
+        ; EarlierGoalExpr = generic_call(_, _, _, _, _)
         ; EarlierGoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
         ; EarlierGoalExpr = conj(_, _)
         ; EarlierGoalExpr = negation(_)
diff --git a/compiler/delay_construct.m b/compiler/delay_construct.m
index dbb4ab3..8e1e7ad 100644
--- a/compiler/delay_construct.m
+++ b/compiler/delay_construct.m
@@ -157,7 +157,7 @@ delay_construct_in_goal(Goal0, InstMap0, DelayInfo, Goal) :-
             Goal = hlds_goal(scope(Reason, SubGoal), GoalInfo0)
         )
     ;
-        ( GoalExpr0 = generic_call(_, _, _, _)
+        ( GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = plain_call(_, _, _, _, _, _)
         ; GoalExpr0 = unify(_, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
diff --git a/compiler/delay_partial_inst.m b/compiler/delay_partial_inst.m
index dd05a68..10a98c1 100644
--- a/compiler/delay_partial_inst.m
+++ b/compiler/delay_partial_inst.m
@@ -428,7 +428,7 @@ delay_partial_inst_in_goal(InstMap0, Goal0, Goal, !ConstructMap, !DelayInfo) :-
             Goal = Goal0
         )
     ;
-        ( GoalExpr0 = generic_call(_, _, _, _)
+        ( GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = plain_call(_, _, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
         ),
diff --git a/compiler/dep_par_conj.m b/compiler/dep_par_conj.m
index 2259bda..88611b4 100644
--- a/compiler/dep_par_conj.m
+++ b/compiler/dep_par_conj.m
@@ -416,7 +416,7 @@ sync_dep_par_conjs_in_goal(Goal0, Goal, InstMap0, InstMap, !SyncInfo) :-
     ;
         ( GoalExpr0 = unify(_, _, _, _, _)
         ; GoalExpr0 = plain_call(_, _, _, _, _, _)
-        ; GoalExpr0 = generic_call(_, _, _, _)
+        ; GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
         ),
         Goal = Goal0
@@ -855,7 +855,7 @@ insert_wait_in_goal(ModuleInfo, AllowSomePathsOnly, FutureMap, ConsumedVar,
         ;
             ( GoalExpr0 = unify(_, _, _, _, _)
             ; GoalExpr0 = plain_call(_, _, _, _, _, _)
-            ; GoalExpr0 = generic_call(_, _, _, _)
+            ; GoalExpr0 = generic_call(_, _, _, _, _)
             ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
             ),
             InvariantEstablished = no,
@@ -1193,7 +1193,7 @@ insert_signal_in_goal(ModuleInfo, FutureMap, ProducedVar,
             ;
                 ( GoalExpr0 = unify(_, _, _, _, _)
                 ; GoalExpr0 = plain_call(_, _, _, _, _, _)
-                ; GoalExpr0 = generic_call(_, _, _, _)
+                ; GoalExpr0 = generic_call(_, _, _, _, _)
                 ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
                 ),
                 insert_signal_after_goal(ModuleInfo, FutureMap, ProducedVar,
@@ -1798,7 +1798,7 @@ specialize_sequences_in_goal(Goal0, Goal, !SpecInfo) :-
     ;
         ( GoalExpr0 = unify(_, _, _, _, _)
         ; GoalExpr0 = plain_call(_, _, _, _, _, _)
-        ; GoalExpr0 = generic_call(_, _, _, _)
+        ; GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
         ),
         Goal = Goal0
@@ -2456,7 +2456,7 @@ should_we_push_wait(Var, Goal, Wait) :-
             )
         )
     ;
-        ( GoalExpr = generic_call(_, _, _, _)
+        ( GoalExpr = generic_call(_, _, _, _, _)
         ; GoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
         ),
         ( set_of_var.member(NonLocals, Var) ->
@@ -2735,7 +2735,7 @@ should_we_push_signal(Var, Goal, !Signal) :-
             % With generic calls, the only safe assumption is that they produce
             % Var just before return. With foreign code, the signal is done
             % after the return to Mercury execution.
-            ( GoalExpr = generic_call(_, _, _, _)
+            ( GoalExpr = generic_call(_, _, _, _, _)
             ; GoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
             ),
             ( set_of_var.member(NonLocals, Var) ->
diff --git a/compiler/dependency_graph.m b/compiler/dependency_graph.m
index 8dc3d84..8e46fbf 100644
--- a/compiler/dependency_graph.m
+++ b/compiler/dependency_graph.m
@@ -386,7 +386,7 @@ add_dependency_arcs_in_goal(Caller, Goal, !DepGraph) :-
             add_dependency_arcs_in_goal(Caller, SubGoal, !DepGraph)
         )
     ;
-        GoalExpr = generic_call(_, _, _, _)
+        GoalExpr = generic_call(_, _, _, _, _)
     ;
         GoalExpr = plain_call(PredId, ProcId, _, Builtin, _, _),
         (
diff --git a/compiler/det_analysis.m b/compiler/det_analysis.m
index 33887de..98c7014 100644
--- a/compiler/det_analysis.m
+++ b/compiler/det_analysis.m
@@ -648,7 +648,8 @@ det_infer_goal_2(GoalExpr0, GoalExpr, GoalInfo, InstMap0, SolnContext,
         GoalExpr = plain_call(PredId, ProcId, Args, Builtin, UnifyContext,
             Name)
     ;
-        GoalExpr0 = generic_call(GenericCall, _ArgVars, _Modes, CallDetism),
+        GoalExpr0 = generic_call(GenericCall, _ArgVars, _Modes, _MaybArgRegs,
+            CallDetism),
         det_infer_generic_call(GenericCall, CallDetism, GoalInfo, SolnContext,
             RightFailingContexts, Detism, GoalFailingContexts, !DetInfo),
         GoalExpr = GoalExpr0
diff --git a/compiler/det_report.m b/compiler/det_report.m
index 99ea5c7..67e6b41 100644
--- a/compiler/det_report.m
+++ b/compiler/det_report.m
@@ -651,7 +651,7 @@ det_diagnose_goal_expr(GoalExpr, GoalInfo, InstMap0, Desired, Actual,
             AtomicMsgs),
         Msgs = InitMsgs ++ AtomicMsgs
     ;
-        GoalExpr = generic_call(GenericCall, _, _, _),
+        GoalExpr = generic_call(GenericCall, _, _, _, _),
         Context = goal_info_get_context(GoalInfo),
         hlds_goal.generic_call_id(GenericCall, CallId),
         StartingPieces = [words(call_id_to_string(CallId))],
@@ -1059,7 +1059,7 @@ reqscope_check_goal(Goal, InstMap0, !DetInfo) :-
         )
     ;
         ( GoalExpr = plain_call(_, _, _, _, _, _)
-        ; GoalExpr = generic_call(_, _, _, _)
+        ; GoalExpr = generic_call(_, _, _, _, _)
         ; GoalExpr = unify(_, _, _, _, _)
         ; GoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
         )
diff --git a/compiler/distance_granularity.m b/compiler/distance_granularity.m
index 7bf0cb8..9ade83c 100644
--- a/compiler/distance_granularity.m
+++ b/compiler/distance_granularity.m
@@ -318,7 +318,7 @@ apply_dg_to_goal(!Goal, CallerPredId, CallerProcId, PredIdSpecialized,
         GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _),
         IsRecursiveCallInParallelConj = no
     ;
-        GoalExpr0 = generic_call(_, _, _, _),
+        GoalExpr0 = generic_call(_, _, _, _, _),
         IsRecursiveCallInParallelConj = no
     ;
         GoalExpr0 = conj(Type, Goals0),
@@ -879,7 +879,7 @@ update_original_predicate_goal(!Goal, CallerPredId, CallerProcId,
     ;
         GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
     ;
-        GoalExpr0 = generic_call(_, _, _, _)
+        GoalExpr0 = generic_call(_, _, _, _, _)
     ;
         GoalExpr0 = conj(Type, Goals0),
         update_original_predicate_goals(Goals0, [], Goals1, CallerPredId,
diff --git a/compiler/equiv_type.m b/compiler/equiv_type.m
index 8fac69c..004fb6b 100644
--- a/compiler/equiv_type.m
+++ b/compiler/equiv_type.m
@@ -1414,7 +1414,7 @@ replace_in_pred_mode(Location, PredName, OrigArity, Context, DeclType,
         (
             WithInst = ground(_, GroundInstInfo),
             GroundInstInfo = higher_order(HOInst),
-            HOInst = pred_inst_info(PredOrFunc, ExtraModes0, DetPrime),
+            HOInst = pred_inst_info(PredOrFunc, ExtraModes0, _, DetPrime),
             ( MaybePredOrFunc0 = no
             ; MaybePredOrFunc0 = yes(PredOrFunc)
             )
diff --git a/compiler/equiv_type_hlds.m b/compiler/equiv_type_hlds.m
index f712b73..3aad35a 100644
--- a/compiler/equiv_type_hlds.m
+++ b/compiler/equiv_type_hlds.m
@@ -727,11 +727,12 @@ type_may_occur_in_insts([Inst | Insts]) =
 replace_in_inst_2(_, any(_, none) @ Inst, Inst, no, !TVarSet, !Cache).
 replace_in_inst_2(EqvMap, any(Uniq, higher_order(PredInstInfo0)) @ Inst0, Inst,
         Changed, !TVarSet, !Cache) :-
-    PredInstInfo0 = pred_inst_info(PorF, Modes0, Det),
+    PredInstInfo0 = pred_inst_info(PorF, Modes0, MaybeArgRegs, Det),
     replace_in_modes(EqvMap, Modes0, Modes, Changed, !TVarSet, !Cache),
     (
         Changed = yes,
-        Inst = any(Uniq, higher_order(pred_inst_info(PorF, Modes, Det)))
+        PredInstInfo = pred_inst_info(PorF, Modes, MaybeArgRegs, Det),
+        Inst = any(Uniq, higher_order(PredInstInfo))
     ;
         Changed = no,
         Inst = Inst0
@@ -753,11 +754,12 @@ replace_in_inst_2(EqvMap, bound(Uniq, BoundInsts0) @ Inst0, Inst,
 replace_in_inst_2(_, ground(_, none) @ Inst, Inst, no, !TVarSet, !Cache).
 replace_in_inst_2(EqvMap, ground(Uniq, higher_order(PredInstInfo0)) @ Inst0,
         Inst, Changed, !TVarSet, !Cache) :-
-    PredInstInfo0 = pred_inst_info(PorF, Modes0, Det),
+    PredInstInfo0 = pred_inst_info(PorF, Modes0, MaybeArgRegs, Det),
     replace_in_modes(EqvMap, Modes0, Modes, Changed, !TVarSet, !Cache),
     (
         Changed = yes,
-        Inst = ground(Uniq, higher_order(pred_inst_info(PorF, Modes, Det)))
+        PredInstInfo = pred_inst_info(PorF, Modes, MaybeArgRegs, Det),
+        Inst = ground(Uniq, higher_order(PredInstInfo))
     ;
         Changed = no,
         Inst = Inst0
@@ -1022,7 +1024,7 @@ replace_in_goal_expr(EqvMap, GoalExpr0, GoalExpr, Changed, !Info) :-
             GoalExpr = GoalExpr0
         )
     ;
-        GoalExpr0 = generic_call(Details, Args, Modes0, Detism),
+        GoalExpr0 = generic_call(Details, Args, Modes0, MaybeArgRegs, Detism),
         TVarSet0 = !.Info ^ ethri_tvarset,
         Cache0 = !.Info ^ ethri_inst_cache,
         replace_in_modes(EqvMap, Modes0, Modes, Changed, TVarSet0, TVarSet,
@@ -1031,7 +1033,7 @@ replace_in_goal_expr(EqvMap, GoalExpr0, GoalExpr, Changed, !Info) :-
             Changed = yes,
             !Info ^ ethri_tvarset := TVarSet,
             !Info ^ ethri_inst_cache := Cache,
-            GoalExpr = generic_call(Details, Args, Modes, Detism)
+            GoalExpr = generic_call(Details, Args, Modes, MaybeArgRegs, Detism)
         ;
             Changed = no,
             GoalExpr = GoalExpr0
diff --git a/compiler/erl_code_gen.m b/compiler/erl_code_gen.m
index 07e6cbf..9d74615 100644
--- a/compiler/erl_code_gen.m
+++ b/compiler/erl_code_gen.m
@@ -675,7 +675,7 @@ erl_gen_goal_expr(GoalExpr, CodeModel, Detism, InstMap, Context,
         erl_gen_disj(Goals, CodeModel, InstMap, Context, MaybeSuccessExpr,
             Statement, !Info)
     ;
-        GoalExpr = generic_call(GenericCall, Vars, Modes, CallDetism),
+        GoalExpr = generic_call(GenericCall, Vars, Modes, _, CallDetism),
         determinism_to_code_model(CallDetism, CallCodeModel),
         expect(unify(CodeModel, CallCodeModel), $module, $pred,
             "code model mismatch"),
diff --git a/compiler/exception_analysis.m b/compiler/exception_analysis.m
index a4f4b48..e778fd4 100644
--- a/compiler/exception_analysis.m
+++ b/compiler/exception_analysis.m
@@ -391,7 +391,7 @@ check_goal_for_exceptions_2(SCC, VarTypes, GoalExpr, GoalInfo,
         check_goal_for_exceptions_plain_call(SCC, VarTypes,
             CallPredId, CallProcId, Args, !Result, !ModuleInfo)
     ;
-        GoalExpr = generic_call(Details, Args, _, _),
+        GoalExpr = generic_call(Details, Args, _, _, _),
         check_goal_for_exceptions_generic_call(VarTypes, Details, Args,
             GoalInfo, !Result, !ModuleInfo)
     ;
diff --git a/compiler/export.m b/compiler/export.m
index e591349..e24eabc 100644
--- a/compiler/export.m
+++ b/compiler/export.m
@@ -382,13 +382,14 @@ get_export_info_for_lang_c(Preds, PredId, ProcId, _Globals, ModuleInfo,
     pred_info_get_procedures(PredInfo, ProcTable),
     map.lookup(ProcTable, ProcId, ProcInfo),
     proc_info_maybe_arg_info(ProcInfo, MaybeArgInfos),
+    pred_info_get_markers(PredInfo, Markers),
     pred_info_get_arg_types(PredInfo, ArgTypes),
     (
         MaybeArgInfos = yes(ArgInfos0),
         ArgInfos = ArgInfos0
     ;
         MaybeArgInfos = no,
-        generate_proc_arg_info(Status, ArgTypes, ModuleInfo, ProcInfo,
+        generate_proc_arg_info(Markers, ArgTypes, ModuleInfo, ProcInfo,
             NewProcInfo),
         proc_info_arg_info(NewProcInfo, ArgInfos)
     ),
diff --git a/compiler/fact_table.m b/compiler/fact_table.m
index 07c9817..ebd3f8a 100644
--- a/compiler/fact_table.m
+++ b/compiler/fact_table.m
@@ -2977,17 +2977,7 @@ generate_argument_vars_code(PragmaVars, Types, ModuleInfo, DeclCode, InputCode,
         OutputCode, SaveRegsCode, GetRegsCode, NumInputArgs) :-
     list.map((pred(X::in, Y::out) is det :- X = pragma_var(_, _, Y, _)),
         PragmaVars, Modes),
-    module_info_get_globals(ModuleInfo, Globals),
-    globals.lookup_bool_option(Globals, use_float_registers, FloatRegs),
-    (
-        FloatRegs = yes,
-        FloatRegType = reg_f
-    ;
-        FloatRegs = no,
-        FloatRegType = reg_r
-    ),
-    make_arg_infos(Types, Modes, model_non, ModuleInfo, FloatRegType,
-        ArgInfos),
+    make_standard_arg_infos(Types, Modes, model_non, ModuleInfo, ArgInfos),
     generate_argument_vars_code_2(PragmaVars, ArgInfos, Types, ModuleInfo,
         DeclCode, InputCode, OutputCode, SaveRegsCode, GetRegsCode, 1,
         NumInputArgs).
diff --git a/compiler/float_regs.m b/compiler/float_regs.m
new file mode 100644
index 0000000..2779c6d
--- /dev/null
+++ b/compiler/float_regs.m
@@ -0,0 +1,1622 @@
+%-----------------------------------------------------------------------------%
+% vim: ft=mercury ts=4 sw=4 et
+%-----------------------------------------------------------------------------%
+% Copyright (C) 2011 The University of Melbourne.
+% This file may only be copied under the terms of the GNU General
+% Public License - see the file COPYING in the Mercury distribution.
+%-----------------------------------------------------------------------------%
+%
+% File: float_regs.m
+% Author: wangp.
+%
+% In the following we assume that Mercury `float' is wider than a word, and that
+% we are targeting a Mercury abstract machine with float registers.
+% The module is not used otherwise.
+%
+% Arguments in first-order calls are passed via float registers if the formal
+% parameter has type `float' or equivalent.  All other arguments are passed via
+% regular registers.
+%
+% Higher-order calls are complicated by polymorphism.  A procedure of type
+% `pred(float)' may be an argument to another procedure, where that argument
+% position has type `pred(T)'.  Calling that higher-order term should place
+% its argument into a regular register (since it is polymorphic), but the
+% actual procedure expects its argument in a float register.
+%
+% We deal with these problems of incompatible calling conventions by
+% substituting wrapper closures over the original higher-order terms, when the
+% original higher-order term is passed to a callee or stored in a data term,
+% where the expected calling convention is different.  See below for examples.
+%
+% As we have seen, a higher-order type does not identify the register class for
+% each argument.  A higher-order inst already contains information about the
+% calling convention to use for a term with that inst: the argument modes.
+% In this module, we extend higher-order insts to record the register class
+% that must be used for each argument, e.g.
+%
+%       pred(in, out, out) is det /* arg regs: [reg_r, reg_r, reg_f] */
+%
+% indicates that the first and second arguments must be passed via regular
+% registers.  The third argument must be passed via a float register.
+%
+%-----------------------------------------------------------------------------%
+%
+% EXAMPLE 1
+% ---------
+%
+%   :- pred get_q(pred(T, T)).
+%   :- mode get_q(out(pred(in, out) is det)) is det.
+%
+%   p(X) :-
+%       get_q(Q),
+%       call(Q, 1.0, X).
+%
+% Q has type `pred(float, float)' and we would be misled to pass the float
+% arguments in the higher-order call via the float registers.  The inst
+% contains the information to correct the higher-order call.
+%
+%   :- pred get_q(pred(T, T)).
+%   :- mode get_q(out(pred(in, out) is det /* arg regs: [reg_r, reg_r] */))
+%       is det.
+%
+%   p(X) :-
+%       get_q(Q),
+%       % new insts:
+%       % Q -> pred(in, out) is det /* arg regs: [reg_r, reg_r] */
+%       call(Q, 1.0, X).
+%       % arg regs: [reg_r, reg_r]
+%
+% EXAMPLE 2
+% ---------
+%
+%   :- pred foo(float) is semidet.
+%   :- pred q(pred(T)::in(pred(in) is semidet /* arg regs: [reg_r] */), T::in)
+%       is semidet.
+%
+%   p :-
+%       F = foo,            /* arg regs: [reg_f] */
+%       q(F, 1.0).          /* F incompatible */
+%
+% becomes
+%
+%   p :-
+%       F = foo,            /* arg regs: [reg_f] */
+%       F1 = wrapper1(F),   /* arg regs: [reg_r] */
+%       q(F1, 1.0).
+%
+%   :- pred wrapper(
+%       pred(float)::in(pred(in) is semidet /* arg regs: [reg_f] */),
+%       float::in) is semidet.
+%
+%   wrapper1(F, X) :- /* must use reg_r: X */
+%       call(F, X).
+%
+% The wrapper1 predicate has an annotation that causes the code generator to
+% use a regular register for the argument X.
+%
+% EXAMPLE 3
+% ---------
+%
+%   :- type foo(T) ---> mkfoo(pred(T, T)).
+%   :- inst mkfoo  ---> mkfoo(pred(in, out) is det).
+%
+%   :- pred p(foo(float)::out(mkfoo)) is det.
+%   :- pred q(float::in, float::in) is det.
+%
+%   p(Foo) :-
+%       Q = q,              /* arg regs: [reg_f, reg_f] */
+%       Foo = mkfoo(Q).
+%       ...
+%
+% becomes
+%
+%   p(Foo) :-
+%       Q = q,              /* arg regs: [reg_f, reg_f] */
+%       Q1 = wrapper2(Q),   /* arg regs: [reg_r, reg_r] */
+%       Foo = mkfoo(Q1).
+%
+% `q' needs to be wrapped in argument of `mkfoo'.  Foo has type `foo(float)'
+% but may be passed to a procedure with the argument type `foo(T)'.
+% Then `q' could be extracted from Foo, and called with arguments placed in
+% the regular registers.
+%
+%-----------------------------------------------------------------------------%
+
+:- module transform_hlds.float_regs.
+:- interface.
+
+:- import_module hlds.hlds_module.
+:- import_module parse_tree.error_util.
+
+:- import_module list.
+
+%-----------------------------------------------------------------------------%
+
+:- pred insert_reg_wrappers(module_info::in, module_info::out,
+    list(error_spec)::out) is det.
+
+%----------------------------------------------------------------------------%
+%----------------------------------------------------------------------------%
+
+:- implementation.
+
+:- import_module check_hlds.inst_match.
+:- import_module check_hlds.inst_util.
+:- import_module check_hlds.mode_util.
+:- import_module check_hlds.polymorphism.
+:- import_module check_hlds.type_util.
+:- import_module hlds.goal_util.
+:- import_module hlds.hlds_data.
+:- import_module hlds.hlds_error_util.
+:- import_module hlds.hlds_goal.
+:- import_module hlds.hlds_pred.
+:- import_module hlds.instmap.
+:- import_module hlds.quantification.
+:- import_module mdbcomp.prim_data.
+:- import_module parse_tree.builtin_lib_types.
+:- import_module parse_tree.prog_data.
+:- import_module parse_tree.prog_mode.
+:- import_module parse_tree.prog_type.
+:- import_module parse_tree.set_of_var.
+:- import_module transform_hlds.dependency_graph.
+:- import_module transform_hlds.lambda.
+
+:- import_module assoc_list.
+:- import_module bool.
+:- import_module int.
+:- import_module map.
+:- import_module maybe.
+:- import_module pair.
+:- import_module require.
+:- import_module set.
+:- import_module varset.
+
+%-----------------------------------------------------------------------------%
+
+insert_reg_wrappers(!ModuleInfo, Specs) :-
+    % In the first phase, update the pred_inst_infos in argument modes to
+    % include information about the register type that should be used for
+    % each higher-order argument.
+    module_info_get_valid_predids(PredIds, !ModuleInfo),
+    list.foldl(add_arg_regs_in_pred, PredIds, !ModuleInfo),
+
+    % In the second phase, go over every procedure goal, update instmap deltas
+    % to include the information from pred_inst_infos.  When a higher-order
+    % variable has an inst that indicates it uses a different calling
+    % convention than is required in a given context, replace that variable
+    % with a wrapper closure which has the expected calling convention.
+    list.foldl2(insert_reg_wrappers_pred, PredIds, !ModuleInfo, [], Specs),
+    module_info_clobber_dependency_info(!ModuleInfo).
+
+%-----------------------------------------------------------------------------%
+%
+% First phase
+%
+
+:- pred add_arg_regs_in_pred(pred_id::in, module_info::in, module_info::out)
+    is det.
+
+add_arg_regs_in_pred(PredId, !ModuleInfo) :-
+    module_info_pred_info(!.ModuleInfo, PredId, PredInfo0),
+    ProcIds = pred_info_procids(PredInfo0),
+    list.foldl(add_arg_regs_in_proc(!.ModuleInfo), ProcIds,
+        PredInfo0, PredInfo),
+    module_info_set_pred_info(PredId, PredInfo, !ModuleInfo).
+
+:- pred add_arg_regs_in_proc(module_info::in, proc_id::in,
+    pred_info::in, pred_info::out) is det.
+
+add_arg_regs_in_proc(ModuleInfo, ProcId, PredInfo0, PredInfo) :-
+    pred_info_get_markers(PredInfo0, PredMarkers),
+    pred_info_proc_info(PredInfo0, ProcId, ProcInfo0),
+    proc_info_get_argmodes(ProcInfo0, ArgModes0),
+    ( check_marker(PredMarkers, marker_class_instance_method) ->
+        % For class instance methods use the argument types before
+        % instance types were substituted. The list of arguments in the
+        % procedure may be longer due to type_infos and typeclass_infos.
+        pred_info_get_instance_method_arg_types(PredInfo0, IM_ArgTypes),
+        list.length(IM_ArgTypes, Num_IM_ArgTypes),
+        split_list_from_end(Num_IM_ArgTypes, ArgModes0, FrontModes,
+            ArgModes1),
+        list.map_corresponding(add_arg_regs_in_proc_arg(ModuleInfo),
+            IM_ArgTypes, ArgModes1, ArgModes2),
+        ArgModes = FrontModes ++ ArgModes2
+    ;
+        pred_info_get_arg_types(PredInfo0, ArgTypes),
+        list.map_corresponding(add_arg_regs_in_proc_arg(ModuleInfo),
+            ArgTypes, ArgModes0, ArgModes)
+    ),
+    proc_info_set_argmodes(ArgModes, ProcInfo0, ProcInfo),
+    pred_info_set_proc_info(ProcId, ProcInfo, PredInfo0, PredInfo).
+
+:- pred add_arg_regs_in_proc_arg(module_info::in, mer_type::in,
+    mer_mode::in, mer_mode::out) is det.
+
+add_arg_regs_in_proc_arg(ModuleInfo, RealVarType, ArgMode0, ArgMode) :-
+    (
+        type_to_ctor_and_args(RealVarType, _TypeCtor, TypeArgs),
+        TypeArgs = [_ | _]
+    ->
+        % Even though a type parameter might be substituted by `float', in
+        % another procedure it might be left generic, e.g.
+        %
+        %   :- type f(T) ---> f(pred(T)).
+        %   :- inst f    ---> f(pred(in) is semidet).
+        %
+        %   :- pred p1(f(float)::in(f), float::in) is semidet.
+        %   :- pred p2(f(T)::in(f), T::in) is semidet.
+        %
+        % The same value may be passed to `p1' and `p2' so the higher-order
+        % term contained within must use the same calling convention, namely a
+        % regular register for its argument.  Therefore while processing `p1'
+        % we must undo the type substitution and treat `f(float)' as if it were
+        % `f(T)'.
+        %
+        % A type parameter may also be substituted by a higher-order type, e.g.
+        %
+        %   :- type g(T) ---> g(T).
+        %   :- inst g    ---> g(pred(in) is semidet).
+        %
+        %   :- pred q1(g(pred(float))::in(g), float::in) is semidet.
+        %   :- pred q2(g(pred(T))::in(g), T::in) is semidet.
+        %
+        % When processing `q1' we must treat the `g(pred(float))' argument as
+        % if it were `g(pred(T))'.
+
+        PolymorphicContext = no,
+        make_generic_type(PolymorphicContext, RealVarType, AssumedType),
+        add_arg_regs_in_mode(ModuleInfo, AssumedType, ArgMode0, ArgMode)
+    ;
+        ArgMode = ArgMode0
+    ).
+
+:- pred make_generic_type(bool::in, mer_type::in, mer_type::out) is det.
+
+make_generic_type(PolymorphicContext, Type0, Type) :-
+    (
+        type_is_higher_order_details(Type0, Purity, PredOrFunc, EvalMethod,
+            ArgTypes0)
+    ->
+        list.map(make_generic_type(PolymorphicContext), ArgTypes0, ArgTypes),
+        construct_higher_order_type(Purity, PredOrFunc, EvalMethod, ArgTypes,
+            Type)
+    ;
+        type_to_ctor_and_args(Type0, TypeCtor, ArgTypes0)
+    ->
+        (
+            ArgTypes0 = [],
+            (
+                PolymorphicContext = yes,
+                TypeCtor = float_type_ctor
+            ->
+                % We don't actually need to replace `float' by a type variable.
+                % Any other type will do, so long as it forces the argument to
+                % be passed via a regular register.
+                Type = heap_pointer_type
+            ;
+                Type = Type0
+            )
+        ;
+            ArgTypes0 = [_ | _],
+            list.map(make_generic_type(yes), ArgTypes0, ArgTypes),
+            construct_type(TypeCtor, ArgTypes, Type)
+        )
+    ;
+        Type = Type0
+    ).
+
+:- pred add_arg_regs_in_mode(module_info::in, mer_type::in,
+    mer_mode::in, mer_mode::out) is det.
+
+add_arg_regs_in_mode(ModuleInfo, VarType, ArgMode0, ArgMode) :-
+    add_arg_regs_in_mode_2(ModuleInfo, set.init, VarType, ArgMode0, ArgMode).
+
+:- pred add_arg_regs_in_mode_2(module_info::in, set(inst_name)::in, mer_type::in,
+    mer_mode::in, mer_mode::out) is det.
+
+add_arg_regs_in_mode_2(ModuleInfo, Seen, VarType, ArgMode0, ArgMode) :-
+    mode_get_insts(ModuleInfo, ArgMode0, InitialInst0, FinalInst0),
+    add_arg_regs_in_inst(ModuleInfo, Seen, VarType, InitialInst0, InitialInst),
+    add_arg_regs_in_inst(ModuleInfo, Seen, VarType, FinalInst0, FinalInst),
+    % Avoid expanding insts if unchanged.
+    (
+        InitialInst = InitialInst0,
+        FinalInst = FinalInst0
+    ->
+        ArgMode = ArgMode0
+    ;
+        ArgMode = (InitialInst -> FinalInst)
+    ).
+
+:- pred add_arg_regs_in_inst(module_info::in, set(inst_name)::in, mer_type::in,
+    mer_inst::in, mer_inst::out) is det.
+
+add_arg_regs_in_inst(ModuleInfo, Seen0, Type, Inst0, Inst) :-
+    (
+        Inst0 = ground(Uniq, higher_order(PredInstInfo0)),
+        ( type_is_higher_order_details(Type, _, _, _, ArgTypes) ->
+            add_arg_regs_in_pred_inst_info(ModuleInfo, Seen0, ArgTypes,
+                PredInstInfo0, PredInstInfo)
+        ;
+            PredInstInfo = PredInstInfo0
+        ),
+        Inst = ground(Uniq, higher_order(PredInstInfo))
+    ;
+        Inst0 = any(Uniq, higher_order(PredInstInfo0)),
+        ( type_is_higher_order_details(Type, _, _, _, ArgTypes) ->
+            add_arg_regs_in_pred_inst_info(ModuleInfo, Seen0, ArgTypes,
+                PredInstInfo0, PredInstInfo)
+        ;
+            PredInstInfo = PredInstInfo0
+        ),
+        Inst = any(Uniq, higher_order(PredInstInfo))
+    ;
+        Inst0 = bound(Uniq, BoundInsts0),
+        list.map(add_arg_regs_in_bound_inst(ModuleInfo, Seen0, Type),
+            BoundInsts0, BoundInsts),
+        Inst = bound(Uniq, BoundInsts)
+    ;
+        Inst0 = defined_inst(InstName),
+        % XXX is this correct?
+        ( set.contains(Seen0, InstName) ->
+            Inst = Inst0
+        ;
+            set.insert(InstName, Seen0, Seen1),
+            inst_lookup(ModuleInfo, InstName, Inst1),
+            add_arg_regs_in_inst(ModuleInfo, Seen1, Type, Inst1, Inst2),
+            % Avoid expanding insts if unchanged.
+            ( Inst1 = Inst2 ->
+                Inst = Inst0
+            ;
+                Inst = Inst2
+            )
+        )
+    ;
+        ( Inst0 = ground(_, none)
+        ; Inst0 = any(_, none)
+        ; Inst0 = free
+        ; Inst0 = free(_)
+        ; Inst0 = not_reached
+        ; Inst0 = inst_var(_)
+        ; Inst0 = constrained_inst_vars(_, _)
+        ; Inst0 = abstract_inst(_, _)
+        ),
+        Inst = Inst0
+    ).
+
+:- pred add_arg_regs_in_pred_inst_info(module_info::in, set(inst_name)::in,
+    list(mer_type)::in, pred_inst_info::in, pred_inst_info::out) is det.
+
+add_arg_regs_in_pred_inst_info(ModuleInfo, Seen, ArgTypes, PredInstInfo0,
+        PredInstInfo) :-
+    PredInstInfo0 = pred_inst_info(PredOrFunc, Modes0, _, Det),
+    list.map_corresponding(add_arg_regs_in_mode_2(ModuleInfo, Seen),
+        ArgTypes, Modes0, Modes),
+    list.map(ho_arg_reg_for_type, ArgTypes, ArgRegs),
+    PredInstInfo = pred_inst_info(PredOrFunc, Modes, yes(ArgRegs), Det).
+
+:- pred add_arg_regs_in_bound_inst(module_info::in, set(inst_name)::in,
+    mer_type::in, bound_inst::in, bound_inst::out) is det.
+
+add_arg_regs_in_bound_inst(ModuleInfo, Seen, Type, BoundInst0, BoundInst) :-
+    BoundInst0 = bound_functor(ConsId, ArgInsts0),
+    (
+        get_cons_id_non_existential_arg_types(ModuleInfo, Type, ConsId,
+            ArgTypes)
+    ->
+        (
+            ArgTypes = [],
+            % When a foreign type overrides a d.u. type, the inst may have
+            % arguments but the foreign type does not.
+            ArgInsts = ArgInsts0
+        ;
+            ArgTypes = [_ | _],
+            list.map_corresponding(add_arg_regs_in_inst(ModuleInfo, Seen),
+                ArgTypes, ArgInsts0, ArgInsts)
+        )
+    ;
+        % XXX handle existentially typed cons_ids
+        ArgInsts = ArgInsts0
+    ),
+    BoundInst = bound_functor(ConsId, ArgInsts).
+
+:- pred ho_arg_reg_for_type(mer_type::in, ho_arg_reg::out) is det.
+
+ho_arg_reg_for_type(Type, RegType) :-
+    ( Type = float_type ->
+        RegType = ho_arg_reg_f
+    ;
+        RegType = ho_arg_reg_r
+    ).
+
+%-----------------------------------------------------------------------------%
+%
+% Second phase
+%
+
+:- pred insert_reg_wrappers_pred(pred_id::in,
+    module_info::in, module_info::out,
+    list(error_spec)::in, list(error_spec)::out) is det.
+
+insert_reg_wrappers_pred(PredId, !ModuleInfo, !Specs) :-
+    module_info_pred_info(!.ModuleInfo, PredId, PredInfo),
+    ProcIds = pred_info_procids(PredInfo),
+    list.foldl2(insert_reg_wrappers_proc(PredId), ProcIds,
+        !ModuleInfo, !Specs).
+
+:- pred insert_reg_wrappers_proc(pred_id::in, proc_id::in,
+    module_info::in, module_info::out,
+    list(error_spec)::in, list(error_spec)::out) is det.
+
+insert_reg_wrappers_proc(PredId, ProcId, !ModuleInfo, !Specs) :-
+    module_info_get_preds(!.ModuleInfo, PredTable0),
+    map.lookup(PredTable0, PredId, PredInfo0),
+    pred_info_get_procedures(PredInfo0, ProcTable0),
+    map.lookup(ProcTable0, ProcId, ProcInfo0),
+
+    insert_reg_wrappers_proc_2(ProcInfo0, ProcInfo, PredInfo0, PredInfo1,
+        !ModuleInfo, !Specs),
+
+    pred_info_get_procedures(PredInfo1, ProcTable1),
+    map.det_update(ProcId, ProcInfo, ProcTable1, ProcTable),
+    pred_info_set_procedures(ProcTable, PredInfo1, PredInfo),
+    module_info_get_preds(!.ModuleInfo, PredTable1),
+    map.det_update(PredId, PredInfo, PredTable1, PredTable),
+    module_info_set_preds(PredTable, !ModuleInfo).
+
+:- pred insert_reg_wrappers_proc_2(proc_info::in, proc_info::out,
+    pred_info::in, pred_info::out, module_info::in, module_info::out,
+    list(error_spec)::in, list(error_spec)::out) is det.
+
+insert_reg_wrappers_proc_2(!ProcInfo, !PredInfo, !ModuleInfo, !Specs) :-
+    % Grab the appropriate fields from the pred_info and proc_info.
+    pred_info_get_typevarset(!.PredInfo, TypeVarSet0),
+    proc_info_get_headvars(!.ProcInfo, HeadVars),
+    proc_info_get_varset(!.ProcInfo, VarSet0),
+    proc_info_get_vartypes(!.ProcInfo, VarTypes0),
+    proc_info_get_argmodes(!.ProcInfo, ArgModes),
+    proc_info_get_goal(!.ProcInfo, Goal0),
+    proc_info_get_initial_instmap(!.ProcInfo, !.ModuleInfo, InstMap0),
+    proc_info_get_rtti_varmaps(!.ProcInfo, RttiVarMaps0),
+    proc_info_get_inst_varset(!.ProcInfo, InstVarSet0),
+    proc_info_get_has_parallel_conj(!.ProcInfo, HasParallelConj),
+    MustRecomputeNonLocals0 = no,
+    HaveExpandedLambdas0 = no,
+
+    % Process the goal.
+    Info0 = lambda_info(VarSet0, VarTypes0, TypeVarSet0, InstVarSet0,
+        RttiVarMaps0, HasParallelConj, !.PredInfo, !.ModuleInfo,
+        MustRecomputeNonLocals0, HaveExpandedLambdas0),
+    insert_reg_wrappers_proc_body(HeadVars, ArgModes, Goal0, Goal1, InstMap0,
+        Info0, Info1, !Specs),
+    Info1 = lambda_info(VarSet1, VarTypes1, TypeVarSet, _InstVarSet,
+        RttiVarMaps1, _, _PredInfo, !:ModuleInfo, MustRecomputeNonLocals,
+        _HaveExpandedLambdas),
+
+    % Check if we need to requantify.
+    (
+        MustRecomputeNonLocals = yes,
+        implicitly_quantify_clause_body_general(ordinary_nonlocals_no_lambda,
+            HeadVars, _Warnings, Goal1, Goal2, VarSet1, VarSet2, VarTypes1,
+            VarTypes2, RttiVarMaps1, RttiVarMaps2)
+    ;
+        MustRecomputeNonLocals = no,
+        Goal2 = Goal1,
+        VarSet2 = VarSet1,
+        VarTypes2 = VarTypes1,
+        RttiVarMaps2 = RttiVarMaps1
+    ),
+
+    % We recomputed instmap deltas for atomic goals during the second phase,
+    % so we only need to recompute instmap deltas for compound goals now.
+    recompute_instmap_delta(do_not_recompute_atomic_instmap_deltas,
+        Goal2, Goal, VarTypes2, InstVarSet0, InstMap0, !ModuleInfo),
+
+    VarSet = VarSet2,
+    VarTypes = VarTypes2,
+    RttiVarMaps = RttiVarMaps2,
+
+    % Set the new values of the fields in proc_info and pred_info.
+    proc_info_set_goal(Goal, !ProcInfo),
+    proc_info_set_varset(VarSet, !ProcInfo),
+    proc_info_set_vartypes(VarTypes, !ProcInfo),
+    proc_info_set_rtti_varmaps(RttiVarMaps, !ProcInfo),
+    proc_info_set_headvars(HeadVars, !ProcInfo),
+    ensure_all_headvars_are_named(!ProcInfo),
+    pred_info_set_typevarset(TypeVarSet, !PredInfo).
+
+:- pred insert_reg_wrappers_proc_body(list(prog_var)::in, list(mer_mode)::in,
+    hlds_goal::in, hlds_goal::out, instmap::in,
+    lambda_info::in, lambda_info::out,
+    list(error_spec)::in, list(error_spec)::out) is det.
+
+insert_reg_wrappers_proc_body(HeadVars, ArgModes, Goal0, Goal, InstMap0,
+        !Info, !Specs) :-
+    insert_reg_wrappers_goal(Goal0, Goal1, InstMap0, InstMap1, !Info, !Specs),
+    % Ensure that all arguments match their final insts.
+    ModuleInfo = !.Info ^ li_module_info,
+    mode_list_get_final_insts(ModuleInfo, ArgModes, FinalInsts),
+    assoc_list.from_corresponding_lists(HeadVars, FinalInsts,
+        VarsExpectInsts),
+    fix_branching_goal(VarsExpectInsts, Goal1, InstMap1, Goal, !Info,
+        !Specs).
+
+%-----------------------------------------------------------------------------%
+
+:- pred insert_reg_wrappers_goal(hlds_goal::in, hlds_goal::out,
+    instmap::in, instmap::out, lambda_info::in, lambda_info::out,
+    list(error_spec)::in, list(error_spec)::out) is det.
+
+insert_reg_wrappers_goal(Goal0, Goal, !InstMap, !Info, !Specs) :-
+    ( instmap_is_reachable(!.InstMap) ->
+        insert_reg_wrappers_goal_2(Goal0, Goal, !InstMap, !Info, !Specs)
+    ;
+        Goal = Goal0
+    ).
+
+:- pred insert_reg_wrappers_goal_2(hlds_goal::in, hlds_goal::out,
+    instmap::in, instmap::out, lambda_info::in, lambda_info::out,
+    list(error_spec)::in, list(error_spec)::out) is det.
+
+insert_reg_wrappers_goal_2(Goal0, Goal, !InstMap, !Info, !Specs) :-
+    Goal0 = hlds_goal(GoalExpr0, GoalInfo0),
+    (
+        GoalExpr0 = unify(_LHS, _RHS, _Mode, _Unification, _Context),
+        insert_reg_wrappers_unify_goal(GoalExpr0, GoalInfo0, Goal, !InstMap,
+            !Info, !Specs)
+    ;
+        GoalExpr0 = conj(ConjType, Goals0),
+        (
+            ConjType = plain_conj,
+            insert_reg_wrappers_conj(Goals0, Goals, !InstMap, !Info, !Specs)
+        ;
+            ConjType = parallel_conj,
+            list.map_foldl3(insert_reg_wrappers_goal, Goals0, Goals,
+                !InstMap, !Info, !Specs)
+        ),
+        GoalExpr = conj(ConjType, Goals),
+        Goal = hlds_goal(GoalExpr, GoalInfo0),
+        % For some reason, the instmap delta for the conjunction as a whole may
+        % be unreachable even though all of the conjuncts are reachable.
+        % If we leave the instmap reachable that leads to problems when we
+        % try to fix up at the end of branching goals.
+        % XXX figure out why
+        update_instmap_if_unreachable(Goal, !InstMap)
+    ;
+        GoalExpr0 = disj(Goals0),
+        NonLocals = goal_info_get_nonlocals(GoalInfo0),
+        insert_reg_wrappers_disj(Goals0, Goals, NonLocals, !InstMap, !Info,
+            !Specs),
+        GoalExpr = disj(Goals),
+        Goal = hlds_goal(GoalExpr, GoalInfo0)
+    ;
+        GoalExpr0 = switch(Var, CanFail, Cases0),
+        NonLocals = goal_info_get_nonlocals(GoalInfo0),
+        insert_reg_wrappers_switch(Var, Cases0, Cases, NonLocals, !InstMap,
+            !Info, !Specs),
+        GoalExpr = switch(Var, CanFail, Cases),
+        Goal = hlds_goal(GoalExpr, GoalInfo0)
+    ;
+        GoalExpr0 = negation(SubGoal0),
+        insert_reg_wrappers_goal(SubGoal0, SubGoal, !.InstMap, _, !Info,
+            !Specs),
+        GoalExpr = negation(SubGoal),
+        Goal = hlds_goal(GoalExpr, GoalInfo0),
+        update_instmap_if_unreachable(Goal, !InstMap)
+    ;
+        GoalExpr0 = scope(Reason, SubGoal0),
+        ( Reason = from_ground_term(_, from_ground_term_construct) ->
+            % The subgoal cannot construct higher order values.
+            GoalExpr = GoalExpr0,
+            Goal = hlds_goal(GoalExpr, GoalInfo0),
+            update_instmap(Goal, !InstMap)
+        ;
+            insert_reg_wrappers_goal(SubGoal0, SubGoal, !InstMap, !Info,
+                !Specs),
+            GoalExpr = scope(Reason, SubGoal),
+            Goal = hlds_goal(GoalExpr, GoalInfo0)
+        )
+    ;
+        GoalExpr0 = if_then_else(_, _, _, _),
+        NonLocals = goal_info_get_nonlocals(GoalInfo0),
+        insert_reg_wrappers_ite(NonLocals, GoalExpr0, GoalExpr, !InstMap,
+            !Info, !Specs),
+        Goal = hlds_goal(GoalExpr, GoalInfo0)
+    ;
+        GoalExpr0 = plain_call(PredId, ProcId, Args0, Builtin,
+            MaybeUnifyContext, SymName),
+        Context = goal_info_get_context(GoalInfo0),
+        insert_reg_wrappers_plain_call(PredId, ProcId, Args0, Args, WrapGoals,
+            !.InstMap, Context, !Info, !Specs),
+        GoalExpr1 = plain_call(PredId, ProcId, Args, Builtin,
+            MaybeUnifyContext, SymName),
+        finish_call_goal(WrapGoals, GoalExpr1, GoalInfo0, Goal, !InstMap,
+            !Info)
+    ;
+        GoalExpr0 = generic_call(GenericCall, Args0, Modes0, _MaybeArgRegs0,
+            Determinism),
+        (
+            GenericCall = higher_order(CallVar, _Purity, _PredOrFunc, _Arity),
+            Context = goal_info_get_context(GoalInfo0),
+            insert_reg_wrappers_higher_order_call(CallVar, Args0, Args, Modes,
+                ArgsRegs, WrapGoals, !.InstMap, Context, !Info, !Specs),
+            GoalExpr1 = generic_call(GenericCall, Args, Modes, yes(ArgsRegs),
+                Determinism),
+            finish_call_goal(WrapGoals, GoalExpr1, GoalInfo0, Goal, !InstMap,
+                !Info)
+        ;
+            GenericCall = class_method(_TCIVar, MethodNum, ClassId, _),
+            Context = goal_info_get_context(GoalInfo0),
+            insert_reg_wrappers_method_call(ClassId, MethodNum, Args0, Args,
+                Modes0, Modes, WrapGoals, !.InstMap, Context, !Info, !Specs),
+            % Currently we don't use float registers for method calls.
+            MaybeArgRegs = no,
+            GoalExpr1 = generic_call(GenericCall, Args, Modes, MaybeArgRegs,
+                Determinism),
+            finish_call_goal(WrapGoals, GoalExpr1, GoalInfo0, Goal, !InstMap,
+                !Info)
+        ;
+            ( GenericCall = event_call(_)
+            ; GenericCall = cast(_)
+            ),
+            Goal = Goal0,
+            update_instmap(Goal, !InstMap)
+        )
+    ;
+        GoalExpr0 = call_foreign_proc(Attributes, PredId, ProcId, ForeignArgs0,
+            ExtraArgs, MaybeTraceRuntimeCond, PragmaImpl),
+        Context = goal_info_get_context(GoalInfo0),
+        insert_reg_wrappers_foreign_call(PredId, ProcId, ForeignArgs0,
+            ForeignArgs, WrapGoals, !.InstMap, Context, !Info, !Specs),
+        GoalExpr1 = call_foreign_proc(Attributes, PredId, ProcId, ForeignArgs,
+            ExtraArgs, MaybeTraceRuntimeCond, PragmaImpl),
+        finish_call_goal(WrapGoals, GoalExpr1, GoalInfo0, Goal, !InstMap,
+            !Info)
+    ;
+        GoalExpr0 = shorthand(_),
+        % These should have been expanded out by now.
+        unexpected($module, $pred, "shorthand")
+    ).
+
+:- pred finish_call_goal(list(hlds_goal)::in, hlds_goal_expr::in,
+    hlds_goal_info::in, hlds_goal::out, instmap::in, instmap::out,
+    lambda_info::in, lambda_info::out) is det.
+
+finish_call_goal(WrapGoals, CallGoalExpr0, CallGoalInfo0, Goal,
+        !InstMap, !Info) :-
+    % Recompute the instmap_delta for the call goal to reflect changes to the
+    % callee's argument modes that we made in the first phase.
+    CallGoal0 = hlds_goal(CallGoalExpr0, CallGoalInfo0),
+    do_recompute_atomic_instmap_delta(CallGoal0, CallGoal, !.InstMap, !Info),
+    update_instmap(CallGoal, !InstMap),
+    CallGoal = hlds_goal(_, CallGoalInfo),
+    conj_list_to_goal(WrapGoals ++ [CallGoal], CallGoalInfo, Goal).
+
+:- pred do_recompute_atomic_instmap_delta(hlds_goal::in, hlds_goal::out,
+    instmap::in, lambda_info::in, lambda_info::out) is det.
+
+do_recompute_atomic_instmap_delta(Goal0, Goal, InstMap, !Info) :-
+    VarTypes = !.Info ^ li_vartypes,
+    InstVarSet = !.Info ^ li_inst_varset,
+    ModuleInfo0 = !.Info ^ li_module_info,
+    recompute_instmap_delta(recompute_atomic_instmap_deltas, Goal0, Goal,
+        VarTypes, InstVarSet, InstMap, ModuleInfo0, ModuleInfo),
+    !Info ^ li_module_info := ModuleInfo.
+
+:- pred update_instmap_if_unreachable(hlds_goal::in, instmap::in, instmap::out)
+    is det.
+
+update_instmap_if_unreachable(Goal, InstMap0, InstMap) :-
+    Goal = hlds_goal(_, GoalInfo),
+    InstMapDelta = goal_info_get_instmap_delta(GoalInfo),
+    ( instmap_delta_is_unreachable(InstMapDelta) ->
+        init_unreachable(InstMap)
+    ;
+        InstMap = InstMap0
+    ).
+
+%-----------------------------------------------------------------------------%
+
+:- pred insert_reg_wrappers_unify_goal(hlds_goal_expr::in(goal_expr_unify),
+    hlds_goal_info::in, hlds_goal::out, instmap::in, instmap::out,
+    lambda_info::in, lambda_info::out,
+    list(error_spec)::in, list(error_spec)::out) is det.
+
+insert_reg_wrappers_unify_goal(GoalExpr0, GoalInfo0, Goal, !InstMap, !Info,
+        !Specs) :-
+    GoalExpr0 = unify(LHS, RHS0, Mode, Unification0, Context),
+    (
+        Unification0 = construct(CellVar, ConsId, Args0, UniModes0,
+            HowToConstruct, IsUnique, SubInfo),
+        (
+            RHS0 = rhs_functor(_, IsExistConstruct, _)
+        ;
+            RHS0 = rhs_var(_),
+            unexpected($module, $pred, "construct rhs_var")
+        ;
+            RHS0 = rhs_lambda_goal(_, _, _, _, _, _, _, _, _),
+            unexpected($module, $pred, "construct rhs_lambda_goal")
+        ),
+        (
+            Args0 = [],
+            ModuleInfo = !.Info ^ li_module_info,
+            update_construct_goal_instmap_delta(ModuleInfo, CellVar, ConsId,
+                Args0, GoalInfo0, GoalInfo, !InstMap),
+            Goal = hlds_goal(GoalExpr0, GoalInfo)
+        ;
+            Args0 = [_ | _],
+            GoalContext = goal_info_get_context(GoalInfo0),
+            insert_reg_wrappers_construct(CellVar, ConsId, Args0, Args,
+                UniModes0, UniModes, Maybe, !.InstMap, GoalContext,
+                !Info, !Specs),
+            (
+                Maybe = yes(WrapGoals),
+                list.foldl(update_instmap, WrapGoals, !InstMap),
+                ModuleInfo = !.Info ^ li_module_info,
+                update_construct_goal_instmap_delta(ModuleInfo, CellVar,
+                    ConsId, Args, GoalInfo0, GoalInfo1, !InstMap),
+                RHS = rhs_functor(ConsId, IsExistConstruct, Args),
+                Unification = construct(CellVar, ConsId, Args, UniModes,
+                    HowToConstruct, IsUnique, SubInfo),
+                GoalExpr1 = unify(LHS, RHS, Mode, Unification, Context),
+                Goal1 = hlds_goal(GoalExpr1, GoalInfo1),
+                conj_list_to_goal(WrapGoals ++ [Goal1], GoalInfo1, Goal)
+            ;
+                Maybe = no,
+                ModuleInfo = !.Info ^ li_module_info,
+                update_construct_goal_instmap_delta(ModuleInfo, CellVar,
+                    ConsId, Args0, GoalInfo0, GoalInfo, !InstMap),
+                Goal = hlds_goal(GoalExpr0, GoalInfo)
+            )
+        )
+    ;
+        Unification0 = deconstruct(CellVar, ConsId, Args, UniModes0,
+            CanFail, CanCGC),
+        % Update the uni_modes of the deconstruction using the current inst of
+        % the deconstructed var. Recompute the instmap delta from the new
+        % uni_modes if changed.
+        ModuleInfo = !.Info ^ li_module_info,
+        list.length(Args, Arity),
+        instmap_lookup_var(!.InstMap, CellVar, CellVarInst0),
+        inst_expand(ModuleInfo, CellVarInst0, CellVarInst),
+        (
+            get_arg_insts(CellVarInst, ConsId, Arity, ArgInsts),
+            list.map_corresponding(uni_mode_set_rhs_final_inst,
+                ArgInsts, UniModes0, UniModes),
+            UniModes \= UniModes0
+        ->
+            Unification = deconstruct(CellVar, ConsId, Args, UniModes,
+                CanFail, CanCGC),
+            GoalExpr1 = unify(LHS, RHS0, Mode, Unification, Context),
+            Goal1 = hlds_goal(GoalExpr1, GoalInfo0),
+            do_recompute_atomic_instmap_delta(Goal1, Goal, !.InstMap, !Info)
+        ;
+            Goal = hlds_goal(GoalExpr0, GoalInfo0)
+        ),
+        update_instmap(Goal, !InstMap)
+    ;
+        Unification0 = assign(ToVar, FromVar),
+        Delta0 = goal_info_get_instmap_delta(GoalInfo0),
+        instmap_lookup_var(!.InstMap, FromVar, Inst),
+        instmap_delta_set_var(ToVar, Inst, Delta0, Delta),
+        goal_info_set_instmap_delta(Delta, GoalInfo0, GoalInfo1),
+        Goal = hlds_goal(GoalExpr0, GoalInfo1),
+        update_instmap(Goal, !InstMap)
+    ;
+        Unification0 = simple_test(_, _),
+        Goal = hlds_goal(GoalExpr0, GoalInfo0),
+        update_instmap(Goal, !InstMap)
+    ;
+        Unification0 = complicated_unify(_, _, _),
+        unexpected($module, $pred, "complicated_unify")
+    ).
+
+:- pred insert_reg_wrappers_construct(prog_var::in, cons_id::in,
+    list(prog_var)::in, list(prog_var)::out,
+    list(uni_mode)::in, list(uni_mode)::out, maybe(list(hlds_goal))::out,
+    instmap::in, prog_context::in, lambda_info::in, lambda_info::out,
+    list(error_spec)::in, list(error_spec)::out) is det.
+
+insert_reg_wrappers_construct(CellVar, ConsId, OrigVars, Vars,
+        UniModes0, UniModes, Maybe, InstMap0, Context, !Info, !Specs) :-
+    ModuleInfo = !.Info ^ li_module_info,
+    VarTypes = !.Info ^ li_vartypes,
+    map.lookup(VarTypes, CellVar, CellType),
+    (
+        % Replace all type parameters by phony type variables.
+        % See EXAMPLE 3 at the top of the file.
+        type_to_ctor_and_args(CellType, TypeCtor, TypeArgs),
+        TypeArgs = [_ | _],
+        varset.init(TVarSet0),
+        list.map_foldl(replace_type_params_by_dummy_vars, TypeArgs,
+            PhonyTypeArgs, TVarSet0, _TVarSet),
+        construct_type(TypeCtor, PhonyTypeArgs, PhonyCellType),
+        get_cons_id_non_existential_arg_types(ModuleInfo, PhonyCellType,
+            ConsId, PhonyArgTypes),
+        PhonyArgTypes = [_ | _]
+    ->
+        uni_modes_to_modes(UniModes0, LhsModes0, RhsModes0),
+        list.map_corresponding(add_arg_regs_in_mode(ModuleInfo),
+            PhonyArgTypes, LhsModes0, LhsModes),
+        list.map_corresponding(add_arg_regs_in_mode(ModuleInfo),
+            PhonyArgTypes, RhsModes0, RhsModes),
+        modes_to_uni_modes(ModuleInfo, LhsModes, RhsModes, UniModes),
+        mode_list_get_initial_insts(ModuleInfo, RhsModes, ArgInitialInsts),
+        match_args(InstMap0, Context, PhonyArgTypes, ArgInitialInsts,
+            OrigVars, Vars, [], WrapGoals, !Info, !Specs),
+        Maybe = yes(WrapGoals)
+    ;
+        Vars = OrigVars,
+        UniModes = UniModes0,
+        Maybe = no
+    ).
+
+:- pred replace_type_params_by_dummy_vars(mer_type::in, mer_type::out,
+    tvarset::in, tvarset::out) is det.
+
+replace_type_params_by_dummy_vars(Type0, Type, !TVarSet) :-
+    (
+        type_is_higher_order_details(Type0, Purity, PredOrFunc, EvalMethod,
+            ArgTypes0)
+    ->
+        list.map_foldl(replace_type_params_by_dummy_vars, ArgTypes0, ArgTypes,
+            !TVarSet),
+        construct_higher_order_type(Purity, PredOrFunc, EvalMethod, ArgTypes,
+            Type)
+    ;
+        varset.new_var(TVar, !TVarSet),
+        Type = type_variable(TVar, kind_star)
+    ).
+
+:- pred uni_modes_to_modes(list(uni_mode)::in, list(mer_mode)::out,
+    list(mer_mode)::out) is det.
+
+uni_modes_to_modes([], [], []).
+uni_modes_to_modes([UniMode | UniModes], [L | Ls], [R | Rs]) :-
+    UniMode = ((LI - RI) -> (LF - RF)),
+    L = (LI -> LF),
+    R = (RI -> RF),
+    uni_modes_to_modes(UniModes, Ls, Rs).
+
+:- pred update_construct_goal_instmap_delta(module_info::in, prog_var::in,
+    cons_id::in, list(prog_var)::in, hlds_goal_info::in, hlds_goal_info::out,
+    instmap::in, instmap::out) is det.
+
+update_construct_goal_instmap_delta(ModuleInfo, CellVar, ConsId, Args,
+        GoalInfo0, GoalInfo, !InstMap) :-
+    Delta0 = goal_info_get_instmap_delta(GoalInfo0),
+    ( instmap_delta_search_var(Delta0, CellVar, CellInst0) ->
+        rebuild_cell_inst(ModuleInfo, !.InstMap, ConsId, Args,
+            CellInst0, CellInst),
+        instmap_delta_set_var(CellVar, CellInst, Delta0, Delta),
+        goal_info_set_instmap_delta(Delta, GoalInfo0, GoalInfo),
+        apply_instmap_delta_sv(Delta, !InstMap)
+    ;
+        GoalInfo = GoalInfo0,
+        apply_instmap_delta_sv(Delta0, !InstMap)
+    ).
+
+:- pred rebuild_cell_inst(module_info::in, instmap::in, cons_id::in,
+    list(prog_var)::in, mer_inst::in, mer_inst::out) is det.
+
+rebuild_cell_inst(ModuleInfo, InstMap, ConsId, Args, Inst0, Inst) :-
+    (
+        Inst0 = bound(Uniq, BoundInsts0),
+        list.map(rebuild_cell_bound_inst(InstMap, ConsId, Args),
+            BoundInsts0, BoundInsts),
+        Inst = bound(Uniq, BoundInsts)
+    ;
+        Inst0 = ground(Uniq, higher_order(PredInstInfo0)),
+        PredInstInfo0 = pred_inst_info(PredOrFunc, Modes, _, Determinism),
+        ( ConsId = closure_cons(ShroudedPredProcId, _EvalMethod) ->
+            proc(PredId, _) = unshroud_pred_proc_id(ShroudedPredProcId),
+            module_info_pred_info(ModuleInfo, PredId, PredInfo),
+            pred_info_get_arg_types(PredInfo, ArgTypes),
+            list.length(Args, NumArgs),
+            list.det_drop(NumArgs, ArgTypes, MissingArgTypes),
+            list.map(ho_arg_reg_for_type, MissingArgTypes, ArgRegs),
+            PredInstInfo = pred_inst_info(PredOrFunc, Modes, yes(ArgRegs),
+                Determinism),
+            Inst = ground(Uniq, higher_order(PredInstInfo))
+        ;
+            Inst = Inst0
+        )
+    ;
+        % XXX do we need to handle any of these other cases?
+        ( Inst0 = free
+        ; Inst0 = free(_)
+        ; Inst0 = any(_, _)
+        ; Inst0 = ground(_, none)
+        ; Inst0 = not_reached
+        ; Inst0 = constrained_inst_vars(_, _)
+        ; Inst0 = defined_inst(_)
+        ),
+        Inst = Inst0
+    ;
+        Inst0 = abstract_inst(_, _),
+        unexpected($module, $pred, "abstract_inst")
+    ;
+        Inst0 = inst_var(_),
+        unexpected($module, $pred, "inst_var")
+    ).
+
+:- pred rebuild_cell_bound_inst(instmap::in, cons_id::in, list(prog_var)::in,
+    bound_inst::in, bound_inst::out) is det.
+
+rebuild_cell_bound_inst(InstMap, ConsId, Args, Inst0, Inst) :-
+    Inst0 = bound_functor(BoundConsId, ArgInsts0),
+    ( equivalent_cons_ids(ConsId, BoundConsId) ->
+        list.map_corresponding(rebuild_cell_bound_inst_arg(InstMap),
+            Args, ArgInsts0, ArgInsts),
+        Inst = bound_functor(BoundConsId, ArgInsts)
+    ;
+        Inst = Inst0
+    ).
+
+:- pred rebuild_cell_bound_inst_arg(instmap::in, prog_var::in,
+    mer_inst::in, mer_inst::out) is det.
+
+rebuild_cell_bound_inst_arg(InstMap, Var, ArgInst0, ArgInst) :-
+    instmap_lookup_var(InstMap, Var, VarInst),
+    % To cope with LCO.
+    ( VarInst = free_inst ->
+        ArgInst = ArgInst0
+    ;
+        ArgInst = VarInst
+    ).
+
+:- pred uni_mode_set_rhs_final_inst(mer_inst::in, uni_mode::in, uni_mode::out)
+    is det.
+
+uni_mode_set_rhs_final_inst(RF, UniMode0, UniMode) :-
+    UniMode0 = ((LI - RI0) -> (LF - _RF0)),
+    UniMode = ((LI - RI0) -> (LF - RF)).
+
+%-----------------------------------------------------------------------------%
+
+:- pred insert_reg_wrappers_conj(list(hlds_goal)::in, list(hlds_goal)::out,
+    instmap::in, instmap::out, lambda_info::in, lambda_info::out,
+    list(error_spec)::in, list(error_spec)::out) is det.
+
+insert_reg_wrappers_conj([], [], !InstMap, !Info, !Specs).
+insert_reg_wrappers_conj([Goal0 | Goals0], Goals, !InstMap, !Info, !Specs) :-
+    % Flatten the conjunction as we go.
+    insert_reg_wrappers_goal(Goal0, Goal1, !InstMap, !Info, !Specs),
+    goal_to_conj_list(Goal1, Goal1List),
+    insert_reg_wrappers_conj(Goals0, Goals1, !InstMap, !Info, !Specs),
+    list.append(Goal1List, Goals1, Goals).
+
+%-----------------------------------------------------------------------------%
+
+:- pred insert_reg_wrappers_disj(list(hlds_goal)::in, list(hlds_goal)::out,
+    set_of_progvar::in, instmap::in, instmap::out,
+    lambda_info::in, lambda_info::out,
+    list(error_spec)::in, list(error_spec)::out) is det.
+
+insert_reg_wrappers_disj(Goals0, Goals, NonLocals, InstMap0, InstMap, !Info,
+        !Specs) :-
+    list.map2_foldl2(insert_reg_wrappers_disjunct(InstMap0),
+        Goals0, Goals1, InstMaps1, !Info, !Specs),
+    common_instmap_delta(InstMap0, NonLocals, InstMaps1, CommonDelta, !Info),
+    ( instmap_delta_is_reachable(CommonDelta) ->
+        instmap_delta_to_assoc_list(CommonDelta, VarsExpectInsts),
+        list.map_corresponding_foldl2(fix_branching_goal(VarsExpectInsts),
+            Goals1, InstMaps1, Goals, !Info, !Specs)
+    ;
+        Goals = Goals1
+    ),
+    apply_instmap_delta(InstMap0, CommonDelta, InstMap).
+
+:- pred insert_reg_wrappers_disjunct(instmap::in,
+    hlds_goal::in, hlds_goal::out, instmap::out,
+    lambda_info::in, lambda_info::out,
+    list(error_spec)::in, list(error_spec)::out) is det.
+
+insert_reg_wrappers_disjunct(InstMap0, Goal0, Goal, InstMap, !Info, !Specs) :-
+    insert_reg_wrappers_goal(Goal0, Goal, InstMap0, InstMap, !Info, !Specs).
+
+%-----------------------------------------------------------------------------%
+
+:- pred insert_reg_wrappers_switch(prog_var::in,
+    list(case)::in, list(case)::out, set_of_progvar::in,
+    instmap::in, instmap::out, lambda_info::in, lambda_info::out,
+    list(error_spec)::in, list(error_spec)::out) is det.
+
+insert_reg_wrappers_switch(Var, Cases0, Cases, NonLocals, InstMap0, InstMap,
+        !Info, !Specs) :-
+    VarTypes = !.Info ^ li_vartypes,
+    map.lookup(VarTypes, Var, Type),
+    list.map2_foldl2(insert_reg_wrappers_case(Var, Type, InstMap0),
+        Cases0, Cases1, InstMaps1, !Info, !Specs),
+    common_instmap_delta(InstMap0, NonLocals, InstMaps1, CommonDelta, !Info),
+    ( instmap_delta_is_reachable(CommonDelta) ->
+        instmap_delta_to_assoc_list(CommonDelta, VarsExpectInsts),
+        list.map_corresponding_foldl2(fix_case_goal(VarsExpectInsts),
+            Cases1, InstMaps1, Cases, !Info, !Specs)
+    ;
+        Cases = Cases1
+    ),
+    apply_instmap_delta(InstMap0, CommonDelta, InstMap).
+
+:- pred insert_reg_wrappers_case(prog_var::in, mer_type::in, instmap::in,
+    case::in, case::out, instmap::out, lambda_info::in, lambda_info::out,
+    list(error_spec)::in, list(error_spec)::out) is det.
+
+insert_reg_wrappers_case(Var, Type, InstMap0, Case0, Case, InstMap, !Info,
+        !Specs) :-
+    Case0 = case(MainConsId, OtherConsIds, Goal0),
+    ModuleInfo0 = !.Info ^ li_module_info,
+    bind_var_to_functors(Var, Type, MainConsId, OtherConsIds,
+        InstMap0, InstMap1, ModuleInfo0, ModuleInfo1),
+    !Info ^ li_module_info := ModuleInfo1,
+    insert_reg_wrappers_goal(Goal0, Goal, InstMap1, InstMap, !Info, !Specs),
+    Case = case(MainConsId, OtherConsIds, Goal).
+
+%-----------------------------------------------------------------------------%
+
+:- pred insert_reg_wrappers_ite(set_of_progvar::in,
+    hlds_goal_expr::in(goal_expr_ite), hlds_goal_expr::out(goal_expr_ite),
+    instmap::in, instmap::out, lambda_info::in, lambda_info::out,
+    list(error_spec)::in, list(error_spec)::out) is det.
+
+insert_reg_wrappers_ite(NonLocals, GoalExpr0, GoalExpr, InstMap0, InstMap,
+        !Info, !Specs) :-
+    GoalExpr0 = if_then_else(Vars, Cond0, Then0, Else0),
+    insert_reg_wrappers_goal(Cond0, Cond, InstMap0, InstMapCond,
+        !Info, !Specs),
+    insert_reg_wrappers_goal(Then0, Then1, InstMapCond, InstMapThen,
+        !Info, !Specs),
+    insert_reg_wrappers_goal(Else0, Else1, InstMap0, InstMapElse,
+        !Info, !Specs),
+
+    common_instmap_delta(InstMap0, NonLocals, [InstMapThen, InstMapElse],
+        CommonDelta, !Info),
+    ( instmap_delta_is_reachable(CommonDelta) ->
+        instmap_delta_to_assoc_list(CommonDelta, VarsExpectInsts),
+        fix_branching_goal(VarsExpectInsts, Then1, InstMapThen, Then,
+            !Info, !Specs),
+        fix_branching_goal(VarsExpectInsts, Else1, InstMapElse, Else,
+            !Info, !Specs)
+    ;
+        Then = Then1,
+        Else = Else1
+    ),
+    apply_instmap_delta(InstMap0, CommonDelta, InstMap),
+    GoalExpr = if_then_else(Vars, Cond, Then, Else).
+
+%-----------------------------------------------------------------------------%
+
+:- pred insert_reg_wrappers_plain_call(pred_id::in, proc_id::in,
+    list(prog_var)::in, list(prog_var)::out, list(hlds_goal)::out,
+    instmap::in, prog_context::in, lambda_info::in, lambda_info::out,
+    list(error_spec)::in, list(error_spec)::out) is det.
+
+insert_reg_wrappers_plain_call(PredId, ProcId, Vars0, Vars, WrapGoals,
+        InstMap0, Context, !Info, !Specs) :-
+    ModuleInfo = !.Info ^ li_module_info,
+    module_info_pred_proc_info(ModuleInfo, PredId, ProcId, PredInfo, ProcInfo),
+    pred_info_get_arg_types(PredInfo, ArgTypes),
+    proc_info_get_argmodes(ProcInfo, ArgModes),
+    match_args_for_call(InstMap0, Context, ArgTypes, ArgModes, Vars0, Vars,
+        WrapGoals, !Info, !Specs).
+
+:- pred insert_reg_wrappers_higher_order_call(prog_var::in,
+    list(prog_var)::in, list(prog_var)::out, list(mer_mode)::out,
+    list(ho_arg_reg)::out, list(hlds_goal)::out, instmap::in, prog_context::in,
+    lambda_info::in, lambda_info::out,
+    list(error_spec)::in, list(error_spec)::out) is det.
+
+insert_reg_wrappers_higher_order_call(CallVar, Vars0, Vars, ArgModes, ArgRegs,
+        WrapGoals, InstMap0, Context, !Info, !Specs) :-
+    ModuleInfo = !.Info ^ li_module_info,
+    VarTypes = !.Info ^ li_vartypes,
+    map.lookup(VarTypes, CallVar, CallVarType),
+    instmap_lookup_var(InstMap0, CallVar, CallVarInst),
+    type_is_higher_order_details_det(CallVarType, _, PredOrFunc, _, ArgTypes),
+    list.length(ArgTypes, Arity),
+    lookup_pred_inst_info(ModuleInfo, CallVarInst, PredOrFunc, Arity,
+        CallVarPredInstInfo),
+    CallVarPredInstInfo = pred_inst_info(_, ArgModes, _, _),
+    get_ho_arg_regs(CallVarPredInstInfo, ArgTypes, ArgRegs),
+    match_args_for_call(InstMap0, Context, ArgTypes, ArgModes, Vars0, Vars,
+        WrapGoals, !Info, !Specs).
+
+:- pred insert_reg_wrappers_method_call(class_id::in, int::in,
+    list(prog_var)::in, list(prog_var)::out,
+    list(mer_mode)::in, list(mer_mode)::out, list(hlds_goal)::out,
+    instmap::in, prog_context::in, lambda_info::in, lambda_info::out,
+    list(error_spec)::in, list(error_spec)::out) is det.
+
+insert_reg_wrappers_method_call(ClassId, MethodNum, Vars0, Vars,
+        Modes0, Modes, WrapGoals, InstMap0, Context, !Info, !Specs) :-
+    ModuleInfo = !.Info ^ li_module_info,
+    module_info_get_class_table(ModuleInfo, Classes),
+    map.lookup(Classes, ClassId, ClassDefn),
+    ClassInterface = ClassDefn ^ class_hlds_interface,
+    list.det_index1(ClassInterface, MethodNum, ClassProc),
+    ClassProc = hlds_class_proc(PredId, ProcId),
+    module_info_pred_proc_info(ModuleInfo, PredId, ProcId, PredInfo, ProcInfo),
+    pred_info_get_arg_types(PredInfo, ArgTypes),
+    proc_info_get_argmodes(ProcInfo, ProcArgModes),
+
+    % We need to update the modes in the generic_call using the modes from the
+    % proc_info because we may have changed them during the first phase.
+    % Vars0 is missing the typeclass_info variable, whereas the ArgTypes and
+    % ProcArgModes *do* include the respective data for that variable.
+    % The problem is, we don't know which argument position the typeclass_info
+    % variable appears at, and it's not easy to find out.
+    %
+    % match_args_for_call has no effect on typeclass_info nor type_info
+    % variables, and we know those variables must appear at the head of the
+    % argument list. Therefore we can just call match_args_for_call on the
+    % non typeclass_info/type_info variables at the tail of Vars0, using the
+    % corresponding tails of ArgTypes and ProcArgModes.
+
+    take_non_rtti_types_from_tail(ArgTypes, EndTypes),
+    list.length(EndTypes, N),
+    split_list_from_end(N, Vars0, StartVars, EndVars0),
+    split_list_from_end(N, Modes0, StartModes, _),
+    split_list_from_end(N, ProcArgModes, _, EndProcArgModes),
+
+    match_args_for_call(InstMap0, Context, EndTypes, EndProcArgModes,
+        EndVars0, EndVars, WrapGoals, !Info, !Specs),
+
+    Vars = StartVars ++ EndVars,
+    Modes = StartModes ++ EndProcArgModes.
+
+:- pred take_non_rtti_types_from_tail(list(mer_type)::in, list(mer_type)::out)
+    is det.
+
+take_non_rtti_types_from_tail([], []).
+take_non_rtti_types_from_tail([Type | Types0], Types) :-
+    take_non_rtti_types_from_tail(Types0, Types1),
+    (
+        ( polymorphism.type_is_typeclass_info(Type)
+        ; Type = type_info_type
+        )
+    ->
+        Types = Types1
+    ;
+        Types = [Type | Types1]
+    ).
+
+:- pred insert_reg_wrappers_foreign_call(pred_id::in, proc_id::in,
+    list(foreign_arg)::in, list(foreign_arg)::out, list(hlds_goal)::out,
+    instmap::in, prog_context::in, lambda_info::in, lambda_info::out,
+    list(error_spec)::in, list(error_spec)::out) is det.
+
+insert_reg_wrappers_foreign_call(PredId, ProcId, ForeignArgs0, ForeignArgs,
+        WrapGoals, InstMap0, Context, !Info, !Specs) :-
+    Vars0 = list.map(foreign_arg_var, ForeignArgs0),
+    insert_reg_wrappers_plain_call(PredId, ProcId, Vars0, Vars, WrapGoals,
+        InstMap0, Context, !Info, !Specs),
+    list.map_corresponding(set_foreign_arg_var, Vars, ForeignArgs0,
+        ForeignArgs).
+
+:- pred set_foreign_arg_var(prog_var::in, foreign_arg::in, foreign_arg::out)
+    is det.
+
+set_foreign_arg_var(Var, !ForeignArg) :-
+    !ForeignArg ^ arg_var := Var.
+
+%-----------------------------------------------------------------------------%
+
+:- pred match_args_for_call(instmap::in, prog_context::in, list(mer_type)::in,
+    list(mer_mode)::in, list(prog_var)::in, list(prog_var)::out,
+    list(hlds_goal)::out, lambda_info::in, lambda_info::out,
+    list(error_spec)::in, list(error_spec)::out) is det.
+
+match_args_for_call(InstMap0, Context, ArgTypes, ArgModes, OrigVars, Vars,
+        WrapGoals, !Info, !Specs) :-
+    ModuleInfo = !.Info ^ li_module_info,
+    mode_list_get_initial_insts(ModuleInfo, ArgModes, InitialInsts),
+    match_args(InstMap0, Context, ArgTypes, InitialInsts, OrigVars, Vars,
+        [], WrapGoals, !Info, !Specs).
+
+:- pred match_args(instmap::in, prog_context::in, list(mer_type)::in,
+    list(mer_inst)::in, list(prog_var)::in, list(prog_var)::out,
+    list(hlds_goal)::in, list(hlds_goal)::out,
+    lambda_info::in, lambda_info::out,
+    list(error_spec)::in, list(error_spec)::out) is det.
+
+match_args(InstMap0, Context, ArgTypes, Insts, OrigVars, Vars, !WrapGoals,
+        !Info, !Specs) :-
+    (
+        ArgTypes = [],
+        Insts = [],
+        OrigVars = []
+    ->
+        Vars = []
+    ;
+        ArgTypes = [AT | ATs],
+        Insts = [I | Is],
+        OrigVars = [OV | OVs]
+    ->
+        match_arg(InstMap0, Context, AT, I, OV, V, !WrapGoals, !Info, !Specs),
+        match_args(InstMap0, Context, ATs, Is, OVs, Vs, !WrapGoals, !Info,
+            !Specs),
+        Vars = [V | Vs]
+    ;
+        unexpected($module, $pred, "length mismatch")
+    ).
+
+:- pred match_arg(instmap::in, prog_context::in, mer_type::in, mer_inst::in,
+    prog_var::in, prog_var::out, list(hlds_goal)::in, list(hlds_goal)::out,
+    lambda_info::in, lambda_info::out,
+    list(error_spec)::in, list(error_spec)::out) is det.
+
+match_arg(InstMapBefore, Context, ArgType, ExpectInst, OrigVar, Var,
+        !WrapGoals, !Info, !Specs) :-
+    ModuleInfo = !.Info ^ li_module_info,
+    VarTypes = !.Info ^ li_vartypes,
+    (
+        inst_is_bound(ModuleInfo, ExpectInst),
+        type_is_higher_order_details(ArgType, _, PredOrFunc, _,
+            ArgPredArgTypes),
+        ArgPredArgTypes = [_ | _]
+    ->
+        map.lookup(VarTypes, OrigVar, OrigVarType),
+        type_is_higher_order_details_det(OrigVarType, _, _, _,
+            OrigPredArgTypes),
+        list.length(OrigPredArgTypes, Arity),
+        (
+            search_pred_inst_info(ModuleInfo, ExpectInst, PredOrFunc, Arity,
+                ExpectPredInstInfo)
+        ->
+            instmap_lookup_var(InstMapBefore, OrigVar, OrigVarInst),
+            lookup_pred_inst_info(ModuleInfo, OrigVarInst, PredOrFunc, Arity,
+                OrigPredInstInfo),
+            get_ho_arg_regs(ExpectPredInstInfo, ArgPredArgTypes,
+                ExpectArgRegs),
+            get_ho_arg_regs(OrigPredInstInfo, OrigPredArgTypes,
+                OrigArgRegs),
+            ( OrigArgRegs = ExpectArgRegs ->
+                Var = OrigVar
+            ;
+                create_reg_wrapper(OrigVar, OrigPredInstInfo, ExpectArgRegs,
+                    OrigArgRegs, Context, Var, UnifyGoal, !Info),
+                list.cons(UnifyGoal, !WrapGoals)
+            )
+        ;
+            PredInfo = !.Info ^ li_pred_info,
+            VarSet = !.Info ^ li_varset,
+            maybe_report_missing_pred_inst(PredInfo, VarSet, OrigVar, Context,
+                OrigPredArgTypes, ArgPredArgTypes, !Specs),
+            Var = OrigVar
+        )
+    ;
+        Var = OrigVar
+    ).
+
+:- pred lookup_pred_inst_info(module_info::in, mer_inst::in, pred_or_func::in,
+    int::in, pred_inst_info::out) is det.
+
+lookup_pred_inst_info(ModuleInfo, Inst, PredOrFunc, Arity, PredInstInfo) :-
+    (
+        search_pred_inst_info(ModuleInfo, Inst, PredOrFunc, Arity,
+            PredInstInfo0)
+    ->
+        PredInstInfo = PredInstInfo0
+    ;
+        unexpected($module, $pred, "no higher order inst")
+    ).
+
+:- pred search_pred_inst_info(module_info::in, mer_inst::in, pred_or_func::in,
+    int::in, pred_inst_info::out) is semidet.
+
+search_pred_inst_info(ModuleInfo, Inst, PredOrFunc, Arity, PredInstInfo) :-
+    ( search_pred_inst_info_2(ModuleInfo, Inst, PredInstInfo0) ->
+        PredInstInfo = PredInstInfo0
+    ;
+        PredOrFunc = pf_function,
+        PredInstInfo = pred_inst_info_standard_func_mode(Arity)
+    ).
+
+:- pred search_pred_inst_info_2(module_info::in, mer_inst::in,
+    pred_inst_info::out) is semidet.
+
+search_pred_inst_info_2(ModuleInfo, Inst, PredInstInfo) :-
+    (
+        Inst = any(_, higher_order(PredInstInfo))
+    ;
+        Inst = ground(_, higher_order(PredInstInfo))
+    ;
+        Inst = defined_inst(InstName),
+        inst_lookup(ModuleInfo, InstName, InstB),
+        search_pred_inst_info_2(ModuleInfo, InstB, PredInstInfo)
+    ).
+
+:- pred get_ho_arg_regs(pred_inst_info::in, list(mer_type)::in,
+    list(ho_arg_reg)::out) is det.
+
+get_ho_arg_regs(PredInstInfo, ArgTypes, ArgRegs) :-
+    PredInstInfo = pred_inst_info(_, _, MaybeArgRegs, _),
+    (
+        MaybeArgRegs = yes(ArgRegs)
+    ;
+        MaybeArgRegs = no,
+        list.map(ho_arg_reg_for_type, ArgTypes, ArgRegs)
+    ).
+
+    % Emit an error if a higher-order inst cannot be found for the variable,
+    % but only if any of its arguments are floats.  We want to avoid reporting
+    % errors for code which simply copies a higher-order term when updating an
+    % unrelated structure field.
+    %
+    % XXX improve the conditions for which an error is reported
+    %
+:- pred maybe_report_missing_pred_inst(pred_info::in, prog_varset::in,
+    prog_var::in, prog_context::in, list(mer_type)::in, list(mer_type)::in,
+    list(error_spec)::in, list(error_spec)::out) is det.
+
+maybe_report_missing_pred_inst(PredInfo, VarSet, Var, Context,
+        ArgTypesA, ArgTypesB, !Specs) :-
+    (
+        ( list.member(float_type, ArgTypesA)
+        ; list.member(float_type, ArgTypesB)
+        ),
+        % Ignore special predicates.
+        not pred_info_get_origin(PredInfo, origin_special_pred(_))
+    ->
+        Spec = report_missing_higher_order_inst(PredInfo, VarSet, Var,
+            Context),
+        list.cons(Spec, !Specs)
+    ;
+        true
+    ).
+
+:- func report_missing_higher_order_inst(pred_info, prog_varset, prog_var,
+    prog_context) = error_spec.
+
+report_missing_higher_order_inst(PredInfo, VarSet, Var, Context) = Spec :-
+    PredPieces = describe_one_pred_info_name(should_module_qualify, PredInfo),
+    varset.lookup_name(VarSet, Var, VarName),
+    InPieces = [words("In") | PredPieces] ++ [suffix(":"), nl],
+    ErrorPieces = [
+        words("error: missing higher-order inst for variable"), quote(VarName),
+        suffix("."), nl
+    ],
+    VerbosePieces = [
+        words("Please provide the higher-order inst to ensure correctness"),
+        words("of the generated code in this grade.")
+    ],
+    Msg = simple_msg(Context, [
+        always(InPieces),
+        always(ErrorPieces),
+        verbose_only(VerbosePieces)
+    ]),
+    Spec = error_spec(severity_error, phase_code_gen, [Msg]).
+
+%-----------------------------------------------------------------------------%
+
+    % At the end of a branching goal, non-local variables bound to higher-order
+    % terms must agree on the calling convention across all branches.
+    % When there is disagreement for a variable, we simply choose one calling
+    % convention, then create wrappers in each branch to make that conform to
+    % the chosen calling convention.
+    %
+    % Currently the choice of calling convention is arbitrary and may lead to
+    % more wrappers, or more boxing of floats, than if we had chosen another
+    % calling convention.  This kind of code should be rare.
+    %
+:- pred common_instmap_delta(instmap::in, set_of_progvar::in,
+    list(instmap)::in, instmap_delta::out, lambda_info::in, lambda_info::out)
+    is det.
+
+common_instmap_delta(InstMap0, NonLocals, InstMaps, CommonDelta, !Info) :-
+    list.filter_map(
+        (pred(InstMap::in, Delta::out) is semidet :-
+            instmap_is_reachable(InstMap),
+            compute_instmap_delta(InstMap0, InstMap, NonLocals, Delta)
+        ), InstMaps, InstMapDeltas),
+    (
+        InstMapDeltas = [],
+        instmap_delta_init_unreachable(CommonDelta)
+    ;
+        InstMapDeltas = [_ | _],
+        VarTypes = !.Info ^ li_vartypes,
+        ModuleInfo0 = !.Info ^ li_module_info,
+        merge_instmap_deltas(InstMap0, NonLocals, VarTypes, InstMapDeltas,
+            CommonDelta, ModuleInfo0, ModuleInfo),
+        !Info ^ li_module_info := ModuleInfo
+    ).
+
+:- pred fix_branching_goal(assoc_list(prog_var, mer_inst)::in,
+    hlds_goal::in, instmap::in, hlds_goal::out,
+    lambda_info::in, lambda_info::out,
+    list(error_spec)::in, list(error_spec)::out) is det.
+
+fix_branching_goal(VarsExpectInsts, Goal0, GoalInstMap0, Goal, !Info,
+        !Specs) :-
+    ( instmap_is_reachable(GoalInstMap0) ->
+        % GoalInstMap0 is the instmap at the end of Goal0.
+        Goal0 = hlds_goal(_, GoalInfo0),
+        Context = goal_info_get_context(GoalInfo0),
+        match_vars_insts(VarsExpectInsts, GoalInstMap0, Context,
+            map.init, Renaming, [], WrapGoals0, !Info, !Specs),
+        (
+            WrapGoals0 = [],
+            Goal = Goal0
+        ;
+            WrapGoals0 = [_ | _],
+            conjoin_goal_and_goal_list(Goal0, WrapGoals0, Goal1),
+            rename_some_vars_in_goal(Renaming, Goal1, Goal)
+        )
+    ;
+        Goal = Goal0
+    ).
+
+:- pred fix_case_goal(assoc_list(prog_var, mer_inst)::in,
+    case::in, instmap::in, case::out, lambda_info::in, lambda_info::out,
+    list(error_spec)::in, list(error_spec)::out) is det.
+
+fix_case_goal(VarsExpectInsts, Case0, GoalInstMap0, Case, !Info, !Specs) :-
+    Case0 = case(MainConsId, OtherConsIds, Goal0),
+    fix_branching_goal(VarsExpectInsts, Goal0, GoalInstMap0, Goal, !Info,
+        !Specs),
+    Case = case(MainConsId, OtherConsIds, Goal).
+
+:- pred match_vars_insts(assoc_list(prog_var, mer_inst)::in, instmap::in,
+    prog_context::in, prog_var_renaming::in, prog_var_renaming::out,
+    list(hlds_goal)::in, list(hlds_goal)::out,
+    lambda_info::in, lambda_info::out,
+    list(error_spec)::in, list(error_spec)::out) is det.
+
+match_vars_insts(VarsExpectInsts, InstMap0, Context, !Renaming, !WrapGoals,
+        !Info, !Specs) :-
+    (
+        VarsExpectInsts = []
+    ;
+        VarsExpectInsts = [Var - Inst | Tail],
+        match_var_inst(Var, Inst, InstMap0, Context,
+            !Renaming, !WrapGoals, !Info, !Specs),
+        match_vars_insts(Tail, InstMap0, Context,
+            !Renaming, !WrapGoals, !Info, !Specs)
+    ).
+
+:- pred match_var_inst(prog_var::in, mer_inst::in, instmap::in,
+    prog_context::in, prog_var_renaming::in, prog_var_renaming::out,
+    list(hlds_goal)::in, list(hlds_goal)::out,
+    lambda_info::in, lambda_info::out,
+    list(error_spec)::in, list(error_spec)::out) is det.
+
+match_var_inst(Var, ExpectInst, InstMap0, Context, !Renaming, !WrapGoals,
+        !Info, !Specs) :-
+    ModuleInfo = !.Info ^ li_module_info,
+    VarTypes = !.Info ^ li_vartypes,
+    ( inst_is_free(ModuleInfo, ExpectInst) ->
+        true
+    ;
+        map.lookup(VarTypes, Var, VarType),
+        match_arg(InstMap0, Context, VarType, ExpectInst, Var, SubstVar,
+            [], WrapGoals, !Info, !Specs),
+        ( Var = SubstVar ->
+            true
+        ;
+            map.det_insert(Var, SubstVar, !Renaming),
+            map.det_insert(SubstVar, Var, !Renaming),
+            list.append(WrapGoals, !WrapGoals)
+        )
+    ).
+
+%-----------------------------------------------------------------------------%
+
+:- pred create_reg_wrapper(prog_var::in, pred_inst_info::in,
+    list(ho_arg_reg)::in, list(ho_arg_reg)::in, prog_context::in,
+    prog_var::out, hlds_goal::out, lambda_info::in, lambda_info::out) is det.
+
+create_reg_wrapper(OrigVar, OrigVarPredInstInfo, OuterArgRegs, InnerArgRegs,
+        Context, Var, UnifyGoal, !Info) :-
+    VarSet0 = !.Info ^ li_varset,
+    VarTypes0 = !.Info ^ li_vartypes,
+    ModuleInfo0 = !.Info ^ li_module_info,
+
+    map.lookup(VarTypes0, OrigVar, OrigVarType),
+    type_is_higher_order_details_det(OrigVarType, Purity, PredOrFunc,
+        EvalMethod, PredArgTypes),
+
+    % Create variables for the head variables of the wrapper procedure.
+    % These are also the variables in the call in the procedure body.
+    create_fresh_vars(PredArgTypes, CallVars, VarSet0, VarSet1,
+        VarTypes0, VarTypes1),
+    list.length(CallVars, Arity),
+
+    % Create the in the body of the wrapper procedure.
+    CallVar = OrigVar,
+    OrigVarPredInstInfo = pred_inst_info(_, ArgModes, _, Determinism),
+    GenericCall = higher_order(CallVar, Purity, PredOrFunc, Arity),
+    CallGoalExpr = generic_call(GenericCall, CallVars, ArgModes,
+        yes(InnerArgRegs), Determinism),
+    CallNonLocals = set_of_var.list_to_set([CallVar | CallVars]),
+    instmap_delta_from_mode_list(CallVars, ArgModes, ModuleInfo0,
+        CallInstMapDelta),
+    goal_info_init(CallNonLocals, CallInstMapDelta, Determinism, Purity,
+        Context, CallGoalInfo),
+    CallGoal = hlds_goal(CallGoalExpr, CallGoalInfo),
+
+    % Create the replacement variable Var.
+    varset.new_var(Var, VarSet1, VarSet),
+    map.det_insert(Var, OrigVarType, VarTypes1, VarTypes),
+    !Info ^ li_varset := VarSet,
+    !Info ^ li_vartypes := VarTypes,
+
+    % RegR_HeadVars are the wrapper procedure's headvars which must use regular
+    % registers.
+    list.foldl_corresponding(make_reg_r_headvars(VarTypes),
+        CallVars, OuterArgRegs, set_of_var.init, RegR_HeadVars),
+
+    % Create the wrapper procedure.
+    DummyPPId = proc(invalid_pred_id, invalid_proc_id),
+    DummyShroudedPPId = shroud_pred_proc_id(DummyPPId),
+    ConsId = closure_cons(DummyShroudedPPId, EvalMethod),
+    InInst = ground(shared, higher_order(OrigVarPredInstInfo)),
+    UniModes0 = [(InInst - InInst) -> (InInst - InInst)],
+    Unification0 = construct(Var, ConsId, LambdaNonLocals, UniModes0,
+        construct_dynamically, cell_is_shared, no_construct_sub_info),
+    LambdaNonLocals = [CallVar],
+    lambda.expand_lambda(Purity, ho_ground, PredOrFunc, EvalMethod,
+        reg_wrapper_proc(RegR_HeadVars), CallVars, ArgModes, Determinism,
+        LambdaNonLocals, CallGoal, Unification0, Functor, Unification, !Info),
+
+    % Create the unification goal for Var.
+    UnifyMode = (out_mode - in_mode),
+    MainContext = umc_implicit("reg_wrapper"),
+    UnifyContext = unify_context(MainContext, []),
+    UnifyGoalExpr = unify(Var, Functor, UnifyMode, Unification, UnifyContext),
+    UnifyNonLocals = set_of_var.make_singleton(Var),
+    UnifyPredInstInfo = pred_inst_info(PredOrFunc, ArgModes, yes(OuterArgRegs),
+        Determinism),
+    UnifyPredVarInst = ground(shared, higher_order(UnifyPredInstInfo)),
+    UnifyInstMapDelta = instmap_delta_from_assoc_list([
+        Var - UnifyPredVarInst]),
+    goal_info_init(UnifyNonLocals, UnifyInstMapDelta, detism_det,
+        purity_pure, UnifyGoalInfo),
+    UnifyGoal = hlds_goal(UnifyGoalExpr, UnifyGoalInfo),
+
+    !Info ^ li_recompute_nonlocals := yes.
+
+:- pred create_fresh_vars(list(mer_type)::in, list(prog_var)::out,
+    prog_varset::in, prog_varset::out, vartypes::in, vartypes::out) is det.
+
+create_fresh_vars([], [], !VarSet, !VarTypes).
+create_fresh_vars([Type | Types], [Var | Vars], !VarSet, !VarTypes) :-
+    varset.new_var(Var, !VarSet),
+    map.det_insert(Var, Type, !VarTypes),
+    create_fresh_vars(Types, Vars, !VarSet, !VarTypes).
+
+:- pred make_reg_r_headvars(vartypes::in, prog_var::in, ho_arg_reg::in,
+    set_of_progvar::in, set_of_progvar::out) is det.
+
+make_reg_r_headvars(VarTypes, Var, RegType, !RegR_HeadVars) :-
+    (
+        RegType = ho_arg_reg_r,
+        map.lookup(VarTypes, Var, VarType),
+        ( VarType = float_type ->
+            set_of_var.insert(Var, !RegR_HeadVars)
+        ;
+            true
+        )
+    ;
+        RegType = ho_arg_reg_f
+    ).
+
+%-----------------------------------------------------------------------------%
+
+:- pred split_list_from_end(int::in, list(T)::in, list(T)::out, list(T)::out)
+    is det.
+
+split_list_from_end(EndLen, List, Start, End) :-
+    list.length(List, Len),
+    StartLen = Len - EndLen,
+    ( StartLen = 0 ->
+        Start = [],
+        End = List
+    ; StartLen > 0 ->
+        list.det_split_list(StartLen, List, Start, End)
+    ;
+        unexpected($module, $pred, "list too short")
+    ).
+
+%-----------------------------------------------------------------------------%
+:- end_module transform_hlds.float_regs.
+%-----------------------------------------------------------------------------%
diff --git a/compiler/follow_code.m b/compiler/follow_code.m
index 5a7bc8f..890163b 100644
--- a/compiler/follow_code.m
+++ b/compiler/follow_code.m
@@ -149,7 +149,7 @@ move_follow_code_in_goal(Goal0, Goal, RttiVarMaps, !Changed) :-
         GoalExpr = scope(Reason, SubGoal),
         Goal = hlds_goal(GoalExpr, GoalInfo)
     ;
-        ( GoalExpr0 = generic_call(_, _, _, _)
+        ( GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = plain_call(_, _, _, _, _, _)
         ; GoalExpr0 = unify(_, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
diff --git a/compiler/follow_vars.m b/compiler/follow_vars.m
index 9097455..84da20c 100644
--- a/compiler/follow_vars.m
+++ b/compiler/follow_vars.m
@@ -250,7 +250,7 @@ find_follow_vars_in_goal_expr(GoalExpr0, GoalExpr, !GoalInfo,
                 !:FollowVarsMap, !:NextNonReservedR, !:NextNonReservedF)
         )
     ;
-        GoalExpr0 = generic_call(GenericCall, Args, Modes, Det),
+        GoalExpr0 = generic_call(GenericCall, Args, Modes, MaybeArgRegs, Det),
         GoalExpr = GoalExpr0,
         (
             GenericCall = cast(_)
@@ -262,20 +262,27 @@ find_follow_vars_in_goal_expr(GoalExpr0, GoalExpr, !GoalInfo,
             ),
             determinism_to_code_model(Det, CodeModel),
             map.apply_to_list(Args, VarTypes, Types),
-            % Generic calls use rN registers for all arguments.
-            make_arg_infos(Types, Modes, CodeModel, ModuleInfo, reg_r,
+            generic_call_arg_reg_types(ModuleInfo, VarTypes, GenericCall,
+                Args, MaybeArgRegs, ArgRegTypes),
+            make_arg_infos(Types, Modes, ArgRegTypes, CodeModel, ModuleInfo,
                 ArgInfos),
             assoc_list.from_corresponding_lists(Args, ArgInfos, ArgsInfos),
+            % XXX use arg_info.generic_call_arg_reg_types?
             arg_info.partition_args(ArgsInfos, InVarInfos, _),
-            assoc_list.keys(InVarInfos, InVars),
+            list.filter(is_reg_r_arg, InVarInfos, InVarInfosR, InVarInfosF),
+            assoc_list.keys(InVarInfosR, InVarsR),
+            assoc_list.keys(InVarInfosF, InVarsF),
             module_info_get_globals(ModuleInfo, Globals),
             call_gen.generic_call_info(Globals, GenericCall,
-                length(InVars), _, SpecifierArgInfos, FirstInput, _),
+                length(InVarsR), length(InVarsF), _, SpecifierArgInfos,
+                FirstInputR, _),
+            FirstInputF = 1,
             find_follow_vars_from_arginfo(SpecifierArgInfos,
                 map.init, !:FollowVarsMap, 1, _, 1, _),
-            find_follow_vars_from_generic_in_vars(InVars, !FollowVarsMap,
-                FirstInput, !:NextNonReservedR),
-            !:NextNonReservedF = 1
+            find_follow_vars_from_generic_in_vars(reg_r, InVarsR,
+                !FollowVarsMap, FirstInputR, !:NextNonReservedR),
+            find_follow_vars_from_generic_in_vars(reg_f, InVarsF,
+                !FollowVarsMap, FirstInputF, !:NextNonReservedF)
         )
     ;
         GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _),
@@ -286,6 +293,10 @@ find_follow_vars_in_goal_expr(GoalExpr0, GoalExpr, !GoalInfo,
         unexpected($module, $pred, "shorthand")
     ).
 
+:- pred is_reg_r_arg(pair(prog_var, arg_info)::in) is semidet.
+
+is_reg_r_arg(_ - arg_info(reg(reg_r, _), _)).
+
 %-----------------------------------------------------------------------------%
 
 :- pred find_follow_vars_in_call(pred_id::in, proc_id::in, list(prog_var)::in,
@@ -338,16 +349,14 @@ find_follow_vars_from_arginfo([ArgVar - arg_info(ArgLoc, Mode) | ArgsInfos],
 
 %-----------------------------------------------------------------------------%
 
-    % Generic calls use regular registers (rN) only.
-    %
-:- pred find_follow_vars_from_generic_in_vars(list(prog_var)::in,
+:- pred find_follow_vars_from_generic_in_vars(reg_type::in, list(prog_var)::in,
     abs_follow_vars_map::in, abs_follow_vars_map::out, int::in, int::out)
     is det.
 
-find_follow_vars_from_generic_in_vars([], !FollowVarsMap, !NextRegR).
-find_follow_vars_from_generic_in_vars([InVar | InVars], !FollowVarsMap,
-        !NextRegR) :-
-    Locn = abs_reg(reg_r, !.NextRegR),
+find_follow_vars_from_generic_in_vars(_RegType, [], !FollowVarsMap, !NextReg).
+find_follow_vars_from_generic_in_vars(RegType, [InVar | InVars],
+        !FollowVarsMap, !NextReg) :-
+    Locn = abs_reg(RegType, !.NextReg),
     ( map.insert(InVar, Locn, !FollowVarsMap) ->
         true    % FollowVarsMap is updated
     ;
@@ -358,8 +367,9 @@ find_follow_vars_from_generic_in_vars([InVar | InVars], !FollowVarsMap,
         % variable.
         true    % FollowVarsMap is not updated
     ),
-    !:NextRegR = !.NextRegR + 1,
-    find_follow_vars_from_generic_in_vars(InVars, !FollowVarsMap, !NextRegR).
+    !:NextReg = !.NextReg + 1,
+    find_follow_vars_from_generic_in_vars(RegType, InVars, !FollowVarsMap,
+        !NextReg).
 
 %-----------------------------------------------------------------------------%
 
diff --git a/compiler/format_call.m b/compiler/format_call.m
index 4baa000..e6124cc 100644
--- a/compiler/format_call.m
+++ b/compiler/format_call.m
@@ -664,7 +664,7 @@ format_call_traverse_conj(ModuleInfo, [Goal | Goals], CurId, !FormatCallSites,
                 !FormatCallSites, !Counter, !ConjMaps, !PredMap, !RelevantVars)
         )
     ;
-        GoalExpr = generic_call(_, _, _, _)
+        GoalExpr = generic_call(_, _, _, _, _)
     ;
         GoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
     ;
@@ -904,7 +904,7 @@ opt_format_call_sites_in_goal(Goal0, Goal, !GoalIdMap,
             set_of_var.difference(!.ToDeleteVars, NonLocals, !:ToDeleteVars)
         )
     ;
-        ( GoalExpr0 = generic_call(_, _, _, _)
+        ( GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
         ),
         Goal = Goal0,
diff --git a/compiler/goal_expr_to_goal.m b/compiler/goal_expr_to_goal.m
index 1a06660..471507f 100644
--- a/compiler/goal_expr_to_goal.m
+++ b/compiler/goal_expr_to_goal.m
@@ -484,7 +484,7 @@ transform_goal_expr_to_goal(LocKind, Expr, Context, Renaming, Goal, !:NumAdded,
         list.duplicate(Arity, in_mode, Modes),
         goal_info_init(Context, GoalInfo),
         Details = event_call(EventName),
-        GoalExpr0 = generic_call(Details, HeadVars, Modes, detism_det),
+        GoalExpr0 = generic_call(Details, HeadVars, Modes, no, detism_det),
         Goal0 = hlds_goal(GoalExpr0, GoalInfo),
         CallId = generic_call_id(gcid_event_call(EventName)),
         insert_arg_unifications(HeadVars, Args, Context, ac_call(CallId),
@@ -551,11 +551,13 @@ transform_goal_expr_to_goal(LocKind, Expr, Context, Renaming, Goal, !:NumAdded,
             ->
                 % Initialize some fields to junk.
                 Modes = [],
+                MaybeArgRegs = no,
                 Det = detism_erroneous,
 
                 GenericCall = higher_order(PredVar, Purity, pf_predicate,
                     Arity),
-                Call = generic_call(GenericCall, RealHeadVars, Modes, Det),
+                Call = generic_call(GenericCall, RealHeadVars, Modes,
+                    MaybeArgRegs, Det),
 
                 hlds_goal.generic_call_id(GenericCall, CallId)
             ;
diff --git a/compiler/goal_form.m b/compiler/goal_form.m
index 110ffcf..96137e8 100644
--- a/compiler/goal_form.m
+++ b/compiler/goal_form.m
@@ -271,7 +271,7 @@ goal_can_throw_2(GoalExpr, _GoalInfo, Result, !ModuleInfo) :-
             Result = can_throw
         )
     ;
-        GoalExpr = generic_call(_, _, _, _),
+        GoalExpr = generic_call(_, _, _, _, _),
         % XXX We should use results form closure analysis here.
         Result = can_throw
     ;
@@ -451,7 +451,7 @@ goal_can_loop_func(MaybeModuleInfo, Goal) = CanLoop :-
             CanLoop = yes
         )
     ;
-        GoalExpr = generic_call(_, _, _, _),
+        GoalExpr = generic_call(_, _, _, _, _),
         % We have no idea whether the called goal can throw exceptions,
         % at least without closure analysis.
         CanLoop = yes
@@ -606,7 +606,7 @@ goal_expr_can_throw(MaybeModuleInfo, GoalExpr) = CanThrow :-
             CanThrow = yes
         )
     ;
-        GoalExpr = generic_call(_, _, _, _),
+        GoalExpr = generic_call(_, _, _, _, _),
         % We have no idea whether the called goal can throw exceptions,
         % at least without closure analysis.
         CanThrow = yes
@@ -686,7 +686,7 @@ goal_is_flat(hlds_goal(GoalExpr, _GoalInfo)) = goal_is_flat_expr(GoalExpr).
 
 goal_is_flat_expr(GoalExpr) = IsFlat :-
     (
-        ( GoalExpr = generic_call(_, _, _, _)
+        ( GoalExpr = generic_call(_, _, _, _, _)
         ; GoalExpr = plain_call(_, _, _, _, _, _)
         ; GoalExpr = unify(_, _, _, _, _)
         ; GoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
@@ -773,7 +773,7 @@ goal_may_allocate_heap_2(GoalExpr, May) :-
             May = yes
         )
     ;
-        GoalExpr = generic_call(_, _, _, _),
+        GoalExpr = generic_call(_, _, _, _, _),
         May = yes
     ;
         GoalExpr = call_foreign_proc(_, _, _, _, _, _, _),
@@ -925,7 +925,7 @@ cannot_fail_before_stack_flush_conj([Goal | Goals]) :-
             GoalExpr = plain_call(_, _, _, BuiltinState, _, _),
             BuiltinState \= inline_builtin
         ;
-            GoalExpr = generic_call(_, _, _, _)
+            GoalExpr = generic_call(_, _, _, _, _)
         )
     ->
         true
@@ -944,7 +944,7 @@ count_recursive_calls(Goal, PredId, ProcId, Min, Max) :-
     Goal = hlds_goal(GoalExpr, _),
     (
         ( GoalExpr = unify(_, _, _, _, _)
-        ; GoalExpr = generic_call(_, _, _, _)
+        ; GoalExpr = generic_call(_, _, _, _, _)
         ; GoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
         ),
         Min = 0,
diff --git a/compiler/goal_path.m b/compiler/goal_path.m
index 6858b31..eed7b2a 100644
--- a/compiler/goal_path.m
+++ b/compiler/goal_path.m
@@ -143,7 +143,7 @@ fill_goal_id_slots(SlotInfo, ContainingGoal, !GoalNum, !ContainingGoalMap,
     map.det_insert(GoalId, ContainingGoal, !ContainingGoalMap),
     (
         ( GoalExpr0 = plain_call(_, _, _, _, _, _)
-        ; GoalExpr0 = generic_call(_, _, _, _)
+        ; GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
         ),
         GoalExpr = GoalExpr0
@@ -380,7 +380,7 @@ fill_goal_path_slots(RevPath0, SlotInfo, Goal0, Goal) :-
         GoalExpr = unify(LHS, RHS,  Mode, Kind, Context)
     ;
         ( GoalExpr0 = plain_call(_, _, _, _, _, _)
-        ; GoalExpr0 = generic_call(_, _, _, _)
+        ; GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
         ),
         GoalExpr = GoalExpr0
diff --git a/compiler/goal_util.m b/compiler/goal_util.m
index c4ee2e0..27a5caa 100644
--- a/compiler/goal_util.m
+++ b/compiler/goal_util.m
@@ -566,7 +566,7 @@ goal_vars_2(Goal, !Set) :-
         ),
         rhs_goal_vars(RHS, !Set)
     ;
-        GoalExpr = generic_call(GenericCall, ArgVars, _, _),
+        GoalExpr = generic_call(GenericCall, ArgVars, _, _, _),
         generic_call_vars(GenericCall, GenericCallVars),
         set_of_var.insert_list(GenericCallVars, !Set),
         set_of_var.insert_list(ArgVars, !Set)
@@ -735,7 +735,7 @@ attach_features_to_goal_expr(Features, InFromGroundTerm,
         GoalExpr0, GoalExpr) :-
     (
         ( GoalExpr0 = plain_call(_, _, _, _, _, _)
-        ; GoalExpr0 = generic_call(_, _, _, _)
+        ; GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = unify(_, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
         ),
@@ -870,7 +870,7 @@ proc_body_is_leaf(hlds_goal(GoalExpr, _)) = IsLeaf :-
         )
     ;
         ( GoalExpr = plain_call(_, _, _, _, _, _)
-        ; GoalExpr = generic_call(_, _, _, _)
+        ; GoalExpr = generic_call(_, _, _, _, _)
         ; GoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
         ),
         IsLeaf = is_not_leaf
@@ -1000,7 +1000,7 @@ cases_size([case(_, _, Goal) | Cases], Size) :-
 goal_expr_size(GoalExpr, Size) :-
     (
         ( GoalExpr = plain_call(_, _, _, _, _, _)
-        ; GoalExpr = generic_call(_, _, _, _)
+        ; GoalExpr = generic_call(_, _, _, _, _)
         ; GoalExpr = unify(_, _, _, _, _)
         ; GoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
         ),
@@ -1228,7 +1228,7 @@ goal_calls_proc_in_list_2(hlds_goal(GoalExpr, _GoalInfo), PredProcIds,
             true
         )
     ;
-        GoalExpr = generic_call(_, _, _, _)
+        GoalExpr = generic_call(_, _, _, _, _)
     ;
         GoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
     ;
@@ -1734,7 +1734,7 @@ generate_cast_with_insts(CastType, InArg, OutArg, InInst, OutInst, Context,
     goal_info_init(NonLocals, InstMapDelta, detism_det, purity_pure, Context,
         GoalInfo),
     GoalExpr = generic_call(cast(CastType), [InArg, OutArg],
-        [in_mode(InInst), out_mode(OutInst)], detism_det),
+        [in_mode(InInst), out_mode(OutInst)], no, detism_det),
     Goal = hlds_goal(GoalExpr, GoalInfo).
 
 %-----------------------------------------------------------------------------%
@@ -1772,7 +1772,7 @@ goal_is_atomic(Goal, GoalIsAtomic) :-
     (
         ( GoalExpr = unify(_, _, _, _, _)
         ; GoalExpr = plain_call(_, _, _, _, _, _)
-        ; GoalExpr = generic_call(_, _, _, _)
+        ; GoalExpr = generic_call(_, _, _, _, _)
         ; GoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
         ),
         GoalIsAtomic = goal_is_atomic
@@ -1805,7 +1805,7 @@ maybe_strip_equality_pretest(Goal0) = Goal :-
     (
         ( GoalExpr0 = unify(_, _, _, _, _)
         ; GoalExpr0 = plain_call(_, _, _, _, _, _)
-        ; GoalExpr0 = generic_call(_, _, _, _)
+        ; GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
         ),
         Goal = Goal0
@@ -1898,7 +1898,7 @@ maybe_transform_goal_at_goal_path(TransformP, TargetGoalPath,
         (
             ( GoalExpr0 = unify(_, _, _, _, _) 
             ; GoalExpr0 = plain_call(_, _, _, _, _, _)
-            ; GoalExpr0 = generic_call(_, _, _, _)
+            ; GoalExpr0 = generic_call(_, _, _, _, _)
             ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
             ),
             % This search should never reach an atomic goal.
@@ -2068,7 +2068,7 @@ maybe_transform_goal_at_goal_path_with_instmap(TransformP, TargetGoalPath,
         (
             ( GoalExpr0 = unify(_, _, _, _, _) 
             ; GoalExpr0 = plain_call(_, _, _, _, _, _)
-            ; GoalExpr0 = generic_call(_, _, _, _)
+            ; GoalExpr0 = generic_call(_, _, _, _, _)
             ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
             ),
             % This search should never reach an atomic goal.
@@ -2246,7 +2246,7 @@ transform_all_goals(TransformP, Goal0, Goal) :-
     (
         ( GoalExpr0 = unify(_, _, _, _, _) 
         ; GoalExpr0 = plain_call(_, _, _, _, _, _)
-        ; GoalExpr0 = generic_call(_, _, _, _)
+        ; GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
         ),
         GoalExpr = GoalExpr0
diff --git a/compiler/granularity.m b/compiler/granularity.m
index 306c6d8..22da69a 100644
--- a/compiler/granularity.m
+++ b/compiler/granularity.m
@@ -200,7 +200,7 @@ runtime_granularity_test_in_goal(Goal0, Goal, !Changed, SCC, ModuleInfo) :-
         )
     ;
         ( GoalExpr0 = plain_call(_, _, _, _, _, _)
-        ; GoalExpr0 = generic_call(_, _, _, _)
+        ; GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
         ; GoalExpr0 = unify(_, _, _, _, _)
         ),
diff --git a/compiler/handle_options.m b/compiler/handle_options.m
index cc80629..285dbc9 100644
--- a/compiler/handle_options.m
+++ b/compiler/handle_options.m
@@ -1308,8 +1308,8 @@ convert_options_to_globals(OptionTable0, Target, GC_Method, TagsMethod0,
 
     % We only use the float registers if floats would not fit into the
     % regular registers.
-    option_implies(unboxed_float, use_float_registers, bool(no),
-        !Globals),
+    option_implies(highlevel_code, use_float_registers, bool(no), !Globals),
+    option_implies(unboxed_float, use_float_registers, bool(no), !Globals),
 
     % Changing this means that the code in make_hlds_passes.m that
     % handles the declarations for the global variables used by
diff --git a/compiler/hhf.m b/compiler/hhf.m
index 3208c28..1512e70 100644
--- a/compiler/hhf.m
+++ b/compiler/hhf.m
@@ -240,7 +240,7 @@ convert_goal_expr_to_hhf(NonLocals, GoalInfo, GoalExpr0, GoalExpr, !HI) :-
         GoalExpr0 = plain_call(_, _, _, _, _, _),
         GoalExpr = GoalExpr0
     ;
-        GoalExpr0 = generic_call(_, _, _, _),
+        GoalExpr0 = generic_call(_, _, _, _, _),
         GoalExpr = GoalExpr0
     ;
         GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _),
diff --git a/compiler/higher_order.m b/compiler/higher_order.m
index 7f0dcc0..5ed605d 100644
--- a/compiler/higher_order.m
+++ b/compiler/higher_order.m
@@ -554,7 +554,7 @@ ho_traverse_goal(Goal0, Goal, !Info) :-
         GoalExpr = switch(Var, CanFail, Cases),
         Goal = hlds_goal(GoalExpr, GoalInfo0)
     ;
-        GoalExpr0 = generic_call(GenericCall, Args, _, _),
+        GoalExpr0 = generic_call(GenericCall, Args, _, _, _),
         % Check whether this call could be specialized.
         (
             (
@@ -3132,9 +3132,10 @@ construct_higher_order_terms(ModuleInfo, HeadVars0, NewHeadVars, ArgModes0,
         ;
             unexpected($module, $pred, "list.split_list failed.")
         ),
+
         proc_info_interface_determinism(CalledProcInfo, ProcDetism),
         GroundInstInfo = higher_order(pred_inst_info(PredOrFunc,
-            NonCurriedArgModes, ProcDetism))
+            NonCurriedArgModes, no, ProcDetism))
     ;
         in_mode(InMode),
         GroundInstInfo = none,
diff --git a/compiler/hlds_desc.m b/compiler/hlds_desc.m
index 412a196..9983d0a 100644
--- a/compiler/hlds_desc.m
+++ b/compiler/hlds_desc.m
@@ -92,7 +92,7 @@ describe_goal(ModuleInfo, VarSet, Goal) = FullDesc :-
         GoalExpr = plain_call(_, _, Args, _, _, SymName),
         Desc = sym_name_to_string(SymName) ++ describe_args(VarSet, Args)
     ;
-        GoalExpr = generic_call(GCall, Args, _, _),
+        GoalExpr = generic_call(GCall, Args, _, _, _),
         (
             GCall = higher_order(Var, _, _, _),
             Desc = describe_var(VarSet, Var) ++ describe_args(VarSet, Args)
diff --git a/compiler/hlds_goal.m b/compiler/hlds_goal.m
index bc07c7f..a1ee238 100644
--- a/compiler/hlds_goal.m
+++ b/compiler/hlds_goal.m
@@ -123,6 +123,12 @@
                 % this field is junk until after mode analysis.
                 gcall_modes         :: list(mer_mode),
 
+                gcall_reg_types     :: maybe(list(ho_arg_reg)),
+                                    % The register type to use for each of the
+                                    % arguments. This is only needed when float
+                                    % registers exist, and is only set after
+                                    % the float reg wrappers pass.
+
                 % The determinism of the call.
                 gcall_detism        :: determinism
             )
@@ -216,7 +222,7 @@
 :- inst goal_expr_plain_call
     --->    plain_call(ground, ground, ground, ground, ground, ground).
 :- inst goal_expr_generic_call
-    --->    generic_call(ground, ground, ground, ground).
+    --->    generic_call(ground, ground, ground, ground, ground).
 :- inst goal_expr_foreign_proc
     --->    call_foreign_proc(ground, ground, ground, ground, ground,
                 ground, ground).
@@ -2589,10 +2595,10 @@ rename_vars_in_goal_expr(Must, Subn, Expr0, Expr) :-
         rename_vars_in_goal(Must, Subn, Goal0, Goal),
         Expr = scope(Reason, Goal)
     ;
-        Expr0 = generic_call(GenericCall0, Args0, Modes, Det),
+        Expr0 = generic_call(GenericCall0, Args0, Modes, MaybeArgRegs, Det),
         rename_generic_call(Must, Subn, GenericCall0, GenericCall),
         rename_var_list(Must, Subn, Args0, Args),
-        Expr = generic_call(GenericCall, Args, Modes, Det)
+        Expr = generic_call(GenericCall, Args, Modes, MaybeArgRegs, Det)
     ;
         Expr0 = plain_call(PredId, ProcId, Args0, Builtin, Context, Sym),
         rename_var_list(Must, Subn, Args0, Args),
@@ -2827,10 +2833,10 @@ incremental_rename_vars_in_goal_expr(Subn, SubnUpdates, Expr0, Expr) :-
         incremental_rename_vars_in_goal(Subn, SubnUpdates, Goal0, Goal),
         Expr = scope(Reason, Goal)
     ;
-        Expr0 = generic_call(GenericCall0, Args0, Modes, Det),
+        Expr0 = generic_call(GenericCall0, Args0, Modes, MaybeArgRegs, Det),
         rename_generic_call(need_not_rename, Subn, GenericCall0, GenericCall),
         rename_var_list(need_not_rename, Subn, Args0, Args),
-        Expr = generic_call(GenericCall, Args, Modes, Det)
+        Expr = generic_call(GenericCall, Args, Modes, MaybeArgRegs, Det)
     ;
         Expr0 = plain_call(PredId, ProcId, Args0, Builtin, Context, Sym),
         rename_var_list(need_not_rename, Subn, Args0, Args),
@@ -3258,7 +3264,7 @@ goal_has_foreign(Goal) = HasForeign :-
     Goal = hlds_goal(GoalExpr, _),
     (
         ( GoalExpr = plain_call(_, _, _, _, _, _)
-        ; GoalExpr = generic_call(_, _, _, _)
+        ; GoalExpr = generic_call(_, _, _, _, _)
         ; GoalExpr = unify(_, _, _, _, _)
         ),
         HasForeign = no
@@ -3342,7 +3348,7 @@ case_list_has_foreign([Case | Cases]) = HasForeign :-
 goal_expr_has_subgoals(GoalExpr) = HasSubGoals :-
     (
         ( GoalExpr = unify(_, _, _, _, _)
-        ; GoalExpr = generic_call(_, _, _, _)
+        ; GoalExpr = generic_call(_, _, _, _, _)
         ; GoalExpr = plain_call(_, _, _, _, _, _)
         ; GoalExpr = call_foreign_proc(_, _, _, _, _, _,  _)
         ),
@@ -3462,7 +3468,7 @@ set_goal_contexts(Context, Goal0, Goal) :-
         GoalExpr = negation(SubGoal)
     ;
         ( GoalExpr0 = plain_call(_, _, _, _, _, _)
-        ; GoalExpr0 = generic_call(_, _, _, _)
+        ; GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = unify(_, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
         ),
diff --git a/compiler/hlds_out_goal.m b/compiler/hlds_out_goal.m
index ad1484a..3886c2b 100644
--- a/compiler/hlds_out_goal.m
+++ b/compiler/hlds_out_goal.m
@@ -803,7 +803,7 @@ write_goal_expr(Info, GoalExpr, ModuleInfo, VarSet,
         write_goal_plain_call(Info, GoalExpr, ModuleInfo, VarSet,
             AppendVarNums, Indent, Follow, TypeQual, !IO)
     ;
-        GoalExpr = generic_call(_, _, _, _),
+        GoalExpr = generic_call(_, _, _, _, _),
         write_goal_generic_call(Info, GoalExpr, ModuleInfo, VarSet,
             AppendVarNums, Indent, Follow, TypeQual, !IO)
     ;
@@ -1340,7 +1340,7 @@ write_sym_name_and_args(PredName, ArgVars, VarSet, AppendVarNums, !IO) :-
 
 write_goal_generic_call(Info, GoalExpr, _ModuleInfo, VarSet,
         AppendVarNums, Indent, Follow, _TypeQual, !IO) :-
-    GoalExpr = generic_call(GenericCall, ArgVars, Modes, _),
+    GoalExpr = generic_call(GenericCall, ArgVars, Modes, MaybeArgRegs, _),
     DumpOptions = Info ^ hoi_dump_hlds_options,
     % XXX we should print more info here
     (
@@ -1349,7 +1349,8 @@ write_goal_generic_call(Info, GoalExpr, _ModuleInfo, VarSet,
             PredOrFunc = pf_predicate,
             ( string.contains_char(DumpOptions, 'l') ->
                 write_indent(Indent, !IO),
-                io.write_string("% higher-order predicate call\n", !IO)
+                io.write_string("% higher-order predicate call\n", !IO),
+                write_ho_arg_regs(Indent, MaybeArgRegs, !IO)
             ;
                 true
             ),
@@ -1362,7 +1363,8 @@ write_goal_generic_call(Info, GoalExpr, _ModuleInfo, VarSet,
             ( string.contains_char(DumpOptions, 'l') ->
                 write_indent(Indent, !IO),
                 io.write_string("% higher-order function application\n",
-                    !IO)
+                    !IO),
+                write_ho_arg_regs(Indent, MaybeArgRegs, !IO)
             ;
                 true
             ),
@@ -1381,7 +1383,8 @@ write_goal_generic_call(Info, GoalExpr, _ModuleInfo, VarSet,
             _MethodId),
         ( string.contains_char(DumpOptions, 'l') ->
             write_indent(Indent, !IO),
-            io.write_string("% class method call\n", !IO)
+            io.write_string("% class method call\n", !IO),
+            write_ho_arg_regs(Indent, MaybeArgRegs, !IO)
         ;
             true
         ),
@@ -1399,7 +1402,8 @@ write_goal_generic_call(Info, GoalExpr, _ModuleInfo, VarSet,
         GenericCall = event_call(EventName),
         ( string.contains_char(DumpOptions, 'l') ->
             write_indent(Indent, !IO),
-            io.write_string("% event call\n", !IO)
+            io.write_string("% event call\n", !IO),
+            write_ho_arg_regs(Indent, MaybeArgRegs, !IO)
         ;
             true
         ),
@@ -1416,7 +1420,8 @@ write_goal_generic_call(Info, GoalExpr, _ModuleInfo, VarSet,
         CastTypeString = cast_type_to_string(CastType),
         ( string.contains_char(DumpOptions, 'l') ->
             write_indent(Indent, !IO),
-            io.write_strings(["% ", CastTypeString, "\n"], !IO)
+            io.write_strings(["% ", CastTypeString, "\n"], !IO),
+            write_ho_arg_regs(Indent, MaybeArgRegs, !IO)
         ;
             true
         ),
@@ -1438,6 +1443,31 @@ write_goal_generic_call(Info, GoalExpr, _ModuleInfo, VarSet,
         io.write_string(Follow, !IO)
     ).
 
+:- pred write_ho_arg_regs(int::in, maybe(list(ho_arg_reg))::in,
+    io::di, io::uo) is det.
+
+write_ho_arg_regs(Indent, MaybeArgRegs, !IO) :-
+    (
+        MaybeArgRegs = yes(ArgRegs),
+        write_indent(Indent, !IO),
+        io.write_string("% arg regs: ", !IO),
+        io.write_list(ArgRegs, ", ", write_ho_arg_reg, !IO),
+        io.nl(!IO)
+    ;
+        MaybeArgRegs = no
+    ).
+
+:- pred write_ho_arg_reg(ho_arg_reg::in, io::di, io::uo) is det.
+
+write_ho_arg_reg(ArgReg, !IO) :-
+    (
+        ArgReg = ho_arg_reg_r,
+        io.write_string("reg_r", !IO)
+    ;
+        ArgReg = ho_arg_reg_f,
+        io.write_string("reg_f", !IO)
+    ).
+
 %-----------------------------------------------------------------------------%
 %
 % Write out calls to foreign procs.
diff --git a/compiler/hlds_out_mode.m b/compiler/hlds_out_mode.m
index 44e8211..8077d76 100644
--- a/compiler/hlds_out_mode.m
+++ b/compiler/hlds_out_mode.m
@@ -206,7 +206,7 @@ inst_to_term_with_context(Inst, Context) = Term :-
 
 ground_pred_inst_info_to_term(_Uniq, PredInstInfo, Context) = Term :-
     % XXX we ignore Uniq
-    PredInstInfo = pred_inst_info(PredOrFunc, Modes, Det),
+    PredInstInfo = pred_inst_info(PredOrFunc, Modes, _, Det),
     (
         PredOrFunc = pf_predicate,
         construct_qualified_term(unqualified("pred"),
@@ -230,7 +230,7 @@ ground_pred_inst_info_to_term(_Uniq, PredInstInfo, Context) = Term :-
 
 any_pred_inst_info_to_term(_Uniq, PredInstInfo, Context) = Term :-
     % XXX we ignore Uniq
-    PredInstInfo = pred_inst_info(PredOrFunc, Modes, Det),
+    PredInstInfo = pred_inst_info(PredOrFunc, Modes, _, Det),
     (
         PredOrFunc = pf_predicate,
         construct_qualified_term(unqualified("any_pred"),
diff --git a/compiler/hlds_out_pred.m b/compiler/hlds_out_pred.m
index 75c3f04..15a86d4 100644
--- a/compiler/hlds_out_pred.m
+++ b/compiler/hlds_out_pred.m
@@ -84,6 +84,7 @@
 :- import_module parse_tree.prog_ctgc.
 :- import_module parse_tree.prog_out.
 :- import_module parse_tree.prog_util.
+:- import_module parse_tree.set_of_var.
 
 :- import_module assoc_list.
 :- import_module int.
@@ -727,6 +728,7 @@ write_proc(Info, Indent, AppendVarNums, ModuleInfo, PredId, ProcId,
     proc_info_get_headvars(Proc, HeadVars),
     proc_info_get_argmodes(Proc, HeadModes),
     proc_info_get_maybe_arglives(Proc, MaybeArgLives),
+    proc_info_get_reg_r_headvars(Proc, RegR_HeadVars),
     proc_info_maybe_arg_info(Proc, MaybeArgInfos),
     proc_info_get_goal(Proc, Goal),
     proc_info_get_context(Proc, ModeContext),
@@ -955,6 +957,15 @@ write_proc(Info, Indent, AppendVarNums, ModuleInfo, PredId, ProcId,
     ;
         MaybeArgLives = no
     ),
+    ( set_of_var.is_non_empty(RegR_HeadVars) ->
+        write_indent(Indent, !IO),
+        io.write_string("% reg_r headvars: ", !IO),
+        io.write_list(set_of_var.to_sorted_list(RegR_HeadVars),
+            ", ", mercury_output_var(VarSet, AppendVarNums), !IO),
+        io.nl(!IO)
+    ;
+        true
+    ),
     (
         string.contains_char(DumpOptions, 'A'),
         MaybeArgInfos = yes(ArgInfos)
diff --git a/compiler/hlds_pred.m b/compiler/hlds_pred.m
index 1d533fc..e5a873c 100644
--- a/compiler/hlds_pred.m
+++ b/compiler/hlds_pred.m
@@ -644,6 +644,8 @@
 :- pred pred_info_get_var_name_remap(pred_info::in,
     map(prog_var, string)::out) is det.
 :- pred pred_info_get_assertions(pred_info::in, set(assert_id)::out) is det.
+:- pred pred_info_get_instance_method_arg_types(pred_info::in,
+    list(mer_type)::out) is det.
 :- pred pred_info_get_clauses_info(pred_info::in, clauses_info::out) is det.
 :- pred pred_info_get_procedures(pred_info::in, proc_table::out) is det.
 
@@ -693,6 +695,8 @@
     pred_info::in, pred_info::out) is det.
 :- pred pred_info_set_assertions(set(assert_id)::in,
     pred_info::in, pred_info::out) is det.
+:- pred pred_info_set_instance_method_arg_types(list(mer_type)::in,
+    pred_info::in, pred_info::out) is det.
 :- pred pred_info_set_clauses_info(clauses_info::in,
     pred_info::in, pred_info::out) is det.
 :- pred pred_info_set_procedures(proc_table::in,
@@ -1041,7 +1045,13 @@ calls_are_fully_qualified(Markers) =
                 var_name_remap      :: map(prog_var, string),
 
                 % List of assertions which mention this predicate.
-                assertions          :: set(assert_id)
+                assertions          :: set(assert_id),
+
+                % If this predicate is a class method implementation, this
+                % list records the argument types before substituting the type
+                % variables for the instance.
+                % XXX does that make sense?
+                instance_method_arg_types :: list(mer_type)
             ).
 
 :- type pred_info
@@ -1117,7 +1127,7 @@ pred_info_init(ModuleName, SymName, Arity, PredOrFunc, Context, Origin, Status,
     PredSubInfo = pred_sub_info(Context, GoalType, Attributes, Kinds,
         ExistQVarBindings, HeadTypeParams, ClassProofs, ClassConstraintMap,
         UnprovenBodyConstraints, inst_graph_info_init, [],
-        VarNameRemap, Assertions),
+        VarNameRemap, Assertions, []),
     PredInfo = pred_info(PredModuleName, PredName, Arity, PredOrFunc,
         Origin, Status, Markers, ArgTypes, TypeVarSet, TypeVarSet,
         ExistQVars, ClassContext, ClausesInfo, Procs, PredSubInfo).
@@ -1158,7 +1168,7 @@ pred_info_create(ModuleName, SymName, PredOrFunc, Context, Origin, Status,
     PredSubInfo = pred_sub_info(Context, goal_type_clause, Attributes, Kinds,
         ExistQVarBindings, HeadTypeParams, ClassProofs, ClassConstraintMap,
         UnprovenBodyConstraints, inst_graph_info_init, [],
-        VarNameRemap, Assertions),
+        VarNameRemap, Assertions, []),
     PredInfo = pred_info(ModuleName, PredName, Arity, PredOrFunc,
         Origin, Status, Markers, ArgTypes, TypeVarSet, TypeVarSet,
         ExistQVars, ClassContext, ClausesInfo, Procs, PredSubInfo).
@@ -1283,6 +1293,8 @@ pred_info_get_inst_graph_info(PI, PI ^ pred_sub_info ^ inst_graph_info).
 pred_info_get_arg_modes_maps(PI, PI ^ pred_sub_info ^ arg_modes_maps).
 pred_info_get_var_name_remap(PI, PI ^ pred_sub_info ^ var_name_remap).
 pred_info_get_assertions(PI, PI ^ pred_sub_info ^ assertions).
+pred_info_get_instance_method_arg_types(PI,
+    PI ^ pred_sub_info ^ instance_method_arg_types).
 pred_info_get_clauses_info(PI, PI ^ clauses_info).
 pred_info_get_procedures(PI, PI ^ procedures).
 
@@ -1326,6 +1338,8 @@ pred_info_set_var_name_remap(X, !PI) :-
     !PI ^ pred_sub_info ^ var_name_remap := X.
 pred_info_set_assertions(X, !PI) :-
     !PI ^ pred_sub_info ^ assertions := X.
+pred_info_set_instance_method_arg_types(X, !PI) :-
+    !PI ^ pred_sub_info ^ instance_method_arg_types := X.
 pred_info_set_clauses_info(X, !PI) :-
     !PI ^ clauses_info := X.
 pred_info_set_procedures(X, !PI) :-
@@ -1997,6 +2011,8 @@ attribute_list_to_attributes(Attributes, Attributes).
 :- pred proc_info_get_is_address_taken(proc_info::in,
     is_address_taken::out) is det.
 :- pred proc_info_get_stack_slots(proc_info::in, stack_slots::out) is det.
+:- pred proc_info_get_reg_r_headvars(proc_info::in, set_of_progvar::out)
+    is det.
 :- pred proc_info_maybe_arg_info(proc_info::in,
     maybe(list(arg_info))::out) is det.
 :- pred proc_info_get_liveness_info(proc_info::in, liveness_info::out) is det.
@@ -2058,6 +2074,8 @@ attribute_list_to_attributes(Attributes, Attributes).
     proc_info::in, proc_info::out) is det.
 :- pred proc_info_set_stack_slots(stack_slots::in,
     proc_info::in, proc_info::out) is det.
+:- pred proc_info_set_reg_r_headvars(set_of_progvar::in,
+    proc_info::in, proc_info::out) is det.
 :- pred proc_info_set_arg_info(list(arg_info)::in,
     proc_info::in, proc_info::out) is det.
 :- pred proc_info_set_liveness_info(liveness_info::in,
@@ -2335,6 +2353,11 @@ attribute_list_to_attributes(Attributes, Attributes).
                 % Allocation of variables to stack slots.
                 stack_slots                 :: stack_slots,
 
+                % The head variables which must be forced to use regular
+                % registers in the calling convention, despite having type
+                % float. This is only meaningful with float registers.
+                reg_r_headvars              :: set_of_progvar,
+
                 % The calling convention of each argument: information computed
                 % by arg_info.m (based on the modes etc.) and used by code
                 % generation to determine how each argument should be passed.
@@ -2542,6 +2565,7 @@ proc_info_init(MContext, Arity, Types, DeclaredModes, Modes, MaybeArgLives,
     ModeErrors = [],
     InferredDet = detism_erroneous,
     map.init(StackSlots),
+    set_of_var.init(RegR_HeadVars),
     set_of_var.init(InitialLiveness),
     ArgInfo = no,
     goal_info_init(GoalInfo),
@@ -2552,8 +2576,9 @@ proc_info_init(MContext, Arity, Types, DeclaredModes, Modes, MaybeArgLives,
     SharingInfo = structure_sharing_info_init,
     ReuseInfo = structure_reuse_info_init,
     ProcSubInfo = proc_sub_info(DetismDecl, no, no, Term2Info, IsAddressTaken,
-        StackSlots, ArgInfo, InitialLiveness, no, no, no, no_tail_call_events,
-        no, no, no, no, no, no, VarNameRemap, [], SharingInfo, ReuseInfo),
+        StackSlots, RegR_HeadVars, ArgInfo, InitialLiveness, no, no,
+        no, no_tail_call_events, no, no, no, no, no, no, VarNameRemap, [],
+        SharingInfo, ReuseInfo),
     ProcInfo = proc_info(MContext, BodyVarSet, BodyTypes, HeadVars, InstVarSet,
         DeclaredModes, Modes, no, MaybeArgLives, MaybeDet, InferredDet,
         ClauseBody, CanProcess, ModeErrors, RttiVarMaps, eval_normal,
@@ -2577,6 +2602,7 @@ proc_info_create_with_declared_detism(Context, VarSet, VarTypes, HeadVars,
         InstVarSet, HeadModes, DetismDecl, MaybeDeclaredDetism, Detism,
         Goal, RttiVarMaps, IsAddressTaken, VarNameRemap, ProcInfo) :-
     map.init(StackSlots),
+    set_of_var.init(RegR_HeadVars),
     set_of_var.init(Liveness),
     MaybeHeadLives = no,
     ModeErrors = [],
@@ -2584,8 +2610,9 @@ proc_info_create_with_declared_detism(Context, VarSet, VarTypes, HeadVars,
     SharingInfo = structure_sharing_info_init,
     ReuseInfo = structure_reuse_info_init,
     ProcSubInfo = proc_sub_info(DetismDecl, no, no, Term2Info, IsAddressTaken,
-        StackSlots, no, Liveness, no, no, no, no_tail_call_events,
-        no, no, no, no, no, no, VarNameRemap, [], SharingInfo, ReuseInfo),
+        StackSlots, RegR_HeadVars, no, Liveness, no, no, no,
+        no_tail_call_events, no, no, no, no, no, no, VarNameRemap, [],
+        SharingInfo, ReuseInfo),
     ProcInfo = proc_info(Context, VarSet, VarTypes, HeadVars,
         InstVarSet, no, HeadModes, no, MaybeHeadLives,
         MaybeDeclaredDetism, Detism, Goal, yes, ModeErrors,
@@ -2618,6 +2645,7 @@ proc_info_get_maybe_termination_info(PI,
     PI ^ proc_sub_info ^ maybe_termination).
 proc_info_get_is_address_taken(PI, PI ^ proc_sub_info ^ is_address_taken).
 proc_info_get_stack_slots(PI, PI ^ proc_sub_info ^ stack_slots).
+proc_info_get_reg_r_headvars(PI, PI ^ proc_sub_info ^ reg_r_headvars).
 proc_info_maybe_arg_info(PI, PI ^ proc_sub_info ^ arg_pass_info).
 proc_info_get_liveness_info(PI, PI ^ proc_sub_info ^ initial_liveness).
 proc_info_get_need_maxfr_slot(PI, PI ^ proc_sub_info ^ need_maxfr_slot).
@@ -2672,6 +2700,8 @@ proc_info_set_address_taken(AT, !PI) :-
     !PI ^ proc_sub_info ^ is_address_taken := AT.
 proc_info_set_stack_slots(SS, !PI) :-
     !PI ^ proc_sub_info ^ stack_slots := SS.
+proc_info_set_reg_r_headvars(RHV, !PI) :-
+    !PI ^ proc_sub_info ^ reg_r_headvars := RHV.
 proc_info_set_arg_info(AP, !PI) :-
     !PI ^ proc_sub_info ^ arg_pass_info := yes(AP).
 proc_info_set_liveness_info(IL, !PI) :-
diff --git a/compiler/hlds_statistics.m b/compiler/hlds_statistics.m
index 15e099f..0baba24 100644
--- a/compiler/hlds_statistics.m
+++ b/compiler/hlds_statistics.m
@@ -163,7 +163,7 @@ accumulate_proc_stats_in_goal(Goal, !UsedVars, !Stats) :-
         set_tree234.insert_list(ExtraArgVars, !UsedVars),
         !Stats ^ ps_foreign_calls := !.Stats ^ ps_foreign_calls + 1
     ;
-        GoalExpr = generic_call(CallKind, ArgVars, _, _),
+        GoalExpr = generic_call(CallKind, ArgVars, _, _, _),
         set_tree234.insert_list(ArgVars, !UsedVars),
         (
             CallKind = higher_order(HOVar, _, _, _),
diff --git a/compiler/implementation_defined_literals.m b/compiler/implementation_defined_literals.m
index 3ddb141..61f4297 100644
--- a/compiler/implementation_defined_literals.m
+++ b/compiler/implementation_defined_literals.m
@@ -186,7 +186,7 @@ subst_literals_in_goal(Info, Goal0, Goal) :-
         Goal = hlds_goal(GoalExpr, GoalInfo0)
     ;
         ( GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
-        ; GoalExpr0 = generic_call(_, _, _, _)
+        ; GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = plain_call(_, _, _, _, _, _)
         ),
         Goal = Goal0
diff --git a/compiler/inlining.m b/compiler/inlining.m
index b8eb05f..3bdb88c 100644
--- a/compiler/inlining.m
+++ b/compiler/inlining.m
@@ -565,7 +565,7 @@ inlining_in_goal(Goal0, Goal, !Info) :-
         inlining_in_call(PredId, ProcId, ArgVars, Builtin,
             Context, Sym, GoalExpr, GoalInfo0, GoalInfo, !Info)
     ;
-        ( GoalExpr0 = generic_call(_, _, _, _)
+        ( GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
         ; GoalExpr0 = unify(_, _, _, _, _)
         ),
diff --git a/compiler/inst_match.m b/compiler/inst_match.m
index 998526a..d57f479 100644
--- a/compiler/inst_match.m
+++ b/compiler/inst_match.m
@@ -799,7 +799,7 @@ ho_inst_info_matches_initial(HOInstInfoA, none, _, _, !Info) :-
     \+ ho_inst_info_is_nonstandard_func_mode(!.Info ^ imi_module_info,
         HOInstInfoA).
 ho_inst_info_matches_initial(none, higher_order(PredInstB), _, Type, !Info) :-
-    PredInstB = pred_inst_info(pf_function, ArgModes, _Det),
+    PredInstB = pred_inst_info(pf_function, ArgModes, _, _Det),
     Arity = list.length(ArgModes),
     PredInstA = pred_inst_info_standard_func_mode(Arity),
     pred_inst_matches_2(PredInstA, PredInstB, Type, !Info).
@@ -827,11 +827,13 @@ pred_inst_matches_1(PredInstA, PredInstB, MaybeType, ModuleInfo) :-
 :- pred pred_inst_matches_2(pred_inst_info::in, pred_inst_info::in,
     maybe(mer_type)::in, inst_match_info::in, inst_match_info::out) is semidet.
 
-pred_inst_matches_2(pred_inst_info(PredOrFunc, ModesA, Det),
-        pred_inst_info(PredOrFunc, ModesB, Det), MaybeType, !Info) :-
+pred_inst_matches_2(PredInstA, PredInstB, MaybeType, !Info) :-
+    PredInstA = pred_inst_info(PredOrFunc, ModesA, _MaybeArgRegsA, Det),
+    PredInstB = pred_inst_info(PredOrFunc, ModesB, _MaybeArgRegsB, Det),
     maybe_get_higher_order_arg_types(MaybeType, length(ModesA),
         MaybeTypes),
     pred_inst_argmodes_matches(ModesA, ModesB, MaybeTypes, !Info).
+    % XXX do we need to match the arg reg lists? I don't know how.
 
     % pred_inst_argmodes_matches(ModesA, ModesB, !Info):
     %
@@ -1083,7 +1085,7 @@ ho_inst_info_matches_final(HOInstInfoA, none, _, !Info) :-
     \+ ho_inst_info_is_nonstandard_func_mode(!.Info ^ imi_module_info,
         HOInstInfoA).
 ho_inst_info_matches_final(none, higher_order(PredInstB), Type, !Info) :-
-    PredInstB = pred_inst_info(pf_function, ArgModes, _Det),
+    PredInstB = pred_inst_info(pf_function, ArgModes, _, _Det),
     Arity = list.length(ArgModes),
     PredInstA = pred_inst_info_standard_func_mode(Arity),
     pred_inst_matches_2(PredInstA, PredInstB, Type, !Info).
@@ -1239,7 +1241,7 @@ inst_matches_binding_3(not_reached, _, _, !Info).
 ho_inst_info_matches_binding(_, none, _, _).
 ho_inst_info_matches_binding(none, higher_order(PredInstB), MaybeType,
         ModuleInfo) :-
-    PredInstB = pred_inst_info(pf_function, ArgModes, _Det),
+    PredInstB = pred_inst_info(pf_function, ArgModes, _, _Det),
     Arity = list.length(ArgModes),
     PredInstA = pred_inst_info_standard_func_mode(Arity),
     pred_inst_matches_1(PredInstA, PredInstB, MaybeType, ModuleInfo).
@@ -1952,7 +1954,7 @@ inst_contains_inst_var(defined_inst(InstName), InstVar) :-
 inst_contains_inst_var(bound(_Uniq, ArgInsts), InstVar) :-
     bound_inst_list_contains_inst_var(ArgInsts, InstVar).
 inst_contains_inst_var(ground(_Uniq, HOInstInfo), InstVar) :-
-    HOInstInfo = higher_order(pred_inst_info(_PredOrFunc, Modes, _Det)),
+    HOInstInfo = higher_order(pred_inst_info(_PredOrFunc, Modes, _, _Det)),
     mode_list_contains_inst_var(Modes, InstVar).
 inst_contains_inst_var(abstract_inst(_Name, ArgInsts), InstVar) :-
     inst_list_contains_inst_var(ArgInsts, InstVar).
diff --git a/compiler/inst_util.m b/compiler/inst_util.m
index bb4fb23..09654fd 100644
--- a/compiler/inst_util.m
+++ b/compiler/inst_util.m
@@ -1817,7 +1817,7 @@ var_inst_contains_any(ModuleInfo, Instmap, Var) :-
 %-----------------------------------------------------------------------------%
 
 pred_inst_info_is_nonstandard_func_mode(ModuleInfo, PredInstInfo) :-
-    PredInstInfo = pred_inst_info(pf_function, ArgModes, _),
+    PredInstInfo = pred_inst_info(pf_function, ArgModes, _, _),
     Arity = list.length(ArgModes),
     \+ pred_inst_matches(PredInstInfo,
         pred_inst_info_standard_func_mode(Arity), ModuleInfo).
@@ -1827,7 +1827,7 @@ ho_inst_info_is_nonstandard_func_mode(ModuleInfo, HOInstInfo) :-
     pred_inst_info_is_nonstandard_func_mode(ModuleInfo, PredInstInfo).
 
 pred_inst_info_standard_func_mode(Arity) =
-        pred_inst_info(pf_function, ArgModes, detism_det) :-
+        pred_inst_info(pf_function, ArgModes, no, detism_det) :-
     in_mode(InMode),
     out_mode(OutMode),
     ArgModes = list.duplicate(Arity - 1, InMode) ++ [OutMode].
diff --git a/compiler/intermod.m b/compiler/intermod.m
index af05d58..ae15af9 100644
--- a/compiler/intermod.m
+++ b/compiler/intermod.m
@@ -493,7 +493,7 @@ intermod_traverse_goal_expr(GoalExpr0, GoalExpr, DoWrite, !Info) :-
         add_proc(PredId, DoWrite, !Info),
         GoalExpr = GoalExpr0
     ;
-        GoalExpr0 = generic_call(CallType, _, _, _),
+        GoalExpr0 = generic_call(CallType, _, _, _, _),
         GoalExpr = GoalExpr0,
         (
             CallType = higher_order(_, _, _, _),
diff --git a/compiler/interval.m b/compiler/interval.m
index 88eb8d0..42be319 100644
--- a/compiler/interval.m
+++ b/compiler/interval.m
@@ -285,14 +285,19 @@ build_interval_info_in_goal(hlds_goal(GoalExpr, GoalInfo), !IntervalInfo,
             build_interval_info_in_goal(SubGoal, !IntervalInfo, !Acc)
         )
     ;
-        GoalExpr = generic_call(GenericCall, ArgVars, ArgModes, _Detism),
+        GoalExpr = generic_call(GenericCall, ArgVars, ArgModes, MaybeArgRegs,
+            _Detism),
         goal_info_get_maybe_need_across_call(GoalInfo, MaybeNeedAcrossCall),
         IntParams = !.IntervalInfo ^ ii_interval_params,
-        VarTypes = IntParams ^ ip_var_types,
-        list.map(map.lookup(VarTypes), ArgVars, ArgTypes),
         ModuleInfo = IntParams ^ ip_module_info,
-        arg_info.compute_in_and_out_vars(ModuleInfo, ArgVars,
-            ArgModes, ArgTypes, InputArgs, _OutputArgs),
+        VarTypes = IntParams ^ ip_var_types,
+        map.apply_to_list(ArgVars, VarTypes, ArgTypes),
+        arg_info.generic_call_arg_reg_types(ModuleInfo, VarTypes, GenericCall,
+            ArgVars, MaybeArgRegs, ArgRegTypes),
+        arg_info.compute_in_and_out_vars_sep_regs(ModuleInfo, ArgVars,
+            ArgModes, ArgTypes, ArgRegTypes, InputArgsR, InputArgsF,
+            _OutputArgsR, _OutputArgsF),
+        list.append(InputArgsR, InputArgsF, InputArgs),
 
         % Casts are generated inline.
         (
@@ -306,7 +311,8 @@ build_interval_info_in_goal(hlds_goal(GoalExpr, GoalInfo), !IntervalInfo,
             ),
             module_info_get_globals(ModuleInfo, Globals),
             call_gen.generic_call_info(Globals, GenericCall,
-                length(InputArgs), _, GenericVarsArgInfos, _, _),
+                length(InputArgsR), length(InputArgsF), _,
+                GenericVarsArgInfos, _, _),
             assoc_list.keys(GenericVarsArgInfos, GenericVars),
             list.append(GenericVars, InputArgs, Inputs),
             build_interval_info_at_call(Inputs, MaybeNeedAcrossCall, GoalInfo,
@@ -978,7 +984,7 @@ record_decisions_in_goal(Goal0, Goal, !VarInfo, !VarRename, InsertMap,
             Goal = hlds_goal(GoalExpr, GoalInfo0)
         )
     ;
-        GoalExpr0 = generic_call(GenericCall, _, _, _),
+        GoalExpr0 = generic_call(GenericCall, _, _, _, _),
         % Casts are generated inline.
         (
             GenericCall = cast(_),
diff --git a/compiler/lambda.m b/compiler/lambda.m
index adac13d..cf3691c 100644
--- a/compiler/lambda.m
+++ b/compiler/lambda.m
@@ -69,8 +69,16 @@
 :- module transform_hlds.lambda.
 :- interface.
 
+:- import_module hlds.hlds_goal.
 :- import_module hlds.hlds_module.
 :- import_module hlds.hlds_pred.
+:- import_module hlds.hlds_rtti.
+:- import_module mdbcomp.prim_data.
+:- import_module parse_tree.prog_data.
+:- import_module parse_tree.set_of_var.
+
+:- import_module bool.
+:- import_module list.
 
 %-----------------------------------------------------------------------------%
 
@@ -80,6 +88,39 @@
     is det.
 
 %-----------------------------------------------------------------------------%
+
+:- interface.
+
+% The following is exported for float_reg.m
+
+:- type lambda_info
+    --->    lambda_info(
+                li_varset               :: prog_varset,
+                li_vartypes             :: vartypes,
+                li_tvarset              :: tvarset,
+                li_inst_varset          :: inst_varset,
+                li_rtti_varmaps         :: rtti_varmaps,
+                li_has_parallel_conj    :: bool,
+                li_pred_info            :: pred_info,
+                li_module_info          :: module_info,
+                % True iff we need to recompute the nonlocals.
+                li_recompute_nonlocals  :: bool,
+                % True if we expanded some lambda expressions.
+                li_have_expanded_lambda :: bool
+            ).
+
+:- type reg_wrapper_proc
+    --->    reg_wrapper_proc(set_of_progvar)
+    ;       not_reg_wrapper_proc.
+
+:- pred expand_lambda(purity::in, ho_groundness::in,
+    pred_or_func::in, lambda_eval_method::in, reg_wrapper_proc::in,
+    list(prog_var)::in, list(mer_mode)::in, determinism::in,
+    list(prog_var)::in, hlds_goal::in, unification::in,
+    unify_rhs::out, unification::out,
+    lambda_info::in, lambda_info::out) is det.
+
+%-----------------------------------------------------------------------------%
 %-----------------------------------------------------------------------------%
 
 :- implementation.
@@ -115,26 +156,6 @@
 :- import_module varset.
 
 %-----------------------------------------------------------------------------%
-
-:- type lambda_info
-    --->    lambda_info(
-                prog_varset,            % from the proc_info
-                vartypes,               % from the proc_info
-                tvarset,                % from the proc_info
-                inst_varset,            % from the proc_info
-                rtti_varmaps,           % from the proc_info
-                pred_markers,           % from the pred_info
-                bool,                   % has_parallel_conj, from the proc_info
-                pred_or_func,
-                string,                 % pred/func name
-                module_info,
-                bool,                   % true iff we need to recompute
-                                        % the nonlocals
-                bool                    % true if we expanded some lambda
-                                        % expressions
-            ).
-
-%-----------------------------------------------------------------------------%
 %
 % This whole section just traverses the module structure
 %
@@ -174,10 +195,7 @@ expand_lambdas_in_proc(PredId, ProcId, !ModuleInfo) :-
 
 expand_lambdas_in_proc_2(!ProcInfo, !PredInfo, !ModuleInfo) :-
     % Grab the appropriate fields from the pred_info and proc_info.
-    PredName = pred_info_name(!.PredInfo),
-    PredOrFunc = pred_info_is_pred_or_func(!.PredInfo),
     pred_info_get_typevarset(!.PredInfo, TypeVarSet0),
-    pred_info_get_markers(!.PredInfo, Markers),
     proc_info_get_headvars(!.ProcInfo, HeadVars),
     proc_info_get_varset(!.ProcInfo, VarSet0),
     proc_info_get_vartypes(!.ProcInfo, VarTypes0),
@@ -190,11 +208,11 @@ expand_lambdas_in_proc_2(!ProcInfo, !PredInfo, !ModuleInfo) :-
 
     % Process the goal.
     Info0 = lambda_info(VarSet0, VarTypes0, TypeVarSet0, InstVarSet0,
-        RttiVarMaps0, Markers, HasParallelConj, PredOrFunc,
-        PredName, !.ModuleInfo, MustRecomputeNonLocals0, HaveExpandedLambdas0),
+        RttiVarMaps0, HasParallelConj, !.PredInfo, !.ModuleInfo,
+        MustRecomputeNonLocals0, HaveExpandedLambdas0),
     expand_lambdas_in_goal(Goal0, Goal1, Info0, Info1),
     Info1 = lambda_info(VarSet1, VarTypes1, TypeVarSet, _InstVarSet,
-        RttiVarMaps1, _, _, _, _, !:ModuleInfo, MustRecomputeNonLocals,
+        RttiVarMaps1, _, _PredInfo, !:ModuleInfo, MustRecomputeNonLocals,
         HaveExpandedLambdas),
 
     % Check if we need to requantify.
@@ -279,7 +297,7 @@ expand_lambdas_in_goal(Goal0, Goal, !Info) :-
         expand_lambdas_in_goal(Else0, Else, !Info),
         GoalExpr = if_then_else(Vars, Cond, Then, Else)
     ;
-        ( GoalExpr0 = generic_call(_, _, _, _)
+        ( GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = plain_call(_, _, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
         ),
@@ -338,9 +356,9 @@ expand_lambdas_in_unify_goal(LHS, RHS0, Mode, Unification0, Context, GoalExpr,
         expand_lambdas_in_goal(LambdaGoal0, LambdaGoal, !Info),
 
         % Then, convert the lambda expression into a new predicate.
-        expand_lambda(Purity, Groundness, PredOrFunc, EvalMethod, Vars,
-            Modes, Det, NonLocalVars, LambdaGoal, Unification0, Y, Unification,
-            !Info),
+        expand_lambda(Purity, Groundness, PredOrFunc, EvalMethod,
+            not_reg_wrapper_proc, Vars, Modes, Det, NonLocalVars, LambdaGoal,
+            Unification0, Y, Unification, !Info),
         GoalExpr = unify(LHS, Y, Mode, Unification, Context)
     ;
         ( RHS0 = rhs_var(_)
@@ -350,17 +368,11 @@ expand_lambdas_in_unify_goal(LHS, RHS0, Mode, Unification0, Context, GoalExpr,
         GoalExpr = unify(LHS, RHS0, Mode, Unification0, Context)
     ).
 
-:- pred expand_lambda(purity::in, ho_groundness::in,
-    pred_or_func::in, lambda_eval_method::in,
-    list(prog_var)::in, list(mer_mode)::in, determinism::in,
-    list(prog_var)::in, hlds_goal::in, unification::in, unify_rhs::out,
-    unification::out, lambda_info::in, lambda_info::out) is det.
-
-expand_lambda(Purity, _Groundness, PredOrFunc, EvalMethod, Vars, Modes,
+expand_lambda(Purity, _Groundness, PredOrFunc, EvalMethod, RegWrapperProc, Vars, Modes,
         Detism, OrigNonLocals0, LambdaGoal, Unification0, Functor, Unification,
         LambdaInfo0, LambdaInfo) :-
     LambdaInfo0 = lambda_info(VarSet, VarTypes, TVarSet,
-        InstVarSet, RttiVarMaps, Markers, HasParallelConj, POF, OrigPredName,
+        InstVarSet, RttiVarMaps, HasParallelConj, OrigPredInfo,
         ModuleInfo0, MustRecomputeNonLocals0, _HaveExpandedLambdas),
 
     % Calculate the constraints which apply to this lambda expression.
@@ -385,15 +397,12 @@ expand_lambda(Purity, _Groundness, PredOrFunc, EvalMethod, Vars, Modes,
 
     (
         Unification0 = construct(Var, _, OrigNonLocals1, UniModes0, _, _, _),
-        % We use to use OrigVars = OrigNonLocals0 (from rhs_lambda_goal) but
-        % the order of the variables does not necessarily match UniModes0.
+        % We use to use OrigVars = OrigNonLocals0 (from the outer
+        % rhs_lambda_goal) but OrigNonLocals0 may be a reordered version of
+        % OrigNonLocals1.
         OrigVars = OrigNonLocals1,
-        trace [compiletime(flag("lambda_var_order"))] (
-            list.sort(OrigNonLocals0, SortedOrigNonLocals0),
-            list.sort(OrigNonLocals1, SortedOrigNonLocals1),
-            expect(unify(SortedOrigNonLocals0, SortedOrigNonLocals1),
-                $module, $pred, "OrigNonLocals0 != OrigNonLocals1")
-        )
+        expect(unify(sort(OrigNonLocals0), sort(OrigNonLocals1) : list(_)),
+            $module, $pred, "ConstructArgs != OrigVars")
     ;
         ( Unification0 = deconstruct(_, _, _, _, _, _)
         ; Unification0 = assign(_, _)
@@ -512,6 +521,7 @@ expand_lambda(Purity, _Groundness, PredOrFunc, EvalMethod, Vars, Modes,
         list.append(ArgVars, Vars, AllArgVars),
 
         module_info_get_name(ModuleInfo0, ModuleName),
+        OrigPredName = pred_info_name(OrigPredInfo),
         OrigContext = goal_info_get_context(LambdaGoalInfo),
         term.context_file(OrigContext, OrigFile),
         term.context_line(OrigContext, OrigLine),
@@ -577,10 +587,17 @@ expand_lambda(Purity, _Groundness, PredOrFunc, EvalMethod, Vars, Modes,
         (
             MustRecomputeNonLocals0 = yes,
             requantify_proc_general(ordinary_nonlocals_maybe_lambda,
-                ProcInfo2, ProcInfo)
+                ProcInfo2, ProcInfo3)
         ;
             MustRecomputeNonLocals0 = no,
-            ProcInfo = ProcInfo2
+            ProcInfo3 = ProcInfo2
+        ),
+        (
+            RegWrapperProc = reg_wrapper_proc(RegR_HeadVars),
+            proc_info_set_reg_r_headvars(RegR_HeadVars, ProcInfo3, ProcInfo)
+        ;
+            RegWrapperProc = not_reg_wrapper_proc,
+            ProcInfo = ProcInfo3
         ),
         set.init(Assertions),
         pred_info_create(ModuleName, PredName, PredOrFunc, LambdaContext,
@@ -603,7 +620,7 @@ expand_lambda(Purity, _Groundness, PredOrFunc, EvalMethod, Vars, Modes,
         construct_dynamically, cell_is_unique, no_construct_sub_info),
     HaveExpandedLambdas = yes,
     LambdaInfo = lambda_info(VarSet, VarTypes, TVarSet,
-        InstVarSet, RttiVarMaps, Markers, HasParallelConj, POF, OrigPredName,
+        InstVarSet, RttiVarMaps, HasParallelConj, OrigPredInfo,
         ModuleInfo, MustRecomputeNonLocals, HaveExpandedLambdas).
 
 :- pred constraint_contains_vars(list(tvar)::in, prog_constraint::in)
@@ -745,7 +762,7 @@ find_used_vars_in_goal(Goal, !VarUses) :-
             find_used_vars_in_goal(LambdaGoal, !VarUses)
         )
     ;
-        GoalExpr = generic_call(GenericCall, ArgVars, _, _),
+        GoalExpr = generic_call(GenericCall, ArgVars, _, _, _),
         (
             GenericCall = higher_order(Var, _, _, _),
             mark_var_as_used(Var, !VarUses)
diff --git a/compiler/lco.m b/compiler/lco.m
index c49e8f9..b85f4d7 100644
--- a/compiler/lco.m
+++ b/compiler/lco.m
@@ -401,7 +401,7 @@ lco_proc(LowerSCCVariants, SCC, CurProc, !ModuleInfo, !CurSCCVariants,
             proc_info_get_vartypes(ProcInfo0, VarTypes0),
             proc_info_get_headvars(ProcInfo0, HeadVars),
             proc_info_get_argmodes(ProcInfo0, ArgModes),
-            list.map(map.lookup(VarTypes0), HeadVars, ArgTypes),
+            map.apply_to_list(HeadVars, VarTypes0, ArgTypes),
             arg_info.compute_in_and_out_vars(!.ModuleInfo, HeadVars,
                 ArgModes, ArgTypes, _InputHeadVars, OutputHeadVars),
             proc_info_get_inferred_determinism(ProcInfo0, CurProcDetism),
@@ -536,7 +536,7 @@ lco_in_goal(Goal0, Goal, !Info, ConstInfo) :-
         )
     ;
         ( GoalExpr0 = negation(_)
-        ; GoalExpr0 = generic_call(_, _, _, _)
+        ; GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = plain_call(_, _, _, _, _, _)
         ; GoalExpr0 = unify(_, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _,  _, _, _)
@@ -1301,7 +1301,7 @@ lco_transform_variant_goal(ModuleInfo, VariantMap, VarToAddr, InstMap0,
         GoalInfo = GoalInfo0,
         Changed = no
     ;
-        GoalExpr0 = generic_call(_, _, _, _),
+        GoalExpr0 = generic_call(_, _, _, _, _),
         lco_transform_variant_atomic_goal(ModuleInfo, VarToAddr, InstMap0,
             GoalInfo0, GoalExpr0, GoalExpr, Changed, !ProcInfo),
         GoalInfo = GoalInfo0
diff --git a/compiler/live_vars.m b/compiler/live_vars.m
index 8dff713..614f336 100644
--- a/compiler/live_vars.m
+++ b/compiler/live_vars.m
@@ -440,7 +440,7 @@ build_live_sets_in_goal_2(GoalExpr0, GoalExpr, GoalInfo0, GoalInfo,
             GoalInfo = GoalInfo0
         )
     ;
-        GoalExpr0 = generic_call(GenericCall, ArgVars, Modes, _Det),
+        GoalExpr0 = generic_call(GenericCall, ArgVars, Modes, _, _),
         GoalExpr = GoalExpr0,
         (
             GenericCall = cast(_),
diff --git a/compiler/liveness.m b/compiler/liveness.m
index b7364e5..0e67155 100644
--- a/compiler/liveness.m
+++ b/compiler/liveness.m
@@ -392,7 +392,7 @@ detect_liveness_in_goal(Goal0, Goal, Liveness0, FinalLiveness, LiveInfo) :-
 
     (
         ( GoalExpr0 = plain_call(_, _, _, _, _, _)
-        ; GoalExpr0 = generic_call(_, _, _, _)
+        ; GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_,_,  _, _, _, _, _)
         ; GoalExpr0 = unify(_, _, _, _, _)
         ),
@@ -661,7 +661,7 @@ detect_deadness_in_goal(Goal0, Goal, !Deadness, !.Liveness, LiveInfo) :-
 
     (
         ( GoalExpr0 = plain_call(_, _, _, _, _, _)
-        ; GoalExpr0 = generic_call(_, _, _, _)
+        ; GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
         ; GoalExpr0 = unify(_, _, _, _, _)
         ),
@@ -1021,7 +1021,7 @@ update_liveness_goal(Goal, LiveInfo, !Liveness) :-
 update_liveness_expr(GoalExpr, LiveInfo, !Liveness) :-
     (
         ( GoalExpr = plain_call(_, _, _, _, _, _)
-        ; GoalExpr = generic_call(_, _, _, _)
+        ; GoalExpr = generic_call(_, _, _, _, _)
         ; GoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
         ; GoalExpr = unify(_, _, _, _, _)
         )
@@ -1192,7 +1192,7 @@ delay_death_goal_expr(!GoalExpr, !GoalInfo, !BornVars, !DelayedDead, VarSet) :-
     (
         !.GoalExpr = plain_call(_, _, _, _, _, _)
     ;
-        !.GoalExpr = generic_call(_, _, _, _)
+        !.GoalExpr = generic_call(_, _, _, _, _)
     ;
         !.GoalExpr = unify(_, _, _, _, _)
     ;
@@ -1523,7 +1523,7 @@ detect_resume_points_in_goal(Goal0, Goal, !Liveness, LiveInfo, ResumeVars0) :-
         ),
         GoalExpr = scope(Reason, SubGoal)
     ;
-        ( GoalExpr0 = generic_call(_, _, _, _)
+        ( GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = plain_call(_, _, _, _, _, _)
         ; GoalExpr0 = unify(_, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
diff --git a/compiler/llds.m b/compiler/llds.m
index 385cc55..c93338a 100644
--- a/compiler/llds.m
+++ b/compiler/llds.m
@@ -1305,13 +1305,14 @@
 :- type ho_call_variant
     --->    generic
             % This calls for the use of one of do_call_closure_compact and
-            % do_call_class_method_compact, which work for any number of
+            % do_call_class_method_compact which works for any number of
             % visible input arguments.
 
     ;       specialized_known(int).
             % If the integer is N, this calls for the use of do_call_closure_N
             % or do_call_class_method_N. These are specialized to assume N
-            % visible input arguments.
+            % visible regular register input arguments, and zero visible float
+            % register input arguments.
 
     % A tag (used in mkword, create and field expressions and in incr_hp
     % instructions) is a small integer.
diff --git a/compiler/loop_inv.m b/compiler/loop_inv.m
index a381e6f..1f9af39 100644
--- a/compiler/loop_inv.m
+++ b/compiler/loop_inv.m
@@ -304,7 +304,7 @@ invariant_goal_candidates_in_goal(PPId, Goal, !IGCs) :-
             invariant_goal_candidates_handle_primitive_goal(Goal, !IGCs)
         )
     ;
-        ( GoalExpr = generic_call(_, _, _, _)
+        ( GoalExpr = generic_call(_, _, _, _, _)
         ; GoalExpr = unify(_, _, _, _, _)
         ; GoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
         ),
@@ -671,7 +671,7 @@ cannot_succeed(hlds_goal(_GoalExpr, GoalInfo)) :-
 call_has_inst_any(ModuleInfo, Goal) :-
     Goal = hlds_goal(GoalExpr, _GoalInfo),
     (
-        GoalExpr = generic_call(_, _, Modes, _)
+        GoalExpr = generic_call(_, _, Modes, _, _)
     ;
         GoalExpr = plain_call(PredId, ProcId, _, _, _, _),
         Modes = argmodes(ModuleInfo, PredId, ProcId)
@@ -869,7 +869,7 @@ gen_aux_proc_goal(Info, Goal) = AuxGoal :-
             AuxGoal = gen_aux_proc_handle_non_recursive_call(Info, Goal)
         )
     ;
-        ( GoalExpr = generic_call(_, _, _, _)
+        ( GoalExpr = generic_call(_, _, _, _, _)
         ; GoalExpr = unify(_, _, _, _, _)
         ; GoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
         ),
@@ -983,7 +983,7 @@ gen_out_proc_goal(PPId, CallAux, Goal) = AuxGoal :-
             AuxGoal = Goal
         )
     ;
-        ( GoalExpr = generic_call(_, _, _, _)
+        ( GoalExpr = generic_call(_, _, _, _, _)
         ; GoalExpr = unify(_, _, _, _, _)
         ; GoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
         ),
@@ -1092,7 +1092,7 @@ used_vars(ModuleInfo, Goal) = UsedVars :-
             uniquely_used_args(ModuleInfo),
             Args, argmodes(ModuleInfo, PredId, ProcId))
     ;
-        GoalExpr = generic_call(_, Args, Modes, _),
+        GoalExpr = generic_call(_, Args, Modes, _, _),
         UsedVars = list.filter_map_corresponding(
             uniquely_used_args(ModuleInfo),
             Args, Modes)
@@ -1179,7 +1179,7 @@ goal_inputs(ModuleInfo, Goal) = Inputs :-
         Inputs = list.filter_map_corresponding(input_arg(ModuleInfo),
             Args, argmodes(ModuleInfo, PredId, ProcId))
     ;
-        GoalExpr = generic_call(GenericCall, Args, ArgModes, _),
+        GoalExpr = generic_call(GenericCall, Args, ArgModes, _, _),
         generic_call_vars(GenericCall, GenericCallVars),
         Inputs0 = list.filter_map_corresponding(input_arg(ModuleInfo),
             Args, ArgModes),
@@ -1261,7 +1261,7 @@ goal_outputs(ModuleInfo, Goal) = Outputs :-
         Outputs = list.filter_map_corresponding(output_arg(ModuleInfo),
             Args, argmodes(ModuleInfo, PredId, ProcId))
     ;
-        GoalExpr = generic_call(_, Args, ArgModes, _),
+        GoalExpr = generic_call(_, Args, ArgModes, _, _),
         Outputs = list.filter_map_corresponding(output_arg(ModuleInfo),
             Args, ArgModes)
     ;
diff --git a/compiler/make_hlds_passes.m b/compiler/make_hlds_passes.m
index dce9932..b763e6b 100644
--- a/compiler/make_hlds_passes.m
+++ b/compiler/make_hlds_passes.m
@@ -295,17 +295,7 @@ use_double_word_floats(Globals, DoubleWordFloats) :-
             TargetWordBits = 32,
             SinglePrecFloat = no
         ->
-            globals.lookup_bool_option(Globals, highlevel_code, HighLevelCode),
-            (
-                HighLevelCode = yes,
-                DoubleWordFloats = yes
-            ;
-                HighLevelCode = no,
-                % Double word floats in structures works best in conjunction
-                % with float registers. Work on the latter is in progress.
-                globals.lookup_bool_option(Globals, use_float_registers,
-                    DoubleWordFloats)
-            )
+            DoubleWordFloats = yes
         ;
             DoubleWordFloats = no
         )
diff --git a/compiler/make_hlds_warn.m b/compiler/make_hlds_warn.m
index fb0937a..5781a4d 100644
--- a/compiler/make_hlds_warn.m
+++ b/compiler/make_hlds_warn.m
@@ -282,7 +282,7 @@ warn_singletons_in_goal(Goal, QuantVars, !Info) :-
         NonLocals = goal_info_get_nonlocals(GoalInfo),
         warn_singletons_goal_vars(Args, GoalInfo, NonLocals, QuantVars, !Info)
     ;
-        GoalExpr = generic_call(GenericCall, Args0, _, _),
+        GoalExpr = generic_call(GenericCall, Args0, _, _, _),
         goal_util.generic_call_vars(GenericCall, Args1),
         Args = Args0 ++ Args1,
         NonLocals = goal_info_get_nonlocals(GoalInfo),
diff --git a/compiler/mark_static_terms.m b/compiler/mark_static_terms.m
index 9e0602e..e3aaff0 100644
--- a/compiler/mark_static_terms.m
+++ b/compiler/mark_static_terms.m
@@ -108,7 +108,7 @@ goal_mark_static_terms(Goal0, Goal, !SI) :-
         GoalExpr = if_then_else(Vars, Cond, Then, Else)
     ;
         ( GoalExpr0 = plain_call(_, _, _, _, _, _)
-        ; GoalExpr = generic_call(_, _, _, _)
+        ; GoalExpr = generic_call(_, _, _, _, _)
         ; GoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
         ),
         GoalExpr = GoalExpr0
diff --git a/compiler/mark_tail_calls.m b/compiler/mark_tail_calls.m
index 3f6808d..a4425af 100644
--- a/compiler/mark_tail_calls.m
+++ b/compiler/mark_tail_calls.m
@@ -135,7 +135,7 @@ mark_tail_calls_in_goal(Feature, PredId, ProcId, Outputs0, MaybeOutputs,
     Goal0 = hlds_goal(GoalExpr0, GoalInfo0),
     (
         ( GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
-        ; GoalExpr0 = generic_call(_, _, _, _)
+        ; GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = scope(_, _)
         ; GoalExpr0 = negation(_)
         ),
diff --git a/compiler/mercury_compile_middle_passes.m b/compiler/mercury_compile_middle_passes.m
index 3062904..78e7815 100644
--- a/compiler/mercury_compile_middle_passes.m
+++ b/compiler/mercury_compile_middle_passes.m
@@ -70,6 +70,7 @@
 :- import_module transform_hlds.distance_granularity.
 :- import_module transform_hlds.equiv_type_hlds.
 :- import_module transform_hlds.exception_analysis.
+:- import_module transform_hlds.float_regs.
 :- import_module transform_hlds.granularity.
 :- import_module transform_hlds.higher_order.
 :- import_module transform_hlds.implicit_parallelism.
@@ -240,6 +241,9 @@ middle_pass(ModuleName, !HLDS, !DumpInfo, !IO) :-
     maybe_lco(Verbose, Stats, !HLDS, !IO),
     maybe_dump_hlds(!.HLDS, 210, "lco", !DumpInfo, !IO),
 
+    maybe_float_reg_wrapper(Verbose, Stats, !HLDS, !IO),
+    maybe_dump_hlds(!.HLDS, 213, "float_reg_wrapper", !DumpInfo, !IO),
+
     % If we are compiling in a deep profiling grade then now rerun simplify.
     % The reason for doing this now is that we want to take advantage of any
     % opportunities the other optimizations have provided for constant
@@ -1329,6 +1333,28 @@ maybe_par_loop_control(Verbose, Stats, !HLDS, !IO) :-
 
 %-----------------------------------------------------------------------------%
 
+:- pred maybe_float_reg_wrapper(bool::in, bool::in,
+    module_info::in, module_info::out, io::di, io::uo) is det.
+
+maybe_float_reg_wrapper(Verbose, Stats, !HLDS, !IO) :-
+    module_info_get_globals(!.HLDS, Globals),
+    globals.lookup_bool_option(Globals, use_float_registers, UseFloatRegs),
+    (
+        UseFloatRegs = yes,
+        maybe_write_string(Verbose,
+            "% Inserting float register wrappers...", !IO),
+        maybe_flush_output(Verbose, !IO),
+        float_regs.insert_reg_wrappers(!HLDS, Specs),
+        write_error_specs(Specs, Globals, 0, _NumWarnings, 0, NumErrors, !IO),
+        module_info_incr_num_errors(NumErrors, !HLDS),
+        maybe_write_string(Verbose, " done.\n", !IO),
+        maybe_report_stats(Stats, !IO)
+    ;
+        UseFloatRegs = no
+    ).
+
+%-----------------------------------------------------------------------------%
+
 :- pred maybe_term_size_prof(bool::in, bool::in,
     module_info::in, module_info::out, io::di, io::uo) is det.
 
diff --git a/compiler/mercury_to_mercury.m b/compiler/mercury_to_mercury.m
index 92731eb..c69057a 100644
--- a/compiler/mercury_to_mercury.m
+++ b/compiler/mercury_to_mercury.m
@@ -1310,7 +1310,7 @@ mercury_format_structured_inst(not_reached, Indent, _, !U) :-
     pred_inst_info::in, inst_varset::in, U::di, U::uo) is det <= output(U).
 
 mercury_format_ground_pred_inst_info(Uniq, PredInstInfo, VarSet, !U) :-
-    PredInstInfo = pred_inst_info(PredOrFunc, Modes, Det),
+    PredInstInfo = pred_inst_info(PredOrFunc, Modes, MaybeArgRegs, Det),
     (
         Uniq = shared
     ;
@@ -1355,13 +1355,21 @@ mercury_format_ground_pred_inst_info(Uniq, PredInstInfo, VarSet, !U) :-
         add_string(" is ", !U),
         mercury_format_det(Det, !U),
         add_string(")", !U)
+    ),
+    (
+        MaybeArgRegs = yes(ArgRegs),
+        add_string(" /* arg regs: [", !U),
+        mercury_format_arg_reg_list(ArgRegs, !U),
+        add_string("] */", !U)
+    ;
+        MaybeArgRegs = no
     ).
 
 :- pred mercury_format_any_pred_inst_info(uniqueness::in, pred_inst_info::in,
     inst_varset::in, U::di, U::uo) is det <= output(U).
 
 mercury_format_any_pred_inst_info(Uniq, PredInstInfo, VarSet, !U) :-
-    PredInstInfo = pred_inst_info(PredOrFunc, Modes, Det),
+    PredInstInfo = pred_inst_info(PredOrFunc, Modes, MaybeArgRegs, Det),
     (
         Uniq = shared
     ;
@@ -1406,6 +1414,34 @@ mercury_format_any_pred_inst_info(Uniq, PredInstInfo, VarSet, !U) :-
         add_string(" is ", !U),
         mercury_format_det(Det, !U),
         add_string(")", !U)
+    ),
+    (
+        MaybeArgRegs = yes(ArgRegs),
+        add_string(" /* arg regs: [", !U),
+        mercury_format_arg_reg_list(ArgRegs, !U),
+        add_string("] */", !U)
+    ;
+        MaybeArgRegs = no
+    ).
+
+:- pred mercury_format_arg_reg_list(list(ho_arg_reg)::in, U::di, U::uo) is det
+    <= output(U).
+
+mercury_format_arg_reg_list([], !U).
+mercury_format_arg_reg_list([H | T], !U) :-
+    (
+        H = ho_arg_reg_r,
+        add_string("reg_r", !U)
+    ;
+        H = ho_arg_reg_f,
+        add_string("reg_f", !U)
+    ),
+    (
+        T = []
+    ;
+        T = [_ | _],
+        add_string(", ", !U),
+        mercury_format_arg_reg_list(T, !U)
     ).
 
 :- instance inst_info(simple_inst_info) where [
@@ -1879,7 +1915,7 @@ mercury_format_mode((InstA -> InstB), InstInfo, !U) :-
         % in a nice format
         %
         InstA = ground(_Uniq, higher_order(
-            pred_inst_info(_PredOrFunc, _Modes, _Det))),
+            pred_inst_info(_PredOrFunc, _Modes, _, _Det))),
         InstB = InstA
     ->
         mercury_format_inst(InstA, InstInfo, !U)
diff --git a/compiler/middle_rec.m b/compiler/middle_rec.m
index 2231aea..f47f1ca 100644
--- a/compiler/middle_rec.m
+++ b/compiler/middle_rec.m
@@ -208,7 +208,7 @@ contains_only_builtins_expr(GoalExpr) = OnlyBuiltins :-
         )
     ;
         ( GoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
-        ; GoalExpr = generic_call(_, _, _, _)
+        ; GoalExpr = generic_call(_, _, _, _, _)
         ),
         OnlyBuiltins = no
     ;
diff --git a/compiler/ml_code_gen.m b/compiler/ml_code_gen.m
index 5de96ec..9bde86b 100644
--- a/compiler/ml_code_gen.m
+++ b/compiler/ml_code_gen.m
@@ -618,7 +618,7 @@ ml_gen_goal_expr(GoalExpr, CodeModel, Context, GoalInfo, Decls, Statements,
                 Decls, Statements, !Info)
         )
     ;
-        GoalExpr = generic_call(GenericCall, Vars, Modes, Detism),
+        GoalExpr = generic_call(GenericCall, Vars, Modes, _, Detism),
         determinism_to_code_model(Detism, CallCodeModel),
         expect(unify(CodeModel, CallCodeModel), $module, $pred,
             "code model mismatch"),
@@ -771,7 +771,8 @@ goal_expr_find_subgoal_nonlocals(GoalExpr, SubGoalNonLocals) :-
             _Unify_context, _SymName),
         SubGoalNonLocals = set_of_var.list_to_set(ArgVars)
     ;
-        GoalExpr = generic_call(GenericCall, ArgVars, _Modes, _Detism),
+        GoalExpr = generic_call(GenericCall, ArgVars, _Modes, _MaybeArgRegs,
+            _Detism),
         (
             GenericCall = higher_order(HOVar, _Purity, _Kind, _Arity),
             SubGoalNonLocals = set_of_var.list_to_set([HOVar | ArgVars])
diff --git a/compiler/mode_constraints.m b/compiler/mode_constraints.m
index d7ba922..7ea85df 100644
--- a/compiler/mode_constraints.m
+++ b/compiler/mode_constraints.m
@@ -482,7 +482,7 @@ number_robdd_variables_in_goal_2(InstGraph, GoalId, ParentNonLocals, _,
         ParentNonLocals, Args, Occurring, !RInfo).
 number_robdd_variables_in_goal_2(InstGraph, GoalId, ParentNonLocals, _,
         Occurring, GoalExpr, GoalExpr, !RInfo) :-
-    GoalExpr = generic_call(_, Args, _, _),
+    GoalExpr = generic_call(_, Args, _, _, _),
     number_robdd_variables_at_goal_path(InstGraph, GoalId,
         ParentNonLocals, Args, Occurring, !RInfo).
 number_robdd_variables_in_goal_2(InstGraph, GoalId, ParentNonLocals, _,
@@ -1015,7 +1015,7 @@ do_process_inst(ModuleInfo, InstGraph, Free, Bound, DoHO,
             )), Functors, !Constraint, !MDI),
     (
         DoHO = yes,
-        Inst = ground(_, higher_order(pred_inst_info(_, ArgModes, _)))
+        Inst = ground(_, higher_order(pred_inst_info(_, ArgModes, _, _)))
     ->
         HoModes0 = !.MDI ^ ho_modes,
         MCI = !.MDI ^ mc_info,
@@ -1335,7 +1335,8 @@ goal_constraints_2(GoalId, NonLocals, Vars, CanSucceed, GoalExpr0, GoalExpr,
         ),
         GoalExpr = GoalExpr0
     ;
-        GoalExpr0 = generic_call(GenericCall, Args, _Modes, _Det),
+        GoalExpr0 = generic_call(GenericCall, Args, _Modes, _MaybeArgRegs,
+            _Det),
         % Note: `_Modes' is invalid for higher-order calls at this point.
         (
             GenericCall = higher_order(Var, _, _, _),
diff --git a/compiler/mode_ordering.m b/compiler/mode_ordering.m
index 26b5fe1..4e1e3b3 100644
--- a/compiler/mode_ordering.m
+++ b/compiler/mode_ordering.m
@@ -225,7 +225,7 @@ mode_order_goal_2(GoalExpr0, GoalExpr, !GoalInfo, !MOI) :-
         goal_info_set_need_visible_vars(NeedVisibleVars, !GoalInfo),
         GoalExpr = GoalExpr0 ^ call_proc_id := ProcId
     ;
-        GoalExpr0 = generic_call(_GenericCall0, _Args, _Modes0, _Det),
+        GoalExpr0 = generic_call(_GenericCall0, _Args, _Modes0, _, _Det),
         unexpected($module, $pred, "generic_call NYI")
     ;
         GoalExpr0 = switch(_Var, _CanFail0, _Cases0),
diff --git a/compiler/mode_util.m b/compiler/mode_util.m
index b3bba5f..5541bcd 100644
--- a/compiler/mode_util.m
+++ b/compiler/mode_util.m
@@ -641,7 +641,7 @@ propagate_ctor_info(ModuleInfo, Type, Constructors, Inst0, Inst) :-
         )
     ;
         Inst0 = any(Uniq, higher_order(PredInstInfo0)),
-        PredInstInfo0 = pred_inst_info(PredOrFunc, Modes0, Det),
+        PredInstInfo0 = pred_inst_info(PredOrFunc, Modes0, MaybeArgRegs, Det),
         (
             type_is_higher_order_details(Type, _, PredOrFunc, _, ArgTypes),
             list.same_length(ArgTypes, Modes0)
@@ -654,7 +654,7 @@ propagate_ctor_info(ModuleInfo, Type, Constructors, Inst0, Inst) :-
             % with the inst.
             Modes = Modes0
         ),
-        PredInstInfo = pred_inst_info(PredOrFunc, Modes, Det),
+        PredInstInfo = pred_inst_info(PredOrFunc, Modes, MaybeArgRegs, Det),
         Inst = any(Uniq, higher_order(PredInstInfo))
     ;
         Inst0 = free,
@@ -689,7 +689,7 @@ propagate_ctor_info(ModuleInfo, Type, Constructors, Inst0, Inst) :-
         )
     ;
         Inst0 = ground(Uniq, higher_order(PredInstInfo0)),
-        PredInstInfo0 = pred_inst_info(PredOrFunc, Modes0, Det),
+        PredInstInfo0 = pred_inst_info(PredOrFunc, Modes0, MaybeArgRegs, Det),
         (
             type_is_higher_order_details(Type, _, PredOrFunc, _, ArgTypes),
             list.same_length(ArgTypes, Modes0)
@@ -702,7 +702,7 @@ propagate_ctor_info(ModuleInfo, Type, Constructors, Inst0, Inst) :-
             % with the inst.
             Modes = Modes0
         ),
-        PredInstInfo = pred_inst_info(PredOrFunc, Modes, Det),
+        PredInstInfo = pred_inst_info(PredOrFunc, Modes, MaybeArgRegs, Det),
         Inst = ground(Uniq, higher_order(PredInstInfo))
     ;
         Inst0 = not_reached,
@@ -738,7 +738,7 @@ propagate_ctor_info_lazily(ModuleInfo, Subst, Type0, Inst0, Inst) :-
         )
     ;
         Inst0 = any(Uniq, higher_order(PredInstInfo0)),
-        PredInstInfo0 = pred_inst_info(PredOrFunc, Modes0, Det),
+        PredInstInfo0 = pred_inst_info(PredOrFunc, Modes0, MaybeArgRegs, Det),
         apply_type_subst(Type0, Subst, Type),
         (
             type_is_higher_order_details(Type, _, PredOrFunc, _, ArgTypes),
@@ -752,7 +752,7 @@ propagate_ctor_info_lazily(ModuleInfo, Subst, Type0, Inst0, Inst) :-
             % inst.
             Modes = Modes0
         ),
-        PredInstInfo = pred_inst_info(PredOrFunc, Modes, Det),
+        PredInstInfo = pred_inst_info(PredOrFunc, Modes, MaybeArgRegs, Det),
         Inst = any(Uniq, higher_order(PredInstInfo))
     ;
         Inst0 = free,
@@ -789,7 +789,7 @@ propagate_ctor_info_lazily(ModuleInfo, Subst, Type0, Inst0, Inst) :-
         )
     ;
         Inst0 = ground(Uniq, higher_order(PredInstInfo0)),
-        PredInstInfo0 = pred_inst_info(PredOrFunc, Modes0, Det),
+        PredInstInfo0 = pred_inst_info(PredOrFunc, Modes0, MaybeArgRegs, Det),
         apply_type_subst(Type0, Subst, Type),
         (
             type_is_higher_order_details(Type, _, PredOrFunc, _, ArgTypes),
@@ -803,7 +803,7 @@ propagate_ctor_info_lazily(ModuleInfo, Subst, Type0, Inst0, Inst) :-
             % inst.
             Modes = Modes0
         ),
-        PredInstInfo = pred_inst_info(PredOrFunc, Modes, Det),
+        PredInstInfo = pred_inst_info(PredOrFunc, Modes, MaybeArgRegs, Det),
         Inst = ground(Uniq, higher_order(PredInstInfo))
     ;
         Inst0 = not_reached,
@@ -856,7 +856,7 @@ default_higher_order_func_inst(ModuleInfo, PredArgTypes, PredInstInfo) :-
     list.append(FuncArgModes, [FuncRetMode], PredArgModes0),
     propagate_types_into_mode_list(ModuleInfo, PredArgTypes,
         PredArgModes0, PredArgModes),
-    PredInstInfo = pred_inst_info(pf_function, PredArgModes, detism_det).
+    PredInstInfo = pred_inst_info(pf_function, PredArgModes, no, detism_det).
 
 constructors_to_bound_insts(ModuleInfo, Uniq, TypeCtor, Constructors,
         BoundInsts) :-
@@ -1210,7 +1210,7 @@ recompute_instmap_delta_2(RecomputeAtomic, GoalExpr0, GoalExpr, GoalInfo,
         ),
         GoalExpr = scope(Reason, SubGoal)
     ;
-        GoalExpr0 = generic_call(_Details, Vars, Modes, Detism),
+        GoalExpr0 = generic_call(_Details, Vars, Modes, _, Detism),
         ( determinism_components(Detism, _, at_most_zero) ->
             instmap_delta_init_unreachable(InstMapDelta)
         ;
@@ -1623,7 +1623,8 @@ cons_id_to_shared_inst(ModuleInfo, ConsId, NumArgs) = MaybeInst :-
         ;
             unexpected($module, $pred, "list.drop failed")
         ),
-        Inst = ground(shared, higher_order(pred_inst_info(PorF, Modes, Det))),
+        Inst = ground(shared, higher_order(pred_inst_info(PorF, Modes,
+            no, Det))),
         MaybeInst = yes(Inst)
     ;
         ( ConsId = type_ctor_info_const(_, _, _)
diff --git a/compiler/modecheck_call.m b/compiler/modecheck_call.m
index 6111916..8cd1a7a 100644
--- a/compiler/modecheck_call.m
+++ b/compiler/modecheck_call.m
@@ -241,7 +241,7 @@ modecheck_higher_order_call(PredOrFunc, PredVar, Args0, Args, Modes, Det,
             PredInstInfo = pred_inst_info_standard_func_mode(
                 list.length(ArgTypes))
         ),
-        PredInstInfo = pred_inst_info(PredOrFunc, ModesPrime, DetPrime),
+        PredInstInfo = pred_inst_info(PredOrFunc, ModesPrime, _, DetPrime),
         list.length(ModesPrime, Arity)
     ->
         (
diff --git a/compiler/modecheck_conj.m b/compiler/modecheck_conj.m
index 575b057..3106c21 100644
--- a/compiler/modecheck_conj.m
+++ b/compiler/modecheck_conj.m
@@ -531,7 +531,8 @@ candidate_init_vars_3(ModeInfo, Goal, !NonFree, !CandidateVars) :-
         % We assume that generic calls are deterministic. The modes field of
         % higher_order calls is junk until *after* mode analysis, hence we
         % can't handle them here.
-        GoalExpr = generic_call(Details, Args, ArgModes, _JunkDetism),
+        GoalExpr = generic_call(Details, Args, ArgModes, _JunkArgRegs,
+            _JunkDetism),
         Details \= higher_order(_, _, _, _),
         candidate_init_vars_call(ModeInfo, Args, ArgModes,
             !NonFree, !CandidateVars)
diff --git a/compiler/modecheck_goal.m b/compiler/modecheck_goal.m
index 3f2373a..b6f1717 100644
--- a/compiler/modecheck_goal.m
+++ b/compiler/modecheck_goal.m
@@ -206,7 +206,8 @@ modecheck_goal_expr(GoalExpr0, GoalInfo0, GoalExpr, !ModeInfo) :-
             MaybeCallUnifyContext, PredName, GoalInfo0, GoalExpr,
             !ModeInfo)
     ;
-        GoalExpr0 = generic_call(GenericCall, Args0, Modes0, _Detism),
+        GoalExpr0 = generic_call(GenericCall, Args0, Modes0, _MaybeArgRegs,
+            _Detism),
         modecheck_goal_generic_call(GenericCall, Args0, Modes0, GoalInfo0,
             GoalExpr, !ModeInfo)
     ;
@@ -499,7 +500,7 @@ goal_large_flat_constructs(Goal) = LargeFlatConstructs :-
         LargeFlatConstructs = set_of_var.init
     ;
         ( GoalExpr = plain_call(_, _, _, _, _, _)
-        ; GoalExpr = generic_call(_, _, _, _)
+        ; GoalExpr = generic_call(_, _, _, _, _)
         ; GoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
         ),
         LargeFlatConstructs = set_of_var.init
@@ -559,7 +560,7 @@ set_large_flat_constructs_to_ground_in_goal(LargeFlatConstructs,
         Goal = Goal0
     ;
         ( GoalExpr0 = plain_call(_, _, _, _, _, _)
-        ; GoalExpr0 = generic_call(_, _, _, _)
+        ; GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
         ),
         Goal = Goal0
@@ -1322,7 +1323,7 @@ modecheck_goal_generic_call(GenericCall, Args0, Modes0, GoalInfo0, GoalExpr,
         GenericCall = higher_order(PredVar, _, PredOrFunc, _),
         modecheck_higher_order_call(PredOrFunc, PredVar,
             Args0, Args, Modes, Det, ExtraGoals, !ModeInfo),
-        GoalExpr1 = generic_call(GenericCall, Args, Modes, Det),
+        GoalExpr1 = generic_call(GenericCall, Args, Modes, no, Det),
         AllArgs0 = [PredVar | Args0],
         AllArgs = [PredVar | Args],
         handle_extra_goals(GoalExpr1, ExtraGoals, GoalInfo0, AllArgs0, AllArgs,
@@ -1346,7 +1347,7 @@ modecheck_goal_generic_call(GenericCall, Args0, Modes0, GoalInfo0, GoalExpr,
             unexpected($module, $pred, "unknown event")
         ),
         modecheck_event_call(Modes, Args0, Args, !ModeInfo),
-        GoalExpr = generic_call(GenericCall, Args, Modes, detism_det)
+        GoalExpr = generic_call(GenericCall, Args, Modes, no, detism_det)
     ;
         GenericCall = cast(_CastType),
         (
@@ -1384,7 +1385,7 @@ modecheck_goal_generic_call(GenericCall, Args0, Modes0, GoalInfo0, GoalExpr,
             Modes = Modes0
         ),
         modecheck_builtin_cast(Modes, Args0, Args, Det, ExtraGoals, !ModeInfo),
-        GoalExpr1 = generic_call(GenericCall, Args, Modes, Det),
+        GoalExpr1 = generic_call(GenericCall, Args, Modes, no, Det),
         handle_extra_goals(GoalExpr1, ExtraGoals, GoalInfo0, Args0, Args,
             InstMap0, GoalExpr, !ModeInfo)
     ),
diff --git a/compiler/modecheck_unify.m b/compiler/modecheck_unify.m
index 5218a37..1869be2 100644
--- a/compiler/modecheck_unify.m
+++ b/compiler/modecheck_unify.m
@@ -492,7 +492,7 @@ modecheck_unify_lambda(X, PredOrFunc, ArgVars, LambdaModes, LambdaDet,
     mode_info_get_instmap(!.ModeInfo, InstMap0),
     instmap_lookup_var(InstMap0, X, InstOfX),
     InstOfY = ground(unique, higher_order(LambdaPredInfo)),
-    LambdaPredInfo = pred_inst_info(PredOrFunc, LambdaModes, LambdaDet),
+    LambdaPredInfo = pred_inst_info(PredOrFunc, LambdaModes, no, LambdaDet),
     (
         abstractly_unify_inst(is_dead, InstOfX, InstOfY, real_unify,
             UnifyInst, _Det, ModuleInfo0, ModuleInfo1)
diff --git a/compiler/module_qual.m b/compiler/module_qual.m
index d7cdab2..75a6438 100644
--- a/compiler/module_qual.m
+++ b/compiler/module_qual.m
@@ -1166,9 +1166,10 @@ qualify_inst(abstract_inst(Name, Args0), abstract_inst(Name, Args), !Info,
 
 qualify_ho_inst_info(HOInstInfo0, HOInstInfo, !Info, !Specs) :-
     (
-        HOInstInfo0 = higher_order(pred_inst_info(A, Modes0, Det)),
+        HOInstInfo0 = higher_order(pred_inst_info(A, Modes0, MaybeArgRegs,
+            Det)),
         qualify_mode_list(Modes0, Modes, !Info, !Specs),
-        HOInstInfo = higher_order(pred_inst_info(A, Modes, Det))
+        HOInstInfo = higher_order(pred_inst_info(A, Modes, MaybeArgRegs, Det))
     ;
         HOInstInfo0 = none,
         HOInstInfo = none
diff --git a/compiler/notes/compiler_design.html b/compiler/notes/compiler_design.html
index e107e3e..df73e83 100644
--- a/compiler/notes/compiler_design.html
+++ b/compiler/notes/compiler_design.html
@@ -1168,11 +1168,6 @@ specialized versions without them (unused_args.m); type_infos are often unused.
   even if the user doesn't, and automatically constructed unification and
   comparison predicates are often dead as well.
 
-<li> elimination of dead procedures (dead_proc_elim.m). Inlining, higher-order
-  specialization and the elimination of unused args can make procedures dead
-  even the user doesn't, and automatically constructed unification and
-  comparison predicates are often dead as well.
-
 <li> tupling.m looks for predicates that pass around several arguments,
   and modifies the code to pass around a single tuple of these arguments
   instead if this looks like reducing the cost of parameter passing.
@@ -1203,6 +1198,10 @@ specialized versions without them (unused_args.m); type_infos are often unused.
 	     introduce_parallelism.m to do a better job.
 	</ul>
 
+<dt> float_regs.m wraps higher-order terms which use float registers
+  if passed in contexts where regular registers would be expected,
+  and vice versa.
+
 </ul>
 
 <p>
diff --git a/compiler/notes/todo.html b/compiler/notes/todo.html
index cc3ea1b..cebdf2d 100644
--- a/compiler/notes/todo.html
+++ b/compiler/notes/todo.html
@@ -270,8 +270,6 @@ mercury at csse.unimelb.edu.au.
 <h2> LLDS back-end: </h2>
 
 <ul>
-<li> use floating point registers
-
 <li> inter-procedural register allocation 
 
 <li> other specializations, e.g. if argument is known to be bound to
diff --git a/compiler/options.m b/compiler/options.m
index 340a895..e7cf04d 100644
--- a/compiler/options.m
+++ b/compiler/options.m
@@ -1349,7 +1349,7 @@ option_defaults_2(compilation_model_option, [
     gcc_non_local_gotos                 -   bool(yes),
     gcc_global_registers                -   bool(yes),
     asm_labels                          -   bool(yes),
-    use_float_registers                 -   bool(no),
+    use_float_registers                 -   bool(yes),
 
     % MLDS back-end compilation model options
     highlevel_code                      -   bool(no),
diff --git a/compiler/ordering_mode_constraints.m b/compiler/ordering_mode_constraints.m
index 0d51a10..8b05a45 100644
--- a/compiler/ordering_mode_constraints.m
+++ b/compiler/ordering_mode_constraints.m
@@ -333,7 +333,7 @@ goal_reordering(ContainingGoalMap, PredId, VarMap, Bindings, Goal0, Goal) :-
     Goal0 = hlds_goal(GoalExpr0, GoalInfo),
     (
         ( GoalExpr0 = plain_call(_, _, _, _, _, _)
-        ; GoalExpr0 = generic_call(_, _, _, _)
+        ; GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = unify(_, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
         ),
@@ -770,7 +770,7 @@ dump_goal_goal_paths(Globals, Indent, Goal, !IO) :-
     SubGoalIndent = Indent + 1,
     (
         ( GoalExpr = plain_call(_, _, _, _, _, _)
-        ; GoalExpr = generic_call(_, _, _, _)
+        ; GoalExpr = generic_call(_, _, _, _, _)
         ; GoalExpr = unify(_, _, _, _, _)
         ; GoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
         )
diff --git a/compiler/par_loop_control.m b/compiler/par_loop_control.m
index 058f644..892e3c8 100644
--- a/compiler/par_loop_control.m
+++ b/compiler/par_loop_control.m
@@ -247,7 +247,7 @@ goal_get_loop_control_par_conjs(Goal, SelfPredProcId, SeenUsableRecursion) :-
                 SeenUsableRecursion0 = have_not_seen_recursive_call
             )
         ;
-            GoalExpr = generic_call(_, _, _, _),
+            GoalExpr = generic_call(_, _, _, _, _),
             % We cannot determine if a generic call is recursive or not,
             % however it most likely is not.  In either case we cannot perform
             % the loop control transformation.
@@ -847,7 +847,7 @@ goal_rewrite_recursive_call(Info, !Goal, FixupGoalInfo) :-
     !.Goal = hlds_goal(GoalExpr0, GoalInfo),
     (
         ( GoalExpr0 = unify(_, _, _, _, _)
-        ; GoalExpr0 = generic_call(_, _, _, _)
+        ; GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
         ),
         GoalExpr = GoalExpr0,
@@ -986,7 +986,7 @@ goal_loop_control_fixup(Info, RecParConjIds, FixupGoalInfo, !Goal) :-
         !.Goal = hlds_goal(GoalExpr0, _),
         (
             ( GoalExpr0 = unify(_, _, _, _, _)
-            ; GoalExpr0 = generic_call(_, _, _, _)
+            ; GoalExpr0 = generic_call(_, _, _, _, _)
             ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
             ),
             % These cannot be a recursive call and they cannot be a base case
diff --git a/compiler/parallel_to_plain_conj.m b/compiler/parallel_to_plain_conj.m
index ce885a5..0026f20 100644
--- a/compiler/parallel_to_plain_conj.m
+++ b/compiler/parallel_to_plain_conj.m
@@ -94,7 +94,7 @@ parallel_to_plain_conjs_goal(Goal0, Goal) :-
     ;
         ( GoalExpr0 = unify(_, _, _, _, _)
         ; GoalExpr0 = plain_call(_, _, _, _, _, _)
-        ; GoalExpr0 = generic_call(_, _, _, _)
+        ; GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
         ),
         GoalExpr = GoalExpr0
diff --git a/compiler/pd_cost.m b/compiler/pd_cost.m
index f597c30..ebe92d1 100644
--- a/compiler/pd_cost.m
+++ b/compiler/pd_cost.m
@@ -100,7 +100,7 @@ goal_expr_cost(GoalExpr, GoalInfo, Cost) :-
             goal_cost(Goal, Cost)
         )
     ;
-        GoalExpr = generic_call(_, Args, _, _),
+        GoalExpr = generic_call(_, Args, _, _, _),
         list.length(Args, Arity),
         Cost0 = cost_of_reg_assign * Arity // 2,
         Cost = Cost0 + cost_of_stack_flush + cost_of_higher_order_call
diff --git a/compiler/pd_util.m b/compiler/pd_util.m
index 1f818aa..4cdcfa9 100644
--- a/compiler/pd_util.m
+++ b/compiler/pd_util.m
@@ -691,7 +691,7 @@ get_sub_branch_vars_goal(ProcArgInfo, [Goal | GoalList],
     ;
         ( GoalExpr = unify(_, _, _, _, _)
         ; GoalExpr = plain_call(_, _, _, _, _, _)
-        ; GoalExpr = generic_call(_, _, _, _)
+        ; GoalExpr = generic_call(_, _, _, _, _)
         ; GoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
         ; GoalExpr = conj(_, _)
         ; GoalExpr = negation(_)
@@ -1070,9 +1070,9 @@ goals_match_2([OldGoal | OldGoals], [NewGoal | NewGoals], !ONRenaming) :-
             % We don't need to check the modes here - if the goals match
             % and the insts of the argument variables match, the modes
             % of the call must be the same.
-            OldGoal = hlds_goal(generic_call(OldGenericCall, OldArgs1, _, Det),
+            OldGoal = hlds_goal(generic_call(OldGenericCall, OldArgs1, _, _, Det),
                 _),
-            NewGoal = hlds_goal(generic_call(NewGenericCall, NewArgs1, _, Det),
+            NewGoal = hlds_goal(generic_call(NewGenericCall, NewArgs1, _, _, Det),
                 _),
             match_generic_call(OldGenericCall, NewGenericCall),
             goal_util.generic_call_vars(OldGenericCall, OldArgs0),
diff --git a/compiler/polymorphism.m b/compiler/polymorphism.m
index 9b1f489..44b244b 100644
--- a/compiler/polymorphism.m
+++ b/compiler/polymorphism.m
@@ -1066,7 +1066,7 @@ polymorphism_process_goal_expr(GoalExpr0, GoalInfo0, Goal, !Info) :-
         % We don't need to add type_infos for higher order calls, since the
         % type_infos are added when the closures are constructed, not when
         % they are called.
-        GoalExpr0 = generic_call(_, _, _, _),
+        GoalExpr0 = generic_call(_, _, _, _, _),
         Goal = hlds_goal(GoalExpr0, GoalInfo0)
     ;
         GoalExpr0 = plain_call(PredId, _, ArgVars0, _, _, _),
@@ -3625,7 +3625,7 @@ expand_class_method_body(hlds_class_proc(PredId, ProcId), !ProcNum,
     BodyGoalExpr = generic_call(
         class_method(TypeClassInfoVar, !.ProcNum,
             class_id(ClassName, InstanceArity), CallId),
-        HeadVars, Modes, Detism),
+        HeadVars, Modes, no, Detism),
 
     % Make the goal info for the call.
     set_of_var.list_to_set(HeadVars0, NonLocals),
diff --git a/compiler/post_typecheck.m b/compiler/post_typecheck.m
index 03c4af9..3aef772 100644
--- a/compiler/post_typecheck.m
+++ b/compiler/post_typecheck.m
@@ -467,7 +467,7 @@ describe_constrained_goal(ModuleInfo, Goal) = Pieces :-
             CallPieces = describe_one_pred_name(ModuleInfo,
                 should_module_qualify, PredId)
         ;
-            GoalExpr = generic_call(GenericCall, _, _, _),
+            GoalExpr = generic_call(GenericCall, _, _, _, _),
             GenericCall = class_method(_, _, _, SimpleCallId),
             CallPieces = [simple_call(SimpleCallId)]
         ;
@@ -477,11 +477,11 @@ describe_constrained_goal(ModuleInfo, Goal) = Pieces :-
         ),
         Pieces = [words("the call to") | CallPieces]
     ;
-        GoalExpr = generic_call(higher_order(_, _, _, _), _, _, _),
+        GoalExpr = generic_call(higher_order(_, _, _, _), _, _, _, _),
         Pieces = [words("a higher-order call here")]
     ;
-        ( GoalExpr = generic_call(event_call(_), _, _, _)
-        ; GoalExpr = generic_call(cast(_), _, _, _)
+        ( GoalExpr = generic_call(event_call(_), _, _, _, _)
+        ; GoalExpr = generic_call(cast(_), _, _, _, _)
         ; GoalExpr = unify(_, _, _, _, _)
         ; GoalExpr = conj(_, _)
         ; GoalExpr = disj(_)
@@ -700,7 +700,7 @@ in_interface_check(ModuleInfo, PredInfo, Goal, !Specs) :-
             true
         )
     ;
-        GoalExpr = generic_call(_, _, _, _)
+        GoalExpr = generic_call(_, _, _, _, _)
     ;
         GoalExpr = unify(Var, RHS, _, _, _),
         Context = goal_info_get_context(GoalInfo),
@@ -1054,7 +1054,7 @@ resolve_unify_functor(X0, ConsId0, ArgVars0, Mode0, Unification0, UnifyContext,
         adjust_func_arity(pf_function, Arity, FullArity),
         HOCall = generic_call(
             higher_order(FuncVar, Purity, pf_function, FullArity),
-            ArgVars, Modes, Det),
+            ArgVars, Modes, no, Det),
         Goal = hlds_goal(HOCall, GoalInfo0),
         IsPlainUnify = is_not_plain_unify
     ;
diff --git a/compiler/prog_data.m b/compiler/prog_data.m
index d8033cf..d935f43 100644
--- a/compiler/prog_data.m
+++ b/compiler/prog_data.m
@@ -2073,10 +2073,21 @@ get_type_kind(kinded_type(_, Kind)) = Kind.
                                     % of the return value as the last element
                                     % of the list.
 
+                maybe(list(ho_arg_reg)),
+                                    % The register type to use for each of the
+                                    % additional arguments of the pred. This
+                                    % field is only needed when float registers
+                                    % exist, and is only set after the float
+                                    % reg wrappers pass.
+
                 determinism         % The determinism of the predicate or
                                     % function.
             ).
 
+:- type ho_arg_reg
+    --->    ho_arg_reg_r
+    ;       ho_arg_reg_f.
+
 :- type inst_id
     --->    inst_id(sym_name, arity).
 
diff --git a/compiler/prog_io.m b/compiler/prog_io.m
index 657d53f..a59c7a0 100644
--- a/compiler/prog_io.m
+++ b/compiler/prog_io.m
@@ -2309,9 +2309,9 @@ constrain_inst_vars_in_inst(InstConstraints, Inst0, Inst) :-
     pred_inst_info::in, pred_inst_info::out) is det.
 
 constrain_inst_vars_in_pred_inst_info(InstConstraints, PII0, PII) :-
-    PII0 = pred_inst_info(PredOrFunc, Modes0, Det),
+    PII0 = pred_inst_info(PredOrFunc, Modes0, MaybeArgRegs, Det),
     list.map(constrain_inst_vars_in_mode(InstConstraints), Modes0, Modes),
-    PII = pred_inst_info(PredOrFunc, Modes, Det).
+    PII = pred_inst_info(PredOrFunc, Modes, MaybeArgRegs, Det).
 
 :- pred constrain_inst_vars_in_inst_name(inst_var_sub::in,
     inst_name::in, inst_name::out) is det.
@@ -2394,7 +2394,7 @@ inst_var_constraints_are_consistent_in_inst(Inst, !Sub) :-
         (
             HOInstInfo = none
         ;
-            HOInstInfo = higher_order(pred_inst_info(_, Modes, _)),
+            HOInstInfo = higher_order(pred_inst_info(_, Modes, _, _)),
             inst_var_constraints_are_consistent_in_modes(Modes, !Sub)
         )
     ;
diff --git a/compiler/prog_io_util.m b/compiler/prog_io_util.m
index 7d4d7d2..f7015dd 100644
--- a/compiler/prog_io_util.m
+++ b/compiler/prog_io_util.m
@@ -543,7 +543,7 @@ convert_mode(AllowConstrainedInstVar, Term, Mode) :-
         DetTerm = term.functor(term.atom(DetString), [], _),
         standard_det(DetString, Detism),
         convert_mode_list(AllowConstrainedInstVar, ArgModesTerms, ArgModes),
-        PredInstInfo = pred_inst_info(pf_predicate, ArgModes, Detism),
+        PredInstInfo = pred_inst_info(pf_predicate, ArgModes, no, Detism),
         Inst = ground(shared, higher_order(PredInstInfo)),
         Mode = (Inst -> Inst)
     ;
@@ -564,7 +564,7 @@ convert_mode(AllowConstrainedInstVar, Term, Mode) :-
         convert_mode_list(AllowConstrainedInstVar, ArgModesTerms, ArgModes0),
         convert_mode(AllowConstrainedInstVar, RetModeTerm, RetMode),
         list.append(ArgModes0, [RetMode], ArgModes),
-        FuncInstInfo = pred_inst_info(pf_function, ArgModes, Detism),
+        FuncInstInfo = pred_inst_info(pf_function, ArgModes, no, Detism),
         Inst = ground(shared, higher_order(FuncInstInfo)),
         Mode = (Inst -> Inst)
     ;
@@ -582,7 +582,7 @@ convert_mode(AllowConstrainedInstVar, Term, Mode) :-
         DetTerm = term.functor(term.atom(DetString), [], _),
         standard_det(DetString, Detism),
         convert_mode_list(AllowConstrainedInstVar, ArgModesTerms, ArgModes),
-        PredInstInfo = pred_inst_info(pf_predicate, ArgModes, Detism),
+        PredInstInfo = pred_inst_info(pf_predicate, ArgModes, no, Detism),
         Inst = any(shared, higher_order(PredInstInfo)),
         Mode = (Inst -> Inst)
     ;
@@ -603,7 +603,7 @@ convert_mode(AllowConstrainedInstVar, Term, Mode) :-
         convert_mode_list(AllowConstrainedInstVar, ArgModesTerms, ArgModes0),
         convert_mode(AllowConstrainedInstVar, RetModeTerm, RetMode),
         list.append(ArgModes0, [RetMode], ArgModes),
-        FuncInstInfo = pred_inst_info(pf_function, ArgModes, Detism),
+        FuncInstInfo = pred_inst_info(pf_function, ArgModes, no, Detism),
         Inst = any(shared, higher_order(FuncInstInfo)),
         Mode = (Inst -> Inst)
     ;
@@ -641,7 +641,7 @@ convert_inst(AllowConstrainedInstVar, Term, Result) :-
         DetTerm = term.functor(term.atom(DetString), [], _),
         standard_det(DetString, Detism),
         convert_mode_list(AllowConstrainedInstVar, ArgModesTerm, ArgModes),
-        PredInst = pred_inst_info(pf_predicate, ArgModes, Detism),
+        PredInst = pred_inst_info(pf_predicate, ArgModes, no, Detism),
         Result = ground(shared, higher_order(PredInst))
     ;
         % The syntax for a ground higher-order func inst is
@@ -660,7 +660,7 @@ convert_inst(AllowConstrainedInstVar, Term, Result) :-
         convert_mode_list(AllowConstrainedInstVar, ArgModesTerm, ArgModes0),
         convert_mode(AllowConstrainedInstVar, RetModeTerm, RetMode),
         list.append(ArgModes0, [RetMode], ArgModes),
-        FuncInst = pred_inst_info(pf_function, ArgModes, Detism),
+        FuncInst = pred_inst_info(pf_function, ArgModes, no, Detism),
         Result = ground(shared, higher_order(FuncInst))
     ;
         % The syntax for an `any' higher-order pred inst is
@@ -676,7 +676,7 @@ convert_inst(AllowConstrainedInstVar, Term, Result) :-
         DetTerm = term.functor(term.atom(DetString), [], _),
         standard_det(DetString, Detism),
         convert_mode_list(AllowConstrainedInstVar, ArgModesTerm, ArgModes),
-        PredInst = pred_inst_info(pf_predicate, ArgModes, Detism),
+        PredInst = pred_inst_info(pf_predicate, ArgModes, no, Detism),
         Result = any(shared, higher_order(PredInst))
     ;
         % The syntax for an `any' higher-order func inst is
@@ -695,7 +695,7 @@ convert_inst(AllowConstrainedInstVar, Term, Result) :-
         convert_mode_list(AllowConstrainedInstVar, ArgModesTerm, ArgModes0),
         convert_mode(AllowConstrainedInstVar, RetModeTerm, RetMode),
         list.append(ArgModes0, [RetMode], ArgModes),
-        FuncInst = pred_inst_info(pf_function, ArgModes, Detism),
+        FuncInst = pred_inst_info(pf_function, ArgModes, no, Detism),
         Result = any(shared, higher_order(FuncInst))
 
     ; Name = "bound", Args0 = [Disj] ->
diff --git a/compiler/prog_mode.m b/compiler/prog_mode.m
index 27764be..4dc38af 100644
--- a/compiler/prog_mode.m
+++ b/compiler/prog_mode.m
@@ -351,9 +351,11 @@ alt_list_apply_substitution(Subst, [Alt0 | Alts0], [Alt | Alts]) :-
 
 ho_inst_info_apply_substitution(_, none, none).
 ho_inst_info_apply_substitution(Subst, HOInstInfo0, HOInstInfo) :-
-    HOInstInfo0 = higher_order(pred_inst_info(PredOrFunc, Modes0, Det)),
+    HOInstInfo0 = higher_order(pred_inst_info(PredOrFunc, Modes0, MaybeArgRegs,
+        Det)),
     mode_list_apply_substitution(Subst, Modes0, Modes),
-    HOInstInfo = higher_order(pred_inst_info(PredOrFunc, Modes, Det)).
+    HOInstInfo = higher_order(pred_inst_info(PredOrFunc, Modes, MaybeArgRegs,
+        Det)).
 
 mode_list_apply_substitution(Subst, Modes0, Modes) :-
     ( map.is_empty(Subst) ->
@@ -392,9 +394,11 @@ rename_apart_inst_vars_in_mode(Sub, user_defined_mode(Name, Insts0),
 rename_apart_inst_vars_in_inst(Sub, any(Uniq, HOInstInfo0),
         any(Uniq, HOInstInfo)) :-
     (
-        HOInstInfo0 = higher_order(pred_inst_info(PorF, Modes0, Det)),
+        HOInstInfo0 = higher_order(pred_inst_info(PorF, Modes0, MaybeArgRegs,
+            Det)),
         list.map(rename_apart_inst_vars_in_mode(Sub), Modes0, Modes),
-        HOInstInfo = higher_order(pred_inst_info(PorF, Modes, Det))
+        HOInstInfo = higher_order(pred_inst_info(PorF, Modes, MaybeArgRegs,
+            Det))
     ;
         HOInstInfo0 = none,
         HOInstInfo = none
@@ -409,9 +413,9 @@ rename_apart_inst_vars_in_inst(Sub, bound(U, BIs0), bound(U, BIs)) :-
 rename_apart_inst_vars_in_inst(Sub, ground(Uniq, HOInstInfo0),
         ground(Uniq, HOInstInfo)) :-
     (
-        HOInstInfo0 = higher_order(pred_inst_info(PorF, Modes0, Det)),
+        HOInstInfo0 = higher_order(pred_inst_info(PorF, Modes0, ArgRegs, Det)),
         list.map(rename_apart_inst_vars_in_mode(Sub), Modes0, Modes),
-        HOInstInfo = higher_order(pred_inst_info(PorF, Modes, Det))
+        HOInstInfo = higher_order(pred_inst_info(PorF, Modes, ArgRegs, Det))
     ;
         HOInstInfo0 = none,
         HOInstInfo = none
@@ -462,7 +466,7 @@ inst_contains_unconstrained_var(bound(_Uniqueness, BoundInsts)) :-
     inst_contains_unconstrained_var(ArgInst).
 inst_contains_unconstrained_var(ground(_Uniqueness, GroundInstInfo)) :-
     GroundInstInfo = higher_order(PredInstInfo),
-    PredInstInfo = pred_inst_info(_PredOrFunc, Modes, _Detism),
+    PredInstInfo = pred_inst_info(_PredOrFunc, Modes, _MaybeArgRegs, _Detism),
     list.member(Mode, Modes),
     (
         Mode = (Inst -> _)
@@ -683,9 +687,9 @@ strip_builtin_qualifiers_from_inst_name(typed_inst(Type, InstName0),
 strip_builtin_qualifiers_from_ho_inst_info(none, none).
 strip_builtin_qualifiers_from_ho_inst_info(higher_order(Pred0),
         higher_order(Pred)) :-
-    Pred0 = pred_inst_info(PorF, Modes0, Det),
-    Pred = pred_inst_info(PorF, Modes, Det),
-    strip_builtin_qualifiers_from_mode_list(Modes0, Modes).
+    Pred0 = pred_inst_info(PorF, Modes0, ArgRegs, Det),
+    strip_builtin_qualifiers_from_mode_list(Modes0, Modes),
+    Pred = pred_inst_info(PorF, Modes, ArgRegs, Det).
 
 %-----------------------------------------------------------------------------%
 :- end_module parse_tree.prog_mode.
diff --git a/compiler/prog_rep.m b/compiler/prog_rep.m
index 4a6b6f4..16934dd 100644
--- a/compiler/prog_rep.m
+++ b/compiler/prog_rep.m
@@ -293,7 +293,7 @@ goal_to_goal_rep(Info, Instmap0, hlds_goal(GoalExpr, GoalInfo), GoalRep) :-
         GoalExprRep = scope_rep(SubGoalRep, MaybeCut)
     ;
         ( GoalExpr = unify(_, _, _, _, _)
-        ; GoalExpr = generic_call(_, _, _, _)
+        ; GoalExpr = generic_call(_, _, _, _, _)
         ; GoalExpr = plain_call(_, _, _, _, _, _)
         ; GoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
         ),
@@ -342,7 +342,7 @@ goal_to_goal_rep(Info, Instmap0, hlds_goal(GoalExpr, GoalInfo), GoalRep) :-
                 unexpected($module, $pred, "complicated_unify")
             )
         ;
-            GoalExpr = generic_call(GenericCall, Args, _, _),
+            GoalExpr = generic_call(GenericCall, Args, _, _, _),
             ArgsRep = map(var_to_var_rep(Info), Args),
             (
                 GenericCall = higher_order(PredVar, _, _, _),
diff --git a/compiler/prog_type.m b/compiler/prog_type.m
index a310df6..db27c6e 100644
--- a/compiler/prog_type.m
+++ b/compiler/prog_type.m
@@ -66,6 +66,10 @@
     pred_or_func::out, lambda_eval_method::out, list(mer_type)::out)
     is semidet.
 
+:- pred type_is_higher_order_details_det(mer_type::in, purity::out,
+    pred_or_func::out, lambda_eval_method::out, list(mer_type)::out)
+    is det.
+
     % Succeed if the given type is a tuple type, returning
     % the argument types.
     %
@@ -440,6 +444,17 @@ type_is_higher_order_details(Type, Purity, PredOrFunc, EvalMethod,
         PredArgTypes = ArgTypes
     ).
 
+type_is_higher_order_details_det(Type, !:Purity, !:PredOrFunc, !:EvalMethod,
+        !:PredArgTypes) :-
+    (
+        type_is_higher_order_details(Type, !:Purity, !:PredOrFunc,
+            !:EvalMethod, !:PredArgTypes)
+    ->
+        true
+    ;
+        unexpected($module, $pred, "type is not higher-order")
+    ).
+
 type_is_tuple(Type, ArgTypes) :-
     strip_kind_annotation(Type) = tuple_type(ArgTypes, _).
 
@@ -536,7 +551,7 @@ type_to_ctor_and_args_det(Type, TypeCtor, Args) :-
         TypeCtor = TypeCtorPrime,
         Args = ArgsPrime
     ;
-        unexpected($module, $pred, "type_to_ctor_and_args failed")
+        unexpected($module, $pred, "type_to_ctor_and_args failed: " ++ string(Type))
     ).
 
 type_to_ctor(Type, TypeCtor) :-
diff --git a/compiler/prop_mode_constraints.m b/compiler/prop_mode_constraints.m
index 796e8d6..8b591d8 100644
--- a/compiler/prop_mode_constraints.m
+++ b/compiler/prop_mode_constraints.m
@@ -304,7 +304,8 @@ ensure_unique_arguments_in_goal(!Goal, !SeenSoFar, !VarSet, !VarTypes) :-
             )
 
         ;
-            !.GoalExpr = generic_call(Details, Args0, Modes, Determinism),
+            !.GoalExpr = generic_call(Details, Args0, Modes, MaybeArgRegs,
+                Determinism),
             Context = goal_info_get_context(!.GoalInfo),
             make_unifications(Context, Unifications, Args0, Args, !SeenSoFar,
                 !VarSet, !VarTypes),
@@ -316,7 +317,8 @@ ensure_unique_arguments_in_goal(!Goal, !SeenSoFar, !VarSet, !VarTypes) :-
                 % Need to put the call with its new args in a conjunction
                 % with the unifications.
                 Unifications = [_ | _],
-                CallGoalExpr = generic_call(Details, Args, Modes, Determinism),
+                CallGoalExpr = generic_call(Details, Args, Modes, MaybeArgRegs,
+                    Determinism),
                 replace_call_with_conjunction(CallGoalExpr, Unifications,
                     Args, !:GoalExpr, !GoalInfo)
             )
diff --git a/compiler/purity.m b/compiler/purity.m
index 869d21d..6d92e2f 100644
--- a/compiler/purity.m
+++ b/compiler/purity.m
@@ -190,6 +190,7 @@
 :- import_module bool.
 :- import_module list.
 :- import_module map.
+:- import_module maybe.
 :- import_module pair.
 :- import_module require.
 :- import_module set.
@@ -516,7 +517,7 @@ compute_expr_purity(GoalExpr0, GoalExpr, GoalInfo, Purity, ContainsTrace,
                 ArgVars = [InputArg, OutputArg]
             ->
                 GoalExpr = generic_call(cast(unsafe_type_cast),
-                    [InputArg, OutputArg], [in_mode, out_mode], detism_det)
+                    [InputArg, OutputArg], [in_mode, out_mode], no, detism_det)
             ;
                 GoalExpr = plain_call(PredId, ProcId, ArgVars, Status,
                     MaybeUnifyContext, SymName)
@@ -532,7 +533,8 @@ compute_expr_purity(GoalExpr0, GoalExpr, GoalInfo, Purity, ContainsTrace,
         Purity = ActualPurity,
         ContainsTrace = contains_no_trace_goal
     ;
-        GoalExpr0 = generic_call(GenericCall0, _ArgVars, _Modes0, _Det),
+        GoalExpr0 = generic_call(GenericCall0, _ArgVars, _Modes0,
+            _MaybeArgRegs, _Det),
         GoalExpr = GoalExpr0,
         (
             GenericCall0 = higher_order(_, Purity, _, _)
diff --git a/compiler/push_goals_together.m b/compiler/push_goals_together.m
index 8c416c5..1a4394b 100644
--- a/compiler/push_goals_together.m
+++ b/compiler/push_goals_together.m
@@ -490,7 +490,7 @@ is_pushable_goal(PushInfo, Goal, Pushable) :-
             )
         ;
             ( GoalExpr = plain_call(_, _, _, _, _, _)
-            ; GoalExpr = generic_call(_, _, _, _)
+            ; GoalExpr = generic_call(_, _, _, _, _)
             ; GoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
             ),
             Pushable = pushable
@@ -627,7 +627,7 @@ push_into_goal(LoHi, HeadPath, TailPaths, Goal0, Goal, Pushable) :-
         (
             ( GoalExpr0 = unify(_, _, _, _, _)
             ; GoalExpr0 = plain_call(_, _, _, _, _, _)
-            ; GoalExpr0 = generic_call(_, _, _, _)
+            ; GoalExpr0 = generic_call(_, _, _, _, _)
             ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
             ),
             Goal = Goal0,
diff --git a/compiler/quantification.m b/compiler/quantification.m
index 51a1015..f160b8a 100644
--- a/compiler/quantification.m
+++ b/compiler/quantification.m
@@ -463,7 +463,7 @@ implicitly_quantify_goal_quant_info_2(GoalExpr0, GoalExpr, GoalInfo0,
         goal_expr_vars_bitset(NonLocalsToRecompute, GoalExpr0,
             PossiblyNonLocalGoalVars0)
     ;
-        GoalExpr0 = generic_call(GenericCall, CallArgVars, _, _),
+        GoalExpr0 = generic_call(GenericCall, CallArgVars, _, _, _),
         GoalExpr = GoalExpr0,
         goal_util.generic_call_vars(GenericCall, ArgVars0),
         list.append(ArgVars0, CallArgVars, ArgVars),
@@ -1815,7 +1815,7 @@ goal_expr_vars_maybe_lambda_2(NonLocalsToRecompute, GoalExpr,
         GoalExpr = plain_call(_, _, ArgVars, _, _, _),
         set_of_var.insert_list(ArgVars, !Set)
     ;
-        GoalExpr = generic_call(GenericCall, ArgVars1, _, _),
+        GoalExpr = generic_call(GenericCall, ArgVars1, _, _, _),
         goal_util.generic_call_vars(GenericCall, ArgVars0),
         set_of_var.insert_list(ArgVars0, !Set),
         set_of_var.insert_list(ArgVars1, !Set)
@@ -1986,7 +1986,7 @@ goal_expr_vars_maybe_lambda_and_bi_impl_2(GoalExpr, !Set, !LambdaSet) :-
         GoalExpr = plain_call(_, _, ArgVars, _, _, _),
         set_of_var.insert_list(ArgVars, !Set)
     ;
-        GoalExpr = generic_call(GenericCall, ArgVars1, _, _),
+        GoalExpr = generic_call(GenericCall, ArgVars1, _, _, _),
         goal_util.generic_call_vars(GenericCall, ArgVars0),
         set_of_var.insert_list(ArgVars0, !Set),
         set_of_var.insert_list(ArgVars1, !Set)
@@ -2150,7 +2150,7 @@ goal_expr_vars_no_lambda_2(NonLocalsToRecompute, GoalExpr, !Set) :-
         GoalExpr = plain_call(_, _, ArgVars, _, _, _),
         set_of_var.insert_list(ArgVars, !Set)
     ;
-        GoalExpr = generic_call(GenericCall, ArgVars1, _, _),
+        GoalExpr = generic_call(GenericCall, ArgVars1, _, _, _),
         goal_util.generic_call_vars(GenericCall, ArgVars0),
         set_of_var.insert_list(ArgVars0, !Set),
         set_of_var.insert_list(ArgVars1, !Set)
diff --git a/compiler/rbmm.add_rbmm_goal_infos.m b/compiler/rbmm.add_rbmm_goal_infos.m
index 28ef700..3783af7 100644
--- a/compiler/rbmm.add_rbmm_goal_infos.m
+++ b/compiler/rbmm.add_rbmm_goal_infos.m
@@ -255,7 +255,7 @@ keep_allocated_regions([Input | Inputs], [RemovedRegion | RemovedRegions],
     % forgotten when we deal with these explicitly in the future.
     %
 collect_rbmm_goal_info_goal_expr(_, _, _, _, _, _, _, !Expr, !Info) :-
-    !.Expr = generic_call(_, _, _, _),
+    !.Expr = generic_call(_, _, _, _, _),
     sorry($module, $pred, "generic call not handled"). 
 collect_rbmm_goal_info_goal_expr(_, _, _, _, _, _, _, !Expr, !Info) :-
     !.Expr = call_foreign_proc(_, _, _, _, _, _, _),
diff --git a/compiler/rbmm.condition_renaming.m b/compiler/rbmm.condition_renaming.m
index cdf88a0..04794b4 100644
--- a/compiler/rbmm.condition_renaming.m
+++ b/compiler/rbmm.condition_renaming.m
@@ -259,7 +259,7 @@ collect_non_local_and_in_cond_regions_expr(_, _, _, _, _,
         plain_call(_, _, _, _, _, _),
         !NonLocalRegionsProc, !InCondRegionsProc).
 collect_non_local_and_in_cond_regions_expr(_, _, _, _, _,
-        generic_call(_, _, _, _),
+        generic_call(_, _, _, _, _),
         !NonLocalRegionsProc, !InCondRegionsProc).
 collect_non_local_and_in_cond_regions_expr(_, _, _, _, _,
         call_foreign_proc(_, _, _, _, _, _, _),
@@ -543,7 +543,7 @@ collect_non_local_regions_in_ite_compound_goal(Graph, LRBeforeProc,
         ; Expr = conj(_, [])
         ; Expr = disj([])
         ; Expr = call_foreign_proc(_, _, _, _, _, _, _)
-        ; Expr = generic_call(_, _, _, _)
+        ; Expr = generic_call(_, _, _, _, _)
         ; Expr = shorthand(_)
         ),
         unexpected($module, $pred, "atomic or unsupported goal")
@@ -720,7 +720,7 @@ collect_regions_created_in_condition_compound_goal(Graph,
         ; Expr = conj(_, [])
         ; Expr = disj([])
         ; Expr = call_foreign_proc(_, _, _, _, _, _, _)
-        ; Expr = generic_call(_, _, _, _)
+        ; Expr = generic_call(_, _, _, _, _)
         ; Expr = shorthand(_)
         ),
         unexpected($module, $pred, "atomic or unsupported goal")
@@ -822,7 +822,7 @@ collect_ite_renaming_goal(IteRenamedRegionProc, Graph, Goal,
     (
         ( GoalExpr = unify(_, _, _, _, _)
         ; GoalExpr = plain_call(_, _, _, _, _, _)
-        ; GoalExpr = generic_call(_, _, _, _)
+        ; GoalExpr = generic_call(_, _, _, _, _)
         ; GoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
         )
     ;
@@ -969,7 +969,7 @@ collect_ite_renaming_in_condition_compound_goal(IteRenamedRegionProc,
         ; Expr = conj(_, [])
         ; Expr = disj([])
         ; Expr = call_foreign_proc(_, _, _, _, _, _, _)
-        ; Expr = generic_call(_, _, _, _)
+        ; Expr = generic_call(_, _, _, _, _)
         ; Expr = shorthand(_)
         ),
         unexpected($module, $pred, "atomic or unsupported goal")
diff --git a/compiler/rbmm.execution_path.m b/compiler/rbmm.execution_path.m
index 5f5ce91..779f132 100644
--- a/compiler/rbmm.execution_path.m
+++ b/compiler/rbmm.execution_path.m
@@ -162,7 +162,7 @@ execution_paths_covered_compound_goal(ProcInfo, CompoundGoal, !ExecPaths) :-
         ; Expr = conj(_, [])
         ; Expr = disj([])
         ; Expr = call_foreign_proc(_, _, _, _, _, _, _)
-        ; Expr = generic_call(_, _, _, _)
+        ; Expr = generic_call(_, _, _, _, _)
         ; Expr = shorthand(_)
         ),
         unexpected($module, $pred, "encountered atomic or unsupported goal")
diff --git a/compiler/rbmm.points_to_analysis.m b/compiler/rbmm.points_to_analysis.m
index 4211e43..4bb9a07 100644
--- a/compiler/rbmm.points_to_analysis.m
+++ b/compiler/rbmm.points_to_analysis.m
@@ -127,7 +127,7 @@ intra_analyse_goal_expr(conj(_ConjType, Goals), !RptaInfo) :-
     % analysis.
     %
 intra_analyse_goal_expr(plain_call(_, _, _, _, _, _), !RptaInfo).
-intra_analyse_goal_expr(generic_call(_, _, _, _), !RptaInfo).
+intra_analyse_goal_expr(generic_call(_, _, _, _, _), !RptaInfo).
 intra_analyse_goal_expr(call_foreign_proc(_, _, _, _, _, _, _), !RptaInfo).
 
 intra_analyse_goal_expr(switch(_, _, Cases), !RptaInfo) :-
@@ -421,7 +421,7 @@ inter_analyse_goal_expr(Goal, GoalInfo, ModuleInfo, InfoTable,
             CallerRptaInfo1, !:CallerRptaInfo)
     ).
 
-inter_analyse_goal_expr(generic_call(_, _, _, _), _, _, _, !FPTable,
+inter_analyse_goal_expr(generic_call(_, _, _, _, _), _, _, _, !FPTable,
         !RptaInfo) :-
     sorry($module, $pred, "generic_call not handled").
 
diff --git a/compiler/rbmm.region_arguments.m b/compiler/rbmm.region_arguments.m
index 6c7c327..47f0b6c 100644
--- a/compiler/rbmm.region_arguments.m
+++ b/compiler/rbmm.region_arguments.m
@@ -244,7 +244,7 @@ record_actual_region_arguments_expr(ModuleInfo, GoalExpr, GoalInfo, CallerPPId,
                 RptaInfoTable, ConstantRTable, DeadRTable, BornRTable),
             Cases, !FormalRegionArgTable, !ActualRegionArgProc)
     ;
-        GoalExpr = generic_call(_, _, _, _),
+        GoalExpr = generic_call(_, _, _, _, _),
         sorry($module, $pred, "generic_call NYI")
     ;
         GoalExpr = call_foreign_proc(_, _, _, _, _, _, _),
diff --git a/compiler/rbmm.region_transformation.m b/compiler/rbmm.region_transformation.m
index 3acd03b..b0fee25 100644
--- a/compiler/rbmm.region_transformation.m
+++ b/compiler/rbmm.region_transformation.m
@@ -459,7 +459,7 @@ region_transform_goal_expr(ModuleInfo, Graph, ResurRenaming, IteRenaming,
 
 region_transform_goal_expr(_, _, _, _, _, _, !GoalExpr, !GoalInfo, !NameToVar,
         !VarSet, !VarTypes) :-
-    !.GoalExpr = generic_call(_, _, _, _),
+    !.GoalExpr = generic_call(_, _, _, _, _),
     sorry($module, $pred, "generic call").
 
 region_transform_goal_expr(_, _, _, _, _, _, !GoalExpr, !GoalInfo, !NameToVar,
@@ -574,7 +574,7 @@ region_transform_compound_goal(ModuleInfo, Graph,
         ( !.GoalExpr = shorthand(_)
         ; !.GoalExpr = unify(_, _, _, _, _)
         ; !.GoalExpr = plain_call(_, _, _, _, _, _)
-        ; !.GoalExpr = generic_call(_, _, _, _)
+        ; !.GoalExpr = generic_call(_, _, _, _, _)
         ; !.GoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
         ; !.GoalExpr = conj(_, [])
         ; !.GoalExpr = disj([])
diff --git a/compiler/recompilation.usage.m b/compiler/recompilation.usage.m
index f9734f3..69dc18e 100644
--- a/compiler/recompilation.usage.m
+++ b/compiler/recompilation.usage.m
@@ -1327,7 +1327,7 @@ find_items_used_by_insts(Modes, !Info) :-
 
 find_items_used_by_inst(any(_, HOInstInfo), !Info) :-
     (
-        HOInstInfo = higher_order(pred_inst_info(_, Modes, _)),
+        HOInstInfo = higher_order(pred_inst_info(_, Modes, _, _)),
         find_items_used_by_modes(Modes, !Info)
     ;
         HOInstInfo = none
@@ -1338,7 +1338,7 @@ find_items_used_by_inst(bound(_, BoundInsts), !Info) :-
     list.foldl(find_items_used_by_bound_inst, BoundInsts, !Info).
 find_items_used_by_inst(ground(_, HOInstInfo), !Info) :-
     (
-        HOInstInfo = higher_order(pred_inst_info(_, Modes, _)),
+        HOInstInfo = higher_order(pred_inst_info(_, Modes, _, _)),
         find_items_used_by_modes(Modes, !Info)
     ;
         HOInstInfo = none
diff --git a/compiler/saved_vars.m b/compiler/saved_vars.m
index e4cd697..ca74234 100644
--- a/compiler/saved_vars.m
+++ b/compiler/saved_vars.m
@@ -167,7 +167,7 @@ saved_vars_in_goal(Goal0, Goal, !SlotInfo) :-
         GoalExpr = scope(Reason, SubGoal),
         Goal = hlds_goal(GoalExpr, GoalInfo0)
     ;
-        ( GoalExpr0 = generic_call(_, _, _, _)
+        ( GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = plain_call(_, _, _, _, _, _)
         ; GoalExpr0 = unify(_, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
@@ -298,7 +298,7 @@ can_push(Var, Goal) = CanPush :-
             ( GoalExpr = conj(parallel_conj, _)
             ; GoalExpr = unify(_, _, _, _, _)
             ; GoalExpr = plain_call(_, _, _, _, _, _)
-            ; GoalExpr = generic_call(_, _, _, _)
+            ; GoalExpr = generic_call(_, _, _, _, _)
             ; GoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
             ),
             CanPush = no
@@ -392,7 +392,7 @@ saved_vars_delay_goal([Goal0 | Goals0], Goals, Construct, Var, IsNonLocal,
                 IsNonLocal, !SlotInfo),
             Goals = [NewConstruct, Goal1 | Goals1]
         ;
-            Goal0Expr = generic_call(_, _, _, _),
+            Goal0Expr = generic_call(_, _, _, _, _),
             rename_var(Var, _NewVar, Subst, !SlotInfo),
             rename_some_vars_in_goal(Subst, Construct, NewConstruct),
             rename_some_vars_in_goal(Subst, Goal0, Goal1),
diff --git a/compiler/simplify.m b/compiler/simplify.m
index 19cd8cb..8a22349 100644
--- a/compiler/simplify.m
+++ b/compiler/simplify.m
@@ -1005,7 +1005,7 @@ simplify_goal_expr(!GoalExpr, !GoalInfo, !Info) :-
         !.GoalExpr = plain_call(_, _, _, _, _, _),
         simplify_goal_plain_call(!GoalExpr, !GoalInfo, !Info)
     ;
-        !.GoalExpr = generic_call(_, _, _, _),
+        !.GoalExpr = generic_call(_, _, _, _, _),
         simplify_goal_generic_call(!GoalExpr, !GoalInfo, !Info)
     ;
         !.GoalExpr = call_foreign_proc(_, _, _, _, _, _, _),
@@ -1278,7 +1278,7 @@ simplify_goal_switch(GoalExpr0, GoalExpr, GoalInfo0, GoalInfo, !Info) :-
     simplify_info::in, simplify_info::out) is det.
 
 simplify_goal_generic_call(GoalExpr0, GoalExpr, GoalInfo, GoalInfo, !Info) :-
-    GoalExpr0 = generic_call(GenericCall, Args, Modes, Det),
+    GoalExpr0 = generic_call(GenericCall, Args, Modes, _, Det),
     (
         GenericCall = higher_order(Closure, Purity, _, _),
         (
@@ -1750,7 +1750,7 @@ warn_switch_for_ite_cond(ModuleInfo, VarTypes, Cond, !CondCanSwitch) :-
         )
     ;
         ( CondExpr = plain_call(_, _, _, _, _, _)
-        ; CondExpr = generic_call(_, _, _, _)
+        ; CondExpr = generic_call(_, _, _, _, _)
         ; CondExpr = call_foreign_proc(_, _, _, _, _, _, _)
         ; CondExpr = conj(_, _)
         ; CondExpr = switch(_, _, _)
@@ -3719,7 +3719,7 @@ goal_contains_trace(hlds_goal(GoalExpr0, GoalInfo0),
     (
         ( GoalExpr0 = unify(_, _, _, _, _)
         ; GoalExpr0 = plain_call(_, _, _, _, _, _)
-        ; GoalExpr0 = generic_call(_, _, _, _)
+        ; GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
         ),
         GoalExpr = GoalExpr0,
@@ -4181,7 +4181,7 @@ will_flush(plain_call(_, _, _, BuiltinState, _, _), BeforeAfter) = WillFlush :-
             WillFlush = yes
         )
     ).
-will_flush(generic_call(GenericCall, _, _, _), BeforeAfter) = WillFlush :-
+will_flush(generic_call(GenericCall, _, _, _, _), BeforeAfter) = WillFlush :-
     (
         GenericCall = higher_order(_, _, _, _),
         WillFlush0 = yes
diff --git a/compiler/size_prof.m b/compiler/size_prof.m
index b6bbc29..48a22ac 100644
--- a/compiler/size_prof.m
+++ b/compiler/size_prof.m
@@ -315,7 +315,7 @@ size_prof_process_goal(Goal0, Goal, !Info) :-
         !Info ^ spi_rev_type_ctor_map := map.init,
         GoalExpr = GoalExpr0
     ;
-        GoalExpr0 = generic_call(_, _, _, _),
+        GoalExpr0 = generic_call(_, _, _, _, _),
         % We don't want to save type_ctor_info variables across calls,
         % because saving/restoring them is more expensive than defining
         % them again.
diff --git a/compiler/ssdebug.m b/compiler/ssdebug.m
index 4a8bb29..1d02408 100755
--- a/compiler/ssdebug.m
+++ b/compiler/ssdebug.m
@@ -316,7 +316,7 @@ ssdebug_first_pass_in_goal(!Goal, !ProcInfo, !ProxyMap, !ModuleInfo) :-
             Builtin = out_of_line_builtin
         )
     ;
-        GoalExpr0 = generic_call(_, _, _, _),
+        GoalExpr0 = generic_call(_, _, _, _, _),
         insert_context_update_call(!.ModuleInfo, !Goal, !ProcInfo)
     ;
         GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
diff --git a/compiler/stm_expand.m b/compiler/stm_expand.m
index 5f04bb3..cfc8b5a 100644
--- a/compiler/stm_expand.m
+++ b/compiler/stm_expand.m
@@ -373,7 +373,7 @@ stm_process_goal(Instmap, Goal0, Goal, !Info) :-
         GoalExpr = if_then_else(Vars, Cond, Then, Else),
         Goal = hlds_goal(GoalExpr, GoalInfo0)
     ;
-        ( GoalExpr0 = generic_call(_, _, _, _)
+        ( GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = plain_call(_, _, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
         ),
@@ -2278,7 +2278,7 @@ create_closure(PredProcID, Args, ArgTypes, ArgModes, ClosureVar,
 
     ClosureAssignInstmapDeltaList = assoc_list.from_corresponding_lists(
         [ClosureVar], [ground(shared, higher_order(pred_inst_info(
-        pf_predicate, ArgModes, detism_det)))]),
+        pf_predicate, ArgModes, no, detism_det)))]),
     ClosureAssignInstmapDelta =
         instmap_delta_from_assoc_list(ClosureAssignInstmapDeltaList),
 
diff --git a/compiler/store_alloc.m b/compiler/store_alloc.m
index 82e7921..3821b35 100644
--- a/compiler/store_alloc.m
+++ b/compiler/store_alloc.m
@@ -267,7 +267,7 @@ store_alloc_in_goal_2(GoalExpr0, GoalExpr, !Liveness, !LastLocns,
         ),
         BranchedGoal = is_not_branched_goal
     ;
-        ( GoalExpr0 = generic_call(_, _, _, _)
+        ( GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = plain_call(_, _, _, _, _, _)
         ; GoalExpr0 = unify(_, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
diff --git a/compiler/stratify.m b/compiler/stratify.m
index 9d811ea..3d31f16 100644
--- a/compiler/stratify.m
+++ b/compiler/stratify.m
@@ -226,7 +226,7 @@ first_order_check_goal(Goal, Negated, WholeScc, ThisPredProcId, ErrorOrWarning,
             true
         )
     ;
-        GoalExpr = generic_call(_Var, _Args, _Modes, _Det)
+        GoalExpr = generic_call(_Var, _Args, _Modes, _MaybeArgRegs, _Det)
         % Do nothing.
     ;
         GoalExpr = unify(_LHS, _RHS, _Mode, _Unification, _UnifyContext)
@@ -386,7 +386,8 @@ higher_order_check_goal(Goal, Negated, WholeScc, ThisPredProcId,
             true
         )
     ;
-        GoalExpr = generic_call(GenericCall, _Vars, _Modes, _Det),
+        GoalExpr = generic_call(GenericCall, _Vars, _Modes, _MaybeArgRegs,
+            _Det),
         (
             Negated = yes,
             HighOrderLoops = yes,
@@ -818,7 +819,7 @@ stratify_analyze_goal(Goal, !Calls, !HasAT, !CallsHO) :-
         % XXX If the foreign proc may_call_mercury, then we may be missing
         % some calls.
     ;
-        GoalExpr = generic_call(_Var, _Vars, _Modes, _Det),
+        GoalExpr = generic_call(_Var, _Vars, _Modes, _MaybeArgRegs, _Det),
         % Record that the higher order call was made.
         !:CallsHO = calls_higher_order
     ;
@@ -943,7 +944,7 @@ get_called_procs(Goal, !Calls) :-
         GoalExpr = call_foreign_proc(_Attrib, _CPred, _CProc, _, _, _, _)
         % Do nothing.
     ;
-        GoalExpr = generic_call(_Var, _Vars, _Modes, _Det)
+        GoalExpr = generic_call(_Var, _Vars, _Modes, _MaybeArgRegs, _Det)
         % Do nothing.
     ;
         ( GoalExpr = conj(_ConjType, Goals)
diff --git a/compiler/structure_reuse.direct.choose_reuse.m b/compiler/structure_reuse.direct.choose_reuse.m
index 0b58804..7ec1e79 100644
--- a/compiler/structure_reuse.direct.choose_reuse.m
+++ b/compiler/structure_reuse.direct.choose_reuse.m
@@ -593,7 +593,7 @@ compute_match_table_with_continuation(Background, DeadCellTable,
         compute_match_table_goal_list(Background, DeadCellTable,
             Cont, !Table)
     ;
-        GoalExpr = generic_call( _, _, _, _),
+        GoalExpr = generic_call( _, _, _, _, _),
         compute_match_table_goal_list(Background, DeadCellTable,
             Cont, !Table)
     ;
@@ -837,7 +837,7 @@ find_match_in_goal_2(Background, Goal, !Match) :-
     ;
         GoalExpr = plain_call(_, _, _, _, _, _)
     ;
-        GoalExpr = generic_call( _, _, _, _)
+        GoalExpr = generic_call( _, _, _, _, _)
     ;
         GoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
     ;
@@ -1175,7 +1175,7 @@ annotate_reuses_in_goal(Background, Match, !Goal) :-
         GoalExpr = GoalExpr0,
         GoalInfo = GoalInfo0
     ;
-        GoalExpr0 = generic_call( _, _, _, _),
+        GoalExpr0 = generic_call( _, _, _, _, _),
         GoalExpr = GoalExpr0,
         GoalInfo = GoalInfo0
     ;
@@ -1451,7 +1451,7 @@ check_for_cell_caching_2(DeadCellTable, !Goal) :-
         GoalExpr = GoalExpr0,
         GoalInfo = GoalInfo0
     ;
-        GoalExpr0 = generic_call( _, _, _, _),
+        GoalExpr0 = generic_call( _, _, _, _, _),
         GoalExpr = GoalExpr0,
         GoalInfo = GoalInfo0
     ;
diff --git a/compiler/structure_reuse.direct.detect_garbage.m b/compiler/structure_reuse.direct.detect_garbage.m
index a068cac..e084a05 100644
--- a/compiler/structure_reuse.direct.detect_garbage.m
+++ b/compiler/structure_reuse.direct.detect_garbage.m
@@ -116,7 +116,8 @@ determine_dead_deconstructions_2(Background, TopGoal, !SharingAs,
         lookup_sharing_and_comb(ModuleInfo, PredInfo, ProcInfo, SharingTable,
             PredId, ProcId, ActualVars, !SharingAs)
     ;
-        GoalExpr = generic_call(GenDetails, CallArgs, Modes, _Detism),
+        GoalExpr = generic_call(GenDetails, CallArgs, Modes, _MaybeArgRegs,
+            _Detism),
         determine_dead_deconstructions_generic_call(ModuleInfo, ProcInfo,
             GenDetails, CallArgs, Modes, GoalInfo, !SharingAs)
     ;
diff --git a/compiler/structure_reuse.indirect.m b/compiler/structure_reuse.indirect.m
index 7bddf43..5d17b86 100644
--- a/compiler/structure_reuse.indirect.m
+++ b/compiler/structure_reuse.indirect.m
@@ -469,7 +469,8 @@ indirect_reuse_analyse_goal(BaseInfo, !Goal, !IrInfo) :-
         indirect_reuse_analyse_plain_call(BaseInfo,
             hlds_goal(GoalExpr0, GoalInfo0), !:Goal, !IrInfo)
     ;
-        GoalExpr0 = generic_call(GenDetails, CallArgs, Modes, _Detism),
+        GoalExpr0 = generic_call(GenDetails, CallArgs, Modes, _MaybeArgRegs,
+            _Detism),
         indirect_reuse_analyse_generic_call(BaseInfo, GenDetails, CallArgs,
             Modes, GoalInfo0, !IrInfo)
     ;
diff --git a/compiler/structure_reuse.lbu.m b/compiler/structure_reuse.lbu.m
index 25ca445..4f3993f 100644
--- a/compiler/structure_reuse.lbu.m
+++ b/compiler/structure_reuse.lbu.m
@@ -102,7 +102,7 @@ backward_use_in_goal_2(VarTypes, Info0, !Expr, !LBU) :-
             true
         )
     ;
-        !.Expr = generic_call(_, _, _, _)
+        !.Expr = generic_call(_, _, _, _, _)
     ;
         % XXX Can they be nondet? If so, LFU variables need to be added
         % to !LBU.
diff --git a/compiler/structure_reuse.lfu.m b/compiler/structure_reuse.lfu.m
index d632f63..3f6415e 100644
--- a/compiler/structure_reuse.lfu.m
+++ b/compiler/structure_reuse.lfu.m
@@ -160,7 +160,7 @@ forward_use_in_composite_goal(VarTypes, !Goal, !InstantiatedVars,
     ;
         ( GoalExpr0 = unify(_, _, _, _, _)
         ; GoalExpr0 = plain_call(_, _, _, _, _, _)
-        ; GoalExpr0 = generic_call(_, _, _, _)
+        ; GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
         ),
         unexpected($module, $pred, "atomic goal")
@@ -278,7 +278,7 @@ add_vars_to_lfu_in_goal_expr(ForceInUse, Expr0, Expr) :-
         add_vars_to_lfu_in_goal(ForceInUse, Goal0, Goal),
         Expr = scope(Reason, Goal)
     ;
-        Expr0 = generic_call(_, _, _, _),
+        Expr0 = generic_call(_, _, _, _, _),
         Expr = Expr0
     ;
         Expr0 = plain_call(_, _, _, _, _, _),
diff --git a/compiler/structure_reuse.versions.m b/compiler/structure_reuse.versions.m
index 75ba575..69de4ad 100644
--- a/compiler/structure_reuse.versions.m
+++ b/compiler/structure_reuse.versions.m
@@ -370,7 +370,7 @@ process_goal(ConvertPotentialReuse, ReuseTable, ModuleInfo, !Goal) :-
             true
         )
     ;
-        GoalExpr0 = generic_call(_, _, _, _)
+        GoalExpr0 = generic_call(_, _, _, _, _)
     ;
         GoalExpr0 = unify(_, _, _, Unification0, _),
         ReuseDescription0 = goal_info_get_reuse(GoalInfo0),
diff --git a/compiler/structure_sharing.analysis.m b/compiler/structure_sharing.analysis.m
index 28b71c2..fcbfc0d 100644
--- a/compiler/structure_sharing.analysis.m
+++ b/compiler/structure_sharing.analysis.m
@@ -607,7 +607,8 @@ analyse_goal(ModuleInfo, PredInfo, ProcInfo, SharingTable, Verbose, Goal,
             RenamedSharing, !.SharingAs),
         !:Status = lub(CalleeStatus, !.Status)
     ;
-        GoalExpr = generic_call(GenDetails, CallArgs, Modes, _Detism),
+        GoalExpr = generic_call(GenDetails, CallArgs, Modes, _MaybeArgRegs,
+            _Detism),
         analyse_generic_call(ModuleInfo, ProcInfo, GenDetails, CallArgs,
             Modes, GoalInfo, !SharingAs)
     ;
diff --git a/compiler/switch_detection.m b/compiler/switch_detection.m
index e89c277..342a76a 100644
--- a/compiler/switch_detection.m
+++ b/compiler/switch_detection.m
@@ -332,7 +332,7 @@ detect_switches_in_goal_expr(InstMap0, GoalInfo, GoalExpr0, GoalExpr,
             GoalExpr = GoalExpr0
         )
     ;
-        ( GoalExpr0 = generic_call(_, _, _, _)
+        ( GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = plain_call(_, _, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
         ),
@@ -986,7 +986,7 @@ find_bind_var_2(Var, ProcessUnify, Goal0, Goal, !Subst, !Result, !Info,
         )
     ;
         ( GoalExpr0 = plain_call(_, _, _, _, _, _)
-        ; GoalExpr0 = generic_call(_, _, _, _)
+        ; GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
         ; GoalExpr0 = disj(_)
         ; GoalExpr0 = switch(_, _, _)
diff --git a/compiler/tabling_analysis.m b/compiler/tabling_analysis.m
index c83594b..780ff6a 100644
--- a/compiler/tabling_analysis.m
+++ b/compiler/tabling_analysis.m
@@ -336,7 +336,7 @@ check_goal_for_mm_tabling(SCC, VarTypes, Goal, Result, MaybeAnalysisStatus,
         check_call_for_mm_tabling(CalleePPId, CallArgs, SCC, VarTypes, Result,
             MaybeAnalysisStatus, !ModuleInfo)
     ;
-        GoalExpr = generic_call(Details, _Args, _ArgModes, _),
+        GoalExpr = generic_call(Details, _Args, _ArgModes, _, _),
         (
             % XXX We should use any results from closure analysis here.
             Details = higher_order(_Var, _, _, _),
@@ -633,7 +633,7 @@ annotate_goal_2(VarTypes, !GoalExpr, Status, !ModuleInfo) :-
         !.GoalExpr = call_foreign_proc(Attributes, _, _, _, _, _, _),
         Status = get_mm_tabling_status_from_attributes(Attributes)
     ;
-        !.GoalExpr = generic_call(GenericCall, _Args, _Modes, _Detism),
+        !.GoalExpr = generic_call(GenericCall, _Args, _Modes, _, _Detism),
         (
             % XXX We should use any results from closure analysis here.
             GenericCall = higher_order(_Var, _, _, _),
diff --git a/compiler/term_constr_build.m b/compiler/term_constr_build.m
index a5f967c..c1b85ed 100644
--- a/compiler/term_constr_build.m
+++ b/compiler/term_constr_build.m
@@ -478,7 +478,7 @@ build_abstract_goal_2(GoalExpr, GoalInfo, AbstractGoal, !Info) :-
         Polyhedron = polyhedron.from_constraints(Constraints),
         AbstractGoal = term_primitive(Polyhedron, [], [])
     ;
-        GoalExpr = generic_call(_, _, _, _),
+        GoalExpr = generic_call(_, _, _, _, _),
         % XXX At the moment all higher-order calls are eventually treated
         % as an error. We do not record them as a normal type of error
         % because this is going to change. To approximate their effect here
diff --git a/compiler/term_pass1.m b/compiler/term_pass1.m
index 2443107..2e301e1 100644
--- a/compiler/term_pass1.m
+++ b/compiler/term_pass1.m
@@ -346,7 +346,7 @@ check_goal_non_term_calls(PPId, VarTypes, Goal, !Errors, !ModuleInfo, !IO) :-
         GoalExpr = call_foreign_proc(_, _, _, _, _, _, _)
         % XXX This looks incomplete - juliensf.
     ;
-        GoalExpr = generic_call(_, _, _, _),
+        GoalExpr = generic_call(_, _, _, _, _),
         % XXX We should use any results from closure analysis here.
         Context = goal_info_get_context(GoalInfo),
         Error = termination_error_context(horder_call, Context),
diff --git a/compiler/term_traversal.m b/compiler/term_traversal.m
index 6bb39a5..f4d3c0e 100644
--- a/compiler/term_traversal.m
+++ b/compiler/term_traversal.m
@@ -260,7 +260,7 @@ term_traverse_goal(Goal, Params, !Info, !ModuleInfo, !IO) :-
             )
         )
     ;
-        GoalExpr = generic_call(Details, Args, ArgModes, _),
+        GoalExpr = generic_call(Details, Args, ArgModes, _, _),
         Context = goal_info_get_context(GoalInfo),
         (
             Details = higher_order(Var, _, _, _),
diff --git a/compiler/trailing_analysis.m b/compiler/trailing_analysis.m
index 644d139..0e2fc6a 100644
--- a/compiler/trailing_analysis.m
+++ b/compiler/trailing_analysis.m
@@ -420,7 +420,7 @@ check_goal_for_trail_mods(SCC, VarTypes, Goal, Result, MaybeAnalysisStatus,
             )
         )
     ;
-        GoalExpr = generic_call(Details, _Args, _ArgModes, _),
+        GoalExpr = generic_call(Details, _Args, _ArgModes, _, _),
         (
             % XXX Use results of closure analysis to handle this.
             Details = higher_order(_Var, _, _, _),
@@ -935,7 +935,7 @@ annotate_goal_2(VarTypes, GoalInfo, !GoalExpr, Status, !ModuleInfo) :-
         Status = attributes_imply_trail_mod(Attributes)
     ;
         % XXX We should use any results from closure analysis here.
-        !.GoalExpr = generic_call(GenericCall, _, _, _),
+        !.GoalExpr = generic_call(GenericCall, _, _, _, _),
         (
             GenericCall = higher_order(_, _, _, _),
             Status = trail_may_modify
diff --git a/compiler/transform_hlds.m b/compiler/transform_hlds.m
index b6fda1f..98549cf 100644
--- a/compiler/transform_hlds.m
+++ b/compiler/transform_hlds.m
@@ -71,7 +71,7 @@
 :- include_module trailing_analysis.
 :- include_module tabling_analysis.
 
-% Optimizations (HLDS -> HLDS)
+% Mostly optimizations (HLDS -> HLDS)
 :- include_module higher_order.
 :- include_module inlining.
 :- include_module deforest.
@@ -98,12 +98,13 @@
 :- include_module parallel_to_plain_conj.
 :- include_module implicit_parallelism.
 :- include_module par_loop_control.
+:- include_module lco.
+:- include_module float_regs.
 
 :- include_module mmc_analysis.
 
 % XXX The following modules are all currently unused.
 :- include_module transform.
-:- include_module lco.
 
 %-----------------------------------------------------------------------------%
 
diff --git a/compiler/try_expand.m b/compiler/try_expand.m
index acae9ca..f70b4ba 100644
--- a/compiler/try_expand.m
+++ b/compiler/try_expand.m
@@ -402,7 +402,7 @@ expand_try_goals_in_goal(Instmap, Goal0, Goal, !Info) :-
         GoalExpr = if_then_else(Vars, Cond, Then, Else),
         Goal = hlds_goal(GoalExpr, GoalInfo0)
     ;
-        ( GoalExpr0 = generic_call(_, _, _, _)
+        ( GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = plain_call(_, _, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
         ),
diff --git a/compiler/tupling.m b/compiler/tupling.m
index cc5c59c..eecc041 100644
--- a/compiler/tupling.m
+++ b/compiler/tupling.m
@@ -876,9 +876,9 @@ prepare_proc_for_counting(PredProcId, !ReverseGoalPathMapMap, !ModuleInfo) :-
     some [!ProcInfo] (
         module_info_pred_proc_info(!.ModuleInfo, PredId, ProcId,
             PredInfo, !:ProcInfo),
-        pred_info_get_import_status(PredInfo, Status),
+        pred_info_get_markers(PredInfo, Markers),
         pred_info_get_arg_types(PredInfo, ArgTypes),
-        generate_proc_arg_info(Status, ArgTypes, !.ModuleInfo, !ProcInfo),
+        generate_proc_arg_info(Markers, ArgTypes, !.ModuleInfo, !ProcInfo),
 
         detect_liveness_proc(!.ModuleInfo, PredProcId, !ProcInfo),
         initial_liveness(!.ProcInfo, PredId, !.ModuleInfo, Liveness0),
@@ -1062,14 +1062,20 @@ count_load_stores_in_goal(Goal, CountInfo, !CountState) :-
                 CountInfo, !CountState)
         )
     ;
-        GoalExpr = generic_call(GenericCall, ArgVars, ArgModes, _Detism),
+        GoalExpr = generic_call(GenericCall, ArgVars, ArgModes, MaybeArgRegs,
+            _Detism),
         ProcInfo = CountInfo ^ ci_proc,
         ModuleInfo = CountInfo ^ ci_module,
         goal_info_get_maybe_need_across_call(GoalInfo, MaybeNeedAcrossCall),
         proc_info_get_vartypes(ProcInfo, VarTypes),
-        list.map(map.lookup(VarTypes), ArgVars, ArgTypes),
-        arg_info.compute_in_and_out_vars(ModuleInfo, ArgVars,
-            ArgModes, ArgTypes, InputArgs, OutputArgs),
+        map.apply_to_list(ArgVars, VarTypes, ArgTypes),
+        arg_info.generic_call_arg_reg_types(ModuleInfo, VarTypes,
+            GenericCall, ArgVars, MaybeArgRegs, ArgRegTypes),
+        arg_info.compute_in_and_out_vars_sep_regs(ModuleInfo, ArgVars,
+            ArgModes, ArgTypes, ArgRegTypes, InputArgsR, InputArgsF,
+            OutputArgsR, OutputArgsF),
+        InputArgs = InputArgsR ++ InputArgsF,
+        OutputArgs = OutputArgsR ++ OutputArgsF,
 
         (
             ( GenericCall = higher_order(_, _, _, _)
@@ -1078,7 +1084,8 @@ count_load_stores_in_goal(Goal, CountInfo, !CountState) :-
             ),
             module_info_get_globals(ModuleInfo, Globals),
             call_gen.generic_call_info(Globals, GenericCall,
-                length(InputArgs), _, GenericVarsArgInfos, _, _),
+                length(InputArgsR), length(InputArgsF), _,
+                GenericVarsArgInfos, _, _),
             assoc_list.keys(GenericVarsArgInfos, GenericVars),
             list.append(GenericVars, InputArgs, Inputs),
             Outputs = set.list_to_set(OutputArgs),
@@ -1725,7 +1732,7 @@ fix_calls_in_goal(Goal0, Goal, !VarSet, !VarTypes, !RttiVarMaps,
     Goal0 = hlds_goal(GoalExpr0, GoalInfo0),
     (
         ( GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
-        ; GoalExpr0 = generic_call(_, _, _, _)
+        ; GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = unify(_, _, _, _, _)
         ),
         Goal = hlds_goal(GoalExpr0, GoalInfo0)
diff --git a/compiler/type_constraints.m b/compiler/type_constraints.m
index d2b5aac..94e592c 100644
--- a/compiler/type_constraints.m
+++ b/compiler/type_constraints.m
@@ -1637,7 +1637,7 @@ goal_to_constraint(Environment, Goal, !TCInfo) :-
         foreign_proc_goal_to_constraint(Environment, GoalExpr, GoalInfo,
             !TCInfo)
     ;
-        GoalExpr = generic_call(_, _, _, _),
+        GoalExpr = generic_call(_, _, _, _, _),
         generic_call_goal_to_constraint(Environment, GoalExpr, GoalInfo,
             !TCInfo)
     ;
@@ -1809,7 +1809,7 @@ foreign_proc_goal_to_constraint(Environment, GoalExpr, GoalInfo, !TCInfo) :-
     type_constraint_info::in, type_constraint_info::out) is det.
 
 generic_call_goal_to_constraint(Environment, GoalExpr, GoalInfo, !TCInfo) :-
-    GoalExpr = generic_call(Details, Vars, _, _),
+    GoalExpr = generic_call(Details, Vars, _, _, _),
     Context = goal_info_get_context(GoalInfo),
     list.map_foldl(get_var_type, Vars, ArgTVars, !TCInfo),
     ArgTypes = list.map(tvar_to_type, ArgTVars),
diff --git a/compiler/typecheck.m b/compiler/typecheck.m
index bfbdfa6..a03dd9d 100644
--- a/compiler/typecheck.m
+++ b/compiler/typecheck.m
@@ -1318,7 +1318,8 @@ typecheck_goal_2(GoalExpr0, GoalExpr, GoalInfo, !Info) :-
         typecheck_call_pred(CurCall, Args, GoalId, PredId, !Info),
         GoalExpr = plain_call(PredId, ProcId, Args, BI, UC, Name)
     ;
-        GoalExpr0 = generic_call(GenericCall0, Args, Modes, Detism),
+        GoalExpr0 = generic_call(GenericCall0, Args, Modes, MaybeArgRegs,
+            Detism),
         hlds_goal.generic_call_id(GenericCall0, CallId),
         typecheck_info_set_called_predid(CallId, !Info),
         (
@@ -1344,7 +1345,7 @@ typecheck_goal_2(GoalExpr0, GoalExpr, GoalInfo, !Info) :-
             % so nothing needs to be done here.
             GenericCall = GenericCall0
         ),
-        GoalExpr = generic_call(GenericCall, Args, Modes, Detism)
+        GoalExpr = generic_call(GenericCall, Args, Modes, MaybeArgRegs, Detism)
     ;
         GoalExpr0 = unify(LHS, RHS0, UnifyMode, Unification, UnifyContext),
         trace [compiletime(flag("type_checkpoint")), io(!IO)] (
diff --git a/compiler/unify_gen.m b/compiler/unify_gen.m
index 820b342..aa95234 100644
--- a/compiler/unify_gen.m
+++ b/compiler/unify_gen.m
@@ -722,7 +722,7 @@ generate_closure(PredId, ProcId, EvalMethod, Var, Args, GoalInfo, Code, !CI) :-
         Args = [CallPred | CallArgs],
         ProcHeadVars = [ProcPred | ProcArgs],
         ProcInfoGoal = hlds_goal(generic_call(higher_order(ProcPred, _, _, _),
-            ProcArgs, _, CallDeterminism), _GoalInfo),
+            ProcArgs, _, _, CallDeterminism), _GoalInfo),
         determinism_to_code_model(CallDeterminism, CallCodeModel),
         % Check that the code models are compatible. Note that det is not
         % compatible with semidet, and semidet is not compatible with nondet,
@@ -738,7 +738,12 @@ generate_closure(PredId, ProcId, EvalMethod, Var, Args, GoalInfo, Code, !CI) :-
         % in deep profiling grades.
         module_info_get_globals(ModuleInfo, Globals),
         globals.lookup_bool_option(Globals, profile_deep, Deep),
-        Deep = no
+        Deep = no,
+        % XXX If float registers are used, float register arguments are placed
+        % after regular register arguments in the hidden arguments vector.
+        % The code below does not handle that layout.
+        globals.lookup_bool_option(Globals, use_float_registers, UseFloatRegs),
+        UseFloatRegs = no
     ->
         (
             CallArgs = [],
@@ -848,17 +853,20 @@ generate_closure(PredId, ProcId, EvalMethod, Var, Args, GoalInfo, Code, !CI) :-
         % is never looked at.
         add_scalar_static_cell(ClosureLayoutRvalsTypes, ClosureDataAddr, !CI),
         ClosureLayoutRval = const(llconst_data_addr(ClosureDataAddr, no)),
-        list.length(Args, NumArgs),
         proc_info_arg_info(ProcInfo, ArgInfo),
         VarTypes = get_var_types(!.CI),
         MayUseAtomic0 = initial_may_use_atomic(ModuleInfo),
-        generate_pred_args(!.CI, VarTypes, Args, ArgInfo, PredArgs,
+        generate_pred_args(!.CI, VarTypes, Args, ArgInfo, ArgsR, ArgsF,
             MayUseAtomic0, MayUseAtomic),
+        list.length(ArgsR, NumArgsR),
+        list.length(ArgsF, NumArgsF),
+        NumArgsRF = encode_num_generic_call_vars(NumArgsR, NumArgsF),
+        list.append(ArgsR, ArgsF, ArgsRF),
         Vector = [
             cell_arg_full_word(ClosureLayoutRval, complete),
             cell_arg_full_word(CodeAddrRval, complete),
-            cell_arg_full_word(const(llconst_int(NumArgs)), complete)
-            | PredArgs
+            cell_arg_full_word(const(llconst_int(NumArgsRF)), complete)
+            | ArgsRF
         ],
         % XXX construct_dynamically is just a dummy value. We just want
         % something which is not construct_in_region(_).
@@ -902,15 +910,15 @@ generate_extra_closure_args([Var | Vars], LoopCounter, NewClosure, Code,
     Code = ProduceCode ++ AssignCode ++ IncrCode ++ VarsCode.
 
 :- pred generate_pred_args(code_info::in, vartypes::in, list(prog_var)::in,
-    list(arg_info)::in, list(cell_arg)::out,
+    list(arg_info)::in, list(cell_arg)::out, list(cell_arg)::out,
     may_use_atomic_alloc::in, may_use_atomic_alloc::out) is det.
 
-generate_pred_args(_, _, [], _, [], !MayUseAtomic).
-generate_pred_args(_, _, [_ | _], [], _, !MayUseAtomic) :-
+generate_pred_args(_, _, [], _, [], [], !MayUseAtomic).
+generate_pred_args(_, _, [_ | _], [], _, _, !MayUseAtomic) :-
     unexpected($module, $pred, "insufficient args").
 generate_pred_args(CI, VarTypes, [Var | Vars], [ArgInfo | ArgInfos],
-        [CellArg | CellArgs], !MayUseAtomic) :-
-    ArgInfo = arg_info(_, ArgMode),
+        ArgsR, ArgsF, !MayUseAtomic) :-
+    ArgInfo = arg_info(reg(RegType, _), ArgMode),
     (
         ArgMode = top_in,
         IsDummy = variable_is_of_dummy_type(CI, Var),
@@ -931,8 +939,17 @@ generate_pred_args(CI, VarTypes, [Var | Vars], [ArgInfo | ArgInfos],
     map.lookup(VarTypes, Var, Type),
     get_module_info(CI, ModuleInfo),
     update_type_may_use_atomic_alloc(ModuleInfo, Type, !MayUseAtomic),
-    generate_pred_args(CI, VarTypes, Vars, ArgInfos, CellArgs,
-        !MayUseAtomic).
+    generate_pred_args(CI, VarTypes, Vars, ArgInfos, ArgsR0, ArgsF0,
+        !MayUseAtomic),
+    (
+        RegType = reg_r,
+        ArgsR = [CellArg | ArgsR0],
+        ArgsF = ArgsF0
+    ;
+        RegType = reg_f,
+        ArgsR = ArgsR0,
+        ArgsF = [CellArg | ArgsF0]
+    ).
 
 :- pred generate_cons_args(list(prog_var)::in, list(mer_type)::in,
     list(uni_mode)::in, list(arg_width)::in, list(int)::in,
diff --git a/compiler/unique_modes.m b/compiler/unique_modes.m
index 8ab26ec..9e406dd 100644
--- a/compiler/unique_modes.m
+++ b/compiler/unique_modes.m
@@ -273,9 +273,10 @@ unique_modes_check_goal_expr(GoalExpr0, GoalInfo0, GoalExpr, !ModeInfo) :-
             Builtin0, MaybeUnifyContext0, SymName0, GoalInfo0, GoalExpr,
             !ModeInfo)
     ;
-        GoalExpr0 = generic_call(GenericCall0, ArgVars0, ArgModes0, Detism0),
+        GoalExpr0 = generic_call(GenericCall0, ArgVars0, ArgModes0,
+            MaybeRegTypes, Detism0),
         unique_modes_check_goal_generic_call(GenericCall0, ArgVars0, ArgModes0,
-            Detism0, GoalExpr, !ModeInfo)
+            MaybeRegTypes, Detism0, GoalExpr, !ModeInfo)
     ;
         GoalExpr0 = call_foreign_proc(Attributes0, PredId0, ProcId0,
             Args0, ExtraArgs0, MaybeTraceRuntimeCond0, PragmaCode0),
@@ -576,11 +577,12 @@ unique_modes_check_goal_scope(Reason, SubGoal0, GoalInfo0, GoalExpr,
     ).
 
 :- pred unique_modes_check_goal_generic_call(generic_call::in,
-    list(prog_var)::in, list(mer_mode)::in, determinism::in,
-    hlds_goal_expr::out, mode_info::in, mode_info::out) is det.
+    list(prog_var)::in, list(mer_mode)::in, maybe(list(ho_arg_reg))::in,
+    determinism::in, hlds_goal_expr::out, mode_info::in, mode_info::out)
+    is det.
 
-unique_modes_check_goal_generic_call(GenericCall, ArgVars, Modes, Detism,
-        GoalExpr, !ModeInfo) :-
+unique_modes_check_goal_generic_call(GenericCall, ArgVars, Modes,
+        MaybeRegTypes, Detism, GoalExpr, !ModeInfo) :-
     mode_checkpoint(enter, "generic_call", !ModeInfo),
     hlds_goal.generic_call_id(GenericCall, CallId),
     mode_info_set_call_context(call_context_call(CallId), !ModeInfo),
@@ -607,7 +609,8 @@ unique_modes_check_goal_generic_call(GenericCall, ArgVars, Modes, Detism,
     ),
     unique_modes_check_call_modes(ArgVars, Modes, ArgOffset, Detism,
         NeverSucceeds, !ModeInfo),
-    GoalExpr = generic_call(GenericCall, ArgVars, Modes, Detism),
+    GoalExpr = generic_call(GenericCall, ArgVars, Modes, MaybeRegTypes,
+        Detism),
     mode_info_unset_call_context(!ModeInfo),
     mode_checkpoint(exit, "generic_call", !ModeInfo).
 
diff --git a/compiler/unneeded_code.m b/compiler/unneeded_code.m
index e5b596e..645e61f 100644
--- a/compiler/unneeded_code.m
+++ b/compiler/unneeded_code.m
@@ -658,7 +658,7 @@ unneeded_process_goal_internal(UnneededInfo, Goal0, Goal,
     (
         ( GoalExpr0 = unify(_, _, _, _, _)
         ; GoalExpr0 = plain_call(_, _, _, _, _, _)
-        ; GoalExpr0 = generic_call(_, _, _, _)
+        ; GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
         ),
         Goal = Goal0,
@@ -967,7 +967,7 @@ unneeded_refine_goal(Goal0, Goal, !RefinedGoals) :-
     (
         ( GoalExpr0 = unify(_, _, _, _, _)
         ; GoalExpr0 = plain_call(_, _, _, _, _, _)
-        ; GoalExpr0 = generic_call(_, _, _, _)
+        ; GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
         ),
         Goal = Goal0
diff --git a/compiler/untupling.m b/compiler/untupling.m
index 36c0c19..a3b014e 100644
--- a/compiler/untupling.m
+++ b/compiler/untupling.m
@@ -501,7 +501,7 @@ fix_calls_in_goal(Goal0, Goal, !VarSet, !VarTypes, TransformMap, ModuleInfo) :-
     Goal0 = hlds_goal(GoalExpr0, GoalInfo0),
     (
         ( GoalExpr0 = call_foreign_proc(_, _, _, _, _, _, _)
-        ; GoalExpr0 = generic_call(_, _, _, _)
+        ; GoalExpr0 = generic_call(_, _, _, _, _)
         ; GoalExpr0 = unify(_, _, _, _, _)
         ),
         Goal = Goal0
diff --git a/compiler/unused_args.m b/compiler/unused_args.m
index 58ab67a..f65e17a 100644
--- a/compiler/unused_args.m
+++ b/compiler/unused_args.m
@@ -656,7 +656,7 @@ unused_args_traverse_goal(Info, Goal, !VarDep) :-
             unused_args_traverse_goal(Info, SubGoal, !VarDep)
         )
     ;
-        GoalExpr = generic_call(GenericCall, Args, _, _),
+        GoalExpr = generic_call(GenericCall, Args, _, _, _),
         goal_util.generic_call_vars(GenericCall, CallArgs),
         set_list_vars_used(CallArgs, !VarDep),
         set_list_vars_used(Args, !VarDep)
@@ -1504,7 +1504,7 @@ unused_args_fixup_goal_expr(Goal0, Goal, !Info, Changed) :-
         ),
         Goal = hlds_goal(GoalExpr, GoalInfo0)
     ;
-        GoalExpr0 = generic_call(_, _, _, _),
+        GoalExpr0 = generic_call(_, _, _, _, _),
         Goal = hlds_goal(GoalExpr0, GoalInfo0),
         Changed = no
     ;
diff --git a/compiler/unused_imports.m b/compiler/unused_imports.m
index 7950f6f..82f368a 100644
--- a/compiler/unused_imports.m
+++ b/compiler/unused_imports.m
@@ -398,7 +398,7 @@ hlds_goal_used_modules(Goal, !UsedModules) :-
         GoalExpr = plain_call(_, _, _, _, _, Name),
         add_sym_name_module(visibility_private, Name, !UsedModules)
     ;
-        GoalExpr = generic_call(Call, _, _, _),
+        GoalExpr = generic_call(Call, _, _, _, _),
         (
             Call = class_method(_, _, ClassId, CallId),
             ClassId = class_id(ClassName, _),
@@ -587,7 +587,7 @@ bound_inst_info_used_modules(Visibility, bound_functor(ConsId, Insts),
     ho_inst_info::in, used_modules::in, used_modules::out) is det.
 
 ho_inst_info_used_modules(Visibility,
-        higher_order(pred_inst_info(_, Modes, _)), !UsedModules) :-
+        higher_order(pred_inst_info(_, Modes, _, _)), !UsedModules) :-
     list.foldl(mer_mode_used_modules(Visibility), Modes, !UsedModules).
 ho_inst_info_used_modules(_, none, !UsedModules).
 
diff --git a/library/private_builtin.m b/library/private_builtin.m
index f624ffc..96e2424 100644
--- a/library/private_builtin.m
+++ b/library/private_builtin.m
@@ -853,6 +853,18 @@ __Compare____base_typeclass_info_1_0(
 
 %-----------------------------------------------------------------------------%
 
+    % In LLDS grades with float registers, we require a type_ctor_info in
+    % closure layouts to represent hidden float values which are passed via
+    % regular registers. The standard type_ctor_info represents hidden float
+    % arguments passed via float registers.
+
+:- interface.
+
+:- type float_box
+    --->    float_box(float).
+
+%-----------------------------------------------------------------------------%
+
     % This section of the module contains predicates that are used
     % by the MLDS back-end, to implement trailing.
     % (The LLDS back-end does not use these; instead it inserts
diff --git a/runtime/mercury_deep_copy.c b/runtime/mercury_deep_copy.c
index 2dac9ef..8954ec9 100644
--- a/runtime/mercury_deep_copy.c
+++ b/runtime/mercury_deep_copy.c
@@ -16,6 +16,7 @@
 */
 
 #include "mercury_imp.h"
+#include "mercury_builtin_types.h"
 #include "mercury_deep_copy.h"
 #include "mercury_type_info.h"
 #include "mercury_ho_call.h"
diff --git a/runtime/mercury_deep_copy_body.h b/runtime/mercury_deep_copy_body.h
index e50a227..bc56b6d 100644
--- a/runtime/mercury_deep_copy_body.h
+++ b/runtime/mercury_deep_copy_body.h
@@ -504,7 +504,12 @@ try_again:
             ** and then the argument typeinfos themselves.
             */
             {
-                MR_Unsigned         args, i;
+                MR_Unsigned         num_r_args;
+                MR_Unsigned         num_f_args;
+                MR_Unsigned         num_args;
+                MR_Unsigned         r_offset;
+                MR_Unsigned         f_offset;
+                MR_Unsigned         i;
                 MR_Closure          *old_closure;
                 MR_Closure          *new_closure;
                 MR_Word             new_closure_word;
@@ -514,18 +519,21 @@ try_again:
 
                 old_closure = (MR_Closure *) data_value;
                 closure_layout = old_closure->MR_closure_layout;
-                args = old_closure->MR_closure_num_hidden_args;
+                num_r_args = MR_closure_num_hidden_r_args(old_closure);
+                num_f_args = MR_closure_num_hidden_f_args(old_closure);
+                num_args = num_r_args + num_f_args;
 
                 /* create new closure */
                 attrib = maybe_attrib(data_value);
-                MR_offset_incr_saved_hp(new_closure_word, 0, args + 3,
+                MR_offset_incr_saved_hp(new_closure_word, 0, num_args + 3,
                     attrib, NULL);
                 new_closure = (MR_Closure *) new_closure_word;
 
                 /* copy the fixed fields */
                 new_closure->MR_closure_layout = closure_layout;
-                new_closure->MR_closure_num_hidden_args = args;
                 new_closure->MR_closure_code = old_closure->MR_closure_code;
+                new_closure->MR_closure_num_hidden_args_rf =
+                    old_closure->MR_closure_num_hidden_args_rf;
 
                 /*
                 ** Fill in the pseudo_typeinfos in the closure layout
@@ -535,15 +543,28 @@ try_again:
                     old_closure);
 
                 /* copy the arguments */
-                for (i = 0; i < args; i++) {
-                    MR_PseudoTypeInfo arg_pseudo_type_info;
+                r_offset = 0;
+                f_offset = num_r_args;
 
-                    arg_pseudo_type_info =
+                for (i = 0; i < num_args; i++) {
+                    MR_PseudoTypeInfo   arg_pti;
+                    MR_Unsigned         offset;
+
+                    arg_pti =
                         closure_layout->MR_closure_arg_pseudo_type_info[i];
-                    new_closure->MR_closure_hidden_args_0[i] =
+#ifdef MR_MAY_REORDER_CLOSURE_HIDDEN_ARGS
+                    if (MR_unify_pseudo_type_info_float(arg_pti)) {
+                        offset = f_offset++;
+                    } else {
+                        offset = r_offset++;
+                    }
+#else
+                    offset = i;
+#endif
+                    new_closure->MR_closure_hidden_args_0[offset] =
                         copy_arg(NULL,
-                            old_closure->MR_closure_hidden_args_0[i], NULL,
-                            type_info_arg_vector, arg_pseudo_type_info,
+                            old_closure->MR_closure_hidden_args_0[offset],
+                            NULL, type_info_arg_vector, arg_pti,
                             lower_limit, upper_limit);
                 }
 
diff --git a/runtime/mercury_ho_call.c b/runtime/mercury_ho_call.c
index 3cb8e63..a2e517e 100644
--- a/runtime/mercury_ho_call.c
+++ b/runtime/mercury_ho_call.c
@@ -334,8 +334,12 @@ static  MR_Word MR_compare_closures_representation(MR_Closure *x,
 ** returned in registers MR_r1, MR_r2, etc for det and nondet calls or
 ** registers MR_r2, MR_r3, etc for semidet calls.
 **
-** The placement of the extra input arguments into MR_r3, MR_r4 etc is done by
-** the code generator, as is the movement of the output arguments to their
+** When float registers are in effect, float input arguments are passed
+** in MR_f1, MR_f2, etc. Float output arguments are returned in registers
+** MR_f1, MR_f2, etc.
+**
+** The placement of the extra input arguments into MR_rN and MR_fN etc. is done
+** by the code generator, as is the movement of the output arguments to their
 ** eventual destinations.
 **
 ** Each do_call_closure_N variant is specialized for the case where the number
@@ -346,7 +350,10 @@ static  MR_Word MR_compare_closures_representation(MR_Closure *x,
     /*
     ** Number of input arguments to do_call_*_closure_compact,
     ** MR_r1 -> closure
-    ** MR_r2 -> number of immediate input arguments.
+    ** MR_r2 -> number of immediate input arguments: (R | F<<16)
+    **
+    ** R is the number of regular register input arguments
+    ** F is the number of float register input arguments.
     */
 #define MR_HO_CALL_INPUTS_COMPACT   2
 
@@ -355,6 +362,8 @@ static  MR_Word MR_compare_closures_representation(MR_Closure *x,
     ** MR_r1 -> typeclass info
     ** MR_r2 -> index of method in typeclass info
     ** MR_r3 -> number of immediate input arguments.
+    **
+    ** Method calls do not yet use float registers for input or output.
     */
 #define MR_CLASS_METHOD_CALL_INPUTS_COMPACT 3
 
@@ -809,6 +818,8 @@ MR_compare_closures_representation(MR_Closure *x, MR_Closure *y)
     int                 x_num_args;
     int                 y_num_args;
     int                 num_args;
+    int                 r_offset;
+    int                 f_offset;
     int                 i;
     int                 result;
 
@@ -856,8 +867,8 @@ MR_compare_closures_representation(MR_Closure *x, MR_Closure *y)
         }
     }
 
-    x_num_args = x->MR_closure_num_hidden_args;
-    y_num_args = y->MR_closure_num_hidden_args;
+    x_num_args = MR_closure_num_hidden_r_args(x) + MR_closure_num_hidden_f_args(x);
+    y_num_args = MR_closure_num_hidden_r_args(y) + MR_closure_num_hidden_f_args(y);
     if (x_num_args < y_num_args) {
         return MR_COMPARE_LESS;
     } else if (x_num_args > y_num_args) {
@@ -867,10 +878,15 @@ MR_compare_closures_representation(MR_Closure *x, MR_Closure *y)
     num_args = x_num_args;
     x_type_params = MR_materialize_closure_type_params(x);
     y_type_params = MR_materialize_closure_type_params(y);
+
+    r_offset = 0;
+    f_offset = MR_closure_num_hidden_r_args(x);
+
     for (i = 0; i < num_args; i++) {
         MR_TypeInfo x_arg_type_info;
         MR_TypeInfo y_arg_type_info;
         MR_TypeInfo arg_type_info;
+        MR_Unsigned offset;
 
         x_arg_type_info = MR_create_type_info(x_type_params,
             x_layout->MR_closure_arg_pseudo_type_info[i]);
@@ -882,9 +898,20 @@ MR_compare_closures_representation(MR_Closure *x, MR_Closure *y)
         }
 
         arg_type_info = x_arg_type_info;
+#ifdef MR_MAY_REORDER_CLOSURE_HIDDEN_ARGS
+        if (MR_unify_type_ctor_info((MR_TypeCtorInfo) MR_FLOAT_CTOR_ADDR,
+            MR_TYPEINFO_GET_TYPE_CTOR_INFO(arg_type_info)))
+        {
+            offset = f_offset++;
+        } else {
+            offset = r_offset++;
+        }
+#else
+        offset = i;
+#endif
         result = MR_generic_compare_representation(arg_type_info,
-            x->MR_closure_hidden_args_0[i],
-            y->MR_closure_hidden_args_0[i]);
+            x->MR_closure_hidden_args_0[offset],
+            y->MR_closure_hidden_args_0[offset]);
         if (result != MR_COMPARE_EQUAL) {
             goto finish_closure_compare;
         }
@@ -928,7 +955,7 @@ MR_make_closure(MR_Code *proc_addr)
     MR_ClosureId                *closure_id;
     MR_Closure_Dyn_Link_Layout  *closure_layout;
     char                        buf[80];
-    int                         num_hidden_args;
+    int                         num_hidden_r_args;
 
     MR_restore_transient_hp();
 
@@ -973,17 +1000,17 @@ MR_make_closure(MR_Code *proc_addr)
     ** Construct the MR_Closure.
     */
 #ifdef MR_HIGHLEVEL_CODE
-    num_hidden_args = 1;
+    num_hidden_r_args = 1;
 #else
-    num_hidden_args = 0;
+    num_hidden_r_args = 0;
 #endif
-    MR_offset_incr_hp_msg(closure_word, 0, 3 + num_hidden_args,
+    MR_offset_incr_hp_msg(closure_word, 0, 3 + num_hidden_r_args,
         MR_ALLOC_SITE_RUNTIME, NULL);
     closure = (MR_Closure *) closure_word;
 
     closure->MR_closure_layout = (MR_Closure_Layout *) closure_layout;
     closure->MR_closure_code = proc_addr;
-    closure->MR_closure_num_hidden_args = num_hidden_args;
+    closure->MR_closure_num_hidden_args_rf = num_hidden_r_args;
 #ifdef MR_HIGHLEVEL_CODE
     closure->MR_closure_hidden_args(1) = (MR_Word) &MR_generic_closure_wrapper;
 #endif
diff --git a/runtime/mercury_ho_call.h b/runtime/mercury_ho_call.h
index c431909..62446e2 100644
--- a/runtime/mercury_ho_call.h
+++ b/runtime/mercury_ho_call.h
@@ -48,10 +48,17 @@
 ** contain values for all the arguments of the procedure, but the closure
 ** layout structure has information about all arguments. This is to make
 ** the creation of a closure from another closure by adding some more
-** hidden arguments as fast as possible. There is no problem in finding
-** out which pseudotypeinfo describes which hidden argument, because if
-** the closure contains n hidden arguments, these must be the first n arguments
-** of the procedure.
+** hidden arguments as fast as possible.
+**
+** Without float registers, there is no problem in finding out which
+** pseudotypeinfo describes which hidden argument, because if the closure
+** contains n hidden arguments, these must be the first n arguments of the
+** procedure. With float registers, the hidden arguments are reordered so that
+** float register arguments come after the regular register arguments. To tell
+** if an argument is passed via a float register, check if the pseudotypeinfo
+** is MR_FLOAT_CTOR_ADDR. If a float argument is passed via a regular register,
+** the pseudotypeinfo must be replaced by the type_ctor_info for
+** private_builtin.float_box.
 **
 ** The typeinfo and typeclassinfo arguments describing the actual types bound
 ** to type vars are always at the start of the argument list. A closure can
@@ -100,8 +107,14 @@ typedef struct MR_Closure_Dyn_Link_Layout_Struct {
 **
 **	one word pointing to the closure layout structure of the procedure
 **	one word pointing to the code of the procedure
-**	one word giving the number of arguments hidden in the closure (N)
-**	N words representing the N hidden arguments
+**	one word giving the number of hidden arguments: (R | F<<16)
+**	R words representing the R hidden regular register arguments
+**	F words representing the F hidden float register arguments (in boxed form)
+**
+** The num_hidden_args_rf field holds the number of arguments to place into
+** regular registers in the lower 16-bits, and the number of arguments to place
+** into float registers in the high bits. If float registers are not used
+** then F = 0 so num_hidden_args_rf = R.
 **
 ** The reason why the closure layout pointer is first is that most operations
 ** on closures do not need to access that word, and this way it does not
@@ -117,13 +130,25 @@ typedef struct MR_Closure_Dyn_Link_Layout_Struct {
 struct MR_Closure_Struct {
 	MR_Closure_Layout	    *MR_closure_layout;
 	MR_Code			        *MR_closure_code;
-	MR_Unsigned		        MR_closure_num_hidden_args;
+	MR_Unsigned		        MR_closure_num_hidden_args_rf;
 	MR_Word			        MR_closure_hidden_args_0[MR_VARIABLE_SIZED];
 };
 
 /* in mercury_types.h: typedef struct MR_Closure_Struct MR_Closure; */
 
-#define	MR_closure_hidden_args(i)	MR_closure_hidden_args_0[(i) - 1]
+#if !defined(MR_HIGHLEVEL_CODE) && defined(MR_BOXED_FLOAT)
+    #define MR_MAY_REORDER_CLOSURE_HIDDEN_ARGS
+    #define MR_closure_num_hidden_r_args(c) \
+        ((c)->MR_closure_num_hidden_args_rf & 0xffff)
+    #define MR_closure_num_hidden_f_args(c) \
+        ((c)->MR_closure_num_hidden_args_rf >> 16)
+#else
+    #define MR_closure_num_hidden_r_args(c) \
+        ((c)->MR_closure_num_hidden_args_rf)
+    #define MR_closure_num_hidden_f_args(c) 0
+#endif
+
+#define MR_closure_hidden_args(i)   MR_closure_hidden_args_0[(i) - 1]
 
 #ifndef MR_HIGHLEVEL_CODE
   #ifdef MR_DO_CALL_STATS
diff --git a/runtime/mercury_layout_util.c b/runtime/mercury_layout_util.c
index efe59e0..3f2f185 100644
--- a/runtime/mercury_layout_util.c
+++ b/runtime/mercury_layout_util.c
@@ -284,6 +284,8 @@ MR_lookup_closure_long_lval(MR_LongLval locn, MR_Closure *closure,
     MR_bool *succeeded)
 {
     int             locn_num;
+    int             num_r_args;
+    int             num_f_args;
     int             offset;
     MR_Word         value;
     MR_Word         baseaddr;
@@ -299,7 +301,8 @@ MR_lookup_closure_long_lval(MR_LongLval locn, MR_Closure *closure,
             if (MR_print_locn) {
                 printf("closure r%d\n", locn_num);
             }
-            if (locn_num <= closure->MR_closure_num_hidden_args) {
+            num_r_args = MR_closure_num_hidden_r_args(closure);
+            if (locn_num <= num_r_args) {
                 value = closure->MR_closure_hidden_args(locn_num);
                 *succeeded = MR_TRUE;
             }
@@ -309,6 +312,12 @@ MR_lookup_closure_long_lval(MR_LongLval locn, MR_Closure *closure,
             if (MR_print_locn) {
                 printf("closure f%d\n", locn_num);
             }
+            num_r_args = MR_closure_num_hidden_r_args(closure);
+            num_f_args = MR_closure_num_hidden_f_args(closure);
+            if (locn_num <= num_f_args) {
+                value = closure->MR_closure_hidden_args(num_r_args + locn_num);
+                *succeeded = MR_TRUE;
+            }
             break;
 
         case MR_LONG_LVAL_TYPE_STACKVAR:
diff --git a/runtime/mercury_ml_expand_body.h b/runtime/mercury_ml_expand_body.h
index c64c0e6..a5c59f7 100644
--- a/runtime/mercury_ml_expand_body.h
+++ b/runtime/mercury_ml_expand_body.h
@@ -890,12 +890,16 @@ EXPAND_FUNCTION_NAME(MR_TypeInfo type_info, MR_Word *data_word_ptr,
                 MR_UserProcId       *user_proc_id;
                 MR_UCIProcId        *uci_proc_id;
                 MR_ConstString      name;
+                int                 num_r_args;
+                int                 num_f_args;
                 int                 num_args;
                 int                 i;
 
                 closure = (MR_Closure *) *data_word_ptr;
                 closure_layout = closure->MR_closure_layout;
-                num_args = closure->MR_closure_num_hidden_args;
+                num_r_args = MR_closure_num_hidden_r_args(closure);
+                num_f_args = MR_closure_num_hidden_f_args(closure);
+                num_args = num_r_args + num_f_args;
                 expand_info->arity = num_args;
 
 #ifdef  EXPAND_FUNCTOR_FIELD
@@ -922,14 +926,41 @@ EXPAND_FUNCTION_NAME(MR_TypeInfo type_info, MR_Word *data_word_ptr,
                     type_params =
                         MR_materialize_closure_type_params(closure);
                     expand_info->EXPAND_ARGS_FIELD.num_extra_args = 0;
-                    expand_info->EXPAND_ARGS_FIELD.arg_values = &closure->
-                        MR_closure_hidden_args_0[0];
+                    expand_info->EXPAND_ARGS_FIELD.arg_values =
+                        &closure->MR_closure_hidden_args_0[0];
+  #ifdef MR_MAY_REORDER_CLOSURE_HIDDEN_ARGS
+                    /*
+                    ** If hidden arguments may have been reordered, create an
+                    ** new vector with arguments in the correct order.
+                    */
+                    if (num_r_args != 0 && num_f_args != 0) {
+                        int r_offset = 0;
+                        int f_offset = num_r_args;
+
+                        expand_info->EXPAND_ARGS_FIELD.arg_values =
+                            MR_GC_NEW_ARRAY(MR_Word, num_args);
+                        for (i = 0; i < num_args; i++) {
+                            MR_PseudoTypeInfo arg_pti;
+                            int offset;
+
+                            arg_pti = closure_layout->
+                                MR_closure_arg_pseudo_type_info[i];
+                            if (MR_unify_pseudo_type_info_float(arg_pti)) {
+                                offset = f_offset++;
+                            } else {
+                                offset = r_offset++;
+                            }
+                            expand_info->EXPAND_ARGS_FIELD.arg_values[i] =
+                                closure->MR_closure_hidden_args_0[offset];
+                        }
+                    }
+  #endif    /* MR_MAY_REORDER_CLOSURE_HIDDEN_ARGS */
                     expand_info->EXPAND_ARGS_FIELD.arg_locns = NULL;
                     expand_info->EXPAND_ARGS_FIELD.arg_type_infos =
                         MR_GC_NEW_ARRAY(MR_TypeInfo, num_args);
                     expand_info->EXPAND_ARGS_FIELD.can_free_arg_type_infos =
                         MR_TRUE;
-                    for (i = 0; i < num_args ; i++) {
+                    for (i = 0; i < num_args; i++) {
                         expand_info->EXPAND_ARGS_FIELD.arg_type_infos[i] =
                             MR_create_type_info(type_params,
                                 closure_layout->
@@ -944,10 +975,28 @@ EXPAND_FUNCTION_NAME(MR_TypeInfo type_info, MR_Word *data_word_ptr,
 #ifdef  EXPAND_CHOSEN_ARG
                 if (0 <= chosen && chosen < num_args) {
                     MR_TypeInfo *type_params;
+                    MR_Unsigned offset;
+                    MR_Unsigned r_offset;
+                    MR_Unsigned f_offset;
 
                     expand_info->chosen_index_exists = MR_TRUE;
+  #ifdef MR_MAY_REORDER_CLOSURE_HIDDEN_ARGS
+                    r_offset = 0;
+                    f_offset = MR_closure_num_hidden_r_args(closure);
+                    for (i = 0; i <= chosen; i++) {
+                        MR_PseudoTypeInfo arg_pti =
+                            closure_layout->MR_closure_arg_pseudo_type_info[i];
+                        if (MR_unify_pseudo_type_info_float(arg_pti)) {
+                            offset = f_offset++;
+                        } else {
+                            offset = r_offset++;
+                        }
+                    }
+  #else
+                    offset = chosen;
+  #endif
                     expand_info->chosen_value_ptr = 
-                        &closure->MR_closure_hidden_args_0[chosen];
+                        &closure->MR_closure_hidden_args_0[offset];
                     expand_info->chosen_arg_locn = NULL;
                     /* the following code could be improved */
                     type_params = MR_materialize_closure_type_params(closure);
diff --git a/runtime/mercury_type_info.c b/runtime/mercury_type_info.c
index 4a7349e..adc0d03 100644
--- a/runtime/mercury_type_info.c
+++ b/runtime/mercury_type_info.c
@@ -456,6 +456,21 @@ MR_unify_pseudo_type_info(MR_PseudoTypeInfo pti1, MR_PseudoTypeInfo pti2)
     return MR_TRUE;
 }
 
+MR_bool
+MR_unify_pseudo_type_info_float(MR_PseudoTypeInfo pti)
+{
+    MR_TypeCtorInfo tci1;
+    MR_TypeCtorInfo tci2;
+
+    if (MR_PSEUDO_TYPEINFO_IS_VARIABLE(pti)) {
+        return MR_FALSE;
+    }
+
+    tci1 = MR_PSEUDO_TYPEINFO_GET_TYPE_CTOR_INFO(pti);
+    tci2 = (MR_TypeCtorInfo) MR_FLOAT_CTOR_ADDR;
+    return MR_unify_type_ctor_info(tci1, tci2);
+}
+
 int
 MR_compare_type_info(MR_TypeInfo ti1, MR_TypeInfo ti2)
 {
diff --git a/runtime/mercury_type_info.h b/runtime/mercury_type_info.h
index f653047..bc6ff68 100644
--- a/runtime/mercury_type_info.h
+++ b/runtime/mercury_type_info.h
@@ -1661,6 +1661,17 @@ extern  MR_bool MR_unify_pseudo_type_info(MR_PseudoTypeInfo pti1,
                     MR_PseudoTypeInfo pti2);
 
 /*
+** Unify a pseudo_type_info structure with the type_info for float.
+** Return MR_TRUE if pti represents the type float, and MR_FALSE
+** otherwise.
+**
+** You need to wrap MR_{save/restore}_transient_hp() around
+** calls to this function.
+*/
+
+extern  MR_bool MR_unify_pseudo_type_info_float(MR_PseudoTypeInfo pti);
+
+/*
 ** Compare two pseudo_type_info structures, using an ordering based on the
 ** module names, type names and arities of the types inside the type_info.
 ** Return MR_COMPARE_GREATER, MR_COMPARE_EQUAL, or MR_COMPARE_LESS,

diff --git a/tests/debugger/higher_order.exp2 b/tests/debugger/higher_order.exp2
index 1012860..24d0947 100644
--- a/tests/debugger/higher_order.exp2
+++ b/tests/debugger/higher_order.exp2
@@ -1 +1,69 @@
-# this file is not currently used
+      E1:     C1 CALL pred higher_order.main/2-0 (det) higher_order.m:12
+mdb> echo on
+Command echo enabled.
+mdb> context none
+Contexts will not be printed.
+mdb> format_param size 100
+mdb> format_param depth 100
+mdb> format flat
+mdb> step
+      E2:     C2 CALL pred higher_order.domap/3-0 (det)
+mdb> print *
+       P (arg 1)              	lambda_higher_order_m_18(float_add2(3.0))
+       HeadVar__2             	[1.0, 2.0, 3.0, 4.0, 5.0]
+mdb> finish
+      E3:     C2 EXIT pred higher_order.domap/3-0 (det)
+mdb> print *
+       P (arg 1)              	lambda_higher_order_m_18(float_add2(3.0))
+       HeadVar__2             	[1.0, 2.0, 3.0, 4.0, 5.0]
+       HeadVar__3             	[4.0, 5.0, 6.0, 7.0, 8.0]
+mdb> step
+      E4:     C3 CALL pred higher_order.domap/3-0 (det)
+mdb> print *
+       P (arg 1)              	lambda_higher_order_m_19(float_op3(4.0, 5.0))
+       HeadVar__2             	[1.0, 2.0, 3.0, 4.0, 5.0]
+mdb> finish
+      E5:     C3 EXIT pred higher_order.domap/3-0 (det)
+mdb> print *
+       P (arg 1)              	lambda_higher_order_m_19(float_op3(4.0, 5.0))
+       HeadVar__2             	[1.0, 2.0, 3.0, 4.0, 5.0]
+       HeadVar__3             	[9.0, 14.0, 19.0, 24.0, 29.0]
+mdb> step
+      E6:     C4 CALL pred higher_order.domap/3-0 (det)
+mdb> print *
+       P (arg 1)              	max(3)
+       HeadVar__2             	[1, 2, 3, 4, 5]
+mdb> finish
+      E7:     C4 EXIT pred higher_order.domap/3-0 (det)
+mdb> print *
+       P (arg 1)              	max(3)
+       HeadVar__2             	[1, 2, 3, 4, 5]
+       HeadVar__3             	[3, 3, 3, 4, 5]
+mdb> step
+      E8:     C5 CALL pred higher_order.domap/3-0 (det)
+mdb> print *
+       P (arg 1)              	lambda_higher_order_m_21([6])
+       HeadVar__2             	[[1, 2], [3, 4, 5]]
+mdb> finish
+      E9:     C5 EXIT pred higher_order.domap/3-0 (det)
+mdb> print *
+       P (arg 1)              	lambda_higher_order_m_21([6])
+       HeadVar__2             	[[1, 2], [3, 4, 5]]
+       HeadVar__3             	[[6, 1, 2], [6, 3, 4, 5]]
+mdb> step
+     E10:     C6 CALL pred higher_order.domap/3-0 (det)
+mdb> print *
+       P (arg 1)              	lambda_higher_order_m_22(["a"])
+       HeadVar__2             	[["one", "two"], ["three", "four", "five"]]
+mdb> finish
+     E11:     C6 EXIT pred higher_order.domap/3-0 (det)
+mdb> print *
+       P (arg 1)              	lambda_higher_order_m_22(["a"])
+       HeadVar__2             	[["one", "two"], ["three", "four", "five"]]
+       HeadVar__3             	[["a", "one", "two"], ["a", "three", "four", "five"]]
+mdb> continue -S
+[4.0, 5.0, 6.0, 7.0, 8.0]
+[9.0, 14.0, 19.0, 24.0, 29.0]
+[3, 3, 3, 4, 5]
+[[6, 1, 2], [6, 3, 4, 5]]
+[["a", "one", "two"], ["a", "three", "four", "five"]]
diff --git a/tests/hard_coded/Mercury.options b/tests/hard_coded/Mercury.options
index 2e6cb07..2cff456 100644
--- a/tests/hard_coded/Mercury.options
+++ b/tests/hard_coded/Mercury.options
@@ -40,6 +40,7 @@ MCFLAGS-from_ground_term_bug = 	--from-ground-term-threshold=4
 MCFLAGS-func_test	=	--infer-all
 MCFLAGS-ho_order	=	--optimize-higher-order
 MCFLAGS-ho_order2	=	--optimize-higher-order
+MCFLAGS-ho_float_reg    =       --no-warn-overlapping-scopes
 MCFLAGS-no_fully_strict	=	--no-fully-strict
 MCFLAGS-one_member	=	-O5
 MCFLAGS-impure_foreign	=	--optimize-duplicate-calls
diff --git a/tests/hard_coded/Mmakefile b/tests/hard_coded/Mmakefile
index 05af31d..01004db 100644
--- a/tests/hard_coded/Mmakefile
+++ b/tests/hard_coded/Mmakefile
@@ -132,6 +132,7 @@ ORDINARY_PROGS=	\
 	higher_order_syntax \
 	higher_order_syntax2 \
 	higher_order_type_manip \
+	ho_float_reg \
 	ho_func_default_inst \
 	ho_func_reg \
 	ho_order \
diff --git a/tests/hard_coded/copy_pred.exp b/tests/hard_coded/copy_pred.exp
index 0d0bef4..8431df3 100644
--- a/tests/hard_coded/copy_pred.exp
+++ b/tests/hard_coded/copy_pred.exp
@@ -1 +1 @@
-10, 20, blah
+10, 20.2, 30, 40.4, blah
diff --git a/tests/hard_coded/copy_pred.m b/tests/hard_coded/copy_pred.m
index 2dfd14c..28a04b5 100644
--- a/tests/hard_coded/copy_pred.m
+++ b/tests/hard_coded/copy_pred.m
@@ -1,41 +1,45 @@
-        :- module copy_pred.
-        :- interface.
-        :- import_module io.
-
-        :- pred main(io__state::di, io__state::uo) is det.
-
-        :- implementation.
-
-        :- import_module list, map.
-        :- import_module univ, string.
-
-        main -->
-                { F = foo(10, 20) },
-                { copy(F, F2) },
-                io__set_globals(univ(F2)),
-		io__get_globals(Univ),
-		{ det_univ_to_type(Univ, F3) },
-		{ inst_cast(F3, F4) },
-		{ F4("blah", S) },
-		print(S), nl.
-
-	:- pred inst_cast(pred(string, string), pred(string, string)).
-	:- mode inst_cast(in, out(pred(in, out) is det)) is det.
-	:- pragma foreign_proc("C",
-		inst_cast(X::in, Y::out(pred(in, out) is det)),
-		[will_not_call_mercury, thread_safe, promise_pure], "Y = X").
-	:- pragma foreign_proc("C#",
-		inst_cast(X::in, Y::out(pred(in, out) is det)),
-		[will_not_call_mercury, thread_safe, promise_pure], "Y = X;").
-	:- pragma foreign_proc("Java",
-		inst_cast(X::in, Y::out(pred(in, out) is det)),
-		[will_not_call_mercury, thread_safe, promise_pure], "Y = X;").
-	:- pragma foreign_proc("Erlang",
-		inst_cast(X::in, Y::out(pred(in, out) is det)),
-		[will_not_call_mercury, thread_safe, promise_pure], "Y = X").
-
-        :- pred foo(int, int, string, string) is det.
-        :- mode foo(in, in, in, out) is det.
-        foo(A, B, S0, S) :-
-		string__format("%d, %d, %s", [i(A), i(B), s(S0)], S).
+:- module copy_pred.
+:- interface.
+:- import_module io.
+
+:- pred main(io__state::di, io__state::uo) is det.
+
+:- implementation.
+
+:- import_module list, map.
+:- import_module univ, string.
+
+:- type phloat == float.
+
+main -->
+    { F = foo(10, 20.2, 30, 40.4) },
+    { copy(F, F2) },
+    io__set_globals(univ(F2)),
+    io__get_globals(Univ),
+    { det_univ_to_type(Univ, F3) },
+    { inst_cast(F3, F4) },
+    { F4("blah", S) },
+    print(S), nl.
+
+:- pred inst_cast(pred(string, string), pred(string, string)).
+:- mode inst_cast(in, out(pred(in, out) is det)) is det.
+
+:- pragma foreign_proc("C",
+    inst_cast(X::in, Y::out(pred(in, out) is det)),
+    [will_not_call_mercury, thread_safe, promise_pure], "Y = X").
+:- pragma foreign_proc("C#",
+    inst_cast(X::in, Y::out(pred(in, out) is det)),
+    [will_not_call_mercury, thread_safe, promise_pure], "Y = X;").
+:- pragma foreign_proc("Java",
+    inst_cast(X::in, Y::out(pred(in, out) is det)),
+    [will_not_call_mercury, thread_safe, promise_pure], "Y = X;").
+:- pragma foreign_proc("Erlang",
+    inst_cast(X::in, Y::out(pred(in, out) is det)),
+    [will_not_call_mercury, thread_safe, promise_pure], "Y = X").
+
+:- pred foo(int, float, int, phloat, string, string) is det.
+:- mode foo(in, in, in, in, in, out) is det.
+
+foo(A, B, C, D, S0, S) :-
+    string__format("%d, %g, %d, %g, %s", [i(A), f(B), i(C), f(D), s(S0)], S).
 
diff --git a/tests/hard_coded/deconstruct_arg.exp b/tests/hard_coded/deconstruct_arg.exp
index d4af6b7..ac1c4c6 100644
--- a/tests/hard_coded/deconstruct_arg.exp
+++ b/tests/hard_coded/deconstruct_arg.exp
@@ -208,6 +208,20 @@ deconstruct deconstruct: functor lambda_deconstruct_arg_m_121 arity 1
 deconstruct limited deconstruct 3 of '<<predicate>>'
 functor lambda_deconstruct_arg_m_121 arity 1 [[1, 2]]
 
+deconstruct functor: p/3
+deconstruct argument 0 of '<<predicate>>' is 1
+deconstruct argument 1 of '<<predicate>>' is 2.2
+deconstruct argument 2 of '<<predicate>>' is "three"
+deconstruct argument 'moo' doesn't exist
+deconstruct argument 'mooo!' doesn't exist
+deconstruct argument 'packed1' doesn't exist
+deconstruct argument 'packed2' doesn't exist
+deconstruct argument 'packed3' doesn't exist
+deconstruct deconstruct: functor p arity 3
+[1, 2.2, "three"]
+deconstruct limited deconstruct 3 of '<<predicate>>'
+functor p arity 3 [1, 2.2, "three"]
+
 deconstruct functor: {}/2
 deconstruct argument 0 of {1, 'b'} is 1
 deconstruct argument 1 of {1, 'b'} is 'b'
diff --git a/tests/hard_coded/deconstruct_arg.exp2 b/tests/hard_coded/deconstruct_arg.exp2
index c59f29d..e402c3a 100644
--- a/tests/hard_coded/deconstruct_arg.exp2
+++ b/tests/hard_coded/deconstruct_arg.exp2
@@ -208,6 +208,20 @@ deconstruct deconstruct: functor <<predicate>> arity 0
 deconstruct limited deconstruct 3 of '<<predicate>>'
 functor <<predicate>> arity 0 []
 
+deconstruct functor: <<predicate>>/0
+deconstruct argument 0 of '<<predicate>>' doesn't exist
+deconstruct argument 1 of '<<predicate>>' doesn't exist
+deconstruct argument 2 of '<<predicate>>' doesn't exist
+deconstruct argument 'moo' doesn't exist
+deconstruct argument 'mooo!' doesn't exist
+deconstruct argument 'packed1' doesn't exist
+deconstruct argument 'packed2' doesn't exist
+deconstruct argument 'packed3' doesn't exist
+deconstruct deconstruct: functor <<predicate>> arity 0
+[]
+deconstruct limited deconstruct 3 of '<<predicate>>'
+functor <<predicate>> arity 0 []
+
 deconstruct functor: {}/2
 deconstruct argument 0 of {1, 'b'} is 1
 deconstruct argument 1 of {1, 'b'} is 'b'
diff --git a/tests/hard_coded/deconstruct_arg.m b/tests/hard_coded/deconstruct_arg.m
index af0a70b..eeb983d 100644
--- a/tests/hard_coded/deconstruct_arg.m
+++ b/tests/hard_coded/deconstruct_arg.m
@@ -119,6 +119,7 @@ main -->
 		% test predicates	
 	test_all(newline), newline,
 	test_all(test_all([1, 2])), newline,
+	test_all(p(1, 2.2, "three")), newline,
 		% test tuples
 	test_all({1, 'b'}), newline,
 	test_all({1, 'b', "third"}), newline,
@@ -130,6 +131,10 @@ main -->
 		% test packed fields
 	test_all(packed(100, one, two, three, "four")), newline.
 
+:- pred p(int::in, float::in, string::in, io::di, io::uo) is det.
+
+p(_, _, _, !IO).
+
 %-----------------------------------------------------------------------------%
 
 :- pred test_all(T::in, io.state::di, io.state::uo) is cc_multi.
diff --git a/tests/hard_coded/ho_float_reg.exp b/tests/hard_coded/ho_float_reg.exp
new file mode 100644
index 0000000..863a05a
--- /dev/null
+++ b/tests/hard_coded/ho_float_reg.exp
@@ -0,0 +1,88 @@
+
+plain call
+50.625
+
+higher-order call
+50.625
+
+docall
+50.625
+
+docall_inline
+50.625
+
+get_docall
+50.625
+
+get_docall2
+50.625
+
+get_docall3
+50.625
+
+docall_foreign_poly
+50.625
+
+docall_foreign_float
+50.625
+
+docall_struct_poly_generic_args
+50.625
+
+docall_struct_poly_float_args
+50.625
+
+docall_struct_mono
+50.625
+
+docall_struct_both
+50.625
+50.625
+
+docall_pair_poly
+50.625
+
+docall_pair_float
+50.625
+
+docall_tuple_poly
+50.625
+
+docall_tuple_float
+50.625
+
+docall_tuple2_float
+50.625
+
+get_struct_pred_switch
+50.625
+
+get_struct_pred_disj
+[50.625, 50.625, 50.625]
+
+get_struct_pred_ite
+50.625
+
+meth1
+50.625
+
+meth2
+50.625
+
+meth3
+50.625
+
+meth4
+50.625
+
+meth5
+50.625
+
+cast_inst
+50.625
+
+lost_inst
+50.625
+
+map_recursive_type
+[50.625, 50.625, 50.625]
diff --git a/tests/hard_coded/ho_float_reg.m b/tests/hard_coded/ho_float_reg.m
new file mode 100644
index 0000000..2abba97
--- /dev/null
+++ b/tests/hard_coded/ho_float_reg.m
@@ -0,0 +1,634 @@
+% Test the float_reg.m pass
+
+:- module ho_float_reg.
+:- interface.
+
+:- import_module io.
+
+:- pred main(io::di, io::uo) is cc_multi.
+
+%-----------------------------------------------------------------------------%
+%-----------------------------------------------------------------------------%
+
+:- implementation.
+
+:- import_module float.
+:- import_module list.
+:- import_module pair.
+:- import_module require.
+:- import_module solutions.
+
+%-----------------------------------------------------------------------------%
+
+main(!IO) :-
+    P1 = foo(1.5),
+    P2 = foo(1.5, 2.5),
+
+    begin("plain call", !IO),
+    some [Res] (
+        foo(1.5, 2.5, 3.5, Res),
+        io.write_float(Res, !IO),
+        io.nl(!IO)
+    ),
+
+    begin("higher-order call", !IO),
+    some [Res] (
+        P1(2.5, 3.5, Res),
+        io.write_float(Res, !IO),
+        io.nl(!IO)
+    ),
+
+    begin("docall", !IO),
+    some [Res] (
+        docall(P2, 3.5, Res),
+        io.write_float(Res, !IO),
+        io.nl(!IO)
+    ),
+
+    begin("docall_inline", !IO),
+    some [Res] (
+        docall_inline(P2, 3.5, Res),
+        io.write_float(Res, !IO),
+        io.nl(!IO)
+    ),
+
+    begin("get_docall", !IO),
+    some [DoCall, Res] (
+        get_docall(DoCall),
+        DoCall(P2, 3.5, Res),
+        io.write_float(Res, !IO),
+        io.nl(!IO)
+    ),
+
+    begin("get_docall2", !IO),
+    some [DoCall1, DoCall2, Res] (
+        get_docall2(DoCall2),
+        DoCall2(DoCall1),
+        DoCall1(P2, 3.5, Res),
+        io.write_float(Res, !IO),
+        io.nl(!IO)
+    ),
+
+    begin("get_docall3", !IO),
+    some [DoCall1, DoCall2, DoCall3, Res] (
+        get_docall3(DoCall3),
+        DoCall3(DoCall2),
+        DoCall2(DoCall1),
+        DoCall1(P2, 3.5, Res),
+        io.write_float(Res, !IO),
+        io.nl(!IO)
+    ),
+
+    begin("docall_foreign_poly", !IO),
+    some [Res] (
+        docall_foreign_poly(P2, 3.5, Res),
+        io.write_float(Res, !IO),
+        io.nl(!IO)
+    ),
+
+    begin("docall_foreign_float", !IO),
+    some [Res] (
+        docall_foreign_float(P2, 3.5, Res),
+        io.write_float(Res, !IO),
+        io.nl(!IO)
+    ),
+
+    some [SPoly, SMono, Res1, Res2, Res3] (
+        SPoly = struct_poly(P2),
+        SMono = struct_mono(foo(1.5, 2.5)) : struct(float),
+
+        begin("docall_struct_poly_generic_args", !IO),
+        docall_struct_poly_generic_args(SPoly, 3.5, Res),
+        io.write_float(Res, !IO),
+        io.nl(!IO),
+
+        begin("docall_struct_poly_float_args", !IO),
+        docall_struct_poly_float_args(SPoly, 3.5, Res2),
+        io.write_float(Res2, !IO),
+        io.nl(!IO),
+
+        begin("docall_struct_mono", !IO),
+        docall_struct_mono(SMono, 3.5, Res3),
+        io.write_float(Res3, !IO),
+        io.nl(!IO)
+    ),
+
+    begin("docall_struct_both", !IO),
+    some [S, Res1, Res2] (
+        S = struct_both(P2, P2),
+        docall_struct_both(S, 3.5, Res1, 3.5, Res2),
+        io.write_float(Res1, !IO),
+        io.nl(!IO),
+        io.write_float(Res2, !IO),
+        io.nl(!IO)
+    ),
+
+    some [T, Res1, Res2] (
+        T = foo(1.5, 2.5) - "foo",
+
+        begin("docall_pair_poly", !IO),
+        docall_pair_poly(T, 3.5, Res1),
+        io.write_float(Res1, !IO),
+        io.nl(!IO),
+
+        begin("docall_pair_float", !IO),
+        docall_pair_float(T, 3.5, Res2),
+        io.write_float(Res2, !IO),
+        io.nl(!IO)
+    ),
+
+    some [T, Res1, Res2, Res3] (
+        T = {foo(1.5, 2.5)},
+
+        begin("docall_tuple_poly", !IO),
+        docall_tuple_poly(T, 3.5, Res1),
+        io.write_float(Res1, !IO),
+        io.nl(!IO),
+
+        begin("docall_tuple_float", !IO),
+        docall_tuple_float(T, 3.5, Res2),
+        io.write_float(Res2, !IO),
+        io.nl(!IO),
+
+        begin("docall_tuple2_float", !IO),
+        docall_tuple2_float({T}, 3.5, Res3),
+        io.write_float(Res3, !IO),
+        io.nl(!IO)
+    ),
+
+    begin("get_struct_pred_switch", !IO),
+    some [S, Pred, Res] (
+        S = struct_poly(P2),
+        get_struct_pred_switch(S, Pred),
+        Pred(3.5, Res),
+        io.write_float(Res, !IO),
+        io.nl(!IO)
+    ),
+
+    begin("get_struct_pred_disj", !IO),
+    some [S, Res] (
+        S = struct_poly(P2),
+        unsorted_solutions(
+            (pred(R::out) is multi :-
+                get_struct_pred_disj(S, P),
+                P(3.5, R)
+            ), Res),
+        io.write(Res, !IO),
+        io.nl(!IO)
+    ),
+
+    begin("get_struct_pred_ite", !IO),
+    some [S, Pred, Res] (
+        S = struct_poly(P2),
+        get_struct_pred_ite(S, Pred),
+        Pred(3.5, Res),
+        io.write_float(Res, !IO),
+        io.nl(!IO)
+    ),
+
+    some [Res] (
+        begin("meth1", !IO),
+        meth1(1.5, 2.5, 3.5, Res),
+        io.write_float(Res, !IO),
+        io.nl(!IO)
+    ),
+
+    some [Res] (
+        begin("meth2", !IO),
+        meth2(1.5, 2.5, 3.5, Res),
+        io.write_float(Res, !IO),
+        io.nl(!IO)
+    ),
+
+    some [Res] (
+        begin("meth3", !IO),
+        meth3(P2, 3.5, Res),
+        io.write_float(Res, !IO),
+        io.nl(!IO)
+    ),
+
+    some [Res] (
+        begin("meth4", !IO),
+        meth4(P1, 2.5, 3.5, Res),
+        io.write_float(Res, !IO),
+        io.nl(!IO)
+    ),
+
+    some [Res] (
+        begin("meth5", !IO),
+        meth5(DoCall),
+        DoCall(P2, 3.5, Res),
+        io.write_float(Res, !IO),
+        io.nl(!IO)
+    ),
+
+    begin("cast_inst", !IO),
+    some [S1, S2, Res] (
+        S1 = struct_poly(P2),
+        cast_inst(S1, S2),
+        docall_struct_poly_float_args(S2, 3.5, Res),
+        io.write_float(Res, !IO),
+        io.nl(!IO)
+    ),
+
+    begin("lost_inst", !IO),
+    some [S1, S2, S3, S2_Cast, Res] (
+        S1 = struct_poly(P2),
+        lost_inst(S1, S2),
+        cast_inst(S2, S3),
+        docall_struct_mono(S3, 3.5, Res),
+        io.write_float(Res, !IO),
+        io.nl(!IO)
+    ),
+
+    some [T, Res] (
+        begin("map_recursive_type", !IO),
+        T = foo(1.5, 2.5),
+        map_recursive_type(cons(T, cons(T, cons(T, nil))), 3.5, Res),
+        io.write(Res, !IO),
+        io.nl(!IO)
+    ).
+
+%-----------------------------------------------------------------------------%
+
+:- pred begin(string::in, io::di, io::uo) is det.
+
+begin(Message, !IO) :-
+    io.nl(!IO),
+    io.write_string(Message, !IO),
+    io.nl(!IO),
+    clear_float_regs(!IO).
+
+:- pred clear_float_regs(io::di, io::uo) is det.
+
+:- pragma foreign_proc("C",
+    clear_float_regs(_IO0::di, _IO::uo),
+    [will_not_call_mercury, promise_pure, thread_safe],
+"
+#if defined(MR_f)
+    MR_f(1) = 0.0;
+    MR_f(2) = 0.0;
+    MR_f(3) = 0.0;
+    MR_f(4) = 0.0;
+    MR_f(5) = 0.0;
+    MR_f(6) = 0.0;
+    MR_f(7) = 0.0;
+    MR_f(8) = 0.0;
+    MR_f(9) = 0.0;
+    MR_f(10) = 0.0;
+#endif
+").
+
+clear_float_regs(!IO).
+
+%-----------------------------------------------------------------------------%
+
+:- pred foo(float::in, float::in, float::in, float::out) is det.
+
+foo(A, B, C, X) :-
+    X = A + B*B + C*C*C.
+
+%-----------------------------------------------------------------------------%
+
+:- pred docall(pred(T, T)::in(pred(in, out) is det), T::in, T::out) is det.
+:- pragma no_inline(docall/3).
+
+docall(P, X, Y) :-
+    P(X, Y).
+
+:- pred docall_inline(pred(T, T)::in(pred(in, out) is det), T::in, T::out)
+    is det.
+:- pragma inline(docall_inline/3).
+
+docall_inline(P, X, Y) :-
+    P(X, Y).
+
+%-----------------------------------------------------------------------------%
+
+:- pred get_docall(pred(pred(T1, T1), T1, T1)).
+:- mode get_docall(out(pred(pred(in, out) is det, in, out) is det)) is det.
+
+:- pragma no_inline(get_docall/1).
+
+get_docall(docall).
+
+:- pred get_docall2(pred(pred(pred(T1, T1), T1, T1))).
+:- mode get_docall2(out(pred(out(pred((pred(in, out) is det), in, out) is det))
+    is det)) is det.
+:- pragma no_inline(get_docall2/1).
+
+get_docall2(get_docall).
+
+:- pred get_docall3(pred(pred(pred(pred(T1, T1), T1, T1)))).
+:- mode get_docall3(out(pred(out(pred(out(pred((pred(in, out) is det), in, out)
+    is det)) is det)) is det)) is det.
+:- pragma no_inline(get_docall3/1).
+
+get_docall3(get_docall2).
+
+%-----------------------------------------------------------------------------%
+
+:- pred docall_foreign_poly(pred(T, T)::in(pred(in, out) is det),
+    T::in, T::out) is det.
+
+docall_foreign_poly(P, X, Y) :- % for non-C backends
+    P(X, Y).
+
+:- pragma foreign_proc("C",
+    docall_foreign_poly(P::in(pred(in, out) is det), X::in, Y::out),
+    [may_call_mercury, promise_pure, thread_safe],
+"
+    EXPORT_docall_foreign_2(TypeInfo_for_T, P, X, &Y);
+").
+
+:- pred docall_foreign_2(pred(T, T)::in(pred(in, out) is det), T::in, T::out)
+    is det.
+:- pragma foreign_export("C",
+    docall_foreign_2(in(pred(in, out) is det), in, out),
+    "EXPORT_docall_foreign_2").
+
+docall_foreign_2(P, X, Y) :-
+    P(X, Y).
+
+%-----------------------------------------------------------------------------%
+
+:- pred docall_foreign_float(pred(float, float)::in(pred(in, out) is det),
+    float::in, float::out) is det.
+
+:- pragma foreign_proc("C",
+    docall_foreign_float(P::in(pred(in, out) is det), X::in, Y::out),
+    [may_call_mercury, promise_pure, thread_safe],
+"
+    EXPORT_docall_foreignf_2(P, X, &Y);
+").
+
+docall_foreign_float(P, X, Y) :- % for non-C backends
+    P(X, Y).
+
+:- pred docall_foreign_float_2(pred(float, float)::in(pred(in, out) is det),
+    float::in, float::out) is det.
+
+:- pragma foreign_export("C",
+    docall_foreign_float_2(in(pred(in, out) is det), in, out),
+    "EXPORT_docall_foreignf_2").
+
+docall_foreign_float_2(P, X, Y) :-
+    P(X, Y).
+
+%-----------------------------------------------------------------------------%
+
+:- type struct(T)
+    --->    struct_poly(pred(T, T))
+    ;       struct_mono(pred(float, float)).
+
+:- inst struct
+    --->    struct_poly(pred(in, out) is det)
+    ;       struct_mono(pred(in, out) is det).
+
+:- pred docall_struct_poly_generic_args(struct(T)::in(struct), T::in, T::out)
+    is det.
+:- pragma no_inline(docall_struct_poly_generic_args/3).
+
+docall_struct_poly_generic_args(S, X, Y) :-
+    (
+        S = struct_poly(P),
+        P(X, Y)
+    ;
+        S = struct_mono(_),
+        unexpected($module, $pred)
+    ).
+
+:- pred docall_struct_poly_float_args(struct(float)::in(struct),
+    float::in, float::out) is det.
+:- pragma no_inline(docall_struct_poly_float_args/3).
+
+docall_struct_poly_float_args(S, X, Y) :-
+    (
+        S = struct_poly(P),
+        P(X, Y)
+    ;
+        S = struct_mono(_),
+        unexpected($module, $pred)
+    ).
+
+:- pred docall_struct_mono(struct(T)::in(struct), float::in, float::out)
+    is det.
+:- pragma no_inline(docall_struct_mono/3).
+
+docall_struct_mono(S, X, Y) :-
+    (
+        S = struct_mono(P),
+        P(X, Y)
+    ;
+        S = struct_poly(_),
+        unexpected($module, $pred)
+    ).
+
+%-----------------------------------------------------------------------------%
+
+:- type struct_both(T)
+    --->    struct_both(pred(T, T), pred(float, float)).
+
+:- inst struct_both
+    --->    struct_both(pred(in, out) is det, pred(in, out) is det).
+
+:- pred docall_struct_both(struct_both(T)::in(struct_both), T::in, T::out,
+    float::in, float::out) is det.
+:- pragma no_inline(docall_struct_both/5).
+
+docall_struct_both(struct_both(P, Q), !X, !Y) :-
+    P(!X),
+    Q(!Y).
+
+%-----------------------------------------------------------------------------%
+
+:- inst pairpred
+    --->    (pred(in, out) is det) - ground.
+
+:- pred docall_pair_poly(pair(pred(T, T), U)::in(pairpred), T::in, T::out)
+    is det.
+:- pragma no_inline(docall_pair_poly/3).
+
+docall_pair_poly(P - _, X, Y) :-
+    P(X, Y).
+
+:- pred docall_pair_float(pair(pred(float, float), string)::in(pairpred),
+    float::in, float::out) is det.
+:- pragma no_inline(docall_pair_float/3).
+
+docall_pair_float(P - _, X, Y) :-
+    P(X, Y).
+
+%-----------------------------------------------------------------------------%
+
+:- inst tuplepred
+    --->    { pred(in, out) is det }.
+
+:- inst tupletuplepred
+    --->    { tuplepred }.
+
+:- pred docall_tuple_poly({ pred(T, T) }::in(tuplepred), T::in, T::out) is det.
+:- pragma no_inline(docall_tuple_poly/3).
+
+docall_tuple_poly({P}, X, Y) :-
+    P(X, Y).
+
+:- pred docall_tuple_float({ pred(float, float) }::in(tuplepred),
+    float::in, float::out) is det.
+:- pragma no_inline(docall_tuple_float/3).
+
+docall_tuple_float({P}, X, Y) :-
+    P(X, Y).
+
+:- pred docall_tuple2_float({{ pred(float, float) }}::in(tupletuplepred),
+    float::in, float::out) is det.
+:- pragma no_inline(docall_tuple2_float/3).
+
+docall_tuple2_float({{P}}, X, Y) :-
+    P(X, Y).
+
+%-----------------------------------------------------------------------------%
+
+    % Test inst merging in switches.
+    %
+:- pred get_struct_pred_switch(struct(float)::in(struct),
+    pred(float, float)::out(pred(in, out) is det)) is det.
+:- pragma no_inline(get_struct_pred_switch/2).
+
+get_struct_pred_switch(struct_poly(P), P).
+get_struct_pred_switch(struct_mono(P), P).
+
+    % Test inst merging in disjunctions.
+    %
+:- pred get_struct_pred_disj(struct(float)::in(struct),
+    pred(float, float)::out(pred(in, out) is det)) is multi.
+:- pragma no_inline(get_struct_pred_disj/2).
+
+get_struct_pred_disj(struct_poly(P), P).
+get_struct_pred_disj(struct_mono(P), P).
+get_struct_pred_disj(_, foo(1.5, 2.5)).
+get_struct_pred_disj(_, foo(1.5, 2.5)).
+
+    % Test insts merging in if-then-elses.
+    %
+:- pred get_struct_pred_ite(struct(float)::in(struct),
+    pred(float, float)::out(pred(in, out) is det)) is det.
+:- pragma no_inline(get_struct_pred_ite/2).
+
+get_struct_pred_ite(S, P) :-
+    ( S = struct_poly(P0) ->
+        P = P0
+    ; S = struct_mono(P0) ->
+        P = P0
+    ;
+        P = foo(-1.5, -2.5) % dummy
+    ).
+
+%-----------------------------------------------------------------------------%
+
+:- typeclass tc(T) where [
+    pred meth1(T::in, T::in, T::in, T::out) is det,
+    pred meth2(T::in, T::in, T::in, T::out) is det,
+    pred meth3(pred(T, T)::in(pred(in, out) is det), T::in, T::out) is det,
+    pred meth4(pred(float, T, T)::in(pred(in, in, out) is det), float::in,
+        T::in, T::out) is det,
+    pred meth5(pred(pred(T, T), T, T)::out(pred(pred(in, out) is det, in, out)
+        is det)) is det
+].
+
+:- instance tc(float) where [
+    pred(meth1/4) is foo,
+    ( meth2(A, B, C, D) :-
+        foo(A, B, C, D)
+    ),
+    ( meth3(P, C, D) :-
+        P(C, D)
+    ),
+    ( meth4(P, B, C, D) :-
+        P(B, C, D)
+    ),
+    pred(meth5/1) is get_docall
+].
+
+%-----------------------------------------------------------------------------%
+
+    % Check we are able to introduce wrappers after losing the inst
+    % then "recovering" it.
+    %
+:- pred lost_inst(struct(float)::in, struct(float)::out) is det.
+:- pragma no_inline(lost_inst/2).
+
+lost_inst(S0, S) :-
+    cast_inst(S0, S1),
+    (
+        S1 = struct_poly(P),
+        S = struct_mono(P)  % wrapper here
+    ;
+        S1 = struct_mono(P),
+        S = struct_mono(P)
+    ).
+
+:- pred cast_inst(struct(float)::in, struct(float)::out(struct)) is det.
+
+:- pragma foreign_proc("C",
+    cast_inst(S0::in, S::out(struct)),
+    [will_not_call_mercury, promise_pure, thread_safe],
+"
+    S = S0;
+").
+:- pragma foreign_proc("C#",
+    cast_inst(S0::in, S::out(struct)),
+    [will_not_call_mercury, promise_pure, thread_safe],
+"
+    S = S0;
+").
+:- pragma foreign_proc("Java",
+    cast_inst(S0::in, S::out(struct)),
+    [will_not_call_mercury, promise_pure, thread_safe],
+"
+    S = S0;
+").
+:- pragma foreign_proc("Erlang",
+    cast_inst(S0::in, S::out(struct)),
+    [will_not_call_mercury, promise_pure, thread_safe],
+"
+    S = S0;
+").
+
+%-----------------------------------------------------------------------------%
+
+:- type recursive_type(T)
+    --->    nil
+    ;       cons(pred(T, T), recursive_type(T)).
+
+:- inst recursive_inst
+    --->    nil
+    ;       cons(pred(in, out) is det, recursive_inst).
+
+:- pred map_recursive_type(recursive_type(T)::in(recursive_inst),
+    T::in, list(T)::out) is det.
+
+map_recursive_type(nil, _, []).
+map_recursive_type(cons(P, Ps), X, [Y | Ys]) :-
+    P(X, Y),
+    map_recursive_type(Ps, X, Ys).
+
+%-----------------------------------------------------------------------------%
+
+:- type existstruct(T)
+    --->    some [U] existstruct(pred(T, T), U).
+
+:- inst existstruct
+    --->    existstruct(pred(in, out) is det, ground).
+
+% XXX mode checking fails
+%
+% :- pred docall_existstruct(existstruct(T)::in(existstruct), T::in, T::out)
+%     is det.
+% :- pragma no_inline(docall_existstruct/3).
+%
+% docall_existstruct(existstruct(P, _), X, Y) :-
+%     P(X, Y).
+
+%-----------------------------------------------------------------------------%
+% vim: ft=mercury ts=4 sts=4 sw=4 et
diff --git a/tests/hard_coded/ho_univ_to_type.m b/tests/hard_coded/ho_univ_to_type.m
index 8690423..2e2f014 100644
--- a/tests/hard_coded/ho_univ_to_type.m
+++ b/tests/hard_coded/ho_univ_to_type.m
@@ -27,13 +27,18 @@
 :- type mypred2 == (pred(float, int, int)).
 :- inst mypred == (pred(in, in, out) is det).
 
+:- type fpred
+    --->    fpred(mypred2).
+:- inst fpred
+    --->    fpred(mypred).
+
 main -->
 	{ foo(Pred0) },
 	{ type_to_univ(Pred0, Univ) },
 	( 
 		{ univ_to_type(Univ, Pred1) }
 	->
-		{ convert_inst(Pred1, Pred2) },
+		{ convert_inst(fpred(Pred1), fpred(Pred2)) },
 		{ Pred2(5.0, 1, X) },
 		io__write_int(X),
 		io__write_string("\n")
@@ -50,9 +55,9 @@ foo(X) :- X = (pred(A::in, B::in, C::out) is det :- C = A + B).
 % Some hacky pragma foreign_proc to allow use to change an
 % inst from `ground' to `pred(in, in, out) is det'.
 
-:- pred convert_inst(mypred2::in, mypred2::out(mypred)) is det.
+:- pred convert_inst(fpred::in, fpred::out(fpred)) is det.
 :- pragma foreign_proc("C",
-	convert_inst(Pred1::in, Pred2::out(mypred)),
+	convert_inst(Pred1::in, Pred2::out(fpred)),
 	[will_not_call_mercury, promise_pure],
 "
 {
@@ -60,14 +65,14 @@ foo(X) :- X = (pred(A::in, B::in, C::out) is det :- C = A + B).
 }
 ").
 :- pragma foreign_proc("C#",
-	convert_inst(Pred1::in, Pred2::out(mypred)),
+	convert_inst(Pred1::in, Pred2::out(fpred)),
 	[will_not_call_mercury, promise_pure], "
 {
 	Pred2 = Pred1;
 }
 ").
 :- pragma foreign_proc("Java",
-	convert_inst(Pred1::in, Pred2::out(mypred)),
+	convert_inst(Pred1::in, Pred2::out(fpred)),
 	[will_not_call_mercury, promise_pure],
 "
 {
@@ -75,7 +80,7 @@ foo(X) :- X = (pred(A::in, B::in, C::out) is det :- C = A + B).
 }
 ").
 :- pragma foreign_proc("Erlang",
-	convert_inst(Pred1::in, Pred2::out(mypred)),
+	convert_inst(Pred1::in, Pred2::out(fpred)),
 	[will_not_call_mercury, promise_pure], "
 	Pred2 = Pred1
 ").
diff --git a/tools/make_spec_ho_call b/tools/make_spec_ho_call
index ca25883..8235b77 100755
--- a/tools/make_spec_ho_call
+++ b/tools/make_spec_ho_call
@@ -23,7 +23,12 @@ do
     echo "{"
     echo "    MR_Closure    *closure;"
     echo "    int           num_explicit_args;"
-    echo "    int           num_hidden_args;"
+    echo "    int           num_hidden_rf_args;"
+    echo "    int           num_hidden_r_args;"
+    echo "#ifdef MR_BOXED_FLOAT"
+    echo "    int           num_explicit_f_args = 0;"
+    echo "    int           num_hidden_f_args;"
+    echo "#endif"
     echo "    int           i;"
 
     i=1;
@@ -35,10 +40,20 @@ do
 
     echo
     echo "    closure = (MR_Closure *) MR_r1;"
-    echo "    num_hidden_args = closure->MR_closure_num_hidden_args;"
+    echo "    num_hidden_rf_args = closure->MR_closure_num_hidden_args_rf;"
+    echo "    num_hidden_r_args = MR_closure_num_hidden_r_args(closure);"
+    echo "#ifdef MR_BOXED_FLOAT"
+    echo "    num_hidden_f_args = MR_closure_num_hidden_f_args(closure);"
+    echo "#endif"
+
     if test $spec_explicit_arg -lt 0
     then
+        echo "#ifdef MR_BOXED_FLOAT"
+        echo "    num_explicit_args = (MR_r2 & 0xffff);"
+        echo "    num_explicit_f_args = (MR_r2 >> 16);"
+        echo "#else"
         echo "    num_explicit_args = MR_r2;"
+        echo "#endif"
         num_explicit_args="num_explicit_args"
     else
         i=1;
@@ -51,12 +66,12 @@ do
         num_explicit_args="$spec_explicit_arg"
     fi
     echo
-    echo "    MR_maybe_record_closure_histogram($num_explicit_args, num_hidden_args);"
+    echo "    MR_maybe_record_closure_histogram($num_explicit_args, num_hidden_r_args);"
     echo
 
     if test $max_spec_hidden_arg -ge 0 -a $spec_explicit_arg -ge 0
     then
-        echo "    switch (num_hidden_args) {"
+        echo "    switch (num_hidden_rf_args) {"
         spec_hidden_arg=0
         while test $spec_hidden_arg -le $max_spec_hidden_arg
         do
@@ -90,16 +105,16 @@ do
     echo "    MR_save_registers();"
     if test $spec_explicit_arg -lt 0
     then
-        echo "    if (num_hidden_args < MR_HO_CALL_INPUTS_COMPACT) {"
+        echo "    if (num_hidden_r_args < MR_HO_CALL_INPUTS_COMPACT) {"
         echo "        /* copy the explicit args to the left, from the left */"
         echo "        for (i = 1; i <= num_explicit_args; i++) {"
-        echo "            MR_virtual_reg_assign(i + num_hidden_args,"
+        echo "            MR_virtual_reg_assign(i + num_hidden_r_args,"
         echo "                MR_virtual_reg_value(i + MR_HO_CALL_INPUTS_COMPACT));"
         echo "        }"
-        echo "    } else if (num_hidden_args > MR_HO_CALL_INPUTS_COMPACT) {"
+        echo "    } else if (num_hidden_r_args > MR_HO_CALL_INPUTS_COMPACT) {"
         echo "        /* copy the explicit args to the right, from the right */"
         echo "        for (i = num_explicit_args; i > 0 ; i--) {"
-        echo "            MR_virtual_reg_assign(i + num_hidden_args,"
+        echo "            MR_virtual_reg_assign(i + num_hidden_r_args,"
         echo "                MR_virtual_reg_value(i + MR_HO_CALL_INPUTS_COMPACT));"
         echo "        }"
         echo "    } /* else the explicit args are in the right place */"
@@ -107,16 +122,32 @@ do
         i=1
         while test $i -le $spec_explicit_arg
         do
-            echo "    MR_virtual_reg_assign(num_hidden_args + $i, arg$i);"
+            echo "    MR_virtual_reg_assign(num_hidden_r_args + $i, arg$i);"
             i=`expr $i + 1`
         done
     fi
 
     echo
-    echo "    for (i = 1; i <= num_hidden_args; i++) {"
+    echo "    for (i = 1; i <= num_hidden_r_args; i++) {"
     echo "        MR_virtual_reg_assign(i, closure->MR_closure_hidden_args(i));"
     echo "    }"
     echo "    MR_restore_registers();"
+
+    echo
+    echo "#ifdef MR_BOXED_FLOAT"
+    echo "    if (num_hidden_f_args > 0) {"
+    echo "        /* copy the explicit args to the right, from the right */"
+    echo "        for (i = num_explicit_f_args; i > 0 ; i--) {"
+    echo "            MR_f(i + num_hidden_f_args) = MR_f(i);"
+    echo "        }"
+    echo "        for (i = 1; i <= num_hidden_f_args; i++) {"
+    echo "            MR_f(i) = MR_word_to_float("
+    echo "                closure->MR_closure_hidden_args("
+    echo "                    num_hidden_r_args + i));"
+    echo "        }"
+    echo "    }"
+    echo "#endif"
+
     echo
     echo "    MR_tailcall(closure->MR_closure_code, MR_prof_ho_caller_proc);"
     echo "}"
--------------------------------------------------------------------------
mercury-reviews mailing list
Post messages to:       mercury-reviews at csse.unimelb.edu.au
Administrative Queries: owner-mercury-reviews at csse.unimelb.edu.au
Subscriptions:          mercury-reviews-request at csse.unimelb.edu.au
--------------------------------------------------------------------------



More information about the reviews mailing list