From novalazy at gmail.com Mon Aug 5 14:42:26 2019 From: novalazy at gmail.com (Peter Wang) Date: Mon, 5 Aug 2019 14:42:26 +1000 Subject: [m-dev.] idea to speed up making trans_opt files Message-ID: <20190805144226.GP1266@kola.localdomain> Hi, I was reminding myself why making .trans_opt files in the library directory is a largely sequential process. There is a blob of cyclic dependencies in the standard library (67 modules). The cycle is broken arbitrarily by name: later modules can't use the .trans_opt files of earlier modules for the purposes of making a .trans_opt file. I don't have a better idea for increasing parallelism in the process except, perhaps, manually removing a module's dependency on a particular .trans_opt file if we determine beforehand that the dependency will not (significantly) improve the results of the current module. To speed up the sequential process of making .trans_opt files: there probably is a lot of repeated work in reading and processing the same interface and .opt files since mmc is invoked to generate each .trans_opt separately. The obvious solution is to call mmc once with the set of modules it needs to generate .trans_opt files for, and have it retain the parsed/processed modules in memory across the set. I'm not intending to work on this any time soon. Does it sound like a large change? Peter From jfischer at opturion.com Mon Aug 5 16:40:13 2019 From: jfischer at opturion.com (Julien Fischer) Date: Mon, 5 Aug 2019 16:40:13 +1000 (AEST) Subject: [m-dev.] standard library hash functions Message-ID: Hi all, With commit f495964, we now have two copies of the int (uint) hash function -- one in library/hash_table.m and one in library/version_hash_table.m. I propose we add the functions uint.hash/1 int.hash/1 char.hash/1 (for the sake of uniformity) and replace the ones the *hash_table modules. (Note that we already have both float.hash/1 and string.hash/1.) Also, if anyone can think of a good home for generic_hash/1, that's also duplicated between the two modules. Julien. From novalazy at gmail.com Mon Aug 5 17:19:45 2019 From: novalazy at gmail.com (Peter Wang) Date: Mon, 5 Aug 2019 17:19:45 +1000 Subject: [m-dev.] standard library hash functions In-Reply-To: References: Message-ID: <20190805171945.GC26357@kola.localdomain> On Mon, 5 Aug 2019 16:40:13 +1000 (AEST), Julien Fischer wrote: > > Hi all, > > With commit f495964, we now have two copies of the int (uint) hash > function -- one in library/hash_table.m and one in > library/version_hash_table.m. I propose we add the functions > > uint.hash/1 > int.hash/1 > char.hash/1 (for the sake of uniformity) > > and replace the ones the *hash_table modules. (Note that we already > have both float.hash/1 and string.hash/1.) Also, if anyone can think > of a good home for generic_hash/1, that's also duplicated between the > two modules. That seems okay. A good home for generic_hash/1 would be in the bin. Peter From zoltan.somogyi at runbox.com Mon Aug 5 18:59:04 2019 From: zoltan.somogyi at runbox.com (Zoltan Somogyi) Date: Mon, 05 Aug 2019 10:59:04 +0200 (CEST) Subject: [m-dev.] idea to speed up making trans_opt files In-Reply-To: <20190805144226.GP1266@kola.localdomain> References: <20190805144226.GP1266@kola.localdomain> Message-ID: On Mon, 5 Aug 2019 14:42:26 +1000, Peter Wang wrote: > I was reminding myself why making .trans_opt files in the library > directory is a largely sequential process. There is a blob of cyclic > dependencies in the standard library (67 modules). The cycle is broken > arbitrarily by name: later modules can't use the .trans_opt files of > earlier modules for the purposes of making a .trans_opt file. > > I don't have a better idea for increasing parallelism in the process > except, perhaps, manually removing a module's dependency on a particular > .trans_opt file if we determine beforehand that the dependency will not > (significantly) improve the results of the current module. What I have long wanted was the ability to *visualise* the cyclic (and other) dependencies graphically, with the info about *what* module A needs from module B being available if you click on the arrow between them. The compiler can now be asked to generate the info for the links; the visualization part is still missing. > To speed up the sequential process of making .trans_opt files: > there probably is a lot of repeated work in reading and processing the > same interface and .opt files since mmc is invoked to generate each > .trans_opt separately. The obvious solution is to call mmc once with the > set of modules it needs to generate .trans_opt files for, and have it > retain the parsed/processed modules in memory across the set. > > I'm not intending to work on this any time soon. Does it sound like a > large change? The compiler already has a mechanism for caching the parsed contents of the files it has read, which it does not use very extensively. Extending that mechanism for this use case should not be too difficult. But I would ask you not to do this soon, as I am working on related code and such changes will lead to conflicts. The part about telling mmake to invoke a single command to build *all* the .trans_opt files may be harder to do in a backward compatible fashion. (I haven't worked on make code for a while, and my make-fu is on the fritz :-) Zoltan. From zoltan.somogyi at runbox.com Mon Aug 26 13:36:33 2019 From: zoltan.somogyi at runbox.com (Zoltan Somogyi) Date: Mon, 26 Aug 2019 13:36:33 +1000 (AEST) Subject: [m-dev.] [Mercury-Language/mercury] Segfault in only Mercury after solutions in a particular code path (#72) Message-ID: On Mon, 26 Aug 2019 10:59:40 +1000 (AEST), "Zoltan Somogyi" wrote: > The bug seems to be in simplification, as shown by the attached HLDS dumps. > It is moving the assignment to R to *before* its value is determined. > > I will take a look at it. I have taken a look at it, and simplificiation is not to blame. It is not the cause of the bug: it only reveals the bug. The true cause of the bug is a fundamental problem in the interaction between type-specific unification predicates when operating on partially instantiated data structures, and the direct arg optimization. The relevant part of the hlc code we generate when we compile (a slightly modified copy of) the bug72 test case with inlining looks like this: main_2_p_0(void) { { MR_bool succeeded; MR_Word R_4 = (MR_Word) (MR_mkword(MR_mktag(1), (MR_Word) (NULL))); MR_Word S_6 = (MR_Word) (MR_mkword(MR_mktag(1), &bug72a_scalar_common_1[1])); MR_Word Var_9; MR_Word Var_13; MR_Word ArgX1_21; succeeded = (S_6 != (MR_Word) ((MR_Unsigned) 0U)); if (succeeded) { Var_13 = ((MR_Word) ((MR_hl_field(MR_mktag(1), S_6, (MR_Integer) 0)))); Var_9 = ((MR_Word) ((MR_hl_field(MR_mktag(1), S_6, (MR_Integer) 1)))); succeeded = ((MR_tag((MR_Word) Var_13)) == (MR_Integer) 1); if (succeeded) { ArgX1_21 = (MR_Word) (MR_body((MR_Word) (Var_13), (MR_Integer) 1)); R_4 = (MR_Word) (MR_mkword(MR_mktag(1), (MR_Word) (ArgX1_21))); succeeded = (Var_9 == (MR_Word) ((MR_Unsigned) 0U)); } } ... This code works. With --no-inlining, the corresponding code we get is main_2_p_0(void) { { MR_bool succeeded; MR_Word R_4 = (MR_Word) (MR_mkword(MR_mktag(1), (MR_Word) (NULL))); MR_Word S_6; MR_Word Var_9; MR_Word Var_13; bug72a__sols_1_p_0(&S_6); succeeded = (S_6 != (MR_Word) ((MR_Unsigned) 0U)); if (succeeded) { Var_13 = ((MR_Word) ((MR_hl_field(MR_mktag(1), S_6, (MR_Integer) 0)))); Var_9 = ((MR_Word) ((MR_hl_field(MR_mktag(1), S_6, (MR_Integer) 1)))); succeeded = bug72a____Unify____maybe_reviewed_0_1(R_4, Var_13); if (succeeded) succeeded = (Var_9 == (MR_Word) ((MR_Unsigned) 0U)); } ... with the code of the unify function being bug72a____Unify____maybe_reviewed_0_1( MR_Word HeadVar__1_1, MR_Word HeadVar__2_2) { { MR_bool succeeded = ((MR_tag((MR_Word) HeadVar__2_2)) == (MR_Integer) 1); MR_Word ArgX1_5; if (succeeded) { ArgX1_5 = (MR_Word) (MR_body((MR_Word) (HeadVar__2_2), (MR_Integer) 1)); HeadVar__1_1 = (MR_Word) (MR_mkword(MR_mktag(1), (MR_Word) (ArgX1_5))); succeeded = MR_TRUE; } return succeeded; } } This is exactly the same code as one would expect from the inlined version, with some cosmetic differences arising from differences in variable numbering, and from MLDS optimizations merging successive "if (succeeded) ..." statements. The root of the problem is the parameter passing. In the inlined version, when we fill in the argument of R_4, we update R_4. In the non-inlined version, when we fill in the argument of HeadVar__1, for which main passes R_4, we update HeadVar__1, but do NOT update R_4 in main. Normally, this is not a problem: the filled in argument is normally on the heap, and the caller's pointer also points to it, so the caller also sees the field as being filled in. However, with the direct arg optimization, the field being filled in is NOT on the heap; it is in the pointer, next to the primary tag. It is the lack of any update to this field in the caller's R_4 that leaves the value of R_4 as a tagged NULL pointer, whose dereferencing leads to the crash. I see two obvious approaches to fixing this. The quick-and-dirty fix would be to insist that the calls to unify predicates that have this problem (i.e. they are for a type with one or more direct arg functors, and some args of those direct-arg functors are initially free) be inlined even in the presence of --no-inlining. The other would be to modify the parameter passing conventions for such unifications, which would require not just nontrivial changes to the code that handles both the caller and the callee sides of such calls, but probably also significant changes in all the code generators. Currently all code generators assume that a call's outputs define *new* variables; making them handle situations in which a call has one or two outputs that update *existing* variables would also not be trivial. On that basis, the first approach would seem more pragmatic, even though conceptually it is far from satisfying. Opinions? Zoltan. -------------- next part -------------- A non-text attachment was scrubbed... Name: bug72a.m Type: text/x-objcsrc Size: 1162 bytes Desc: not available URL: From novalazy at gmail.com Mon Aug 26 15:00:55 2019 From: novalazy at gmail.com (Peter Wang) Date: Mon, 26 Aug 2019 15:00:55 +1000 Subject: [m-dev.] [Mercury-Language/mercury] Segfault in only Mercury after solutions in a particular code path (#72) In-Reply-To: References: Message-ID: <20190826150055.GB1344@kola.localdomain> On Mon, 26 Aug 2019 13:36:33 +1000 (AEST), "Zoltan Somogyi" wrote: > > The root of the problem is the parameter passing. In the inlined version, > when we fill in the argument of R_4, we update R_4. In the non-inlined version, > when we fill in the argument of HeadVar__1, for which main passes R_4, > we update HeadVar__1, but do NOT update R_4 in main. Normally, this is > not a problem: the filled in argument is normally on the heap, and the caller's > pointer also points to it, so the caller also sees the field as being filled in. > However, with the direct arg optimization, the field being filled in is NOT > on the heap; it is in the pointer, next to the primary tag. It is the lack of > any update to this field in the caller's R_4 that leaves the value of R_4 as a > tagged NULL pointer, whose dereferencing leads to the crash. > > I see two obvious approaches to fixing this. The quick-and-dirty fix would be > to insist that the calls to unify predicates that have this problem (i.e. they are > for a type with one or more direct arg functors, and some args of those direct-arg > functors are initially free) be inlined even in the presence of --no-inlining. > The other would be to modify the parameter passing conventions for such > unifications, which would require not just nontrivial changes to the code > that handles both the caller and the callee sides of such calls, but probably > also significant changes in all the code generators. Currently all code generators > assume that a call's outputs define *new* variables; making them handle situations > in which a call has one or two outputs that update *existing* variables would also > not be trivial. > > On that basis, the first approach would seem more pragmatic, even though > conceptually it is far from satisfying. > > Opinions? Unfortunately the problem is not limited to unify predicates, as in the attached test case. We may need to disable the direct arg optimization until the calling convention can be changed. Peter -------------- next part -------------- %-----------------------------------------------------------------------------% % vim: ft=mercury ts=4 sw=4 et %-----------------------------------------------------------------------------% :- module bug72b. :- interface. :- import_module io. :- pred main(io::di, io::uo) is det. :- implementation. :- import_module list. :- import_module string. :- type maybe_reviewed ---> reviewed(package) ; unreviewed(package). :- type package ---> package(string, string). main(!IO) :- R = unreviewed(_), fill(R), io.write_string(dump_maybe_reviewed(R), !IO), io.nl(!IO). :- pred fill(maybe_reviewed). :- mode fill(bound(unreviewed(free)) >> ground) is det. :- pragma no_inline(fill/1). fill(unreviewed(P)) :- P = package("a", "b"). :- func dump_maybe_reviewed(maybe_reviewed) = string. :- pragma no_inline(dump_maybe_reviewed/1). dump_maybe_reviewed(reviewed(P)) = "reviewed " ++ dump_package(P). dump_maybe_reviewed(unreviewed(P)) = "unreviewed " ++ dump_package(P). :- func dump_package(package) = string. :- pragma no_inline(dump_package/1). dump_package(package(A, B)) = A ++ B. From zoltan.somogyi at runbox.com Tue Aug 27 08:29:57 2019 From: zoltan.somogyi at runbox.com (Zoltan Somogyi) Date: Tue, 27 Aug 2019 08:29:57 +1000 (AEST) Subject: [m-dev.] [Mercury-Language/mercury] Segfault in only Mercury after solutions in a particular code path (#72) In-Reply-To: <20190826150055.GB1344@kola.localdomain> References: <20190826150055.GB1344@kola.localdomain> Message-ID: On Mon, 26 Aug 2019 15:00:55 +1000, Peter Wang wrote: > Unfortunately the problem is not limited to unify predicates, as in the > attached test case. Yes, you are right. That leaves changing the arg passing convention the only feasible approach. > We may need to disable the direct arg optimization > until the calling convention can be changed. Agreed. The attached diff makes that possible. The option it adds *must* be set consistently in *all* the modules of a program, including the standard library, so flipping the switch from the current backward- compatible "yes" to "no" is a significant user-visible change, but then again, changing the calling convention for predicates with direct-arg arguments is a user-visible change as well, so we have to bite at least one of those bullets sometime. Any opinions on which, and when? Note that we should remove the "where direct_arg is" syntax on type definitions sometime, since the compiler can now do at least as good a job as humans at deciding when the optimization is applicable (provided it is switched on, of course). Flip-the-switch time would be as good a time as any for this, which makes *now* a good time for its deprecation. On a tangentially related topic: I want to add both test cases to the suite, but intend to add them under the name gh72a.m and gh72b.m. We have been naming test cases bugN.m for Mantis bug N for a long time. Github bug reports started much more recently, so they are going over numbers that Mantis has already gone over. Naming test cases from github ghN.m will avoid clashes. There is already a gh65.m in tests/valid. Any objections to using this naming convention from now on? Also, Julien, would you object to commenting out the test_generic_ref test case? All other tests that we know we can't pass in any grade are commented out in the relevant test directory's Mmakefile; why is this test treated differently? Zoltan. -------------- next part -------------- A non-text attachment was scrubbed... Name: Log.ada Type: application/octet-stream Size: 203 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: DIFF.ada Type: application/octet-stream Size: 3818 bytes Desc: not available URL: From jfischer at opturion.com Tue Aug 27 10:34:07 2019 From: jfischer at opturion.com (Julien Fischer) Date: Tue, 27 Aug 2019 10:34:07 +1000 (AEST) Subject: [m-dev.] [Mercury-Language/mercury] Segfault in only Mercury after solutions in a particular code path (#72) In-Reply-To: References: <20190826150055.GB1344@kola.localdomain> Message-ID: Hi Zoltan, On Tue, 27 Aug 2019, Zoltan Somogyi wrote: > On a tangentially related topic: I want to add both test cases to the suite, > but intend to add them under the name gh72a.m and gh72b.m. We have been > naming test cases bugN.m for Mantis bug N for a long time. Github bug reports > started much more recently, so they are going over numbers that Mantis has > already gone over. Naming test cases from github ghN.m will avoid clashes. > There is already a gh65.m in tests/valid. Any objections to using this naming > convention from now on? Not from me. We should probably switch over to github as our primary bug tracking system. It is possible to export the issues data from github, so the copy on gibhub being the only version is no longer a problem. (IIRC, that was the main reason we stayed with Mantis when Mercury moved to git.) > Also, Julien, would you object to commenting out the test_generic_ref test case? No. Julien. From novalazy at gmail.com Tue Aug 27 11:32:26 2019 From: novalazy at gmail.com (Peter Wang) Date: Tue, 27 Aug 2019 11:32:26 +1000 Subject: [m-dev.] [Mercury-Language/mercury] Segfault in only Mercury after solutions in a particular code path (#72) In-Reply-To: References: <20190826150055.GB1344@kola.localdomain> Message-ID: <20190827113226.GI1306@kola.localdomain> On Tue, 27 Aug 2019 08:29:57 +1000 (AEST), "Zoltan Somogyi" wrote: > > > On Mon, 26 Aug 2019 15:00:55 +1000, Peter Wang wrote: > > Unfortunately the problem is not limited to unify predicates, as in the > > attached test case. > > Yes, you are right. That leaves changing the arg passing convention > the only feasible approach. > > > We may need to disable the direct arg optimization > > until the calling convention can be changed. > > Agreed. The attached diff makes that possible. > > The option it adds *must* be set consistently in *all* the modules of a program, > including the standard library, so flipping the switch from the current backward- > compatible "yes" to "no" is a significant user-visible change, but then again, > changing the calling convention for predicates with direct-arg arguments > is a user-visible change as well, so we have to bite at least one of those bullets > sometime. Any opinions on which, and when? For me, it's fine to break compatibility across rotds. Few users (zero?) would have such big programs that rebuilds are of any concern. (1) switch off direct args, (2) change the calling convention, (3) switch on direct args. > Note that we should remove the "where direct_arg is" syntax on type definitions > sometime, since the compiler can now do at least as good a job as humans > at deciding when the optimization is applicable (provided it is switched on, of course). > Flip-the-switch time would be as good a time as any for this, which makes *now* > a good time for its deprecation. Sure. > On a tangentially related topic: I want to add both test cases to the suite, > but intend to add them under the name gh72a.m and gh72b.m. We have been > naming test cases bugN.m for Mantis bug N for a long time. Github bug reports > started much more recently, so they are going over numbers that Mantis has > already gone over. Naming test cases from github ghN.m will avoid clashes. > There is already a gh65.m in tests/valid. Any objections to using this naming > convention from now on? No objection. Peter From zoltan.somogyi at runbox.com Tue Aug 27 11:38:14 2019 From: zoltan.somogyi at runbox.com (Zoltan Somogyi) Date: Tue, 27 Aug 2019 11:38:14 +1000 (AEST) Subject: [m-dev.] [Mercury-Language/mercury] Segfault in only Mercury after solutions in a particular code path (#72) In-Reply-To: <20190827113226.GI1306@kola.localdomain> References: <20190826150055.GB1344@kola.localdomain> <20190827113226.GI1306@kola.localdomain> Message-ID: On Tue, 27 Aug 2019 11:32:26 +1000, Peter Wang wrote: > For me, it's fine to break compatibility across rotds. Few users (zero?) > would have such big programs that rebuilds are of any concern. Agreed. The issue I see is that they have to *know* that they should rebuild from scratch. If they don't, and try to mix .c files generated by Mercury compilers from before and after the change, the result will be mysterious failures. > > Note that we should remove the "where direct_arg is" syntax on type definitions > > sometime, since the compiler can now do at least as good a job as humans > > at deciding when the optimization is applicable (provided it is switched on, of course). > > Flip-the-switch time would be as good a time as any for this, which makes *now* > > a good time for its deprecation. > > Sure. OK, I will add a note to NEWS, and post to m-users. Zoltan. From novalazy at gmail.com Tue Aug 27 11:40:07 2019 From: novalazy at gmail.com (Peter Wang) Date: Tue, 27 Aug 2019 11:40:07 +1000 Subject: [m-dev.] [Mercury-Language/mercury] Segfault in only Mercury after solutions in a particular code path (#72) In-Reply-To: References: <20190826150055.GB1344@kola.localdomain> Message-ID: <20190827114007.GL1306@kola.localdomain> On Tue, 27 Aug 2019 10:34:07 +1000 (AEST), Julien Fischer wrote: > > Hi Zoltan, > > On Tue, 27 Aug 2019, Zoltan Somogyi wrote: > > > On a tangentially related topic: I want to add both test cases to the suite, > > but intend to add them under the name gh72a.m and gh72b.m. We have been > > naming test cases bugN.m for Mantis bug N for a long time. Github bug reports > > started much more recently, so they are going over numbers that Mantis has > > already gone over. Naming test cases from github ghN.m will avoid clashes. > > There is already a gh65.m in tests/valid. Any objections to using this naming > > convention from now on? > > Not from me. We should probably switch over to github as our primary > bug tracking system. It is possible to export the issues data from > github, so the copy on gibhub being the only version is no longer a > problem. (IIRC, that was the main reason we stayed with Mantis when > Mercury moved to git.) Github issues did not support attachments at the time (except for images, I think). Attachments are supported now, restricted to certain file extensions. I think it's acceptable. Peter From zoltan.somogyi at runbox.com Tue Aug 27 11:55:20 2019 From: zoltan.somogyi at runbox.com (Zoltan Somogyi) Date: Tue, 27 Aug 2019 11:55:20 +1000 (AEST) Subject: [m-dev.] [Mercury-Language/mercury] Segfault in only Mercury after solutions in a particular code path (#72) In-Reply-To: References: <20190826150055.GB1344@kola.localdomain> Message-ID: On Tue, 27 Aug 2019 10:34:07 +1000 (AEST), Julien Fischer wrote: > We should probably switch over to github as our primary > bug tracking system. It is possible to export the issues data from > github, so the copy on gibhub being the only version is no longer a > problem. (IIRC, that was the main reason we stayed with Mantis when > Mercury moved to git.) I find Mantis's user interface to be much easier to use than Github's for this, so I would prefer to keep using Mantis as our primary bug database. Since people can report issues via Github, they will, let's not encourage its use for this. Zoltan. From zoltan.somogyi at runbox.com Tue Aug 27 12:23:55 2019 From: zoltan.somogyi at runbox.com (Zoltan Somogyi) Date: Tue, 27 Aug 2019 12:23:55 +1000 (AEST) Subject: [m-dev.] [Mercury-Language/mercury] Segfault in only Mercury after solutions in a particular code path (#72) In-Reply-To: <20190827114007.GL1306@kola.localdomain> References: <20190826150055.GB1344@kola.localdomain> <20190827114007.GL1306@kola.localdomain> Message-ID: On Tue, 27 Aug 2019 11:40:07 +1000, Peter Wang wrote: > Github issues did not support attachments at the time (except for > images, I think). Attachments are supported now, restricted to certain > file extensions. I think it's acceptable. Even if .m is not on the list of extensions they allow? Is there some documentation of the *reasons* for the limits on extensions? Zoltan. From novalazy at gmail.com Tue Aug 27 13:53:00 2019 From: novalazy at gmail.com (Peter Wang) Date: Tue, 27 Aug 2019 13:53:00 +1000 Subject: [m-dev.] [Mercury-Language/mercury] Segfault in only Mercury after solutions in a particular code path (#72) In-Reply-To: References: <20190826150055.GB1344@kola.localdomain> <20190827114007.GL1306@kola.localdomain> Message-ID: <20190827135300.GA1306@kola.localdomain> On Tue, 27 Aug 2019 12:23:55 +1000 (AEST), "Zoltan Somogyi" wrote: > > > On Tue, 27 Aug 2019 11:40:07 +1000, Peter Wang wrote: > > Github issues did not support attachments at the time (except for > > images, I think). Attachments are supported now, restricted to certain > > file extensions. I think it's acceptable. > > Even if .m is not on the list of extensions they allow? It's not ideal, but I wouldn't be opening issues or attaching files very often so the workarounds are acceptable. For a single file, rename or gzip the file. For multiple files, put the files in a zip or tar.gz. I'm not saying we definitely should migrate, just that the one problem I had with Github issues has been (mostly) resolved. > > Is there some documentation of the *reasons* for the limits on extensions? Not that I know of. Peter