<html> <head> <meta http-equiv="Content-Type" content="text/html; charset=us-ascii"> <style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style> </head> <body dir="ltr"> <div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);" class="elementToProof ContentPasted0"> Zoltan and Richard, <div><br class="ContentPasted0"> </div> <div class="ContentPasted0">I drafted a response a few months ago that I never sent: given that my main field is not computer science, I deferred to you. </div> <div><br class="ContentPasted0"> </div> <div class="ContentPasted0 ContentPasted1">As a follow-up: when should one use non-determinism and when should one use Mercury? To motivate my questions: I came to Mercury because I was looking for a typed "Prolog" that allowed for elegant joins between relationships (essentially to replace SQL:). Zoltan's implementation for the CSV example is canonical and efficient code - but it is also comparatively long and uses several data structures that could be replaced by less efficient non-determinism. When is the elegance of a non-deterministic solution acceptable given its inefficiency?</div> <div><br class="ContentPasted0"> </div> <div class="ContentPasted0">Sincerely, Mark.</div> <br> </div> <div id="appendonsend"></div> <hr style="display:inline-block;width:98%" tabindex="-1"> <div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> Richard O'Keefe <raoknz@gmail.com><br> <b>Sent:</b> 31 December 2022 04:57<br> <b>To:</b> zoltan.somogyi@runbox.com <zoltan.somogyi@runbox.com><br> <b>Cc:</b> Mark Clements <mark.clements@ki.se>; users <users@lists.mercurylang.org><br> <b>Subject:</b> Re: [m-users.] Announcement (aggregates module) + questions (window functions)</font> <div> </div> </div> <div> <table border="0" cellspacing="0" cellpadding="0" width="100%" align="left" style="border:0; display:table; width:100%; table-layout:fixed; border-collapse:seperate; float:none"> <tbody style="display:block"> <tr> <td valign="middle" width="1px" bgcolor="#A6A6A6" cellpadding="7px 2px 7px 2px" style="padding:7px 2px 7px 2px; background-color:#A6A6A6"> </td> <td valign="middle" width="100%" bgcolor="#EAEAEA" cellpadding="7px 5px 7px 15px" color="#212121" style="width:100%; background-color:#EAEAEA; padding:7px 5px 7px 15px; font-family:wf_segoe-ui_normal,Segoe UI,Segoe WP,Tahoma,Arial,sans-serif; font-size:12px; font-weight:normal; color:#212121; text-align:left; word-wrap:break-word"> <div>You don't often get email from raoknz@gmail.com. <a href="https://aka.ms/LearnAboutSenderIdentification"> Learn why this is important</a></div> </td> <td valign="middle" align="left" width="75px" bgcolor="#EAEAEA" cellpadding="7px 5px 7px 5px" color="#212121" style="width:75px; background-color:#EAEAEA; padding:7px 5px 7px 5px; font-family:wf_segoe-ui_normal,Segoe UI,Segoe WP,Tahoma,Arial,sans-serif; font-size:12px; font-weight:normal; color:#212121; text-align:left; word-wrap:break-word"> </td> </tr> </tbody> </table> <div> <div dir="ltr"> <div class="x_gmail_default" style="font-family:monospace,monospace">I'd like to second Zoltan's point.</div> <div class="x_gmail_default" style="font-family:monospace,monospace">I've solved that particular Rosetta Code problem in two different</div> <div class="x_gmail_default" style="font-family:monospace,monospace">programming languages, using an imperative style with hash tables,</div> <div class="x_gmail_default" style="font-family:monospace,monospace">and modulo the usual hand-waving about hash-tables being "O(1)"</div> <div class="x_gmail_default" style="font-family:monospace,monospace">-- which is fair enough in this case -- the code is obviously</div> <div class="x_gmail_default" style="font-family:monospace,monospace">linear in the size of the two CSV files the program has to read.</div> <div class="x_gmail_default" style="font-family:monospace,monospace"><br> </div> <div class="x_gmail_default" style="font-family:monospace,monospace">I'm reminded of my experience with XPath, XQuery, and XSLT, that</div> <div class="x_gmail_default" style="font-family:monospace,monospace">sometimes it is simpler and more readable NOT to use a specialised</div> <div class="x_gmail_default" style="font-family:monospace,monospace">query language but to use plain high level code in a language with</div> <div class="x_gmail_default" style="font-family:monospace,monospace">decent facilities. (Such as Mercury.) Not coincidentally, XPath,</div> <div class="x_gmail_default" style="font-family:monospace,monospace">XQuery, and XSLT hide practically everything you need to understand</div> <div class="x_gmail_default" style="font-family:monospace,monospace">performance.</div> <div class="x_gmail_default" style="font-family:monospace,monospace"><br> </div> </div> <br> <div class="x_gmail_quote"> <div dir="ltr" class="x_gmail_attr">On Sat, 31 Dec 2022 at 08:35, Zoltan Somogyi <<a href="mailto:zoltan.somogyi@runbox.com">zoltan.somogyi@runbox.com</a>> wrote:<br> </div> <blockquote class="x_gmail_quote" style="margin:0px 0px 0px 0.8ex; border-left:1px solid rgb(204,204,204); padding-left:1ex"> <br> 2022-12-30 00:27 GMT+11:00 "Mark Clements" <<a href="mailto:mark.clements@ki.se" target="_blank">mark.clements@ki.se</a>>:<br> > %% patient and visit are nondet predicates<br> > main(!IO) :-<br> > print_line("{Id, RowNumber, Date, Score, CumScore}", !IO),<br> > aggregate((pred({Id,RowNumber,Datei,Scorei,CumScorei}::out) is nondet :-<br> > patient(Id,_),<br> > Combined = (pred(Date::out,Score::out) is nondet :- visit(Id,Date,Score)),<br> > Combined(Datei,Scorei),<br> > bag_cum_sum(Combined)(Datei,CumScorei),<br> > Dates = (pred(Date::out) is nondet :- Combined(Date,_)),<br> > bag_row_number(Dates)(Datei,RowNumber)),<br> > print_line,<br> > !IO).<br> > <br> > Third, I have sought to stay with nondet predicates, with the implementation internally using lists -- is there a better approach?<br> <br> I think that getting your data from nondet predicates is fundamentally a bad idea.<br> The reason is simple: it bakes the data into the program. If you want to run<br> the same task on a different data set, you have to modify the program and recompile it.<br> This is much less convenient than a program that you can run on a different data set<br> simply by invoking it with different file names.<br> <br> The attached code is my solution to the same task on <a href="https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Frosettacode.org%2F&data=05%7C01%7Cmark.clements%40ki.se%7C585dbab45ec34f7f25ef08daeae32777%7Cbff7eef1cf4b4f32be3da1dda043c05d%7C0%7C0%7C638080559250227740%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000%7C%7C%7C&sdata=tHDpCagECzIIv%2FsnHGSobrwGJS4wgMOfxMwkUswQ%2BBo%3D&reserved=0" originalsrc="http://rosettacode.org/" shash="KYQo7rJmzTi77Cj5G3d4yF79MDmfXh13zKiuZTE0kfi4Cbxax/5w+HN/iIIAcq6UlZZKpihfc3T0QP4cp5UCIRKJ8eu84v+2R64QhJqRIJwbBVA/MUwekTsTGehIWR/LtMr3+QmS9wJgOUnm4l0k20mJUITbFFzZxEiLsvm3s2g=" rel="noreferrer" target="_blank"> rosettacode.org</a>.<br> You will note that its operative part is longer than your code above,<br> but it is also much simpler, and therefore easier to read and to<br> understand. It is also easier to reason about its performance.<br> For example,<br> <br> - the main operation loops over all visits,<br> - the non-constant-time part of each iteration consists of lookup and update operations<br> on its main data structure, VisitDataMap,<br> - VisitDataMap is a map, and is therefore implemented using balanced trees.<br> <br> This makes it clear that its complexity is O(N log N), where N is the number of visits.<br> (Technically, it is O(N log M), where M is the number of unique patients, but the<br> difference is negligible.) By contrast, I cannot tell anything about the performance<br> of the aggregate-using code above, because all the relevant details are hidden<br> behind abstraction boundaries, whose documentation is silent about performance.<br> Note that in the SQL programs from which you draw your inspiration, selecting<br> the right set of indexes for each relation is usually an important design concern.<br> <br> Zoltan._______________________________________________<br> users mailing list<br> <a href="mailto:users@lists.mercurylang.org" target="_blank">users@lists.mercurylang.org</a><br> <a href="https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.mercurylang.org%2Flistinfo%2Fusers&data=05%7C01%7Cmark.clements%40ki.se%7C585dbab45ec34f7f25ef08daeae32777%7Cbff7eef1cf4b4f32be3da1dda043c05d%7C0%7C0%7C638080559250227740%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000%7C%7C%7C&sdata=ifg5Y1A2bX92Y5sjcVVwesIHjFXY42k0fKIydxoUXZs%3D&reserved=0" originalsrc="https://lists.mercurylang.org/listinfo/users" shash="HUlplGMn8JYLEWrZN8r4ACieTFlWp3fVYQIM55k401pQ+DoAxGd8gjZT/vcu0rf5TUsogritoo/q4DETVVtZWL7WLbHALKbjUJ03q8eaoDI1Y51Zzc2wGdQvbKed3zNZgErbcmReuTpHEd1qD3kJ0oUwC8yw3SkfpYfbB6rYr9A=" rel="noreferrer" target="_blank">https://lists.mercurylang.org/listinfo/users</a><br> </blockquote> </div> </div> </div> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <meta http-equiv="Content-Style-Type" content="text/css"> <title></title> <meta name="Generator" content="Cocoa HTML Writer"> <meta name="CocoaVersion" content="1561.6"> <style type="text/css"> p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Times; color: #000000; -webkit-text-stroke: #000000; min-height: 14.0px} p.p2 {margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Times; color: #000000; -webkit-text-stroke: #000000} span.s1 {font-kerning: none} span.s2 {text-decoration: underline ; font-kerning: none; color: #0000ee; -webkit-text-stroke: 0px #0000ee} </style> <p class="p1"><span class="s1"></span><br> </p> <p class="p1"><span class="s1"></span><br> </p> <p class="p2"><span class="s1"><i>När du skickar e-post till Karolinska Institutet (KI) innebär detta att KI kommer att behandla dina personuppgifter. </i><a href="https://ki.se/medarbetare/integritetsskyddspolicy"><span class="s2">Här finns information om hur KI behandlar personuppgifter</span></a>.<span class="Apple-converted-space"> </span></span></p> <p class="p1"><span class="s1"></span><br> </p> <p class="p2"><span class="s1"><i>Sending email to Karolinska Institutet (KI) will result in KI processing your personal data.</i> <a href="https://ki.se/en/staff/data-protection-policy"><span class="s2">You can read more about KI’s processing of personal data here</span></a>.<span class="Apple-converted-space"> </span></span></p> </body> </html>