[m-users.] Documenting univ data passing to and from the C FFI

Fabrice Nicol fabrnicol at gmail.com
Thu May 20 18:59:34 AEST 2021


Please read: first/second/third-wave AI.
(Sorry, the French unpredictably pops up sometimes xD)

Le jeu. 20 mai 2021 à 10:48 AM, Fabrice Nicol <fabrnicol at gmail.com> a
écrit :

> Thanks Julien.
> I think this is quite an interesting debate actually.
>
> I am taking the point of view of a more-or-less-advanced user in the
> course of writing an interface to an external C library.
> There are two independent sets of constraints to consider: Mercury
> language design constraints and performance issues of the interface when
> they are relevant. Actually the former may run contrary to the latter
> sometimes. So the question is what must yield first in such cases, and how
> to strike a reasonable balance.
>
>  > "*Why do you need to pass univs across the Mercury-C boundary?*
> *Why do you need to unpack them in C code?"*
>
> Performance issues.
> Consider an experimental user with a 50 GB database and a reasonably good
> platform.
> User has been performing 'second-wave IA' jobs (say ML neural network
> stuff for concreteness), say using R (but it might be TensorFlow and
> python). User now wants to process results in a 'third wave IA'
> perspective, in other words perform tasks:
> - using logic/functional programming techniques
> - using expert system methods modernized from good ol' 'first wave IA'
> (mostly symbolic rule-based analysis expressing some domain-specific
> knowledge).
>
> User now has to churn big data loads and wants to give Mercury a
> well-deserved chance.
>
> This is the context. It is not that experimental either, despite some
> appearances. It is even quite a trendy thing in some advanced IA curricula,
> with other tools/languages like Haskell. (Check out open-access IA courses
> at Zurich EPFZ for example).
>
> User now has to minimize Mercury/C interface overloads in a big data
> context. After some tests, he needs to thread his big data loads to and
> from R using the C FFI.
>
> Understandably, the data from R (and python is not better) will **not** be
> type-safe. R nastily and sometimes unpredictably casts types around, and
> User has to neutralize this by 'boxing' data flows into some universal data
> type. 'univ' looks like a good candidate ( at least a bit better than the
> string way out).
> User now has a choice: export some Mercury code into C to process the
> univ-typed 50 GB data, or use the runtime RTTI macros to do the job.
> Understandably s/he will prefer the latter option even in the face of
> possible interface modifications in the future: either the software C
> interface code will be fixed to reflect those changes, or the Mercury tools
> will be frozen at user level for the time it takes. This is mostly a team
> resource management issue, not a language design issue (from a user
> viewpoint).
>
> > "*The details of the RTTI system are deliberately not documented at the*
> *target language level since they are (and have been) subject to change."*
>
> Yes :-(
> Understandably so from a Mercury language developper's viewpoint.
> Perhaps a bit less so from a Mercury user's viewpoint, as outlined above.
>
> If you have to pack values in univs like that you would be far better
>> implementing test/3 as a Mercury predicate and exporting that to C,
>> e.g.
>>
>> (...)
>>
>>
> Great code chunk. Thanks again Julien.  I would suggest to add some
> version of it to Janet's nice crash course.
> Unfortunately, I tested it against a realistic database and the incurred
> CPU time penalty is a bit stiff over the C RTTI macro alternative (link
> below).
>
> > "*I assume that you **are attempting something a bit more general in
> practice?"*
>
>>
> Sure. As you understood, this was a simplified minimal example for
> demonstrative purposes.
> The real code is here (it runs, yet I must warn that it still is very
> unstable/experimental):
> https://www.github.com/fabnicol/RMercury/tree/library/ri.m,
> lines 4100 and further down.
>
> I'm following a mixed approach there: using 'MR_unravel_ univ' at the C
> level yet performing type analysis in Mercury code. This is OK while you
> have few columns. But this might have to be changed for transposed matrices
> with millions of columns (and few lines), as in this case Mercury code type
> analysis would be called as many million times.
>
> Fabrice
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mercurylang.org/archives/users/attachments/20210520/d7074798/attachment.html>


More information about the users mailing list