[m-users.] (cough) "AI"
Bijan Parsia
bijan.parsia at manchester.ac.uk
Mon Oct 27 00:07:56 AEDT 2025
I’d be interested to know more about the precise set up of your experiments, Richard.
The impression I’m getting is that so-called “agentic” approaches work better especially for things without explicit training info (e.g., non-popular languages). (Roughly, firing up multiple llm driven “agents” which try to explore the space and cross-check). In Claude, adding more and specific documentation to the project can help as well (e.g., a description of tail accuracy, not necessarily actual tests, but the high level NL description).
(It’s hard to proper experiments on these public systems because they change a *lot* from moment to moment!)
Cheers,
Bijan.
More information about the users
mailing list