[m-dev.] for review: new module for handling file offsets

Fergus Henderson fjh at cs.mu.OZ.AU
Tue Mar 9 22:34:04 AEDT 1999


On 09-Mar-1999, Peter Schachte <schachte at cs.mu.OZ.AU> wrote:
> 
> I don't understand why switching from line numbers to file offsets is
> so much easier than switching to file regions.

Currently, to compute the term__context for a compound term,
all you need is the term__context for the first component.
Adding column numbers or changing to file offsets won't change that.
But if we switch to regions, we will also need the term__context
for the last component.  This would require substantial code changes.

> Alright, how about this as a compromise position: add an abstract type
> called something like file_content_identifier (but preferably shorter)
> to signify that it specifies some part of a file.

In the library, I used the term "context" (term__context, token_context).

> Use this type
> instead of integer offsets in your module.  And then just add the word
> ``start'' somewhere in the predicates that get the line and column
> from an offset so that it's clear that you are getting the line and
> column numbers of the start of the specified part of the file.  This
> just encapsulates what you're doing better so that later it would be
> easier to switch to some fancier way of identifying part of a file.

Well, this doesn't actually help much.  As explained above, "region"
and "starting line [and possibly column]" are fundamentally different
abstractions that require different algorithms to compute them.
If the code is still computing just the start, then the abstraction
we use should probably reflect that.

It makes some sense for term__context to abstract away the difference
between regions and starting points, because we might later change it
(keeping our data structures the same, but changing all of our code).

But for a standard library module, it would never be a good idea to change
the implementation from starting points to regions, because doing that
would break all the code that uses it.  Or, to look at it differently,
using a non-commital abstraction like `context' rather than the more
concrete abstractions `region' or `offset' would force all the code that
uses that module to use the more complicated algorithms required for
computing regions, in case the underlying implementation were to change.
In that case, it would be better to just use the `region' abstraction.

-- 
Fergus Henderson <fjh at cs.mu.oz.au>  |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>  |  of excellence is a lethal habit"
PGP: finger fjh at 128.250.37.3        |     -- the last words of T. S. Garp.



More information about the developers mailing list