[m-rev.] diff: string splitting routines to string.m
Ralph Becket
rafe at csse.unimelb.edu.au
Fri Feb 2 12:57:41 AEDT 2007
Ondrej Bojar, Friday, 2 February 2007:
> Ralph Becket wrote:
> >In this case I'm just looking for a convincing case *for*
> >split_at_string, but I'm not arguing *against* it.
>
> It's a generalization of split_at_char. To be honest, I came across it
> in one project only so far, but it simplifies nicely situations where
> you want some hierarchy of delimiters:
>
> sentence_id ||| I|pronoun am|verb sleepy|adj ||| scores
>
> Usually, one would use 'tab' to delimit main columns and 'space' to
> delimit words/subfields within fields. Some people are afraid of
> whitespace delimiters and prefer to use a printable character. One needs
> to escape the character (the whitespace too, but it's not that common
> that your data actually contains whitespace). So for each level of
> delimiting, you need another character and another escape sequence. If
> your fields are never blank, you can however use just different number
> of copies of the single delimiter char, to mark different levels of
> segments. And this is where split_at_string is useful.
>
> Probably not very convincing, though....
Sounds like the computational linguistics people need to learn something
about representing data :-)
--------------------------------------------------------------------------
mercury-reviews mailing list
Post messages to: mercury-reviews at csse.unimelb.edu.au
Administrative Queries: owner-mercury-reviews at csse.unimelb.edu.au
Subscriptions: mercury-reviews-request at csse.unimelb.edu.au
--------------------------------------------------------------------------
More information about the reviews
mailing list