[m-users.] Read csv file with variable number and type of fields

Julien Fischer jfischer at opturion.com
Wed Sep 16 00:26:12 AEST 2015


Hi Dirk,

On Tue, 8 Sep 2015, Dirk Ziegemeyer wrote:

>>>>> My idea is to read the file in two steps:
>>>>> 1. the header line (in order to determine number and type of fields), e.g. with some DCG rules or module parsing_utils
>>>>> 2. the rest
>>>>>
>>>>> Is this a valid approach and can I combine io.read_line_as_string/3 to
>>>>> read the header line and stream.get/4 to read the rest of the file?
>>>>
>>>> Yes, it would work.  It's going to be a bit fiddly as you are going to
>>>> have to handle things like quoted header names etc yourself.  A better
>>>> approach would be to use the CSV library's raw_reader to read in just
>>>> the header line, set up the record structure based on that, and then
>>>> intialize a new reader to handle the rest of the data.
>>>>
>>>> I've attached a small example of how this could be done.
>>>
>>> Thank you for this very complete example. The combination of
>>> raw_reader and the „normal“ csv reader is exactly what I’m looking
>>> for.
>>
>> If such functionality is useful, I could add it to the CSV library.
>
> Such functionality is useful in my case. I also need the names of the
> header fields in order to know which data column belongs to which
> header.
>
> The background is that there is a database with all possible column
> names and their data type. But the actual structure of the parsed csv
> file - namely the included columns and their order - it is not known
> in advance.
>
> May be the column headers could be returned in an additional output
> parameter of predicate init_with_inferred_desc.

I have added the init_reader_from_header family of predicates to
the library which provide this functionality.  (There's an example
of how to use them in the samples directory.)

Julien.


More information about the users mailing list