[m-users.] Read csv file with variable number and type of fields

Dirk Ziegemeyer dirk at ziegemeyer.de
Thu Sep 3 06:25:21 AEST 2015


Hi Julien,

> Am 02.09.2015 um 06:44 schrieb Julien Fischer <jfischer at opturion.com>:
> 
>> I need to read csv files where the number and the type of fields in a
>> record is not fixed in advance. Instead the record structure is
>> determined by the header record of the csv file together with a
>> database which assigns a data type to every header field name.
>> 
>> I’m wondering if I can use the library https://github.com/juliensf/mercury-csv for that.
>> 
>> It seems that I need to know the record structure in order to
>> initialize a csv reader with csv.init_reader/3.
> 
> Yes, but there's nothing that prevents you from reading in the header
> line, setting up the record structure base on that and then
> initializing a new CSV reader based on that.
> 
>> My idea is to read the file in two steps:
>> 1. the header line (in order to determine number and type of fields), e.g. with some DCG rules or module parsing_utils
>> 2. the rest
>> 
>> Is this a valid approach and can I combine io.read_line_as_string/3 to
>> read the header line and stream.get/4 to read the rest of the file?
> 
> Yes, it would work.  It's going to be a bit fiddly as you are going to
> have to handle things like quoted header names etc yourself.  A better
> approach would be to use the CSV library's raw_reader to read in just
> the header line, set up the record structure based on that, and then
> intialize a new reader to handle the rest of the data.
> 
> I've attached a small example of how this could be done.

Thank you for this very complete example. The combination of raw_reader and the „normal“ csv reader is exactly what I’m looking for.

Dirk.





More information about the users mailing list