[m-users.] Read csv file with variable number and type of fields

Dirk Ziegemeyer dirk at ziegemeyer.de
Wed Sep 9 02:42:50 AEST 2015

Hi Julien,

> Am 03.09.2015 um 03:01 schrieb Julien Fischer <jfischer at opturion.com>:
> Hi,
> On Wed, 2 Sep 2015, Dirk Ziegemeyer wrote:
>>> Am 02.09.2015 um 06:44 schrieb Julien Fischer <jfischer at opturion.com>:
>>> Yes, but there's nothing that prevents you from reading in the header
>>> line, setting up the record structure base on that and then
>>> initializing a new CSV reader based on that.
>>>> My idea is to read the file in two steps:
>>>> 1. the header line (in order to determine number and type of fields), e.g. with some DCG rules or module parsing_utils
>>>> 2. the rest
>>>> Is this a valid approach and can I combine io.read_line_as_string/3 to
>>>> read the header line and stream.get/4 to read the rest of the file?
>>> Yes, it would work.  It's going to be a bit fiddly as you are going to
>>> have to handle things like quoted header names etc yourself.  A better
>>> approach would be to use the CSV library's raw_reader to read in just
>>> the header line, set up the record structure based on that, and then
>>> intialize a new reader to handle the rest of the data.
>>> I've attached a small example of how this could be done.
>> Thank you for this very complete example. The combination of
>> raw_reader and the „normal“ csv reader is exactly what I’m looking
>> for.
> If such functionality is useful, I could add it to the CSV library.

Such functionality is useful in my case. I also need the names of the header fields in order to know which data column belongs to which header.

The background is that there is a database with all possible column names and their data type. But the actual structure of the parsed csv file - namely the included columns and their order - it is not known in advance.

May be the column headers could be returned in an additional output parameter of predicate init_with_inferred_desc.


More information about the users mailing list