0

I'm trying to read in CSV files with nodejs and the code is like below.

  fs.createReadStream(file)
    .pipe(csv.parse({from_line: 6, columns: true, bom: true}, (err, data) => {
      data.forEach((row, i) => {

As I am using from_line parameter, the data starts at line 6 with header. The issue is that the line #3 has the date which is also used with other data.

What is the best way to resolve this?

Data file looks like below:

Genre: ABC
Date: 2020-01-01, 2020-12-31
Number of Data: 300


No., Code, Name, sales, delivery, return, stock
1, ......
2, ......

Additional question

I have inserted iconv.decodeStream in the second part of function. How could I apply the same decoder for header read-in process?

  fs.createReadStream(file)
    .pipe(iconv.decodeStream("utf-8"))
    .pipe(csv.parse({from_line: 6, columns: true, bom: true}, (err, data) => {
      data.forEach((row, i) => {

1 Answer 1

1

I'd suggest reading the header data first, then you can access this data in your processing callback(s), something like the example below:

app.js

// Import the package main module

const csv = require('csv')
const fs = require("fs");
const { promisify } = require('util');
const parse = promisify(csv.parse);
const iconv = require('iconv-lite');

async function readHeaderData(file, iconv) {
    let buffer = Buffer.alloc(1024);
    const fd = fs.openSync(file)
    fs.readSync(fd, buffer);
    fs.closeSync(fd);
    buffer = await iconv.decode(buffer, "utf-8");
    const options = { to_line: 3, delimiter: ':', columns: false, bom: true, trim: true };
    const rows = await parse(buffer, options);
    // Convert array to object
    return Object.fromEntries(rows);
}

async function readFile(file, iconv) {
    const header = await readHeaderData(file, iconv);
    console.log("readFile: File header:", header);

    fs.createReadStream(file)
    .pipe(iconv.decodeStream("utf-8"))
    .pipe(csv.parse({ from_line: 6, columns: true, bom: true, trim: true }, (err, data) => {
        // We now have access to the header data along with the row data in the callback.
        data.forEach((row, i) => console.log( { line: i, header, row } ))
    }));
}

readFile('stream-with-skip.csv', iconv)

This will give an output like:

readFile: File header: {
  Genre: 'ABC',
  Date: '2020-01-01, 2020-12-31',
  'Number of Data': '300'
}

and

{
  line: 0,
  header: {
    Genre: 'ABC',
    Date: '2020-01-01, 2020-12-31',
    'Number of Data': '300'
  },
  row: {
    'No.': '1',
    Code: 'Code1',
    Name: 'Name1',
    sales: 'sales1',
    delivery: 'delivery1',
    return: 'return1',
    stock: 'stock1'
  }
}
{
  line: 1,
  header: {
    Genre: 'ABC',
    Date: '2020-01-01, 2020-12-31',
    'Number of Data': '300'
  },
  row: {
    'No.': '2',
    Code: 'Code2',
    Name: 'Name2',
    sales: 'sales2',
    delivery: 'delivery2',
    return: 'return2',
    stock: 'stock2'
  }
}

example.csv

Genre: ABC
Date: 2020-01-01, 2020-12-31
Number of Data: 300


No., Code, Name, sales, delivery, return, stock
1, Code1, Name1, sales1, delivery1, return1, stock1
2, Code2, Name2, sales2, delivery2, return2, stock2
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks for your advise. Will run the header part first as suggested.
I added additional question for decoder. It was easy to apply the decoder for main part, but how could I do the same for header?
I've updated the answer to apply the decoder!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.