2

I need a robust method that would combine multiple CSV files, taking into account the newline characters etc. Please help me with it and provide some code if possible.

Or at least help me with what all scenarios should I expect when combining them.

regards

2
  • Do you have an existing code to work with? Commented May 20, 2011 at 4:53
  • You can check csv-merger github project. Commented Mar 12, 2013 at 6:13

5 Answers 5

1

Well if you're on a Unix based machine (Linux or OSX) you could use cat from the terminal to concatenate the files together.

If you absolutely want to use Java, this forum post covers the topic and provides example code.

Also, this stackoverflow post covered this very topic.

Sign up to request clarification or add additional context in comments.

2 Comments

hey ryan, the forum post is exactly what i was looking for.. The stckoverflow post was a bit vague and did not show up in results when i searched for.. thanks
Any reason why I got a down vote? Would appreciate the feedback so I can continue to improve my answers here on stackoverflow.
1

You can use something like this to parse the data: http://opencsv.sourceforge.net/

1) I would parse in each csv into a string array for each line. Compare the initial first lines of each array using a .equals or compareto to make sure you're dealing with the same CSV, you can set this as a check in the first line of the parser, then throw out if they're not of the same type.

Once that's done you can, delete the first lines of all the arrays expect the first array then merge them together, then write a sort method to the data then print to a file.

2) Even easier, take all the CSVs, scan in their first lines, compare them, if they're the same, scan the entire CSV's into multiple string arrays, merge, then using the File class write the array's to the file.csv.

Another CSV parse: http://commons.apache.org/sandbox/csv/apidocs/org/apache/commons/csv/CSVParser.html

Comments

0

Read them in separately, output them into one file. You can also add some code that checks if the records from the two csv files have the same number of columns, otherwise error.

This isn't really a Java-specific problem.

Comments

0

In continuation with user453441's answer, also check for the separator. Many times 'Comma' Separated values are in fact - (because of some business conditions like address line can contain a comma) - separated by different delimiter(s).

Comments

0
    String[] headers = null;
String firstFile = "/path/to/firstFile.dat";
Scanner scanner = new Scanner(new File(firstFile));

if (scanner.hasNextLine())
    headers[] = scanner.nextLine().split(",");

scanner.close();

Iterator<File> iterFiles = listOfFilesToBeMerged.iterator();
BufferedWriter writer = new BufferedWriter(new FileWriter(firstFile, true));

while (iterFiles.hasNext()) {
  File nextFile = iterFiles.next();
  BufferedReader reader = new BufferedReader(new FileReader(nextFile));

  String line = null;
  String[] firstLine = null;
  if ((line = reader.readLine()) != null)
    firstLine = line.split(",");

  if (!Arrays.equals (headers, firstLine))
    throw new FileMergeException("Header mis-match between CSV files: '" +
              firstFile + "' and '" + nextFile.getAbsolutePath());

  while ((line = reader.readLine()) != null) {
    writer.write(line);
    writer.newLine();
  }

  reader.close();
}
writer.close();

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.