2

I am using univocity to parse an large (6 GB) CSV in java. The CSV enrty is as below and can parse CSV. Any idea how to generate the output as below:

CsvParserSettings settings = new CsvParserSettings();   
settings.getFormat().setLineSeparator("\n");

CsvParser parser = new CsvParser(settings);

File f = new File("test.csv");
parser.beginParsing(f, "UTF-8");


String[] row;

while ((row = parser.parseNext()) != null) {

       String val = Arrays.toString(row);

       val = val.replaceAll("\\[", "");
       val = val.replaceAll("\\]", "");
       val = val.replaceAll("\\s", "");


       System.out.println(val);


} // end while

test.csv content:

A,10,2,3

null,11,A1,null

null,30,A23,null

null,44,A34,null

null,16,A67,null

A,20,5,6

null,41,A100,null

null,60,A56,null

null,74,A34,null

null,86,A56,null

Trying to get output like below:

A,[10;11;30;44;16],[2,A1,A23,A34,A67],3

A,[20;41;60;74;86],[5,A100,A56,A34,A56],6

1 Answer 1

1

Each line of expected output depends on multiple rows. Each cell value should be stored in an intermediate variable. Accordingly code can be written as follows:

    BufferedReader csv = new BufferedReader(new FileReader("test.csv"));

    String line;

    ArrayList<String> ar1 = new ArrayList<String>();
    ArrayList<String> ar2 = new ArrayList<String>();

    String s1=null,s2=null;

    String[] lineSplit;

    while ((line = csv.readLine()) != null){

        lineSplit = line.split(",");
        if(lineSplit.length>1){ 
            if(!lineSplit[0].equals("null")){

                if(!ar1.isEmpty()){

                    System.out.println(s1+","+ar1.toString().replaceAll(", ", ";")
                                       +","+ar2.toString().replaceAll(", ", ",")+","+s2);
                }

                s1 = lineSplit[0] ;
                s2 = lineSplit[3];
                ar1 = new ArrayList<String>();
                ar1.add(lineSplit[1]);
                ar2 = new ArrayList<String>();
                ar2.add(lineSplit[2]);
            }
            else{
                ar1.add(lineSplit[1]);
                ar2.add(lineSplit[2]);
            }
        }
    }

    System.out.println(s1+","+ar1.toString().replaceAll(", ", ";")
               +","+ar2.toString().replaceAll(", ", ",")+","+s2);

    csv.close();
Sign up to request clarification or add additional context in comments.

2 Comments

I have tried and it worked perfect. Great! work Nithin. Thanks!
Hi Nitin Can you please help how to achieve below? Input: A,B,C,D,E 101,a1,b1,c1,d1 101,a2,b2,c2,d2 101,a3,b3,c3,d3 102,a21,b21,c21,d21 102,a22,b22,c22,d22 102,a23,b23,c23,d23 Output need to like: 101,[a1;a2;a3],[b1;b2;b3],[c1;c2;c3],[d1;d2;d3] 102,[a21;a22;a23],[b21;b22;b23],[c21;c22;c23],[d21;d22;d23]

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.