Creating csv file from text

Question

Using the the following text file i would like to create a csv file

input file

time : 5/14/18 10:31:26.832 AM
dt # : 0
Shot # : 587
name : 2851
cdn # : 2306
cdl : C5
Comment : N/A
________________________________________________________________________

time : 5/14/18 10:31:23.280 AM
dt # : 0
Shot # : 974
name : 2852
cdn # : 2306
cdl : C5
Comment : N/A
________________________________________________________________________

time : 5/14/18 6:04:27.880 AM
dt # : 21
Shot # : 316
name : 2854
cdn # : 2306
cdl : C5
Comment : N/A
________________________________________________________________________

time : 5/14/18 10:12:53.932 AM
dt # : 21
Shot # : 731
name : 2849
cdn # : 2306
cdl : C5
Comment : N/A
________________________________________________________________________

I tried to use this code to transpose the rows to columns.

gawk -F'\n' -v RS= -v OFS=',' -v ORS='\n' '{$1=$1}1' file.txt

this the output I got.

time : 5/14/18 10:31:26.832 AM,dt # : 0,Shot # : 587,name : 2851,cdn # : 2306,cdl : C5,Comment : N/A,________________________________________________________________________
time : 5/14/18 10:31:23.280 AM,dt # : 0,Shot # : 974,name : 2852,cdn # : 2306,cdl : C5,Comment : N/A,________________________________________________________________________
time : 5/14/18 6:04:27.880 AM,dt # : 21,Shot # : 316,name : 2854,cdn # : 2306,cdl : C5,Comment : N/A,________________________________________________________________________
time : 5/14/18 10:12:53.932 AM,dt # : 21,Shot # : 731,name : 2849,cdn # : 2306,cdl : C5,Comment : N/A,________________________________________________________________________

But the desired output file should be like the below:

time,dt,Shot,name,cdn,cdl,Comment,
5/14/18 10:31:26.832 AM,0,587,2851,2306,C5,N/A
5/14/18 10:31:23.280 AM,0,974,2852,2306,C5,N/A
5/14/18 6:04:27.880 AM,21,316,2854,2306,C5,N/A
5/14/18 10:12:53.932 AM,21,731,2849,2306,C5,N/A

Thanks in advance.

RavinderSingh13 · Accepted Answer · 2018-05-14 10:55:11Z

2

EDIT:

awk -F" : " '!a[$1]++ && NF && !/^__/{sub(/ #/,"");heading=heading?heading OFS $1:$1} /^__/ && val{val=val ORS;next} NF{val=val?val OFS $2:$2} END{gsub(/\n,/,"\n",val);print heading ORS val}' OFS=,  Input_file

Following awk may help you on same.

awk -F" : " 'BEGIN{print "time,dt,Shot,name,cdn,cdl,Comment,"}/^__/ && val{print val;val="";next} {val=val?val OFS $2:$2}' OFS=,   Input_file

edited May 14, 2018 at 10:55

answered May 14, 2018 at 10:36

RavinderSingh13

135k14 gold badges61 silver badges100 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

OXXO Over a year ago

RavinderSingh13, many thanks. in the original file i have more that 200 lines every time appears the ____ separator, is this possible to avoid the BEGIN for the header, otherwise I need to write a long list for the header.. I have reduced the rows to avoid to load a lot data here.. Thanks in advance.

RavinderSingh13 Over a year ago

@OXXO, please check my EDIT solution and let me know if this helps you?

OXXO Over a year ago

RavinderSingh13, the code works perfectly, but with 900k lines it takes a very very long time to end :(

RavinderSingh13 Over a year ago

@OXXO, I would like to request you to run it once and let us know the results so that we all could be aware of that too.

OXXO Over a year ago

RavinderSingh13, I run it using full file 900k lines and takes more than 20 min to finish the process..

|

Collectives™ on Stack Overflow

Creating csv file from text

1 Answer 1

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related