0

Using the the following text file i would like to create a csv file

input file

time : 5/14/18 10:31:26.832 AM
dt # : 0
Shot # : 587
name : 2851
cdn # : 2306
cdl : C5
Comment : N/A
________________________________________________________________________

time : 5/14/18 10:31:23.280 AM
dt # : 0
Shot # : 974
name : 2852
cdn # : 2306
cdl : C5
Comment : N/A
________________________________________________________________________

time : 5/14/18 6:04:27.880 AM
dt # : 21
Shot # : 316
name : 2854
cdn # : 2306
cdl : C5
Comment : N/A
________________________________________________________________________

time : 5/14/18 10:12:53.932 AM
dt # : 21
Shot # : 731
name : 2849
cdn # : 2306
cdl : C5
Comment : N/A
________________________________________________________________________

I tried to use this code to transpose the rows to columns.

gawk -F'\n' -v RS= -v OFS=',' -v ORS='\n' '{$1=$1}1' file.txt

this the output I got.

time : 5/14/18 10:31:26.832 AM,dt # : 0,Shot # : 587,name : 2851,cdn # : 2306,cdl : C5,Comment : N/A,________________________________________________________________________
time : 5/14/18 10:31:23.280 AM,dt # : 0,Shot # : 974,name : 2852,cdn # : 2306,cdl : C5,Comment : N/A,________________________________________________________________________
time : 5/14/18 6:04:27.880 AM,dt # : 21,Shot # : 316,name : 2854,cdn # : 2306,cdl : C5,Comment : N/A,________________________________________________________________________
time : 5/14/18 10:12:53.932 AM,dt # : 21,Shot # : 731,name : 2849,cdn # : 2306,cdl : C5,Comment : N/A,________________________________________________________________________

But the desired output file should be like the below:

time,dt,Shot,name,cdn,cdl,Comment,
5/14/18 10:31:26.832 AM,0,587,2851,2306,C5,N/A
5/14/18 10:31:23.280 AM,0,974,2852,2306,C5,N/A
5/14/18 6:04:27.880 AM,21,316,2854,2306,C5,N/A
5/14/18 10:12:53.932 AM,21,731,2849,2306,C5,N/A

Thanks in advance.

1 Answer 1

2

EDIT:

awk -F" : " '!a[$1]++ && NF && !/^__/{sub(/ #/,"");heading=heading?heading OFS $1:$1} /^__/ && val{val=val ORS;next} NF{val=val?val OFS $2:$2} END{gsub(/\n,/,"\n",val);print heading ORS val}' OFS=,  Input_file

Following awk may help you on same.

awk -F" : " 'BEGIN{print "time,dt,Shot,name,cdn,cdl,Comment,"}/^__/ && val{print val;val="";next} {val=val?val OFS $2:$2}' OFS=,   Input_file
Sign up to request clarification or add additional context in comments.

6 Comments

RavinderSingh13, many thanks. in the original file i have more that 200 lines every time appears the ____ separator, is this possible to avoid the BEGIN for the header, otherwise I need to write a long list for the header.. I have reduced the rows to avoid to load a lot data here.. Thanks in advance.
@OXXO, please check my EDIT solution and let me know if this helps you?
RavinderSingh13, the code works perfectly, but with 900k lines it takes a very very long time to end :(
@OXXO, I would like to request you to run it once and let us know the results so that we all could be aware of that too.
RavinderSingh13, I run it using full file 900k lines and takes more than 20 min to finish the process..
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.