3

I'm working on bash script.

var=$(ls -t1 | head -n1);
cat $var | sed 's/"//g' > latest.csv
cat latest.csv | sed -e 's/^\|$/"/g' -e 's/,/","/g' > from_epos.csv
echo "LATEST: $var";

Here's the whole script, it's meant to delete all quotation mark from current file and add new one, between each field.

INPUT:

"sku","item","price","qty"
5135,"ITEM1",1.79,5
5338,"ITEM2",1.39,5
5318,"ITEM3",1.09,5
5235,"ITEM4",1.09,5
9706,"ITEM5",1.99,5

OUTPUT:

"sku","item","price","qty"
"5135","ITEM1","1.79","5
"
"5338","ITEM2","1.39","5
"
"5318","ITEM3","1.09","5
"
"5235","ITEM4","1.09","5
"
"9706","ITEM5","1.09","5
"

My ideal output is:

"sku","item","price","qty"
"5135","ITEM1","1.79","5"
"5338","ITEM2","1.39","5"
"5318","ITEM3","1.09","5"
"5235","ITEM4","1.09","5"
"9706","ITEM5","1.99","5"

It seems like it's entering random character between line in current output like " and quotation mark is between CR and LF.

What's the problem and how to get it to my ideal vision?

Thanks,

Adam

3
  • You approach seems wrong. If you need to treat a CSV, use a real programming language and a proper parser Commented Jun 18, 2013 at 18:48
  • 1
    I've tried running your script against the provided input, it did provide the correct output. Encoding might be an issue. Commented Jun 18, 2013 at 19:06
  • possible duplicate of Quotation mark into .csv (per field) AWK/SED Commented Jun 18, 2013 at 19:12

3 Answers 3

4
awk 'BEGIN{FS=OFS=","}{gsub(/\"/,"");gsub(/[^,]+/,"\"&\"")}1' input
Sign up to request clarification or add additional context in comments.

5 Comments

Same output like on the above, maybe there's something wrong in encoding? result of input is: text/plain; charset=us-ascii
@AdamLesniak Looks like your file was created on windows. If you don't have dos2unix utility then run this sed -i "s/\r$//g" input.csv
Got installed dos2unix, not working when I put on dos2unix input.csv, still same output ;/ same with sed
@AdamLesniak Technically the best answer should go to anubhava. He was the one solved your problem. :)
@JS웃: Thanks and much appreciated. Your awk command was much more concise that my original awk solution. +1
2

Solution using sed:

sed -e 's/"//g; s/,/","/g; s/^/"/; s/$/"/'

Long-piped-commented version:

sed -e 's/"//g' | # removes all quotations
sed -e 's/,/","/g' | # changes all colons to ","
sed -e 's/^/"/; s/$/"/' # puts quotations in the start and end of each line

Comments

1

awk can do all this in one command:

awk -F"," 'NR>1{for(i=1; i<=NF; i++) {if (!($i ~ /^"/)) printf("\"%s\"",$i); 
           else printf("%s",$i); if (i<NF) printf(","); else print "";}}' latest.csv

EDIT:

Try this awk: (modified from JS's suggested command)

awk 'BEGIN{FS=OFS=","}{gsub(/\"/,"");gsub(/[^,\r]+/,"\"&\"")}1' 

OR

awk -F"[,\r]" 'NR==1{print} NR>1{for(i=1; i<NF; i++) {if (!($i ~ /^"/)) 
               printf("\"%s\"",$i); else printf("%s",$i); if (i<NF-1) printf(",");
               else print "";}}'

4 Comments

same problem, random enters between end of the line
Can you show me output of >head -n2 latest.csv|od -c command
0000000 s k u , i t e m , p r i c e , q 0000020 t y \r \n 1 2 3 4 5 6 7 8 9 0 8 3 0000040 5 , W a l n u t H a l f , 2 . 0000060 7 9 , - 2 4 \r \n 0000070
Any way to remove those \r?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.