1

With my code I am trying to sum up the values with the specific name of a column in a csv file, depending on the input of the name. Here's my code:

#!/bin/bash

updatedata() {

    index=0
    while IFS="" read -r line
    do
        IFS=';' read -ra array <<< "$line"
        for arrpos in "${array[@]}"
        do
            if [ "$arrpos" == *"$1"* ] || [ "$1" == "$arrpos" ]
            then
                break
            else
                let index=index+1
            fi
        done
        break
       
    done < data.csv
    ((index=$index+1))


       
    if [ $pos -eq 0 ]
    then
        v0=$(awk -F";", -v index=$index '{x+=$index}END{print x}' ./data.csv )
    elif [ $pos -eq 1 ]
    then
        v1=$(awk -F";" '{x+=$index}END{print x}' ./data.csv )
    elif [ $pos -eq 2 ]
    then
        v2=$(awk -F";" '{x+=$index}END{print x}' ./data.csv )
    elif [ $pos -eq 3 ]
    then
        v3=$(awk -F";" '{x+=$index}END{print x}' ./data.csv )
    fi
               
                   
         
}

In the middle of the code you can see in v0=, I was trying to experiment a little, but I just keep getting errors:

First I tried this:

v0=$(awk -F";" '{x+=$index}END{print x}' ./data.csv)

but it gave me this error:

'awk: line 1: syntax error at or near }'

so then I decided to try this(as you can see in the code)

v0=$(awk -F";", -v index=$index '{x+=$index}END{print x}' ./data.csv )

And I got this error: 'awk: run time error: cannot command line assign to index type clash or keyword FILENAME="" FNR=0 NR=0'

I don't know what to do. Can you guys help me.

2
  • 3
    index is a built-in awk function. You may want to use another name for this variable (and use $(varname) in awk). You also should not have a comma after -F ';'. Not turning this into an answer as a real answer should probably also point to better ways of doing this operation (the shell loop is probably not needed). Commented Aug 28, 2020 at 9:33
  • See why-is-using-a-shell-loop-to-process-text-considered-bad-practice. If you edit your question to include concise, testable sample input and expected output then we could help you do whatever it is you're trying to do the right way. Commented Aug 28, 2020 at 13:37

1 Answer 1

1

Given some semi-colon-delimited CSV data in data.csv,

A;B;C
1;2;3
4;5;6
-1.2;3;3.3

the following script would calculate the sum of the column named by the colname variable given on the command line:

BEGIN {
        FS = ";"

        if (colname == "") {
                print "Did not get column name (colname) to work with" >"/dev/stderr"
                exit 1
        }
}

FNR == 1 {
        colnum = 0

        for (i = 1; i <= NF; ++i)
                if ($i == colname) {
                        colnum = i
                        break
                }

        if (colnum == 0) {
                printf "Did not find named column (colname = \"%s\")\n", colname >"/dev/stderr"
                exit 1
        }

        sum = 0
        next
}

{
        sum += $colnum
}

END {
        print sum
}

Testing it:

$ awk -v colname='A' -f script.awk data.csv
3.8
$ awk -v colname='B' -f script.awk data.csv
10
$ awk -v colname='C' -f script.awk data.csv
12.3
$ awk -v colname='D' -f script.awk data.csv
Did not find named column (colname = "D")

Shorter variant of the script without so much error checking:

BEGIN { FS = ";" }

FNR == 1 {
        for (i = 1; i <= NF; ++i)
                if ($i == colname) break

        if (i > NF) exit 1
        next
}

{ sum += $i }

END { print sum }

or, as a "one-liner":

$ awk -v colname='A' -F ';' 'FNR == 1 { for (i = 1; i <= NF; ++i) if ($i == colname) break; if (i > NF) exit 1; next } { sum += $i } END { print sum }' data.csv

Ideally, though, you'd use some form of CSV parser, like CSVkit:

$ csvstat --sum -c A data.csv
3.8

The csvstat utility calculates several different statistics for any given CSV file. Here, it figures out that the delimiter is ; on its own. In this example, I ask for the sum of the column named A.

... or using Miller,

$ mlr --csv --fs semicolon stats1 -a sum -f A data.csv
A_sum
3.8

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.