Null value in CSV with Bash

Question

I am trying to write a Bash script that checks and returns IDs of rows in CSV that fail certain criteria. A sample CSV is like below, I am thinking the [ -z {$CATEGORY} ] menthod to identify null value cell in CATEGORY column of the CSV. However, it seem that my if statement is not catching the null value in the CSV, hence need help

ID,DATE,PRODUCT CODE,CATERGORY
1,01/01/2000,10009,1
2,02/01/2000,9999,2
3,25/01/2000,1009,3
4,15/09/2000,2001,5
5,09/25/2000,2003,4
6,09/10/01,2091,P
7,20/02/2002,3098,6
8,01/03/2003,4097,3
9,03/04/2004,5000,2
10,05/02/2013,4000,1
11,10/01/2015,9,

This is my bash script code, the null value is in the row with ID = 11

#!/bin/bash
FILE=${1}
IFS=$'\n'
((c=-1))
for row in $(cat $FILE)
do
        ((c++))
        if ((c==0))
                then
                        continue
        fi
        IFS=','
        read ID DATE PRODUCT CATEGORY <<<${row}

                if [ -z {$CATEGORY} ];
                then
                     echo "$ID" >> file.txt
                fi
done

Don't read lines with a for loop -- there are all sorts of weird things that can go wrong. Also, changing IFS (without restricting it to a specific command) can cause other weird problems. — Gordon Davisson
– Gordon Davisson, Commented Jun 28, 2022 at 8:15
If CATEGORY is empty, {$CATEGORY} expands to {} which is not a null-string. Therefore your query will always be true, unless when CATEGORY contains a space, in which case you will get a syntax error. — user1934428
– user1934428, Commented Jun 28, 2022 at 8:19

Renaud Pacalet · Accepted Answer · 2022-06-28 08:56:21Z

3

-z {$CATEGORY} should be -z ${CATEGORY}, but read ID ... <<< ${row} will assign only ID... Try:

#!/bin/bash

while IFS=, read -r ID DATE PRODUCT CATEGORY; do
  if [[ "$CATEGORY" =~ ^[[:space:]]*$ ]]; then
    echo "$ID"
  fi
done < <( tail -n+2 "$1" ) > file.txt

Note that awk or sed would be much faster and simpler for this (see, for instance, https://mywiki.wooledge.org/DontReadLinesWithFor). Example with awk (tested with recent BSD and GNU awk):

awk -F, 'NR>1 && $NF ~ /^[[:space:]]*$/ {print $1}' "$FILE" > file.txt

Example with sed (tested with recent BSD and GNU sed):

sed -En 's/^([^,]*).*,[[:space:]]*$/\1/p' "$FILE" > file.txt

edited Jun 28, 2022 at 8:56

answered Jun 28, 2022 at 8:09

Renaud Pacalet

30.7k3 gold badges42 silver badges60 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Chan Wee How Over a year ago

Renaud thanks for your answer but your while loop solution isn't working. But will look at the awk or sed methods you suggest.

Renaud Pacalet Over a year ago

@ChanWeeHow What is not working? I just tested with your own example and it behaves as you want.

Chan Wee How Over a year ago

@RenaudPacalet, I copied your while loop code and ran it but my file.txt is still empty.

Renaud Pacalet Over a year ago

Then you probably have spaces after the comma in your input file. Do you confirm? I updated my answer to also cover this case.

Collectives™ on Stack Overflow

Null value in CSV with Bash

1 Answer 1

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related