1

I'm very inexperienced with shell scripts, and I need to write one that deletes an entire row when a column named Views contains the value 0. The column "Views" may not always be in the same location in the file, so I would need some way to find the location of the column before hand. Is this something that is feasible with sed or awk? Or is there something else that I can use?

Thanks!

2
  • Can you show example input and output? I'd like to see the way the headers are formatted, in particular. Commented Feb 16, 2015 at 18:54
  • @Wintermute hey yea, so its just a standard CSV. The headers are the first line of the file: Date,....,Views,...,URL. The sample output would be the exact same CSV file, just with rows with 0 views removed from it Commented Feb 16, 2015 at 18:58

3 Answers 3

4

With awk, this could be done like this:

awk -F, 'NR == 1 { for(i = 1; i <= NF; ++i) { col[$i] = i }; next } $col["Views"] != 0' filename.csv

-F, sets the field separator to a comma, since you mentioned a CSV file. The code is

NR == 1 {                    # in the first line
  for(i = 1; i <= NF; ++i) { # go through all fields
    col[$i] = i              # remember their index by name.
                             # ($i is the ith field)
  }
  next                       # and do nothing else
}

$col["Views"] != 0           # after that, select lines in which the field in
                             # the column that was titled "Views" is not zero,
                             # and do the default action on them (i.e., print)

Note that this will only filter out lines where the Views column is exactly 0. If you also want to filter out lines where the Views field is empty, use $col["Views"] instead of $col["Views"] != 0.

Sign up to request clarification or add additional context in comments.

6 Comments

This looks pretty good, but the only issue is its just being output on the console. I need those rows to be physically deleted from the file. Is this possible with awk?
With GNU awk 4.1.0 or later, use awk -i inplace same_as_before_here. Or, because it is nice to have a backup in case the power goes out at the wrong moment, cp foo.csv foo.csv~ && awk same_as_before foo.csv~ > foo.csv.
it doesn't look like those rows are being skipped. Are we sure that this part is right col[$i] = i i is the index of the fields isn't it? So $col["Views"] would be set to the index rather than the actual value contained in that column, which is what needs to be checked against 0 for every line in the file
$i is not the value of i, it's the value of the ith field. In the same vein, $col["Views"] is the value of the col["Views"]th field. Can you add some input data to the question so I can see what's different in my guessed test data? It works for me.
Looks like I copied it wrong. It's working now for me. Thanks!
|
0
awk -F ',' 'NR==1{print;for(i=1;i<=NF;++i){if($i=="Views"){x=$i;y=i}}};NR>1{if($y!=0){print}}'  file > new_file

breakdown of code

NR==1{                    #for the first line 
print                     #print it 
for(i=1;i<=NF;++i){       #make a loop to read all the column and find the 
    if($i=="Views"){      #name "Views" in the first row. 
        y=i               #Save the column number in a variable named y
    }
}
}

NR>1{                     # start from line 2 going downwards targeting
     if($y!=0){           # the Views Column
       print              #if it does not contain 0, print the line
     }
}

Comments

0
awk '($1 == "badString") && !($1 ~ /[.]/) { next } 1' inputfile > outputfile

#if first column = badString or has . (dot) dont include it in outputfile

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.