Using grep in while loop breaks the loop

Question

I want to write a script in bash that prints the least repeating line of standard input

I wrote this code:

#!/bin/bash
var=1000
while read line
do
    tmp=$(grep -c $line)
    if [ $tmp -lt $var ]
    then
        var=$tmp
        out=$line
    fi
done
var="$var $out"
echo $var

but e.g. when using a test like this

id1
id2
id3
id1
square
id1
id2
id3
id1
circle
id2
id2

the program only enters the loop once thus it gives a bad output

3 id1

when the correct one should be

1 square

This line

tmp=$(grep -c $line)

seems to be breaking the loop but I can't find out why. Is there any way to bypass using grep in my code or any other way to fix my script?

Why is circle your expected output? It is neither the last repeating nor the last unique line in your example. — tripleee
– tripleee, Commented Apr 30, 2016 at 10:26
It should be the least repeating, not the last repeating ;) Still, your answer below helped me a lot ;) — Konrad
– Konrad, Commented Apr 30, 2016 at 10:55
So do you mean the first unique line, then? You have multiple unique lines; theyare all the least repeating. — tripleee
– tripleee, Commented Apr 30, 2016 at 10:58
No, i guess my English skills didn't let me make this clear enough, if there is a unique line in the stdin it should also print it, let's say we have a one line containing word: square , two lines containing word: circle and three lines containing word: triangle. It should print "square" because it only appears once in the file (appears the least amount of times) — Konrad
– Konrad, Commented Apr 30, 2016 at 11:11
That much is clear, but if there is three of each, do you only want the first one? — tripleee
– tripleee, Commented Apr 30, 2016 at 11:13

jil · Accepted Answer · 2016-04-30 11:07:26Z

2

The problem in your code is that this grep

    tmp=$(grep -c $line)

will read from stdin and thus consume all the lines on the very first round the while loop is executed. I.e. first you will read the first line into $line. Then you will grep for this string in the rest of the stdin.

You could fix your code by using a temporary file, e.g.:

#!/bin/bash
tmpfile=$(mktemp)
cat > "$tmpfile"
min=0
while IFS= read -r line; do
    count=$(grep -c "$line" $tmpfile)
    if (( min == 0 || (count < min) )); then
        min=$count
        out="$min $line"
    fi
done < <(sort -u "$tmpfile")
rm "$tmpfile"
echo "$out"

But this is of course quite horrible solution as it uses temporary file and opens the input file many times. Better would be to use something like:

#!/bin/bash
sort | uniq -c | sort -n | head -1

edited Apr 30, 2016 at 11:07

answered Apr 30, 2016 at 10:58

jil

2,70114 silver badges15 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Konrad Over a year ago

Thank you for your answer :)

tripleee · Accepted Answer · 2016-04-30 11:12:48Z

0

The grep command reads the remainder of standard input. You will need to copy the input to a temp file if you want to both grep it and do something else with it.

A much simpler solution to your problem is

uniq -d | tail -n 1

More generally, running grep on each line in a loop over a file is at antipattern which often suggests moving to Awk or sed instead, if you can't find a simple pipeline with standard tools to accomplish your goal.

edited Apr 30, 2016 at 11:12

answered Apr 30, 2016 at 10:24

tripleee

192k37 gold badges318 silver badges367 bronze badges

1 Comment

Konrad Over a year ago

Thanks, you helped me a lot!

Collectives™ on Stack Overflow

Using grep in while loop breaks the loop

2 Answers 2

1 Comment

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related