0

I am trying to loop through files in a directory to find an animal and its value. The command is supposed to only display the animal and total value. For example:

File1 has:

Monkey 11  
Bear 4

File2 has:

Monkey 12

If I wanted the total value of monkeys then I would do:

for f in *; do
    total=$(grep $animal $f | cut -d " " -f 2- | paste -sd+ | bc)
done
echo $animal $total

This would return the correct value of:

Monkey 23

However, if there is only one instance of an animal like for example Bear, the variable total doesn't return any value, I only get echoed:

Bear

Why is this the case and how do I fix it?

Note: I'm not allowed to use the find command.

3
  • I'm surprised that you could get any output with grep '$animal' because it searches for the literal string $animal, not the value of the variable Commented Jul 25, 2022 at 20:51
  • sorry, I meant just $animal Commented Jul 25, 2022 at 20:56
  • Sorry about that, I meant to put the echo outside of the for loop Commented Jul 25, 2022 at 21:24

5 Answers 5

1

you could use this little awk instead of for grep cut paste bc:

awk -v animal="Bear" '
    $1 == animal { count += $2 }
    END { print count + 0 }
' *
Sign up to request clarification or add additional context in comments.

Comments

0

Comments on OP's question about why code behaves as it does:

  • total is reset on each pass through the loop so ...
  • upon leaving the loop total will have the count from the 'last' file processed
  • in the case of Bear the 'last' file processed is File2 and since File2 does not contain any entries for Bear we get total='', which is what's printed by the echo
  • if the Bear entry is moved from File1 to File2 then OP's code should print Bear 4
  • OP's current code effectively ignores all input files and prints whatever's in the 'last' file (File2 in this case)

OP's current code generates the following:

# Monkey

Monkey 12              # from File2

# Bear

Bear                   # no match in File2

I'd probably opt for replacing the whole grep/cut/paste/bc (4x subprocesses) with a single awk (1x subprocess) call (and assuming no matches we report 0):

for animal in Monkey Bear Hippo
do
    total=$(awk -v a="${animal}" '$1==a {sum+=$2} END {print sum+0}' *)
    echo "${animal} ${total}"
done

This generates:

Monkey 23
Bear 4
Hippo 0

NOTES:

  • I'm assuming OP's real code does more than echo the count to stdout hence the need of the total variable otherwise we could eliminate the total variable and have awk print the animal/sum pair directly to stdout
  • if OP's real code has a parent loop processing a list of animals it's likely possible a single awk call could process all of the animals at once; objective being to have awk generate the entire set of animal/sum pairs that could then be fed to the looping construct; if this is the case, and OP has some issues implementing a single awk solution, a new question should be asked

Comments

0

Why is this the case

grep outputs nothing, so nothing is propagated through the pipe and empty string is assigned to total.

Because total is reset every loop (total=anything without referencing previous value), it just has the value for the last file.

how do I fix it?

Do not try to do all at once, just less thing at once.

total=0
for f in *; do
    count=$(grep "$animal" "$f" | cut -d " " -f 2-)
    total=$((total + count))   # reuse total, reference previous value
done
echo "$animal" "$total"

A programmer fluent in shell will most probably jump to AWK for such problems. Remember to check your scripts with shellcheck.

With what you were trying to do, you could do all files at once:

total=$(
    {
      echo 0                # to have at least nice 0 if animal is not found
      grep "$animal" * |
      cut -d " " -f 2-
    } |
    paste -sd+ |
    bc
)

Comments

0

With just bash:

declare -A animals=()
for f in *; do
    while read -r animal value; do
        (( animals[$animal] = ${animals[$animal]:-0} + value ))
    done < "$f"
done
declare -p animals

outputs

declare -A animals=([Monkey]="23" [Bear]="4" )

With this approach, you have all the totals for all the animals by processing each file exactly once

Comments

0
$ head File*
==> File1 <==
Monkey 11
Bear 4

==> File2 <==
Monkey 12

==> File3 <==
Bear
Monkey

Using awk and bash array

#!/bin/bash

sumAnimals(){
  awk '
     { NF == 1 ? a[$1]++ : a[$1]=a[$1]+$2 }
     END{
       for (i in a ) printf "[%s]=%d\n",i, a[i]
     }
  ' File*
}

# storing all animals in bash array
declare -A animalsArr="( $(sumAnimals) )"
# show array content
declare -p animalsArr
# getting total from array
echo "Monkey: ${animalsArr[Monkey]}"
echo "Bear: ${animalsArr[Monkey]}"

Output

declare -A animalsArr=([Bear]="5" [Monkey]="24" )
Monkey: 24
Bear: 5

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.