GREP command in loop

Question

I have about 3000 files in a folder. My files have data as given below:

VISITERM_0 VISITERM_20 VISITERM_35 ..... and so on

Each files do not have the same values like as above. They vary from 0 till 99.

I want to find out how many files in the folder have each of the VISITERMS. For example, if VISITERM_0 is present in 300 files in the folder, then I need it to print

VISITERM_0  300

Similary if there are 1000 files that contain VISITERM_1, I need it to print VISITERM_1 1000

So, I want to print the VISITERMs and the number of files that have them starting from VISITERM_0 till VISITERM_99.

I made use of grep command which is

 grep VISITERM_0 * -l | wc -l

However, this is for a single term and I want to loop this from VISITERM_0 till VISITERM_99. Please helP!

It's unclear what you ask. Please reformulate your question... — willeM_ Van Onsem
– willeM_ Van Onsem, Commented Mar 2, 2015 at 23:33

Charles Duffy · Accepted Answer · 2015-03-02 23:43:20Z

1

#!/bin/bash
# ^^- the above is important; #!/bin/sh would allow only POSIX syntax

# use a C-style for loop, which is a bash extension
for ((i=0; i<100; i++)); do
  # Calculate number of matches...
  num_matches=$(find . -type f -exec grep -l -e "VISITERM_$i" '{}' + | wc -l)
  # ...and print the result.
  printf 'VISITERM_%d\t%d\n' "$i" "$num_matches"
done

edited Mar 2, 2015 at 23:43

answered Mar 2, 2015 at 23:37

Charles Duffy

299k43 gold badges441 silver badges497 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Charles Duffy Over a year ago

...then change your loop to start at 100 and proceed to 0, in the exact same way you would do so in C.

Rudhra Over a year ago

One more doubt. If I want to reorder the entire output in the descending order depending upon the number of file numbers, for instance: VISITERM_0 300, VISITERM_1 150, VISITERM_2 400 are the results and I need them to be organized as VISITERM_2 400, VISITERM_0 300, VISITERM_1 150, How can it be done?

Jotne · Accepted Answer · 2015-03-03 07:34:30Z

Here is an gnu awk (gnu due to multiple characters in RS) that should do:

awk -v RS=" |\n" '{n=split($1,a,"VISITERM_");if (n==2 && a[2]<100) b[a[2]]++} END {for (i in b) print "VISITERM_"i,b[i]}' *

Example:

cat file1
VISITERM_0 VISITERM_320 VISITERM_35

cat file2
VISITERM_0 VISITERM_20 VISITERM_32
VISITERM_20 VISITERM_42 VISITERM_11

Gives:

awk -v RS=" |\n" '{n=split($1,a,"VISITERM_");if (n==2 && a[2]<100) b[a[2]]++} END {for (i in b) print "VISITERM_"i,b[i]}' file*
VISITERM_0 2
VISITERM_11 1
VISITERM_20 2
VISITERM_32 1
VISITERM_35 1
VISITERM_42 1

How it works:

awk -v RS=" |\n" '              # Set record selector to space or new line
    {n=split($1,a,"VISITERM_")  # Split record using "VISITERM_" as separator and store hits of split in "n"
    if (n==2 && a[2]<100)       # If "n" is "2" (does contain "ISITERM_") and has number less "100"
        b[a[2]]++}              # Count the hit of each number and stor it in array "b"
END {for (i in b)               # Walk trough array "b"
    print "VISITERM_"i,b[i]}    # Print the hits
' file*                         # Read the files

PS
If everything is only on one line, change to RS=" ". Then it should work on most awk

Collectives™ on Stack Overflow

GREP command in loop

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related