5

I'm comparing the results of two files for lines in one that are not in the other using grep -v -f file1.txt file2.txt > result.txt

Let's say my files look like;

file1.txt

alex
peter
zoey

file2.txt

alex
john
peter
zoey

So result.txt should contain john

This is to be run inside a Jenkins job, and jenkins ends up not happy with creating an empty result.txt if there are no differences between the two.

I can't just do a blind diff/no diff output, I specifically need to know which lines differ if there are any.

Is there a neater way to run this command to not create a file if there are no results?

3
  • 3
    bash is the one responsible here, not grep. A file redirection in bash will always create the file, even if the file is given no data and will thus be empty. You will have to either a) check the file size and delete it if it is empty, b) create a temporary file for the result and only move it into place if it is not empty, or c) run the diff twice, once to determine if there are any differences and again to write those differences to the file (if any differences existed). Commented Aug 29, 2017 at 18:15
  • Alternatively, is there a way you can write the rest of the Jenkins job to not need a result.txt file? Or would a temporary file be sufficient for what you are trying to accomplish? Commented Aug 29, 2017 at 18:16
  • To extend what 0x5453 was saying -- in grep foo >out, the file out is created before grep is ever started (otherwise, grep would have no stdout to which to write its output). Commented Aug 29, 2017 at 19:24

5 Answers 5

5

EDIT: Try conditionally running the command with the quiet option -q (which will exit when one match is found and save time). Then once the first match is found (meaning the files are different), you will run the same command and output it to your file.

Try this: (EDIT taken from Charles Duffy's comment)

#!/usr/bin/env bash

if grep -qvf file1.txt file2.txt 
then
   grep -vf file1.txt file2.txt > output.txt
   echo "Differences exist. File output.txt created." 
else
   echo "No difference between the files detected"
fi

Or with less code and one line:

grep -qvf file1.txt file2.txt && grep -vf file1.txt file2.txt > output.txt
Sign up to request clarification or add additional context in comments.

7 Comments

Better to run if grep -v -f file1 file2.txt; then ... and not need to check $?. And if you want to suppress output, use -q -- otherwise, the test run will be emitting output to stdout (and running longer than it needs to to get a result, whereas grep -q can exit as soon as it finds a single match)
Thanks @CharlesD. All good points and have been added to the example.
Better to put the -q earlier (though of course it can't be between the -f and the following filename). GNU tools allow optional flags anywhere on a command line, but the POSIX standard only requires them to be supported at the front, so folks on a non-GNU platform might see it looking for a file named -q.
@CharlesDuffy Thanks, I was wondering why it wouldn't work after the f. Just changed it to reflect that.
The others had totally valid solutions, but this is the only approach that worked within Jenkins. Thanks!
|
5

Could you do something as simple as removing the file if it's empty?

Solution #1:

grep -v -f file1.txt file2.txt > result.txt
[[ $? -ne 0 ]] && 'rm' -f result.txt
  • grep should generate a non-zero return code for an empty output
  • quote the rm command to ensure not calling any aliases
  • use the -f to keep silent any messages should the file not exist (shouldn't happen, but doesn't hurt to code for this anyway)

Solution #2:

grep -v -f file1.txt file2.txt > result.txt
[[ ! -s result.txt ]] && 'rm' -f result.txt
  • -s => file exists and is greater than 0 in size
  • ! -s => file doesn't exist or file exists and is 0 in size
  • 'rm' -f ... same explanation as for solution #1

1 Comment

Better to use if ! grep ...; then rm or grep ... || rm and not need to check $? explicitly; code that checks $? is fragile -- adding echos to improve logging can break it, and one needs to follow more context (reading vertically as opposed to horizontally) to understand how pieces are connected.
3

You can use this awk for conditional differential output file creation:

awk 'NR==FNR{a[$1]; next} !($1 in a){b[$1]} END{
if (length(b) > 0) for (i in b) print i > "result.txt"}' file1.txt file2.txt

Output file result.txt will only be created when there is any output data to be written due to length(b) > 0 check.

Comments

3

This should work for your case

out=$(grep -v -f file1.txt file2.txt); [[ -n "$out" ]] && echo "$out" > results.txt

2 Comments

Good approach. That said, I'd use printf '%s\n' "$out" -- lots of places where echo behavior is poorly-defined or nonportable between shells (see for example ksh following XSI extensions to POSIX and expanding backslash-escape sequences by default).
I like that this approach avoids running potentially time consuming commands twice. (at the cost of higher memory consumption.)
0

Using grep && grep. Positive result:

$ grep -q -m 1 -v -f file1 file2 && grep -v -f file1 file2 > out1 
$ cat out1
john

and negative:

$ grep -q -m 1 -v -f file1 file1 && grep -v -f file1 file1 > out2
$ cat out2
cat: out2: No such file or directory

It exits the first grep after the first match to quicken its execution but still its execution time at the worst could be twice the runtime of one grep.

Another way, another awk:

$ awk -v outfile=out3 '
NR==FNR { 
    a[$0]
    next
}
($0 in a==0) {
    print $0 > outfile  # file is only opened when there is a match
}' file1 file2
$ cat out3
john

That awk won't recognize partial matches.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.