0

I'm trying to read two text files so I can check if some fields are the same in both files. I can easily extract the fields in command line but something goes wrong when doing from a bash script.

I generate the file (list of files) in a for loop (tried echo and printf commands)

printf "$servidor1$gfs1_dir$gfs1_file\n" >> server1

You can see the output of cat command

cat server1

ftp://server/pub/data/nccf/com/gfs/prod/gfs.2014011400/gfs.t00z.pgrbf00.grib2
ftp://server/pub/data/nccf/com/gfs/prod/gfs.2014011400/gfs.t00z.pgrbf06.grib2

If I try from the command line it runs fine. Two lines/records in the file are shown:

awk 'BEGIN { FS="/"} {print $11}' server1

gfs.t00z.pgrbf00.grib2
gfs.t00z.pgrbf06.grib2

But if I want to set FNR there comes the error (in the script awk it is used to build a variable named fremote)

awk 'BEGIN { FS="/"} { RS="\n"} {FNR == 1} {print $11}' server1

gfs.t00z.pgrbf00.grib2
gfs.t00z.pgrbf06.grib2

The same occurs when I create the fremote var in the bash script (i stands for the loop variable in the script)

i=1
fremote=`awk -v i=$i 'BEGIN { FS="/"} { RS="\n"} {FNR == i} {print $11}' servidor1-file.list`

echo $fremote

gfs.t00z.pgrbf00.grib2 gfs.t00z.pgrbf06.grib2

Maybe it is related with the way server1 file is created, maybe how it is accessed by awk. I can't find the right point.

Thanks in advance for your help. I'll go on working o this issue and post the answer if found.

EDIT

From the comments I add the code in the bash script where awk is invoked (hope it helps to understand what I'm trying). I have two files, list of local files and list of remote files in the server. I try to build two vars flocal and fremote to check if they are the same. Maybe there are easier and smarter ways to check.

while [ $i -le $nlocal ]  
   do
   flocal=`awk -v i=$i 'FNR == i {print $1}' lista.local`
   fremote=`awk -v i=$i 'BEGIN { FS="/"} {FNR == $i} {print $11}' $2`

   if [ "$flocal" != "$fremote" ]; then 
      echo "Some file missing"  >> $log
      flag_check_descarga=0
   else
      contador=$(($contador + 1))
      echo $contador "Download OK" $flocal  >> $log
   fi
   i=$(( $i + 1 ))
done
8
  • 2
    usually FS,OFS,RS,ORS... should be set in BEGIN block. btw, why you want to set FNR? and what do you want to get? Commented Jan 14, 2014 at 15:35
  • What are you doing with RS there? Why are you setting it for each line? == is equality not assignment. If you get an error you should show us the error text. I don't know that setting FNR is going to do what you expect (I don't know what awk is going to do when you change that, possibly nothing). Commented Jan 14, 2014 at 15:36
  • @Kent In the script I use FNR == i where i is the loop variable as I want to check every single line. Commented Jan 14, 2014 at 15:46
  • @EtanReisner There's no error message. It just does not show the expected results, my fault. Commented Jan 14, 2014 at 15:47
  • 2
    Your goal here, ultimately, is to find out whether, for each file on the local disk, there is a similar file (by name) on the remote server? If so then you want something more like awk -F/'NR==FNR{files[$1]=1} ($11 not in files) {printf "File missing: %s", $11}' lista.local server1 or something like that. Commented Jan 14, 2014 at 15:52

1 Answer 1

2

Your syntax is wrong.

awk -v i="$i" 'BEGIN { FS="/"; RS="\n"}
    FNR == i {print $11}' server1

The BEGIN { ... } block contains actions to perform when the script is starting. The FNR==i { ... } block contains actions to perform when reading the *i*th line of a file.

An unconditional block { ... } contains actions to perform unconditionally, i.e. for every input line. But FNR==i is not a meaningful action; it is just a boolean which is true when the line number of the file is equal to i. It is an excellent condition but as an action it doesn't do anything (that you can detect from the outside).

However, the task you appear to be trying to solve would be easier to solve with a single Awk script -- the one posted by @EtanReisner in a comment looks good to me -- or just

comm -23 <(sort lista.remote) <(sort lista.local)

or even, if the files are already sorted,

comm -23 lista.local lista.remote

Wrapping up, you could end up with something like

sort -o lista.local lista.local  # make sure lista.local is sorted
awk -F/ '{ print $11 }' server1 |
sort |
comm -23 - lista.local

In keeping with the Unix spirit, this will quietly succeed if there are no differences, and fail (exit with a non-zero exit code) and print the missing entries if something is missing.

If you want to print the successfully downloaded files as well, just cat lista.local, or maybe something like sed 's/^/Successfully downloaded /' lista.local

Sign up to request clarification or add additional context in comments.

5 Comments

In the script I use FNR == i where i is the loop variable as I want to check every single line. Thanks @tripleee
I don't understand that FNR comment. awk looks at every line unless you do something to make it skip lines.
Didn't know about comm, will try. Thanks
I added a snippet and refactored some of the answer, and so deleted some comments of mine which were now obsolete. In particular, the logic is now correct for checking a download (remote has all the files, local may be missing some).
Hi @triplee You were right and the syntax was wrong. Now it works in my example files, will translate into the global script. Thanks

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.