1

I'm trying to get this script working to count how many files of type .doc and .pdf are. But I keep getting syntax error on the last bracket for the for loop.

awk: ./parselog.awk:14:     for ($7 in count)
awk: ./parselog.awk:14:                     ^ syntax error

Here's the awk script:

#!/usr/bin/awk -f
BEGIN {}
{
    file = match($7, "/datasheets/")
    doccheck = match(tolower($7), ".doc")
    pdfcheck = match(tolower($7), ".pdf")
    if( doccheck || pdfcheck )
    {
            count[$7]++
    }
}
    END{

    for ($7 in count)
    {
            frequency = count[$7]
            sub(/datasheets/,"",$7)
            minusextension = $7
            sub(/\....$/, "", minusextension)
            print minusextension, $7, frequency
    }
    sort
}

2 Answers 2

3

You can't use $7 as a variable name in that for loop. Change all the $7 in your END block to key or something like that.

Sign up to request clarification or add additional context in comments.

3 Comments

Of course you can use $7 as a variable name. It's just not a good idea, and the confusion surrounding it is likely to break your code. Try this: printf 'one\ntwo\nthree\n' | awk '1 END { $3="foo"; print $3; }'
awk 'END { a[1] = 0; for ($1 in a) {} }' => syntax error; awk 'END { a[1] = 0; for (b in a) {} }' no syntax error.
Ah, just in the for loop. Interesting. I take back my -1.
1

You can do this with a one-liner:

[ghoti@pc ~]$ find . \( -name "*.doc" -or -name "*.pdf" \) -print | awk -F. '{c[$NF]++} END {for(ext in c){printf("%5.0f\t%s\n", c[ext], ext);}}'
  232   pdf
   45   doc
[ghoti@pc ~]$ 

Note that this moves the selection of extensions out of the awk script and into the find command earlier in the pipe. If you really want to make this a stand-alone awk-only script (and not shell), I suppose you could do it like this:

#!/usr/bin/awk -f

BEGIN {

  # List of extensions we're interested in:
  exts["doc"]=1;
  exts["pdf"]=1;

  FS=".";
  cmd="find . -print";
  while (cmd | getline) {
    if (exts[$NF]==1) {
      c[$NF]++;
    }
  }
  for (ext in c) {
    printf("%5.0f\t%s\n", c[ext], ext);
  }
  exit;
}

Note that the find command also traverses subdirectories. If you want only the current directory, you can swap in ls *.pdf *.doc and just ls respectively.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.