1

I have working code

BEGIN { FS=";"; }   # field separator
{ 
    if (match($2, /[0-9]+/)) {           # matching `ID` value
        m=substr($2, RSTART, RLENGTH);
        a[m]++;                          # accumulating number of lines for each `ID`
        print > m"_count.txt";    # writing lines pertaining to certain `ID` into respective file
    } 
}
END {
    for(i in a) { 
        print "mv "i"_count.txt "i"_"a[i]".txt"  # renaming files with actual counts
    }
} 

Now i need to change it to do something like this. So i have three arrays of IDs and each array means separate folders to save result.

BEGIN { FS=";"; }   # field separator
{
    array1=(125 258 698 874)
    array2=(956 887 4455 22)
    array3=(111 444 558 966 332)
    if ($1 == $2) {varR=$3} else {varR=$2}
    if (match(varR, /[0-9]+/)) {           # matching `ID` value
        if ( varR in array1 ) {
            FolderName = "folder1/"
            m1=substr(varR, RSTART, RLENGTH);
            a1[m1]++;                          # accumulating number of lines for each `ID`
            print > (FolderName m1)"_count.txt";    # writing lines pertaining to certain `ID` into respective file
        }
        if ( varR in array2 ) {
            FolderName = "folder2/"
            m2=substr(varR, RSTART, RLENGTH);
            a2[m2]++;                          # accumulating number of lines for each `ID`
            print > (FolderName m2)"_count.txt";    # writing lines pertaining to certain `ID` into respective file
        }
        if ( varR in array3 ) {
            FolderName = "folder3/"
            m3=substr(varR, RSTART, RLENGTH);
            a3[m3]++;                          # accumulating number of lines for each `ID`
            print > (FolderName m3)"_count.txt";    # writing lines pertaining to certain `ID` into respective file
        }
    } 
}
END {
    for(i in a1) { 
        print "mv "i"_count.txt "i"_"a1[i]".txt"  # renaming files with actual counts
    }
    for(i in a2) { 
        print "mv "i"_count.txt "i"_"a2[i]".txt"  # renaming files with actual counts
    }
    for(i in a3) { 
        print "mv "i"_count.txt "i"_"a3[i]".txt"  # renaming files with actual counts
    }
} 

As i need to save matching ID to txt files and put in needed folder What if i have 100 arrays? I need to duplicate code for each one?

1
  • 2
    Wihout having read in details your code, to avoid duplicating the code you can built an awk user function that you can call for each array. See this example using functions: stackoverflow.com/questions/42415826/… Commented Mar 31, 2017 at 12:08

2 Answers 2

1

Using GNU Awk's multi-dimensional array support, here's a simplified solution that demonstrates the techniques you need:

$ gawk '
  BEGIN { FS=";" }   # field separator
  {
      # Initialize the sub-arrays of the multi-dimensional array.
      array[1][""]; split("125;258;698;874", aux); for (i in aux) array[1][aux[i]]
      array[2][""]; split("956;887;4455;22", aux); for (i in aux) array[2][aux[i]]
      array[3][""]; split("111;444;558;966;332", aux); for (i in aux) array[3][aux[i]]
      n = length(array) # The count of sub-arrays
      if ($1 == $2) {varR=$3} else {varR=$2}
      if (match(varR, /[0-9]+/)) {           # matching `ID` value
        for (i=1;i<=n;++i) {                 # loop over all arrays
          if (varR in array[i]) {            # look for the ID among the array keys
            print "folder" i
            break
          }
        }
      }        
  } 
' <<<'1;1;4455'
folder 2
  • See this answer of mine for an explanation of the array-initialization and multi-dimensional array techniques used in this command.

  • Note that the array initialization stores the numbers in the keys of arrays array[<n>], because that's what needed to look up values with <value> in array[<n>].


What you tried:

  • Awk has no array-initializer syntax; what array1=(125 258 698 874) in your code creates is a single string: "125258698874":

    • The surrounding () have no effect here (they're just for precedence).
    • Placing tokens - whether numeric or string - right next to each other in Awk performs string concatenation.
    • Perhaps you mistakenly think that Bash's array-initializer syntax works in Awk too.
  • ( varR in array1 ) looks for varR among the indices (keys) of array1, but had your array initialization worked the way it does in Bash, you'd have to check the values instead.

Sign up to request clarification or add additional context in comments.

Comments

0

Do you need to use different arrays, or could you do something like this:

a[1","1] = "abc";
a[1","2] = "xyz";
a[2","2] = "123";
folders[1] = "folder1";
folders[2] = "folder2";
var = "1";
for (f in folders) {
    if (var","f in a) {
        print a[var","f] " >> " folders[f] "/file_" var;
    }
}

1 Comment

My arrays is only numbers, i need to use them only for match IDs and depending on that put result to folders which name depends on array numbers

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.