7

Suppose I want to pass a string to awk so that once I split it (on a pattern) the substrings become the indexes (not the values) of an associative array.

Like so:

$ awk -v s="A:B:F:G" 'BEGIN{ # easy, but can these steps be combined?
                            split(s,temp,":")  # temp[1]="A",temp[2]="B"...
                            for (e in temp) arr[temp[e]] #arr["A"], arr["B"]...
                            for (e in arr) print e 
                            }'
A
B
F
G

Is there a awkism or gawkism that would allow the string s to be directly split into its components with those components becoming the index entries in arr?


The reason is (bigger picture) is I want something like this (pseudo awk):

awk -v s="1,4,55" 'BEGIN{[arr to arr["1"],arr["5"],arr["55"]} $3 in arr {action}'
0

3 Answers 3

5

No, there is no better way to map separated substrings to array indices than:

split(str,tmp); for (i in tmp) arr[tmp[i]]

FWIW if you don't like that approach for doing what your final pseudo-code does:

awk -v s="1,4,55" 'BEGIN{split(s,tmp,/,/); for (i in tmp) arr[tmp[i]]} $3 in arr{action}'

then another way to get the same behavior is

awk -v s=",1,4,55," 'index(s,","$3","){action}'
Sign up to request clarification or add additional context in comments.

4 Comments

I think that split(str,tmp); for (i in tmp) arr[tmp[i]] is probably the way to go. Thanks!
to avoid missing surroundig separator of s in second solution I propose this awk -v s="A:B:C:G" 's ~ "(^|:)" $3 "(:|$)"{action}'
@NeronLeVelu That turns it into a regexp comparison so then you need to worry about regexp metacharacters in the strings. The original code used a string comparison ($3 in arr) and so does the code I posted using index() so regexp metacharacters will just be treated literally.
ok, i forget to assume that part, you are pointing right about this possible issue
2

Probably useless and unnecessarily complex but I'll open the game with while, match and substr:

$ awk -v s="A:B:F:G" '
BEGIN {
    while(match(s,/[^:]+/)) {
        a[substr(s,RSTART,RLENGTH)]
        s=substr(s,RSTART+RLENGTH)
    }
    for(i in a)
        print i
}'
A
B
F
G

I'm eager to see (if there are) some useful solutions. I tried playing around with asorts and such.

Comments

2

Other way kind awkism

cat file

1 hi
2 hello
3 bonjour
4 hola
5 konichiwa

Run it,

awk 'NR==FNR{d[$1]; next}$1 in d' RS="," <(echo "1,2,4") RS="\n" file

you get,

1 hi
2 hello
4 hola

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.