How to split a delimited string into an array in awk?

Question

How to split the string when it contains pipe symbols | in it. I want to split them to be in array.

I tried

echo "12:23:11" | awk '{split($0,a,":"); print a[3] a[2] a[1]}'

Which works fine. If my string is like "12|23|11" then how do I split them into an array?

Note that your output is concatenating the array elements, with no separator. If you instead wanted them to be separated with OFS, stick commas in between them, making print see them as separate arguments. — dubiousjim
– dubiousjim, Commented Apr 19, 2012 at 12:57
@slushy: your command is not at all what the asker needs. your command ( echo "12:23:11" | sed "s/.*://") delete everything until (and including) the last ":", keeping only the "11" ... it works to get the last number, but would need to be modified (in an difficult to read way) to get the 2nd number, etc. awk (and awk's split) is much more elegant and readable. — Olivier Dulac
– Olivier Dulac, Commented Dec 5, 2019 at 9:13
if you need to split on a single character you can use cut — ccpizza
– ccpizza, Commented Dec 11, 2019 at 14:37
Just in case it is a XY problem or you are in environment without sed. input="12:23:11" and then output=$(echo -n ":${input}" | tr ':' '\n' | tac -b | tr '\n' ':'); output="${output#:}"; echo "${output}" output now contains 11:23:12 — Et7f3XIV
– Et7f3XIV, Commented Apr 8, 2024 at 23:44

Chris Seymour · Accepted Answer · 2013-03-24 13:05:45Z

412

Have you tried:

echo "12|23|11" | awk '{split($0,a,"|"); print a[3],a[2],a[1]}'

edited Mar 24, 2013 at 13:05

Chris Seymour

86.4k32 gold badges165 silver badges209 bronze badges

answered Nov 4, 2011 at 13:15

Calin Paul Alexandru

4,7922 gold badges21 silver badges21 bronze badges

Sign up to request clarification or add additional context in comments.

11 Comments

Dimitre Radoulov Over a year ago

@Mohamed Saligh, if you're on Solaris, you need to use /usr/xpg4/bin/awk, given the string length.

shellter Over a year ago

'is not working for me'. especially with colons between the echoed values and split set up to split on '|'??? Typo? Good luck to all.

Alston Over a year ago

Better with some syntax explanation.

WhiteWind Over a year ago

This will not work in GNU awk, because third argument to split is regular expression, and | is special symbol, which needs to be escaped. Use split($0, a, "\|")

Olivier Dulac Over a year ago

@WhiteWind: another way to "ensure" that | is seen as a char and not a special symbol is to put it between [] : ie, split($0, a, "[|]") # I like this better than '\|', in some cases, especially as some variant of regexp (perl vs grep vs .. others?) can have "|" interepreted literally and "\|" seen as regex separator, instead of the opposite... ymmv

|

Maybe · Accepted Answer · 2024-02-23 19:40:33Z

To split a string to an array in awk we use the function split():

awk '{split($0, array, ":")}'
#           \/  \___/  \_/
#           |     |     |
#       string    |     delimiter
#                 |
#               array to store the pieces

If no separator is given, it uses the FS, which defaults to the space:

$ awk '{split($0, array); print array[2]}' <<< "a:b c:d e"
c:d

We can give a separator, for example ::

$ awk '{split($0, array, ":"); print array[2]}' <<< "a:b c:d e"
b c

Which is equivalent to setting it through the FS:

$ awk -F: '{split($0, array); print array[2]}' <<< "a:b c:d e"
b c

In GNU Awk you can also provide the separator as a regexp:

$ awk '{split($0, array, ":*"); print array[2]}' <<< "a:::b c::d e
#note multiple :
b c

And even see what the delimiter was on every step by using its fourth parameter:

$ awk '{split($0, array, ":*", sep); print array[2]; print sep[1]}' <<< "a:::b c::d e"
b c
:::

Let's quote the man page of GNU awk:

split(string, array [, fieldsep [, seps ] ])

Divide string into pieces separated by fieldsep and store the pieces in array and the separator strings in the seps array. The first piece is stored in array[1], the second piece in array[2], and so forth. The string value of the third argument, fieldsep, is a regexp describing where to split string (much as FS can be a regexp describing where to split input records). If fieldsep is omitted, the value of FS is used. split() returns the number of elements created. seps is a gawk extension, with seps[i] being the separator string between array[i] and array[i+1]. If fieldsep is a single space, then any leading whitespace goes into seps[0] and any trailing whitespace goes into seps[n], where n is the return value of split() (i.e., the number of elements in array).

Dimitre Radoulov · Accepted Answer · 2011-11-04 13:53:46Z

22

Please be more specific! What do you mean by "it doesn't work"? Post the exact output (or error message), your OS and awk version:

% awk -F\| '{
  for (i = 0; ++i <= NF;)
    print i, $i
  }' <<<'12|23|11'
1 12
2 23
3 11

Or, using split:

% awk '{
  n = split($0, t, "|")
  for (i = 0; ++i <= n;)
    print i, t[i]
  }' <<<'12|23|11'
1 12
2 23
3 11

Edit: on Solaris you'll need to use the POSIX awk (/usr/xpg4/bin/awk) in order to process 4000 fields correctly.

edited Nov 4, 2011 at 13:53

answered Nov 4, 2011 at 13:24

Dimitre Radoulov

28.1k4 gold badges42 silver badges50 bronze badges

6 Comments

PiotrNycz Over a year ago

for(i = 0 or for(i = 1 ?

Dimitre Radoulov Over a year ago

i = 0, because I use ++i after (not i++).

PiotrNycz Over a year ago

Ok - I did not notice this. I strongly believe more readable would be for (i = 1; i <= n; ++i) ...

RARE Kpop Manifesto Jun 4 at 21:22

@PiotrNycz @Dimitre : why not for (i = 0; i++ < n; ) { … } - this way it combines the best of the 1-based indexing and 0-based indexing (with the freebie bonus of no longer needing the 3rd argument in the for (…;…;…) { } statement

PiotrNycz Jun 7 at 9:20

if this is not just a strange sense of humor, then: we have for (init; condition; increment) construct. Each of 3 parts have their well defined places in this for-loop. What is the possible benefit of placing two parts in one slot and have one slot empty? The only reason for that is to make code less readable, thus maybe avoiding being fired - because the code will be hard to maintain by someone else?

RARE Kpop Manifesto Jul 12 at 0:45

how is that "less readable" ? for (i = 0; i++ < n; ) { … } vs. for (i = 1; i <= n; ++i) { … } vs. for (i = 0; i < n; ++i) { … } (if 0-based indexing). And you really should consider getting away from the mindset that the 3rd argument is "increment", cuz you're sandboxing yourself for no reason. I think of it as "stuff that could be done after each loop cycle". Don't let those "idioms" imposed upon you by others constrain your creativity. e.g. I've conjured up a way to efficiently perform mod-%- 13 only once per 108,000 decimal digits, without bigint support or external c

TrueY · Accepted Answer · 2016-02-10 10:35:00Z

12

I do not like the echo "..." | awk ... solution as it calls unnecessary fork and execsystem calls.

I prefer a Dimitre's solution with a little twist

awk -F\| '{print $3 $2 $1}' <<<'12|23|11'

Or a bit shorter version:

awk -F\| '$0=$3 $2 $1' <<<'12|23|11'

In this case the output record put together which is a true condition, so it gets printed.

In this specific case the stdin redirection can be spared with setting an awk internal variable:

awk -v T='12|23|11' 'BEGIN{split(T,a,"|");print a[3] a[2] a[1]}'

I used ksh quite a while, but in bash this could be managed by internal string manipulation. In the first case the original string is split by internal terminator. In the second case it is assumed that the string always contains digit pairs separated by a one character separator.

T='12|23|11';echo -n ${T##*|};T=${T%|*};echo ${T#*|}${T%|*}
T='12|23|11';echo ${T:6}${T:3:2}${T:0:2}

The result in all cases is

edited Feb 10, 2016 at 10:35

answered Feb 10, 2016 at 10:12

TrueY

7,8601 gold badge44 silver badges55 bronze badges

2 Comments

Daniel Liston Over a year ago

I think the end result was supposed to be the awk array variable references, regardless of the print output example given. But you missed a really easy bash case to provide your end result. T='12:23:11';echo ${T//:}

TrueY Over a year ago

@DanielListon You are right! Thanks! I did not know that the trailing / can be left in this bash expression...

Sven · Accepted Answer · 2018-10-22 11:17:43Z

8

Actually awk has a feature called 'Input Field Separator Variable' link. This is how to use it. It's not really an array, but it uses the internal $ variables. For splitting a simple string it is easier.

echo "12|23|11" | awk 'BEGIN {FS="|";} { print $1, $2, $3 }'

edited Oct 22, 2018 at 11:17

answered Oct 22, 2018 at 11:08

Sven

2,6031 gold badge22 silver badges31 bronze badges

Comments

duedl0r · Accepted Answer · 2011-11-04 13:14:16Z

7

Joke? :)

How about echo "12|23|11" | awk '{split($0,a,"|"); print a[3] a[2] a[1]}'

This is my output:

p2> echo "12|23|11" | awk '{split($0,a,"|"); print a[3] a[2] a[1]}'
112312

so I guess it's working after all..

answered Nov 4, 2011 at 13:14

duedl0r

9,4343 gold badges34 silver badges46 bronze badges

1 Comment

Mohamed Saligh Over a year ago

is that because of the length of the string ? since, my string length is 4000. any ideas

Qorbani · Accepted Answer · 2019-11-22 05:41:13Z

7

I know this is kind of old question, but I thought maybe someone like my trick. Especially since this solution not limited to a specific number of items.

# Convert to an array
_ITEMS=($(echo "12|23|11" | tr '|' '\n'))

# Output array items
for _ITEM in "${_ITEMS[@]}"; do
  echo "Item: ${_ITEM}"
done

The output will be:

Item: 12
Item: 23
Item: 11

answered Nov 22, 2019 at 5:41

Qorbani

5,9452 gold badges41 silver badges47 bronze badges

Comments

codaddict · Accepted Answer · 2011-11-04 13:14:55Z

5

echo "12|23|11" | awk '{split($0,a,"|"); print a[3] a[2] a[1]}'

should work.

answered Nov 4, 2011 at 13:14

codaddict

457k83 gold badges501 silver badges537 bronze badges

Comments

Schildmeijer · Accepted Answer · 2011-11-04 13:15:48Z

4

echo "12|23|11" | awk '{split($0,a,"|"); print a[3] a[2] a[1]}'

answered Nov 4, 2011 at 13:15

Schildmeijer

21k12 gold badges65 silver badges79 bronze badges

Comments

vcatafesta · Accepted Answer · 2022-04-19 16:02:18Z

2

code

awk -F"|" '{split($0,a); print a[1],a[2],a[3]}' <<< '12|23|11'

output

12 23 11

answered Apr 19, 2022 at 16:02

vcatafesta

735 bronze badges

1 Comment

Tyler2P Over a year ago

Your answer could be improved by adding more information on what the code does and how it helps the OP.

Aviv · Accepted Answer · 2021-04-19 18:03:30Z

The challenge: parse and store split strings with spaces and insert them into variables.

Solution: best and simple choice for you would be convert the strings list into array and then parse it into variables with indexes. Here's an example how you can convert and access the array.

Example: parse disk space statistics on each line:

sudo df -k | awk 'NR>1' | while read -r line; do
   #convert into array:
   array=($line)

   #variables:
   filesystem="${array[0]}"
   size="${array[1]}"
   capacity="${array[4]}"
   mountpoint="${array[5]}"
   echo "filesystem:$filesystem|size:$size|capacity:$capacity|mountpoint:$mountpoint"
done

#output:
filesystem:/dev/dsk/c0t0d0s1|size:4000|usage:40%|mountpoint:/
filesystem:/dev/dsk/c0t0d0s2|size:5000|usage:50%|mountpoint:/usr
filesystem:/proc|size:0|usage:0%|mountpoint:/proc
filesystem:mnttab|size:0|usage:0%|mountpoint:/etc/mnttab
filesystem:fd|size:1000|usage:10%|mountpoint:/dev/fd
filesystem:swap|size:9000|usage:9%|mountpoint:/var/run
filesystem:swap|size:1500|usage:15%|mountpoint:/tmp
filesystem:/dev/dsk/c0t0d0s3|size:8000|usage:80%|mountpoint:/export

jian · Accepted Answer · 2022-01-09 17:43:46Z

0

awk -F'['|'] -v '{print $1"\t"$2"\t"$3}' file <<<'12|23|11'

answered Jan 9, 2022 at 17:43

jian

5,0231 gold badge29 silver badges46 bronze badges

Collectives™ on Stack Overflow

How to split a delimited string into an array in awk?

12 Answers 12

11 Comments

Comments

6 Comments

2 Comments

Comments

1 Comment

Comments

Comments

Comments

1 Comment

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

12 Answers 12

11 Comments

Comments

6 Comments

2 Comments

Comments

1 Comment

Comments

Comments

Comments

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related