Parse a txt file into an array using bash and awk [duplicate]

Question

I have the following CSV File called test.csv

First Line, 100
Second Line, 200
Third Line, 300
Fourth Line, 400

I want to split each line at the comma and store the first part into an array in a bash script using awk. I have done the following:

#!/bin/bash
declare -a MY_ARRAY

MY_ARRAY=$(awk -F ',' '{print $1}' test.csv)
echo 'Start'
for ENTRY in ${MY_ARRAY}
do
echo ${ENTRY}
done
echo 'Stop'

And the output is as follows:

Start
First
Line
Second
Line
Third
Line
Fourth
Line
Stop

How can I get the array to hold the following?:

First Line
Second Line
Third Line
Fourth Line

Why do you want to use awk to populate a bash array? In every case it would be easier to use bash directly. Because of the spaces inside the fields you need more bash code than array=( $(awk ...) ) anyway (note the ( ) around $( )`; you forgot them). — Socowi
– Socowi, Commented Mar 27, 2021 at 0:01
a few issues with the current code: a) assigning values to an array typically requires wrapping the right side of the assignment in parents, eg, MY_ARRAY=( $(awk ... ) ); b) since the awk output lines contain white space you need to redefine the default field separator as a \n for proper parsing into the array, eg, IFS=$'\n' MY_ARRAY=( $(awk ... ) ) (all on one line so the IFS redefinition only applies to this command); c) to reference the individual elements of the array you need a wildcard match for the index (wrapped in []'s), eg, for ENTRY in "${MY_ARRAY[@]}" — markp-fuso
– markp-fuso, Commented Mar 27, 2021 at 15:18

tshiono · Accepted Answer · 2021-03-27 03:44:46Z

2

If you want to assign a bash array to the first columns of csv file, you do not have to use awk. Please try instead:

readarray -t my_array < <(cut -d, -f1 test.csv)
for e in "${my_array[@]}"; do
    echo "$e"
done

Output:

First Line
Second Line
Third Line
Fourth Line

The command cut -t, -f1 file splits each line on comma and prints the first field.
The expression <(command) is a process substitution and can redirect the output of the command to another command (readarray in this case).
The command readarray -t my_array reads lines from the standard input and assigns my_array to the lines.

Going back to your posted script, the variable MY_ARRAY is not assigned as an array. It just holds a single string which contains whitespaces and newlines. If it is referred as for ENTRY in ${MY_ARRAY}, the string is split on the whitespaces and the newlines due to the word splitting of bash.

edited Mar 27, 2021 at 3:44

answered Mar 27, 2021 at 2:56

tshiono

22.3k2 gold badges18 silver badges26 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Denis · Accepted Answer · 2021-03-27 00:20:53Z

1

$ cat input 
First Line, 100
Second Line, 200
Third Line, 300
Fourth Line, 400

$ cat so.sh 
#!/bin/bash

while IFS=',' read -r -a array; do
    echo $array
done

$ cat input | ./so.sh 
First Line
Second Line
Third Line
Fourth Line

answered Mar 27, 2021 at 0:20

Denis

1,6672 gold badges20 silver badges33 bronze badges

2 Comments

Harry Boy Over a year ago

I need the script to open the file and read it, not have it cat-ed to the script

EricSchaefer Over a year ago

@HarryBoy But that is how to read from a file in bash.

Collectives™ on Stack Overflow

Parse a txt file into an array using bash and awk [duplicate]

2 Answers 2

Comments

2 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

2 Comments

Linked

Related