1

this is my file

$ cat head_datafile_pipe_deleimiter.csv
"Rec_Open_Date"|"MSISDN"|"IMEI"|"Data_Volume_Bytes"|"Device_Manufacturer"|"Device_Model"|"Product_Description"|"Data_Volume_MB"
"2016-07-17"|"686"|"630"|"618320"|"Apple Inc"|"Apple iPhone S A1530"|"PREPAY PLUS - $0 -"|"0.589676"
"2016-07-17"|"560"|"570"|"42841779"|"Motorola Mobility LLC, a Lenovo Company"|"Moto X 2nd Generation, X112360445"|"$39.95 Plan"|"40.8571"
"2016-07-17"|"811"|"340"|"2465082"|"Samsung Korea"|"Samsung SM-G900I"|"$69.95 Plan"|"2.35089"
"2016-07-17"|"785"|"610"|"41498628"|"Apple Inc"|"Apple iPhone 6S Plus A1687"|"$29.95 Carryover Plan 1GB"|"39.5762"
"2016-07-17"|"908"|"310"|"6497563"|"Samsung Korea"|"Samsung GT-I9195"|"PREPAY PLUS - $0 -"|"6.19656"
"2016-07-17"|"919"|"610"|"0"|"Samsung Korea"|"Samsung SM-G925I"|"$19 CO COMBO - NOT RECURRENT"|"0"
"2016-07-17"|"356"|"290"|"33189681"|"Apple Inc"|"Apple iPhone 6S A1688"|"$39.95 Plan"|"31.6521"
"2016-07-17"|"009"|"160"|"30340"|"Samsung Korea"|"Samsung SM-J500Y"|"PREPAY PLUS - $1 - #33"|"0.0289345"
"2016-07-17"|"574"|"400"|"549067"|"HUAWEI Technologies Co Ltd"|"HUAWEI Y6"|"PREPAY PLUS - $0 -"|"0.523631"

I want to store the output from this in an array

$ awk -F'|' 'NR>1{print $7}' head_datafile_pipe_deleimiter.csv | sort | uniq
"$19 CO COMBO - NOT RECURRENT"
"$29.95 Carryover Plan 1GB"
"$39.95 Plan"
"$69.95 Plan"
"PREPAY PLUS - $0 -"
"PREPAY PLUS - $1 - #33"

the way I do this is write it to a log file

$ awk -F'|' 'NR>1{print $7}' head_datafile_pipe_deleimiter.csv | sort | uniq > logfile
$ cat logfile
"$19 CO COMBO - NOT RECURRENT"
"$29.95 Carryover Plan 1GB"
"$39.95 Plan"
"$69.95 Plan"
"PREPAY PLUS - $0 -"
"PREPAY PLUS - $1 - #33"

and then store this in an array

$ u_vals=(`cat "logfile"`)

prning all the elements in the array

$ echo "${u_vals[@]}"
"$19 CO COMBO - NOT RECURRENT" "$29.95 Carryover Plan 1GB" "$39.95 Plan" "$69.95 Plan" "PREPAY PLUS - $0 -" "PREPAY PLUS - $1 - #33"

print the 1st element

$ echo "${u_vals[0]}"
"$19

get the length of the array (zero is the first array)

$ echo "${#u_vals[@]}"
25

print the last element

$ echo "${u_vals[24]}"
#33"

I have a 2 fold question really
Firstly

what I want is create my array in one command, if possible like this, with out having to write to a file

$ u_vals=(`awk -F'|' 'NR>1{print $7}' head_datafile_pipe_deleimiter.csv | sort | uniq`)

and secondly, and more importantly, I want the array to have 6 elements, as below, but the spaces seems to be the issue

$ cat -n logfile
     1  "$19 CO COMBO - NOT RECURRENT"
     2  "$29.95 Carryover Plan 1GB"
     3  "$39.95 Plan"
     4  "$69.95 Plan"
     5  "PREPAY PLUS - $0 -"
     6  "PREPAY PLUS - $1 - #33"

 ## this will loop through the array but sperrates the elements by spaces 
for elem in "${u_vals[@]}"; do  echo "$elem"; done

EDIT1

this answers my first question but not the second part

$ u_vals=($(awk -F'|' 'NR>1{print $7}' head_datafile_pipe_deleimiter.csv | sort | uniq))

$ echo "${u_vals[@]}"
"$19 CO COMBO - NOT RECURRENT" "$29.95 Carryover Plan 1GB" "$39.95 Plan" "$69.95 Plan" "PREPAY PLUS - $0 -" "PREPAY PLUS - $1 - #33"

$ echo "${u_vals[0]}"
"$19

EDIT2

Based on answer below this is the way I chose to do it, note I use awk instead of the while posted below. Not sure which is best, but I just like and understand awk better.

$ mapfile -t u_vals <<<"$(awk -F'|' 'NR>1{print $7}' head_datafile_pipe_deleimiter.csv | sort | uniq)"


$ declare -p u_vals
declare -a u_vals='([0]="\"\$19 CO COMBO - NOT RECURRENT\"" [1]="\"\$29.95 Carryover Plan 1GB\"" [2]="\"\$39.95 Plan\"" [3]="\"\$69.95 Plan\"" [4]="\"PREPAY PLUS - \$0 -\"" [5]="\"PREPAY PLUS - \$1 - #33\"")'


$ for elem in "${u_vals[@]}"; do  echo "$elem"; done
"$19 CO COMBO - NOT RECURRENT"
"$29.95 Carryover Plan 1GB"
"$39.95 Plan"
"$69.95 Plan"
"PREPAY PLUS - $0 -"
"PREPAY PLUS - $1 - #33"


$ printf "%s\n" "${u_vals[@]}"
"$19 CO COMBO - NOT RECURRENT"
"$29.95 Carryover Plan 1GB"
"$39.95 Plan"
"$69.95 Plan"
"PREPAY PLUS - $0 -"
"PREPAY PLUS - $1 - #33"
2
  • this- creating-an-array-in-bash-with-quoted-entries-from-command-output might be what I am looking for Commented Jul 21, 2016 at 22:07
  • 2
    foo=( $(...) ) is not actually a best-practices way to create an array -- lots of bugs. echo "${u_vals[@]}" looks like it works, but it doesn't, actually; use declare -p u_vals to see why. Commented Jul 21, 2016 at 22:10

1 Answer 1

1

Using BASH you can use mapfile with process substitution:

mapfile -t u_vals < <(
   p=1; while IFS='|' read -ra arr; do (( p )) && p=0 || echo "${arr[6]}"; done < file.csv|
   sort -u)

Test output:

printf "%s\n" "${u_vals[@]}"

"$19 CO COMBO - NOT RECURRENT"
"$29.95 Carryover Plan 1GB"
"$39.95 Plan"
"$69.95 Plan"
"PREPAY PLUS - $0 -"
"PREPAY PLUS - $1 - #33"

while loop inside process substitution is doing this:

  1. Discarding first header row
  2. Splitting each column using pipe as delimiter
  3. Extracting only column # 7
  4. Sorting and taking uniques only

Code Demo

Sign up to request clarification or add additional context in comments.

2 Comments

tks, out of curiosity is your while any better than my awk for any reason?
Awk can also be used instead of inner while loop. This might be tad faster as it is not invoking any external utility.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.