0

The bash below goes to a folder and stores all the unique values that are .html file names in f1. It then removes all text after the _ in $p. I added a for loop to get the unique id in $p. The terminal out shows $p is correct, but the last value is only being stored in the new array ($sorted_unique_ids), I am not sure why all three are not.

dir=/path/to
  var=$(ls -td "$dir"/*/ | head -1)  ## sore newest <run> in var
for f1 in "$var"/qc/*.html ; do 
# Grab file prefix
  bname=`basename $f1` # strip of path
  p="$(echo $bname|cut -d_ -f1)"
    typeset -A myarray  ## define associative array
    myarray[${p}]=yes  ## store p in myarray
     for i in ${!myarray[@]}; do echo ${!myarray[@]} | tr ' ' '\n' | sort; done
done

output

id1
id1
id1
id2
id1
id2
id1
id2
id3
id1
id2
id3

desired sorted_unique_ids

id1
id2
id3
5
  • 2
    Then why do you loop through a single element "$p"? Commented Nov 25, 2019 at 16:52
  • I change it to "${id[@]}" with the same result. Thank you :). Commented Nov 25, 2019 at 16:57
  • $p is not an array here, neither is $id, so it makes no sense to use the ${var[@]} syntax. Commented Nov 25, 2019 at 17:00
  • Please post example directory structure. I am very not sure what exactly you want to do. Do you have files like "$var"/qc/id1_blabla.html "$var"/qc/id2_blabla.html "$var"/qc/id3_blabla.html? And you want to get the list of ids? Then why do you run for i in ${!myarray[@]} for each file? Commented Nov 25, 2019 at 17:39
  • re: the latest edit ... move the line ... ` for i in ${!myarray[@]}; do echo ${!myarray[@]} | tr ' ' '\n' | sort; done` down in your script, past the last done; right now you're getting repeated lines on output because you are, basically, printing each p value to stdout as you process it ... where what you want to do is wait until you've completed all loop processing, completed all array assignments, and then you should find that you have a unique set of p values stored in the array Commented Nov 25, 2019 at 17:41

3 Answers 3

1

Maybe something like this:

dir=$(ls -td "$dir"/*/ | head -1)
find "$dir" -maxdepth 1 -type f -name '*_*.html' -printf "%f\n" |
cut -d_ -f1 | sort -u

For input directory structure created like:

dir=dir
mkdir -p dir/dir
touch dir/dir/id{1,2,3}_{a,b,c}.html

So it looks like this:

dir/dir/id2_b.html
dir/dir/id1_c.html
dir/dir/id2_c.html
dir/dir/id1_b.html
dir/dir/id3_b.html
dir/dir/id2_a.html
dir/dir/id3_a.html
dir/dir/id1_a.html
dir/dir/id3_c.html

The script will output:

id1
id2
id3

Tested on repl.

Sign up to request clarification or add additional context in comments.

Comments

1
latest=`ls -t "$dir"|head -1`    # or …|sed q` if you're really jonesing for keystrokes
for f in "$latest"/qc/*_*.html; do f=${f##*/}; printf %s\\n "${f%_*}"; done | sort -u

Comments

1

Define an associative array:

typeset -A myarray

Use each p value as the index for an array element; assign any value you want to the array element (the value just acts as a placeholder):

myarray[${p}]=yes

If you run across the same p value more than once, each assignment to the array will overwrite the previous assignment; net result is that you'll only ever have a single element in the array with a value of p.

To obtain your unique list of p values, you can loop through the indexes for the array, eg:

for i in ${!myarray[@]}
do
    echo ${i}
done

If you need the array indexes generated in sorted order try:

echo ${!myarray[@]} | tr ' ' '\n' | sort

You can then use this sorted result set as needed (eg, dump to stdout, feed to a loop, etc).


So, adding my code to the OPs original code would give us:

typeset -A myarray  ## define associative array

dir=/path/to
var=$(ls -td "$dir"/*/ | head -1)  ## sore newest <run> in var
for f1 in "$var"/qc/*.html ; do 
  # Grab file prefix
  bname=`basename $f1` # strip of path
  p="$(echo $bname|cut -d_ -f1)"
  myarray[${p}]=yes  ## store p in myarray
done

# display sorted, unique set of p values
for i in ${!myarray[@]}; do echo ${!myarray[@]} | tr ' ' '\n' | sort; done

5 Comments

I added the output in the edit of the post, the lines are repeating :).
you populate the array inside your loop; after you've completed all of your looping (and array assigments) then you add my processing; you're getting repeated lines of output because you're, basically, just printing each p value to stdout as you process it
I've updated the answer to show proper use of the typeset and display of the final array indexes
I think the usage of associative array is just questionable. Why not a single array like, and instead of myarray[${p}]=yes just add the element myarray+=("$p")? What for is that yes?
'yes' ==> acts as a placeholder ... need to assign something to the array, right? and I'm not sure your proposal works ... would need to see your complete solution

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.