Indexing an array with duplicate elements before and after sorting

Question

This is the basic problem: I have an array of integers with possibly duplicate elements. I need to know the indices of each element, but when I sort the array, whenever I select an element from the new array, I want to be able to reference the same element from the original array.

I am looking for a solution to the problem, or maybe a solution to the approach I am taking.

Here is an array

a = [1, 2, 3, 4, 3, 5, 2]

There are two 2's and two 3's, but if I'm working with the first 2 (from the left), I want to work with index 1, and if I'm working with the second 2, I want to be working with index 6. So I use a helper array to allow me to do this:

helper = [0, 1, 2, 3, 4, 5, 6]

Which I will iterate over and use to access each element from a.
I could have accomplished this with each_with_index, but the problem begins when I sort the array.

Now I have a sort order

sort_order = [2, 4, 1, 5, 3]

I use sort_by to sort a according to sort_order, to produce

sorted_a = [2, 2, 4, 1, 5, 3, 3]

You may assume all elements in the input exist in sort_order to avoid sort_by exceptions.

Now the problem is that my helper array should be updated to match the new positions. Each element should be sorted the same way as a was sorted, because it is unclear whether the first 2 in the new array was at index 1 or at index 6 of the original array.

So my new helper array might look like

new_helper = [1, 6, 3, 0, 5, 2, 4]

So if I were to go with this approach, how would I produce the new_helper array, given the original array and the sort order?

Maybe there is a better way to do this?

Does it matter if the helper array points to a different element from the original as long as the value of that element is the same ? — Hunter McMillen
– Hunter McMillen, Commented Jul 25, 2012 at 18:46
The values are not important (in the context of my methods that are using them), but the positions are. That's what I had in mind when I created my helper array, so the new helper array should point to the same element. — MxLDevs
– MxLDevs, Commented Jul 25, 2012 at 19:11
You need to implement the sorting logic yourself then, and whenever you swap a position in your array swap it also inside of your helper array. — Hunter McMillen
– Hunter McMillen, Commented Jul 25, 2012 at 19:13

sxu · Accepted Answer · 2012-07-25 19:25:59Z

1

I would suggest first zip the original array with the helper array, sort the zipped array according the component coming from the original array, then unzip them (this method does not exist unfortunately, but you can do transpose). Or you can implement your own sorting logic as pointed out by Hunter.

edited Jul 25, 2012 at 19:25

answered Jul 25, 2012 at 19:16

sxu

5514 silver badges14 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Keith Randall · Accepted Answer · 2012-07-26 03:15:55Z

0

Make a list of pairs of the original data and that data's index. Like this:

a = [(1, 0), (2, 1), (3, 2), (4, 3), (3, 4), (5, 5), (2,6)]

Sort that list (lexicographically, or just ignore the second part of the pair except to carry it along). The second item in every pair tells you where the element was in the original array.

answered Jul 26, 2012 at 3:15

Keith Randall

23.3k3 gold badges38 silver badges54 bronze badges

Comments

Hunter McMillen · Accepted Answer · 2012-07-25 19:23:55Z

0

You need to swap the values in the helper array when you swap then in your main array.

loop do
   swapped = false
   0.upto(list.size-2) do |i|
      if list[i] > list[i+1]
         list[i], list[i+1] = list[i+1], list[i] # swap values
         helper[i], helper[i+1] = helper[i+1], helper[i]; #swap helper values
         swapped = true
      end
   end
   break unless swapped
end

Example

irb(main):001:0> def parallel_sort(list, helper)
irb(main):002:1> loop do
irb(main):003:2*    swapped = false
irb(main):004:2>    0.upto(list.size-2) do |i|
irb(main):005:3*       if list[i] > list[i+1]
irb(main):006:4>          list[i], list[i+1] = list[i+1], list[i] # swap values
irb(main):007:4>          helper[i], helper[i+1] = helper[i+1], helper[i]; #swap helper values
irb(main):008:4*          swapped = true
irb(main):009:4>       end
irb(main):010:3>    end
irb(main):011:2>    break unless swapped
irb(main):012:2> end
irb(main):013:1> return [list, helper]
irb(main):014:1> end
=> nil
irb(main):015:0> a = [3,2,1]
=> [3, 2, 1]
irb(main):016:0> b = ["three","two","one"]
=> ["three", "two", "one"]
irb(main):017:0> parallel_sort(a,b)
=> [[1, 2, 3], ["one", "two", "three"]]
irb(main):018:0>

edited Jul 25, 2012 at 19:23

answered Jul 25, 2012 at 19:18

Hunter McMillen

61.8k25 gold badges124 silver badges176 bronze badges

2 Comments

MxLDevs Over a year ago

The sort order is based on the custom sort order array though, and I'm not sure how to implement that kind of sorting efficiently. But the idea works. I was hoping to just use sort_by to get things done for me.

Hunter McMillen Over a year ago

@Keikoku Well as long as you aren't sorting hundreds of thousands of elements; what I posted above you do fine.

user1277476 · Accepted Answer · 2012-07-25 20:53:08Z

0

Sorting inside a loop is rarely a good idea.... If you are doing so, you might be better off with a treap (fast on average but infrequently an operation will take a while) or red-black tree (relatively slow, but gives pretty consistent operation times). These are rather like hash tables, except they're not as fast, and they keep elements stored in order using trees.

Either way, why not use a class that saves both the value to sort by, and the helper value? Then they're always together, and you don't need a custom sorting algorithm.

answered Jul 25, 2012 at 20:53

user1277476

2,90714 silver badges10 bronze badges

1 Comment

MxLDevs Over a year ago

Yes that's the solution I was going towards since the original design is pretty bad. But I imagine someone might run into this kind of problem at some point where they don't have the option of changing the design.

galymzhan · Accepted Answer · 2012-07-26 04:02:48Z

0

Since you have sort_order, your array is already kind of sorted, so we should use this fact as an advantage. I came up with this simple solution:

a = [1, 2, 3, 4, 3, 5, 2]
sort_order = [2, 4, 1, 5, 3]

# Save indices
indices = Hash.new { |hash, key| hash[key] = [] }
a.each_with_index { |elem, index| indices[elem] << index }

# Sort the array by placing elements into "right" positions
sorted = []
helper = []
sort_order.each do |elem|
  indices[elem].each do |index|
    sorted << elem
    helper << index
  end
end

p sorted
p helper

The algorithm is based on idea of Counting sort, I slightly modified it to save indices.

answered Jul 26, 2012 at 4:02

galymzhan

5,5132 gold badges32 silver badges45 bronze badges

Collectives™ on Stack Overflow

Indexing an array with duplicate elements before and after sorting

5 Answers 5

Comments

Comments

2 Comments

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Comments

Comments

2 Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related