6

I need a Java data structure that I can efficiently add, delete, and access a random object.

This is what doesn't work:

ArrayList has efficient add (constant time), and random access (just "get" with a random integer), but deletes can take linear time because it has to potentially search the whole list for it.

TreeSet or HashSet has efficient add and delete, but I can't figure out how to get a random object.

Any ideas?

In theory, a B Tree would work, if I could traverse the tree myself with random Lefts or Rights, but I don't think a standard Java class gives me this ability.

I'm willing to use a third party library if nothing in the standard Java classes will work.

I do not need to support duplicates or nulls, nor does it need to be thread safe.

Thanks.

7
  • ArrayList.remove(int index) runs in amortized constant time. As far as I can tell, this means individual calls are linear time, but the average time over a series of calls approaches constant time. Commented May 5, 2014 at 2:06
  • Thanks for that Aarowaim, but I wouldn't know the index of the object to remove. I suppose I could store that information separately in a HashMap or something, but as soon as I removed an object from the ArrayList, the other objects in the list would change indexes, which would mean some of the values in the HashMap would be wrong, and it would take me linear time to fix them. Commented May 5, 2014 at 2:20
  • @Magmatic Do you need to allow duplicate elements? Commented May 5, 2014 at 2:42
  • I do not need to support duplicates or nulls, nor does it need to be thread safe. Commented May 5, 2014 at 2:47
  • 1
    How many elements are you talking about? Hundreds? Thousands? Millions? Are they simple numbers, strings, or more complex objects? Commented May 5, 2014 at 3:16

3 Answers 3

9

You can get this with an ArrayList/HashMap pair (where the map will store the index of objects in the list) if you make list removal unordered. On removal, instead of shifting all subsequent elements in the list down one position, you just move the last element to fill the space of the removed item. Then there are at most two different elements that need to be touched during removal. A quick example (which I haven't tested and hope I didn't bungle):

class SpecialSet<E> extends AbstractSet<E> implements Set<E> {
    private final List<E> list = new ArrayList<>();
    private final Map<E,Integer> indexMap = new HashMap<>();

    @Override
    public boolean add(E e) {
        if (contains(e)) return false;
        indexMap.put(e, list.size());
        list.add(e);
        return true;
    }

    @Override
    public boolean remove(Object o) {
        Integer indexBoxed = indexMap.remove(o);
        if (indexBoxed == null) return false;
        int index = indexBoxed;
        int last = list.size() - 1;
        E element = list.remove(last);
        if (index != last) {
            indexMap.put(element, index);
            list.set(index, element);
        }
        return true;
    }

    public E removeRandom() {
        E element = list.get((int)(Math.random() * size()));
        remove(element);
        return element;
    }

    @Override
    public boolean contains(Object o) {
        return indexMap.containsKey(o);
    }

    @Override
    public Iterator<E> iterator() {
        return list.iterator();
    }

    @Override
    public int size() {
        return list.size();
    }
}

Depending on what object equality testing behavior you want, you maybe could swap out the HashMap for an IdentityHashMap for a little better speed.

Sign up to request clarification or add additional context in comments.

4 Comments

Wow, thanks for the code! I just threw it into my project. (I only changed the "removeRandom" to "getRandom" and took out the line that removes it). It worked perfectly! I had been using an ArrayList with the inefficient remove, but when I changed to this code, the processing time dropped from 1900 ms to 628 ms. Wow, wow, wow! Thank you, thank you, thank you!
I don't see the importance of HashMap here. The most important trick here is: removal is not removing the element directly, it is removing the last element, and put the last element at the position that need to be removed. In such case, just a single ArrayList as internal structure will help
@AdrianShum You have to find an element before you can remove it; that's the part that the HashMap makes fast.
@Boann Oops, you are right, I overlooked the part of delete (I was thinking OP was asking for remove by index)
1

I'd propose that you use a Map, with an Integer key. That way you can generate a random number for the remove.

Problem left for the reader: on subsequent removes (or any operation) there will be gaps.

1 Comment

But how can you remove a value if you don't know its key? You would still need to do an inefficient linear search to find it.
1

Note that a simple array will do the job, if you pre-allocate to the max size you'll need. Keep a "top" index value and insert by incrementing "top" and storing into that array element. When you delete, copy the "top" value down into the deleted element and decrement "top".

If you don't know your max size you can still allocate an array of arrays and use the same basic algorithm, only you need to first compute the major and minor array index values before accessing an element.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.