25

I'm not too concerned about time efficiency (the operation will be rare), but rather about memory efficiency: Can I grow the array without temporarily having all the values twice?

Is there a more efficient way to grow a large array than creating a new one and copying over all the values? Like, concatenating it with a new one?

What about having fixed-size arrays stored in another array and reallocate / copy that top-level one? Would that leave the actual values in place?

I'm aware of ArrayList, but I need a lot of control about accessing the array and the access needs to be very fast. For instance, I think I prefer a[i] to al.get(i).

The main reason why I care about this is that the array in question (or a number of such arrays) might very well occupy a large enough portion of main memory that the usual strategy of creating a double sized copy before discarding the original might not work out. This may mean that I need to reconsider the overall strategy (or up my hardware recommendations).

9
  • Nice question! I like it Commented Sep 15, 2009 at 13:36
  • 4
    what's the problem with the way ArrayList works? Commented Sep 15, 2009 at 13:37
  • Are you specifically disallowing use of ArrayList? If so, why? Commented Sep 15, 2009 at 13:41
  • Added a note about ArrayList to the post. Commented Sep 15, 2009 at 13:56
  • 1
    Why not use a LinkedList? It has constant time insertion. Commented Sep 15, 2009 at 14:13

12 Answers 12

29

The best way to have a dynamically resizing 'array' or list of items is to use an ArrayList.

Java has already built in very efficient resizing algorithms into that data structure.

But, if you must resize your own array, it is best to use System.arraycopy() or Arrays.copyOf().

Arrays.copyOf() can most simply be used like so:

int[] oldArr;
int newArr = Arrays.copyOf(oldArr, oldArr.length * 2);

This will give you a new array with the same elements as the old array, but now with room to spare.

The Arrays class in general has lots of great methods for dealing with arrays.

Also

It is important to make sure that you aren't just growing your array by one element each time an element is added. It is best to implement some strategy where you only have to resize the array every once in a while. Resizing arrays is a costly operation.

Sign up to request clarification or add additional context in comments.

4 Comments

Well, the growing strategy would most likely be the classic "double each time".
I amended my post to clarify that I'm not concerned about the time it takes to grow the array, but rather about the extra memory that's required by the operation. If this has to be 100% of the array size, I can't grow arrays but have to add arrays.
+1 for a very good answer to the more sloppy first revision of my question!
I'll keep this one here though, it still has some good information.
18

Is there a more efficient way to grow a large array than creating a new one and copying over all the values? Like, concatenating it with a new one?

No. And probably there is no language, that guarantees growing an array will always take place without copying. Once you allocate the space for the array and do something else, you most likely have other objects in memory right after the end of the array. At that point, it's fundamentally impossible to grow the array without copying it.

What about having fixed-size arrays stored in another array and reallocate / copy that top-level one? Would that leave the actual values in place?

You mean have an array of arrays and treat it as one large array consisting of a concatenation of the underlying arrays? Yes, that would work (the "faking it by doing indirection" approach), as in Java, Object[][] is simply an array of pointers to Object[] instances.

2 Comments

Thanks for commenting on my indirection idea, that's exactly what I was wondering.
Really minor side note on the topic of "no language does this"... C will do exactly this--resize an array without copying the data--so long as the realloc'd memory does not fill the entire block and\or no blocks lie beyond the allocated block. With a large block allocator or fastest fit allocator this is not uncommon depending on the size and alignment of your array. With best fit allocators, naturally the memory will be moved to a different block most of the time, however it still eliminates the "double copy" as the memory may be moved a segment at a time if the implementation supports it.
6

Arrays are constant-size, so there is no way to grow them. You can only copy them, using System.arrayCopy to be efficient.


ArrayList does exactly what you need. It's optimized much better than any of us could do, unless you devote a considerable time to it. It uses internally System.arrayCopy.


Even more, if you have some huge phases where you need the list to grow/reduce, and others where it doesn't grow/reduce and you make thousands of read or write in it. Suppose also you have a huge performance need, that you prooved that ArrayList is too slow when read/writing. You could still use the ArrayList for one huge phase, and convert it to an array for the other. Note this would be effective only if your application phases are huge.

Comments

4

How about a linked list coupled with an array that holds only references.

The linked list can grow without having to allocate new memory, the array would ensure you have easy access. And every time the array becomes to small, you can simply trash the entire array and build it up again from the linked list.

alt text

14 Comments

Wouldn't that mean you have a permanent copy of all the data in memory? With the memory limitations he talks about, don't think this will help.
For my case, Yannick is right, but it's still an interesting idea, might be an inspiration for similar problems.
Well if were to be used only as a backup to rebuild a fixed size array, you might aswell use an ArrayList :-)
@Yannick If you'd use an arraylist, don't you have a full copy in memory again when you increase the size of the ArrayList.
@Yannick (again, 1st comment now) I don't believe I'd have a copy of all data in memory, if I'm not parsing your comment correctly or miss some fundamentals here please set me straight.
|
3

"Can I grow the array without temporarily having all the values twice?"

Even if you copy the arrays, you're only going to have all the values once. Unless you call clone() on your values, they're passed by reference into the new array.

If you already have your values in memory, the only additional memory expense when copying into a new array is allocating the new Object[] array, which doesn't take much memory at all, as it's just a list of pointers to value objects.

1 Comment

It's a large array of primitives. No object references. Hence copying the array does need at least twice the size of the array allocated.
2

Is the array itself large, or are you referencing large ReferenceTypes?

There is a difference between an array of a PrimitiveType with billions of elements, and an array with thousands of elements, but they refer to large class instances.

int[] largeArrayWithSmallElements = new int[1000000000000];
myClass[] smallArrayWithLargeElements = new myClass[10000];

Edit:

If you have performance considerations using ArrayList, I can assure you it will perform more or less exactly as Array indexing.

And if the application has limited memory resources, you can try to play around with the initial size of the ArrayList (one of it's constructors).

For optimal memory efficiency, you could create a container class with an ArrayList of Arrays.

Something like:

class DynamicList
{
    public long BufferSize;
    public long CurrentIndex;

    ArrayList al = new ArrayList();

    public DynamicList(long bufferSize)
    {
        BufferSize = bufferSize;

        al.add(new long[BufferSize]);
    }

    public void add(long val)
    {
        long[] array;

        int arrayIndex = (int)(CurrentIndex / BufferSize);

        if (arrayIndex > al.size() - 1)
        {
            array = new long[BufferSize];
            al.add(array);
        }
        else
        {
            array = (long[])al.get(arrayIndex);
        }

        array[CurrentIndex % BufferSize] = val;
    }

    public void removeLast()
    {
        CurrentIndex--;
    }

    public long get(long index)
    {
        long[] array;

        int arrayIndex = (int)(index / BufferSize);

        if (arrayIndex < al.size())
        {
            array = (long[])al.get(arrayIndex);
        }
        else
        {
            // throw Exception
        }

        return array[index % BufferSize];
    }
}

(my java is rusty, so please bear with me...)

3 Comments

The array itself is large (millions), the element type is long. Also, there are several thousand such arrays.
The difference being that copying the small array with large elements only takes up the space for the object references, not the objects?
Exactly, hence my question, but since you have an array that takes a lot of memory, I would suggest a different approach when you have memory efficiency in mind.
1

Have a look at System.arraycopy.

2 Comments

This is not what the OP is asking ... A copy of the array is still made when growing.
That might not have been clear in the first revision of my post, I made some clarifications later.
1

AFAIK the only way of growing or reducing an array is doing a System.arraycopy

   /**
    * Removes the element at the specified position in this list.
    * Shifts any subsequent elements to the left (subtracts one from their
    * indices).
    *
    * @param index the index of the element to removed.
    * @return the element that was removed from the list.
    * @throws    IndexOutOfBoundsException if index out of range <tt>(index
    *     &lt; 0 || index &gt;= length)</tt>.
    */
    public static <T> T[] removeArrayIndex(T[] src, int index) {
        Object[] tmp = src.clone();

        int size = tmp.length;
        if ((index < 0) && (index >= size)) {
            throw new ArrayIndexOutOfBoundsException(index);
        }

        int numMoved = size - index - 1;
        if (numMoved > 0) {
            System.arraycopy(tmp, index + 1, tmp, index, numMoved);
        }
        tmp[--size] = null; // Let gc do its work

        return (T[]) Arrays.copyOf(tmp, size - 1);
    }

   /**
    * Inserts the element at the specified position in this list.
    * Shifts any subsequent elements to the rigth (adds one to their indices).
    *
    * @param index the index of the element to inserted.
    * @return the element that is inserted in the list.
    * @throws    IndexOutOfBoundsException if index out of range <tt>(index
    *     &lt; 0 || index &gt;= length)</tt>.
    */
    public static <T> T[] insertArrayIndex(T[] src, Object newData, int index) {
        Object[] tmp = null;
        if (src == null) {
            tmp = new Object[index+1];
        } else {
            tmp = new Object[src.length+1];

            int size = tmp.length;
            if ((index < 0) && (index >= size)) {
                throw new ArrayIndexOutOfBoundsException(index);
            }

            System.arraycopy(src, 0, tmp, 0, index);
            System.arraycopy(src, index, tmp, index+1, src.length-index);
        }

        tmp[index] = newData;

        return (T[]) Arrays.copyOf(tmp, tmp.length);
    }    

3 Comments

I should look this up before I make myself look silly (but time is an issue ATM), but don't you need a throws clause in the method declaration for the ArrayIndexOutOfBoundsException, or is that unchecked?
I would need to double check, but this code is copy&paste from an utility class that I've in production (but I'm not sure if this code is used anymore) ;)
ArrayIndexOutOfBoundsException is definitely unchecked.
1

Obviously, the important bit here is not if you concatenate the arrays or copy them over; what's more important is your array growing strategy. It's not hard to see that a very good way to grow an array is always doubling its size when it becomes full. This way, you will turn the cost of adding an element to O(1) as the actual growing stage will happen only relatively rarely.

1 Comment

Yes, that's one important bit, but I was aware of that one. My question arises because I have a really large array that may occupy a large enough portion of memory that I can not just create a temporary copy.
1

One way of doing this is having a linked list of array nodes. This is somewhat complex but the premise is this:

You have a linked list and each node within the list references an array. This way, your array can grow without ever copying. To grow you only need to add additional nodes at the end. Therefore the 'expensive' grow operation only occurs every M operations where M is the size of each node. Granted, this assumes that you always append to the end and you don't remove.

Insertion and removal in this structure is quite complicated, but if you can avoid them then that's perfect.

The only loss with this structure (ignoring insertion and deletion) is with the gets. The gets will be slightly longer; accessing the correct node requires accessing the correct node within the linked list and then fetching there. If there are a lot of accesses around the middle, this can be slow however there are tricks to speeding linked lists up.

Comments

1

Have you looked at GNU Trove for highly efficient java collections? Their collections store primatives directly for much better memory usage.

Comments

0

Heres a benchmark of time taken to add and remove elements from a collection/arraylist/vector

1 Comment

That's Java 1.2 Those figures are probably very out of date.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.