6

Say I wanted to create an array (NOT list) of 1,000,000 twos in python, like this:

array = [2, 2, 2, ...... , 2]

What would be a fast but simple way of doing it?

7
  • 1
    I know practically no Python, but could it be something like array = [2 for x in 1..1000000]? Commented Jul 9, 2010 at 15:52
  • This previous question might help - stackoverflow.com/questions/1859864/… Commented Jul 9, 2010 at 15:52
  • @mmyers: Your suggestion is not valid syntax; you possibly mean [2 for x in xrange(1000000)]; [2] * 1000000 would be faster and simpler; however these produce a list -- array and list mean different things in Python. Commented Jul 9, 2010 at 20:47
  • @John: mmyers had said he doesn't practically know python. so stop nitpicking :) Ofcourse appreciate the suggestions. Commented Jul 9, 2010 at 20:55
  • @Vijay Dev: Please stop conflating "educating" and "nitpicking". If @mmyers were to ask a question, I'd be glad to supply references to manuals and tutorials. Who appreciates what suggestions?? Commented Jul 9, 2010 at 21:13

6 Answers 6

19

The currently-accepted answer is NOT the fastest way using array.array; at least it's not the slowest -- compare these:

[source: johncatfish (quoting chauncey), Bartek]
python -m timeit -s"import array" "arr = array.array('i', (2 for i in range(0,1000000)))"
10 loops, best of 3: 543 msec per loop

[source: g.d.d.c]
python -m timeit -s"import array" "arr = array.array('i', [2] * 1000000)"
10 loops, best of 3: 141 msec per loop

python -m timeit -s"import array" "arr = array.array('i', [2]) * 1000000"
100 loops, best of 3: 15.7 msec per loop

That's a ratio of about 9 to 1 ...

Sign up to request clarification or add additional context in comments.

2 Comments

+1 and I've updated my answer with the other syntax and a comment. Thank you for pointing it out.
slooow :) ... hybrid of those two version is better
9

Is this what you're after?

# slower.
twosArr = array.array('i', [2] * 1000000)

# faster.
twosArr = array.array('i', [2]) * 1000000

You can get just a list with this:

twosList = [2] * 1000000

-- EDITED --

I updated this to reflect information in another answer. It would appear that you can increase the speed by a ratio of ~ 9 : 1 by adjusting the syntax slightly. Full credit belongs to @john-machin. I wasn't aware you could multiple the array object the same way you could do to a list.

Comments

5

A hybrid approach works fastest for me

$ python -m timeit -s"import array" "arr = array.array('i', [2]*100) * 10000"
100 loops, best of 3: 5.38 msec per loop

$ python -m timeit -s"import array" "arr = array.array('i', [2]) * 1000000"
10 loops, best of 3: 20.3 msec per loop
$ python -m timeit -s"import array" "arr = array.array('i', [2]*10) * 100000"
100 loops, best of 3: 6.69 msec per loop
$ python -m timeit -s"import array" "arr = array.array('i', [2]*100) * 10000"
100 loops, best of 3: 5.38 msec per loop
$ python -m timeit -s"import array" "arr = array.array('i', [2]*1000) * 1000"
100 loops, best of 3: 5.47 msec per loop
$ python -m timeit -s"import array" "arr = array.array('i', [2]*10000) * 100"
100 loops, best of 3: 6.13 msec per loop
$ python -m timeit -s"import array" "arr = array.array('i', [2]*100000) * 10"
10 loops, best of 3: 14.9 msec per loop
$ python -m timeit -s"import array" "arr = array.array('i', [2]*1000000)"
10 loops, best of 3: 77.7 msec per loop

Comments

3

Using the timeit module you can kind of figure out what the fastest of doing this is:

First off, putting that many digits in a list will kill your machine most likely as it will store it in memory.

However, you can test the execution using something like so. It ran on my computer for a long time before I just gave up, but I'm on an older PC:

timeit.Timer('[2] * 1000000').timeit()

Ther other option you can look into is using the array module which is as stated, efficient arrays of numeric values

array.array('i', (2 for i in range(0, 1000000)))

I did not test the completion time of both but I'm sure the array module, which is designed for number sets will be faster.

Edit: Even more fun, you could take a look at numpy which actually seems to have the fastest execution:

from numpy import *
array( [2 for i in range(0, 1000000)])

Even faster from the comments:

a = 2 * ones(10000000)

Awesome!

3 Comments

Numpy also has dedicated factory functions: a = 2 * ones(1000000)
@Philipp: That's awesome! This is why I love SO. Curiosity to answer a question leads to many learnings for myself. Cheers :-)
If you can't fit a million-element list or array into your machine's memory, it's dead already. Also, I don't understand "It ran on my computer for a long time" ... see my answer for (a) how to do simple timing using timeit at the command prompt (b) how small the measured times (milliseconds!) are (4-year-old laptop running Win XP SP2)
1
aList = [2 for x in range(1000000)]

or base on chauncey link

anArray =array.array('i', (2 for i in range(0,1000000)))

Comments

1

If the initial value doesn't have to be non-zero and if you have /dev/zero available on your platform, the following is about 4.7 times faster than the array('L',[0])*size solution:

myarray = array.array('L')
f = open('/dev/zero', 'rb')
myarray.fromfile(f, size)
f.close()

In question How to initialise an integer array.array object with zeros in Python I'm looking for a better way.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.