1

Which is the most performant way to convert something like that

problem = [ [np.array([1,2,3]), np.array([4,5])],
            [np.array([6,7,8]), np.array([9,10])]]

into

desired = np.array([[1,2,3,4,5], 
                   [6,7,8,9,10]])

Unfortunately, the final number of columns and rows (and length of subarrays) is not known in advance, as the subarrays are read from a binary file, record by record.

5
  • So, the number of elems in each element of the list would be the same, like it's 5 here? Commented Nov 23, 2016 at 9:58
  • Does bmat work? Commented Nov 23, 2016 at 11:06
  • The number for each row is the same, so no padding or else is required. Commented Nov 23, 2016 at 13:11
  • Then, I guess the fastest one would be with a traditional loop, intialize output array and use np.concatenate iteratively to assign for each row as listed in @Carles Mitjans's solution. Commented Nov 23, 2016 at 15:32
  • A similar question with answer by Warren: stackoverflow.com/questions/39128514/… Commented Nov 23, 2016 at 17:54

2 Answers 2

5

How about this:

problem = [[np.array([1,2,3]), np.array([4,5])],
        [np.array([6,7,8]), np.array([9,10])]]

print np.array([np.concatenate(x) for x in problem])
Sign up to request clarification or add additional context in comments.

Comments

2

I think this:

print np.array([np.hstack(i) for i in problem])

Using your example, this runs in 0.00022s, wherease concatenate takes 0.00038s

You can also use apply_along_axis although this runs in 0.00024s:

print np.apply_along_axis(np.hstack, 1, problem)

6 Comments

looks good, but i believe that does it take 2 allocation processes? First for each row, and then every row gets copied into the large array?
Look at bmat code - an hstack for each row, and vstack to join the rows. If the list was flattened, you could use one concatenate and then reshape. I don't think the time differences are significant.
np.concatenate seems to be faster than stacking or bmat: problem = [ [np.array([.1]*5000)] * 5 ] * 10000 solution = np.array([np.concatenate(x) for x in problem])
Hstack uses concatenate.
maybe, but there'se a clear difference for me (python 3.5, numpy 1.11)
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.