6

In the code below I am building data up in a nested list. After the for loop what I would like is to cast it into a multidimensional Numpy array as neatly as possible. However, when I do the array conversion on it, it only seems to convert the outer list into an array. Even worse when I continue downward I wind up with dataPoints as shape (100L,)...so an array of lists where each list is my data (obviously I wanted a (100,3)). I have tried fooling with numpy.asanyarray() also but I can't seem to work it out. I would really like a 3d array from my 3d list from the outset if that is possible. If not, how can I get the array of lists into a 2d array without having to iterate and convert them all?

Edit: I am also open to better way of structuring the data from the outset if it makes processing easier. However, it is coming over a serial port and the size is not known beforehand.

import numpy as np
import time

data = []
for _i in range(100):   #build some list of lists
    d = [np.random.rand(), np.random.rand(), np.random.rand()]
    data.append([d,time.clock()])

dataArray = np.array(data)  #now I have an array of lists of a list(of data) and a time
dataPoints = dataArray[:,0] #this is the data in an array of lists
2
  • 1
    You don't have a 3d nested list, you have a mix of lists and scalars. data is a list that contains objects that look like this: [[0.434,0.34,0.22],0.2]. That is a mixed object so numpy wouldn't know what to do with it. Commented Dec 6, 2012 at 15:43
  • This is true of the original data object, which is why I wasn't sure it was possible from there. However the dataPoints object is an array of lists of floats which I can't seem to get into a 2d array either. Commented Dec 6, 2012 at 15:48

2 Answers 2

8

dataPoints is not a 2d list. Convert it first into a 2d list and then it will work:

d=np.array(dataPoints.tolist())

Now d is (100,3) as you wanted.

Sign up to request clarification or add additional context in comments.

5 Comments

Yes, dataPoints is an array of lists. This does work, is it the best way from after the for loop forward? I wind up converting to array(for the slicing ability), back to list (to get the right shape), then back to array.
Skip the array conversion and slicing. Append only d (leave time.clock() out, since you are slicing it out later). This will give you a list of lists which you can then convert into an array. Or, better yet, start with a numpy array in the first place and don't use lists.
I need the timestamp in other parts of the code, and I don't know the size at the time/the data trickles in over a serial port. But your solution is the best I've found so far.
@MattAnderson Why aren't you just appending the timestamp to the the number triplet so you get a (100,4) array? What dimensions would you want your final array to be?
I guess because that would be too simple. Probably just because in my mind one was a time and one was data, I didn't consider that time in this case is just another data. Thanks for your help
-1

If a 2d array is what you want:

from itertools import chain
dataArray = np.array(list(chain(*data)),shape=(100,3))

I didn't work out the code so you may have to change the column/row ordering to get the shape to match.

2 Comments

If I understand this correctly, it is just iterating through and doing an np.array() on each list. I guess I just feel like there is some better way, if it turns out there isn't I will accept this.
docs.scipy.org/doc/numpy-1.10.0/reference/generated/… I am not sure where you are getting the shape param from.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.