1

I have an array of arrays as follows:

segments[:30]
array([array([ 131.2]), array([ 124.1]), 0.23679025210440158,
       array([ 133.65]), array([ 123.3]), 0.3221912760287523,
       array([ 116.7]), array([ 147.7]), 0.24318619072437286,
       array([ 102.3]), array([ 120.55]), 0.07436020392924547,
       array([ 130.25]), array([ 100.5625]), 0.029634355247253552,
       array([ 143.6]), array([ 132.4]), 0.5843092009425164,
       array([ 151.65]), array([ 131.6]), 0.4865431547164917,
       array([ 143.3]), array([ 152.05]), 0.2774583905003965,
       array([ 111.65]), array([ 125.]), 0.23880321211181582,
       array([ 123.1875]), array([ 79.5625]), 0.1562070251966361], dtype=object)

I would like to get rid of array([ 131.2]) and extract only the value 131.2.

My expected output is:

array([131.2, 124.1, 0.23679025210440158,
           133.65, 123.3, 0.3221912760287523,
           116.7,147.7, 0.24318619072437286,
           102.3, 120.55, 0.07436020392924547,....])

I have tried the following:

np.array(segments)

but it doesn't make any change to my data.

2 Answers 2

2

Method 1: list comprehension

One way would be to iterate through and extract the floats when your values are np.arrays with a list comprehension:

np.array([i[0] if isinstance(i, np.ndarray) else i for i in segments])

Which returns:

array([1.31200000e+02, 1.24100000e+02, 2.36790252e-01, 1.33650000e+02,
       1.23300000e+02, 3.22191276e-01, 1.16700000e+02, 1.47700000e+02,
       2.43186191e-01, 1.02300000e+02, 1.20550000e+02, 7.43602039e-02,
       1.30250000e+02, 1.00562500e+02, 2.96343552e-02, 1.43600000e+02,
       1.32400000e+02, 5.84309201e-01, 1.51650000e+02, 1.31600000e+02,
       4.86543155e-01, 1.43300000e+02, 1.52050000e+02, 2.77458391e-01,
       1.11650000e+02, 1.25000000e+02, 2.38803212e-01, 1.23187500e+02,
       7.95625000e+01, 1.56207025e-01])

This is a naive but straightforward way to do thing. But this could be very slow on a very large array.

Method 2: Reshaping

If your structure is always the same as your example, i.e. 2 arrays followed by a float, then you can reshape your array, extract the floats from 2 out of every 3 values, and then concatenate the data back together in the same order:

x = segments.reshape(-1,3)

f = np.concatenate(x[:,[0,1]].flatten()).reshape(-1,2)

l = x[:,2].reshape(-1,1)

np.concatenate((f,l),1).flatten()

Which returns:

array([131.2, 124.1, 0.23679025210440158, 133.65, 123.3,
       0.3221912760287523, 116.7, 147.7, 0.24318619072437286, 102.3,
       120.55, 0.07436020392924547, 130.25, 100.5625,
       0.029634355247253552, 143.6, 132.4, 0.5843092009425164, 151.65,
       131.6, 0.4865431547164917, 143.3, 152.05, 0.2774583905003965,
       111.65, 125.0, 0.23880321211181582, 123.1875, 79.5625,
       0.1562070251966361], dtype=object)

Explanation

Just to aid visualizing what was happening here, you can look at the reshaped data I extracted before concatenating back together.

>>> x
array([[array([131.2]), array([124.1]), 0.23679025210440158],
       [array([133.65]), array([123.3]), 0.3221912760287523],
       [array([116.7]), array([147.7]), 0.24318619072437286],
       [array([102.3]), array([120.55]), 0.07436020392924547],
       [array([130.25]), array([100.5625]), 0.029634355247253552],
       [array([143.6]), array([132.4]), 0.5843092009425164],
       [array([151.65]), array([131.6]), 0.4865431547164917],
       [array([143.3]), array([152.05]), 0.2774583905003965],
       [array([111.65]), array([125.]), 0.23880321211181582],
       [array([123.1875]), array([79.5625]), 0.1562070251966361]],
      dtype=object)

>>> f
array([[131.2   , 124.1   ],
       [133.65  , 123.3   ],
       [116.7   , 147.7   ],
       [102.3   , 120.55  ],
       [130.25  , 100.5625],
       [143.6   , 132.4   ],
       [151.65  , 131.6   ],
       [143.3   , 152.05  ],
       [111.65  , 125.    ],
       [123.1875,  79.5625]])
>>> l
array([[0.23679025210440158],
       [0.3221912760287523],
       [0.24318619072437286],
       [0.07436020392924547],
       [0.029634355247253552],
       [0.5843092009425164],
       [0.4865431547164917],
       [0.2774583905003965],
       [0.23880321211181582],
       [0.1562070251966361]], dtype=object)
Sign up to request clarification or add additional context in comments.

2 Comments

Your first list comprehension is faster - at least for this sample.
That's quite likely, but I'm assuming for very large arrays, this would not be the case. Can't believe I overlooked hstack! For some reason I was under the impression it wouldn't work. +1 for your answer!
2

concatenate makes all elements arrays, but has problems with dimensions. Some are 1d, some 0d:

In [109]: np.concatenate(arr)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-109-062a20dcc2f7> in <module>()
----> 1 np.concatenate(arr)

ValueError: all the input arrays must have same number of dimensions

hstack works because it first converts everything to 1d arrays with [atleast_1d(_m) for _m in tup]:

In [110]: np.hstack(arr)
Out[110]: 
array([1.31200000e+02, 1.24100000e+02, 2.36790252e-01, 1.33650000e+02,
       1.23300000e+02, 3.22191276e-01, 1.16700000e+02, 1.47700000e+02,
       2.43186191e-01, 1.02300000e+02, 1.20550000e+02, 7.43602039e-02,
       1.30250000e+02, 1.00562500e+02, 2.96343552e-02, 1.43600000e+02,
       1.32400000e+02, 5.84309201e-01, 1.51650000e+02, 1.31600000e+02,
       4.86543155e-01, 1.43300000e+02, 1.52050000e+02, 2.77458391e-01,
       1.11650000e+02, 1.25000000e+02, 2.38803212e-01, 1.23187500e+02,
       7.95625000e+01, 1.56207025e-01])

The result is numeric dtype (not object).

Processing an object array requires some sort of python level iteration - except for limited operations like reshape which don't actually manipulate the elements. And iteration on an object is slower than iteration on a list (but faster than Python level iteration on a numeric array).


In [113]: [np.atleast_1d(i) for i in arr]   # consistent dimensions
Out[113]: 
[array([131.2]),
 array([124.1]),
 array([0.23679025]),
 array([133.65]),
 array([123.3]),
 ...]

In [116]: [np.asarray(i) for i in arr]  # mixed dimensions
Out[116]: 
[array([131.2]),
 array([124.1]),
 array(0.23679025),
 array([133.65]),
 array([123.3]),...]

Internally atleast_1d does some testing on the dimensions. It also works with *args so we can write

In [123]: np.atleast_1d(*arr)
Out[123]: 
[array([131.2]),
 array([124.1]),
 array([0.23679025]),
 array([133.65]),
 array([123.3]),
 ...]

and hence

np.concatenate(np.atleast_1d(*arr))

Timing test show that @sacul's 'naive' list comprehension is fastest: np.array([i[0] if isinstance(i, np.ndarray) else i for i in segments])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.