Build 2D numpy array from items in list of tuples

Question

Given a python list of tuples such as:

test = [(1, 'string1', 47.9, -112.8, 6400.0),
        (2, 'string2', 29.7, -90.8, 11.0),
        (3, 'string3', 30.8, -99.1, 1644.0),
        (4, 'string4', 45.8, -110.9, 7500.0),
        (5, 'string5', 43.9, -69.8, 25.0)]

What is the most efficient way to build a 2D numpy array using the 3rd and 4th items from each tuple?

Desired output is:

array([[47.9, 29.7, 30.8, 45.8, 43.9],
       [-112.8, -90.8, -99.1, -110.9, -69.8]])

a_guest · Accepted Answer · 2019-01-25 16:55:56Z

3

You can prepare the data outside numpy using a list comprehension which selects the 3rd and 4th item. Then you only need to transpose the resulting array:

np.array([x[2:4] for x in test]).T

answered Jan 25, 2019 at 16:55

a_guest

36.7k15 gold badges75 silver badges137 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

akuiper · Accepted Answer · 2019-01-25 16:59:06Z

2

zip the list, slice it using itertools.islice:

from itertools import islice

np.array(list(islice(zip(*test), 2, 4)))
# array([[  47.9,   29.7,   30.8,   45.8,   43.9],
#        [-112.8,  -90.8,  -99.1, -110.9,  -69.8]])

answered Jan 25, 2019 at 16:59

akuiper

216k33 gold badges362 silver badges379 bronze badges

Comments

Dani Mesejo · Accepted Answer · 2019-01-25 17:15:11Z

1

You could transform the list of tuples directly into an array then use slicing and transposing to get the desired output:

import numpy as np

test = [(1, 'string1', 47.9, -112.8, 6400.0),
        (2, 'string2', 29.7, -90.8, 11.0),
        (3, 'string3', 30.8, -99.1, 1644.0),
        (4, 'string4', 45.8, -110.9, 7500.0),
        (5, 'string5', 43.9, -69.8, 25.0)]

arr = np.array(test, dtype=object)
result = arr[:, 2:4].T.astype(np.float32)
print(result)

Output

[[  47.9   29.7   30.8   45.8   43.9]
 [-112.8  -90.8  -99.1 -110.9  -69.8]]

Note that after doing arr = np.array(test) everything is done at numpy level.

edited Jan 25, 2019 at 17:15

answered Jan 25, 2019 at 16:54

Dani Mesejo

62.2k6 gold badges56 silver badges86 bronze badges

4 Comments

pjw Over a year ago

Yes, this is likely the most efficient since it avoids list comprehension (everything is done within numpy).

javidcf Over a year ago

For this method, I would do arr = np.array(test, dtype=object). As it stands, float values are converted to string and then converted back to float, which may result in loss of precision.

hpaulj Over a year ago

All the action may be in np.array, but that doesn't mean it is faster. Reading the list, converting to string, and then to float takes time. Loading a object dtype saves time. But for this small sample, selecting the columns first with a list comprehension is faster.

pjw Over a year ago

My real use-case will have len(test) of about 20,000

OmG · Accepted Answer · 2019-01-25 16:54:20Z

1

the first list is:

the_first = [item[2] for item in test]

and second is:

 second = [item[3] for item in test]

and the result is:

 result = np.array([the_first, second])

answered Jan 25, 2019 at 16:54

OmG

19k13 gold badges69 silver badges96 bronze badges

Comments

iz_ · Accepted Answer · 2019-01-25 16:54:38Z

0

You can try this:

import numpy as np

test = [(1, 'string1', 47.9, -112.8, 6400.0), (2, 'string2', 29.7, -90.8, 11.0), (3, 'string3', 30.8, -99.1, 1644.0), (4, 'string4', 45.8, -110.9, 7500.0), (5, 'string5', 43.9, -69.8, 25.0)]

result = np.array([(item[3], item[4]) for item in test]).T
print(result)

# array([[-112.8,  -90.8,  -99.1, -110.9,  -69.8],
#       [6400. ,   11. , 1644. , 7500. ,   25. ]])

answered Jan 25, 2019 at 16:54

iz_

16.7k4 gold badges29 silver badges43 bronze badges

Collectives™ on Stack Overflow

Build 2D numpy array from items in list of tuples

5 Answers 5

Comments

Comments

4 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Comments

Comments

4 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related