How to combine multiple numpy arrays into a dictionary list

Question

I have the following array:

column_names = ['id', 'temperature', 'price']

And three numpy array as follows:

idArry = ([1,2,3,4,....])

tempArry = ([20.3,30.4,50.4,.....])

priceArry = ([1.2,3.5,2.3,.....])

I wanted to combine the above into a dictionary as follows:

table_dict = ( {'id':1, 'temperature':20.3, 'price':1.2 },
               {'id':2, 'temperature':30.4, 'price':3.5},...)

I can use a for loop together with append to create the dictionary but the list is huge at about 15000 rows. Can someone show me how to use python zip functionality or other more efficient and fast way to achieve the above requirement?

Do you need a list of dicts or a tuple of dicts? In your case you have a tuple. — Mykola Zotko
– Mykola Zotko, Commented Feb 9, 2019 at 15:24

Mykola Zotko · Accepted Answer · 2019-02-09 16:29:50Z

3

You can use a listcomp and the function zip():

[{'id': i, 'temperature': j, 'price': k} for i, j, k in zip(idArry, tempArry, priceArry)]
# [{'id': 1, 'temperature': 20.3, 'price': 1.2}, {'id': 2, 'temperature': 30.4, 'price': 3.5}]

If your ids are 1, 2, 3... and you use a list you don’t need ids in your dicts. This is a redundant information in the list.

[{'temperature': i, 'price': j} for i, j in zip(tempArry, priceArry)]

You can use also a dict of dicts. The lookup in the dict must be faster than in the list.

{i: {'temperature': j, 'price': k} for i, j, k in zip(idArry, tempArry, priceArry)}
# {1: {'temperature': 20.3, 'price': 1.2}, 2: {'temperature': 30.4, 'price': 3.5}}

edited Feb 9, 2019 at 16:29

answered Feb 9, 2019 at 15:08

Mykola Zotko

18.2k6 gold badges88 silver badges90 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

jwalton · Accepted Answer · 2019-02-09 15:28:46Z

1

I'd take a look at the functionality of the pandas package. In particular there is a pandas.DataFrame.to_dict method.

I'm confident that for large arrays this method should be pretty fast (though I'm willing to have the zip method proved more efficient).

In the example below I first construct a pandas dataframe from your arrays and then use the to_dict method.

import numpy as np
import pandas as pd

column_names = ['id', 'temperature', 'price']

idArry = np.array([1, 2, 3])
tempArry = np.array([20.3, 30.4, 50.4])
priceArry = np.array([1.2, 3.5, 2.3])

df = pd.DataFrame(np.vstack([idArry, tempArry, priceArry]).T, columns=column_names)

table_dict = df.to_dict(orient='records')

edited Feb 9, 2019 at 15:28

answered Feb 9, 2019 at 15:11

jwalton

5,7561 gold badge22 silver badges45 bronze badges

Comments

Tdaw · Accepted Answer · 2019-02-09 15:54:43Z

1

This could work. Enumerate is used to create a counter that starts at 0 and then each applicable value is pulled out of your tempArry and priceArray. This also creates a generator expression which helps with memory (especially if your lists are really large).

new_dict = ({'id': i + 1 , 'temperature': tempArry[i], 'price': priceArry[i]} for i, _ in enumerate(idArry))

edited Feb 9, 2019 at 15:54

answered Feb 9, 2019 at 15:11

Tdaw

1814 bronze badges

2 Comments

Mykola Zotko Over a year ago

The ids of OP start with 1 and your ids start with 0.

Tdaw Over a year ago

@MykolaZotko sorry, didn't notice that. Its fixed now to have the ID's start at 1.

Jay · Accepted Answer · 2019-02-09 16:00:56Z

1

You can use list-comprehension to achieve this by just iterating over one of the arrays:

[{'id': idArry[i], 'temperature': tempArry[i], 'price': priceArry[i]} for i in range(len(idArry))]

answered Feb 9, 2019 at 16:00

Jay

25.1k25 gold badges99 silver badges143 bronze badges

Comments

iGian · Accepted Answer · 2019-02-09 16:48:40Z

You could build a NumPy matrix then convert to a dictionary as follow. Given your data (I changed the values just for example):

import numpy as np

idArry = np.array([1,2,3,4])
tempArry = np.array([20,30,50,40])
priceArry = np.array([200,300,100,400])

Build the matrix:

table = np.array([idArry, tempArry, priceArry]).transpose()

Create the dictionary:

dict_table = [ dict(zip(column_names, values)) for values in table ]
#=> [{'id': 2, 'temperature': 30, 'price': 300}, {'id': 3, 'temperature': 50, 'price': 100}, {'id': 4, 'temperature': 40, 'price': 400}]

I don't know the purpose, but maybe you can also use the matrix as follow.

temp_col = table[:,1]

table[temp_col >= 40]
# [[  3  50 100]
#  [  4  40 400]]

yatu · Accepted Answer · 2019-02-09 18:20:02Z

A way to do it would be as follows:

column_names = ['id', 'temperature', 'price']

idArry = ([1,2,3,4])
tempArry = ([20.3,30.4,50.4, 4])
priceArry = ([1.2,3.5,2.3, 4.5])

You could zip all elements in the different list:

l = zip(idArry,tempArry,priceArry)

print(list(l))
[(1, 20.3, 1.2), (2, 30.4, 3.5), (3, 50.4, 2.3), (4, 4, 4.5)]

And append the inner dictionaries to a list using a list comprehension and by iterating over the elements in l as so:

[dict(zip(column_names, next(l))) for i in range(len(idArry))]

[{'id': 1, 'temperature': 20.3, 'price': 1.2},
 {'id': 2, 'temperature': 30.4, 'price': 3.5},
 {'id': 3, 'temperature': 50.4, 'price': 2.3},
 {'id': 4, 'temperature': 4, 'price': 4.5}]

The advantage of using this method is that it only uses built-in methods and that it works for an arbitrary amount of column_names.

Collectives™ on Stack Overflow

How to combine multiple numpy arrays into a dictionary list

6 Answers 6

Comments

Comments

2 Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

Comments

Comments

2 Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related