5

I have the following array:

column_names = ['id', 'temperature', 'price']

And three numpy array as follows:

idArry = ([1,2,3,4,....])

tempArry = ([20.3,30.4,50.4,.....])

priceArry = ([1.2,3.5,2.3,.....])

I wanted to combine the above into a dictionary as follows:

table_dict = ( {'id':1, 'temperature':20.3, 'price':1.2 },
               {'id':2, 'temperature':30.4, 'price':3.5},...)

I can use a for loop together with append to create the dictionary but the list is huge at about 15000 rows. Can someone show me how to use python zip functionality or other more efficient and fast way to achieve the above requirement?

1
  • 1
    Do you need a list of dicts or a tuple of dicts? In your case you have a tuple. Commented Feb 9, 2019 at 15:24

6 Answers 6

3

You can use a listcomp and the function zip():

[{'id': i, 'temperature': j, 'price': k} for i, j, k in zip(idArry, tempArry, priceArry)]
# [{'id': 1, 'temperature': 20.3, 'price': 1.2}, {'id': 2, 'temperature': 30.4, 'price': 3.5}]

If your ids are 1, 2, 3... and you use a list you don’t need ids in your dicts. This is a redundant information in the list.

[{'temperature': i, 'price': j} for i, j in zip(tempArry, priceArry)]

You can use also a dict of dicts. The lookup in the dict must be faster than in the list.

{i: {'temperature': j, 'price': k} for i, j, k in zip(idArry, tempArry, priceArry)}
# {1: {'temperature': 20.3, 'price': 1.2}, 2: {'temperature': 30.4, 'price': 3.5}}
Sign up to request clarification or add additional context in comments.

Comments

1

I'd take a look at the functionality of the pandas package. In particular there is a pandas.DataFrame.to_dict method.

I'm confident that for large arrays this method should be pretty fast (though I'm willing to have the zip method proved more efficient).

In the example below I first construct a pandas dataframe from your arrays and then use the to_dict method.

import numpy as np
import pandas as pd

column_names = ['id', 'temperature', 'price']

idArry = np.array([1, 2, 3])
tempArry = np.array([20.3, 30.4, 50.4])
priceArry = np.array([1.2, 3.5, 2.3])

df = pd.DataFrame(np.vstack([idArry, tempArry, priceArry]).T, columns=column_names)

table_dict = df.to_dict(orient='records')

Comments

1

This could work. Enumerate is used to create a counter that starts at 0 and then each applicable value is pulled out of your tempArry and priceArray. This also creates a generator expression which helps with memory (especially if your lists are really large).

new_dict = ({'id': i + 1 , 'temperature': tempArry[i], 'price': priceArry[i]} for i, _ in enumerate(idArry))

2 Comments

The ids of OP start with 1 and your ids start with 0.
@MykolaZotko sorry, didn't notice that. Its fixed now to have the ID's start at 1.
1

You can use list-comprehension to achieve this by just iterating over one of the arrays:

[{'id': idArry[i], 'temperature': tempArry[i], 'price': priceArry[i]} for i in range(len(idArry))]

Comments

1

You could build a NumPy matrix then convert to a dictionary as follow. Given your data (I changed the values just for example):

import numpy as np

idArry = np.array([1,2,3,4])
tempArry = np.array([20,30,50,40])
priceArry = np.array([200,300,100,400])

Build the matrix:

table = np.array([idArry, tempArry, priceArry]).transpose()

Create the dictionary:

dict_table = [ dict(zip(column_names, values)) for values in table ]
#=> [{'id': 2, 'temperature': 30, 'price': 300}, {'id': 3, 'temperature': 50, 'price': 100}, {'id': 4, 'temperature': 40, 'price': 400}]


I don't know the purpose, but maybe you can also use the matrix as follow.

temp_col = table[:,1]

table[temp_col >= 40]
# [[  3  50 100]
#  [  4  40 400]]

Comments

1

A way to do it would be as follows:

column_names = ['id', 'temperature', 'price']

idArry = ([1,2,3,4])
tempArry = ([20.3,30.4,50.4, 4])
priceArry = ([1.2,3.5,2.3, 4.5])

You could zip all elements in the different list:

l = zip(idArry,tempArry,priceArry)

print(list(l))
[(1, 20.3, 1.2), (2, 30.4, 3.5), (3, 50.4, 2.3), (4, 4, 4.5)]

And append the inner dictionaries to a list using a list comprehension and by iterating over the elements in l as so:

[dict(zip(column_names, next(l))) for i in range(len(idArry))]

[{'id': 1, 'temperature': 20.3, 'price': 1.2},
 {'id': 2, 'temperature': 30.4, 'price': 3.5},
 {'id': 3, 'temperature': 50.4, 'price': 2.3},
 {'id': 4, 'temperature': 4, 'price': 4.5}]

The advantage of using this method is that it only uses built-in methods and that it works for an arbitrary amount of column_names.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.