2

I have a list of tuples as follows:

[(x,{'y':'1,3','z':'2'}),
(y,{'a':'4'}),
(z,{'b':'2,3'})]

I need to convert this to a numpy array format as follows:

    x   y   z   a   b
x   0   1,3 2   0   0
y   1,3 0   0   4   0
z   2   0   0   0   2,3
a   0   4   0   0   0
b   0   0   2,3 0   0

To support this,store the node-name as a list to give them mapping indices.

[x,y,z,a,b]

Given the indices - what is the most efficient to way to create the numpy array from this structure?
Also - as new entries come into the original list of tuples,it will add into the index list and the numpy array as appropriate.

Edit of an existing element will not happen.

Help is appreciated.

2
  • 2
    is your comma a "decimal" comma as in the German and French locales? Commented Jul 1, 2013 at 10:54
  • No - its just a simple english separator.I may even may it a list but the problem statement would remain the same.The intersection value would become the matrix element/entry. Commented Jul 1, 2013 at 10:57

2 Answers 2

1

If you use object dtypes you can build your array in the approach below. Since you need a 2D symmetry, it is easier to create a 2D array first and from this build the structured array:

import numpy as np
o = ['x','y','z','a','b']
a = np.zeros((len(o),len(o)),dtype=object)
s  =[('x',{'y':'1,3','z':'2'}), ('y',{'a':'4'}), ('z',{'b':'2,3'})]
for vi in s:
    i = o.index(vi[0])
    for vj in vi[1].items():
        j = o.index(vj[0])
        a[i,j] = vj[1]
        a[j,i] = a[i,j]

# building the structured array
b = np.zeros((len(o),), dtype=[(i,object) for i in o])
for i,vi in enumerate(o):
    b[vi] = a[i,:]

# building a dictionary to access the values
d = dict(( (vi, dict(( (vj, a[i,j]) for j,vj in enumerate(o) ))) for i,vi in enumerate(o) ))
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks - needed help in understanding a syntax b = np.zeros((len(o),), dtype=[(i,object) for i in o]) Wouldn't this be same as - b = np.zeros((len(o),len(o)), dtype=object)
The first is creating the structured array to receive the values and the second is creating a 2D array to receive the values.... I thing the structured array is what you mean by using mapping indices... From a you access the values like a[i,j] and from b like b['x'][i]
Is there a way to create the structured array such that it is accessible as b['x']['y'],and thus bypass indices completely?
then you should build a dictionary: I've updated the answer with this third option...
0

A more numpythonic version... The values are stored as strings. That can be changed, but you will probably need to better define the syntax of your input list of dicts:

import numpy as np
import operator as op

data = [('x', {'y' : '1,3', 'z' : '2'}),
        ('y', {'a' : '4'}),
        ('z', {'b' : '2,3'})]

keys = np.array(['x', 'y', 'z', 'a', 'b'])
keys_sort = np.argsort(keys)

rows = [(item[0], item[1].keys(), item[1].values()) for item in data]


rows = np.array(reduce(op.add, ([item[0]]*len(item[1]) for item in data)))
cols = np.array(reduce(op.add, (item[1].keys() for item in data)))
vals = np.array(reduce(op.add, (item[1].values() for item in data)))

row_idx = keys_sort[np.searchsorted(keys, rows, sorter=keys_sort)]
col_idx = keys_sort[np.searchsorted(keys, cols, sorter=keys_sort)]

out_arr = np.empty((len(keys), len(keys)), dtype=vals.dtype)
out_arr[:] = '0'
out_arr[row_idx, col_idx] = vals
out_arr[col_idx, row_idx] = vals

>>> out_arr
array([['0', '1,3', '2', '0', '0'],
       ['1,3', '0', '0', '4', '0'],
       ['2', '0', '0', '0', '2,3'],
       ['0', '4', '0', '0', '0'],
       ['0', '0', '2,3', '0', '0']], 
      dtype='|S3')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.