4

I just am having a problem with NumPy dtypes. Essentially I'm trying to create a table that looks like the following (and then save it using rec2csv):

      name1   name2   name3 . . . 
name1  #       #      #
name2  #       #      #
name2  #       #      #
.
.
.

The matrix (numerical array in the center), is already computed before I attempt to add the name tags. I've tried to use the following code:

    dt = dtype({'names' : tuple(blah), 'formats' : tuple(fmt)}) 
    ReadArray = array(tuplelist, dtype=dt)

where tuplelist is a list of rows (i.e. the row [name1, #, #, #...]), blah is a list of strings (i.e. the names, blah = ['name1', 'name2', ...]) and fmt is the list of format,s (i.e. fmt = [str, float, float, ...]).

The error I'm getting is the following:

Traceback (most recent call last):

  File "<stdin>", line 1, in <module>
  File "table_calc_try2.py", line 152, in table_calc_try2
    dt = dtype({'names' : tuple(blah), 'formats' : tuple(fmt)}) 
TypeError: data type not understood

Can anyone help?

Thanks!

1 Answer 1

12

The following code might help:

import numpy as np

dt = np.dtype([('name1', '|S10'), ('name2', '<f8')])
tuplelist=[
    ('n1', 1.2),
    ('n2', 3.4),    
     ]
arr = np.array(tuplelist, dtype=dt)

print(arr['name1'])
# ['n1' 'n2']
print(arr['name2'])
# [ 1.2  3.4]

Your immediate problem was that np.dtype expects the format specifiers to be numpy types, such as '|S10' or '<f8' and not Python types, such as str or float. If you type help(np.dtype) you'll see many examples of how np.dtypes can be specified. (I've only mentioned a few.)

Note that np.array expects a list of tuples. It's rather particular about that.

A list of lists raises TypeError: expected a readable buffer object.

A (tuple of tuples) or a (tuple of lists) raises ValueError: setting an array element with a sequence.

Sign up to request clarification or add additional context in comments.

3 Comments

Just as a note: a dict of the form the OP gave is a perfectly valid dtype (other than not specifying numpy types, as you mentioned (e.g. np.float rather than float)). It doesn't have to be a list of tuples, and specifying a dict of {'names':['f0', 'f1' ...], 'formats':[np.float, np.int, ...]} as a dtype is often a lot more convenient.
@Joe, Can 'f0', 'f1' be index? I typically have one col date or string, the rest are floats. Ex: 'foo', 1,2,2,44,3 or 22,2,2,2,2,'3/2/2001' whats the best dtype solutions?
+1 for noting the exceptions raised when the data input is different than a list of tuples, having bumped myself into this too often.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.