3

I have a python list of lists to convert to a numpy array. I have defined the dtype for the numpy array. Some array values might be None or ‘’. An error is issued for those, if the numpy array respective dtype value is float or int. Is there a way to say numpy to assign 1 (or field specified value) for a particular dtype if the value is None or ‘’.

e.g: following code gives me the error

ValueError: could not convert string to float.

    import re
    import numpy as np
    dt = np.dtype([ ('a', np.int32,),
          ('b', np.float32),
          ('c', np.int32,),
          ('d', np.float32),
          ('e', np.int32,),
      ]) 
     npar = np.array(('667000', '0', '0', '', ''), dt)

The expected output for npar is: (assigned 0.0 for d, 1 for e as default values)

    (667000, 0.0, 0, 0.0, 1) 

I have large multidimensional arrays to be converted. So performance is some thing important to be considered.

2
  • Let a link to a sample of your data. Pandas seems to be the best way to solve your problem. Commented Jun 23, 2015 at 19:55
  • sample data is an array of few thousands of rows as shown above. and the other statements are same. I want to convert the 2d array at once. Commented Jun 23, 2015 at 20:23

2 Answers 2

2

This might work:

One liner:

s = ('667000', '0', '0', '', '')
npar = np.array(tuple([0 if dt.names[x]== 'd' else 1 if dt.names[x]=='e' else s[x] for x in range(0,len(s))]),dt)

Or:

import numpy as np
dt = np.dtype([ ('a', np.int32,),
          ('b', np.float32),
          ('c', np.int32,),
          ('d', np.float32),
          ('e', np.int32,),
])
s = ('667000', '0', '0', '', '')
t = np.array(s)
if not t[4]:
    t[4] = 1
t[t==''] = 0
npar = np.array(tuple(t),dt)
Sign up to request clarification or add additional context in comments.

2 Comments

thanks, but I was wondering if there is a pythonic way for specifying default values for array columns.
@user3161836 , okie, I've updated answer with a one-liner :)
2

the numpy.lib.npyio.loadtxt function has a converters option.

let data2.txt be :

667000;0;0;;;
668000;0;0;3;6;

After u=loadtxt('data2.txt',dtype=dt,delimiter=';',converters={3: lambda s :float32(s or 0),4: lambda s :int32(s or 1)}), u is :

array([(667000, 0.0, 0, 0.0, 1), (668000, 0.0, 0, 3.0, 6)], dtype=...)

with lacking values substitued.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.