1

I have a Numpy Array with elements that are in single quotes and I want to convert the dtype to a float.

array = 
 [['20150101' '0.12']
 ['20150102' '0.42']
 ['20150103' '0.12']
 ['20150104' '0.46']
 ['20150105' '0.14']
 ['20150106' '0.1']
 ['20150107' '0.27']
 ['20150108' '0.03']
 ['20150109' '0.16']
 ['20150110' '0.29']
 ['20150111' '0.32']
 ['20150112' '0.16']]

I tried:

values = array.item().split(' ')
new_array = np.asarray(values, dtype='float')

but I get the ValueError: can only convert an array of size 1 to a Python scalar. I want the output to look like this: (No single quotes)

new_array = 
     [[20150101 0.12]
     [20150102 0.42]
     [20150103 0.12]
     [20150104 0.46]
     [20150105 0.14]
     [20150106 0.10]
     [20150107 0.27]
     [20150108 0.03]
     [20150109 0.16]
     [20150110 0.29]
     [20150111 0.32]
     [20150112 0.16]]

Is there a numpy function that can allow me to remove the single quotes?

3
  • new_array = array.astype(float)? Commented Feb 24, 2019 at 1:31
  • The quotes signify that the array dtype is a string, eg 'U10'. Commented Feb 24, 2019 at 1:53
  • Your desired version has a mix of integers and floats. Is that intentional? What single string were you trying to split? Why? Commented Feb 24, 2019 at 2:02

1 Answer 1

3

What you show is a 2d array with a string dtype, which I can recreate with:

In [420]: arr = np.array([['20150101', '0.12'], 
     ...:  ['20150102', '0.42'], 
     ...:  ['20150103', '0.12'], 
     ...:  ['20150104', '0.46']])                                               
In [421]:                                                                       
In [421]: arr                                                                   
Out[421]: 
array([['20150101', '0.12'],     # the repr display
       ['20150102', '0.42'],
       ['20150103', '0.12'],
       ['20150104', '0.46']], dtype='<U8')
In [422]: print(arr)                        # the str display                                                
[['20150101' '0.12']
 ['20150102' '0.42']
 ['20150103' '0.12']
 ['20150104' '0.46']]

The quotes reflect the underlying nature of the array; they aren't just an incidental part of array.

Conversion to a float dtype array:

In [423]: arr.astype(float)                                                     
Out[423]: 
array([[2.0150101e+07, 1.2000000e-01],
       [2.0150102e+07, 4.2000000e-01],
       [2.0150103e+07, 1.2000000e-01],
       [2.0150104e+07, 4.6000000e-01]])

The scientific notation is required by the wide range of values. The first column by itself displays as:

In [424]: _[:,0]                                                                
Out[424]: array([20150101., 20150102., 20150103., 20150104.])

I can get a mix of integer and float with:

In [426]: arr1 = np.zeros((4,), dtype='i,f')                                    
In [427]: arr1                                                                  
Out[427]: 
array([(0, 0.), (0, 0.), (0, 0.), (0, 0.)],
      dtype=[('f0', '<i4'), ('f1', '<f4')])
In [428]: arr1['f0'] = arr[:,0]                                                 
In [429]: arr1['f1'] = arr[:,1]                                                 
In [430]: arr1                                                                  
Out[430]: 
array([(20150101, 0.12), (20150102, 0.42), (20150103, 0.12),
       (20150104, 0.46)], dtype=[('f0', '<i4'), ('f1', '<f4')])

This a 1d structured array. Notice the difference in notation, including the use of ().

Sign up to request clarification or add additional context in comments.

2 Comments

It looks like the asker might be looking for the datetime64[D] type for his first column
@Eric, yes the numbers do look date like. But the strings won't work as is. They'll need editing, e.g. '2015-01-01'.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.