Converting numpy ndarray with dtype <U30 into float

Question

I'm reading a list from pandas dataframe cell.

>>from pandas import DataFrame as table
>>x = table.loc[table['person'] == int(123), table.columns != 'xyz']['segment'][0]
>>print("X = ",x)

where 'person' and 'segment' are my column names and segment contains a list with floating values.

>>X = [[39.414, 39.498000000000005]]

Now, when I try to convert this into a numpy array,

>>x = numpy.asarray(x)
>>x=x.astype(float)

I get the following error

ValueError: could not convert string to float: '[[39.414, 39.498000000000005]]'

I have tried parsing the string and tried to remove any "\n" or " " or any unnecessary quotes, but it does not work. Then I tried to find the dtype

>>print("Dtype = ", x.dtype)
>>Dtype = <U30

I assume that we need to convert the U30 dtype into floats, but I am not sure how to do it. I am using numpy version 1.15.0.

All I want to do is, to parse the above list into a list with floating point values.

Looks like you have a string representation of a list. Try using ast.literal_eval(x) first. Do it on the entire column to make this easier: df.segment = df.segment.apply(ast.literal_eval) — user3483203
– user3483203, Commented Sep 11, 2018 at 16:22

user3483203 · Accepted Answer · 2018-09-11 16:28:27Z

3

The datatype should have tipped you off. U30 here stands for a length 30 unicode string (Which is what you'll see if you type len(x).

What you have is the string representation of a list, not a list of strings/floats/etc..

You need to use the ast library here:

x = '[[39.414, 39.498000000000005]]'
x = ast.literal_eval(x)
np.array(x, dtype=float)

array([[39.414, 39.498]])

answered Sep 11, 2018 at 16:28

user3483203

51.3k10 gold badges72 silver badges104 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

jpp · Accepted Answer · 2018-09-11 16:33:10Z

2

For the specific format you see, consider np.fromstring. With string slicing you can also remove the unused dimension:

x = '[[39.414, 39.498000000000005]]'

res = np.fromstring(x[2:-2], sep=',')

# array([ 39.414,  39.498])

answered Sep 11, 2018 at 16:33

jpp

166k37 gold badges301 silver badges362 bronze badges

4 Comments

appsdownload Over a year ago

Hi! This works for the above example, but does not work for muti dimensional arrays like this : x = '[[39.414, 39.498000000000005],[344.234234,442.23432]]'. In that case x = ast.literal_eval(x) followed by np.array(x, dtype=float) is more suitable

jpp Over a year ago

@appsdownload, Yes, hence the comment For the specific format you see.

appsdownload Over a year ago

Oh. Thank you @jpp. Also can you help me understand which one would be more efficient if I have a specific format?

jpp Over a year ago

I'm not sure. That's a separate question, but look up the timeit module and you can test for yourself!

Collectives™ on Stack Overflow

Converting numpy ndarray with dtype <U30 into float

2 Answers 2

Comments

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related