Replacing string by numpy array in a pandas DataFrame

Question

I have a csv file that looks like this:

A, B
34, "1.0, 2.0"
24, "3.0, 4.0"

I'm reading the file using pandas:

import pandas as pd
df = pd.read_csv('file.csv')

What I need to do is to replace the strings by numpy arrays:

for index, row in df.iterrows():
        df['B'][index] = np.fromstring(df['B'][index], sep=',')

However, it raises the error A value is trying to be set on a copy of a slice from a DataFrame. However, the numpy arrays are being correctly created.

I need all value in B to be of type numpy.ndarray.

Edit: I tried replacing df by row in the code.

for index, row in df.iterrows():
    row['flux'] = np.fromstring(row['flux'][index][1:-1], sep=',')

And no error is raised, but the type of the variables doesn't change and the DataFrame still contains strings.

when you use df['B'][index] you first create a view (df['B']) which then update at the given index. When updating a view, the underlying dataframe (typically) doesn't get updates. If you index only once, you should get that problem: df.loc[index, 'B'] = .... Read more in the documentation — Swier
– Swier, Commented Apr 8, 2020 at 13:14

jezrael · Accepted Answer · 2020-04-08 13:11:28Z

2

Use converters parameter in read_csv for convert to numpy array:

import pandas as pd
import numpy as np
from io import StringIO

temp='''A,B
34,"1.0, 2.0"
24,"3.0, 4.0"'''
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp), converters={'B':lambda x: np.fromstring(x, sep=',')})

print (df)
    A           B
0  34  [1.0, 2.0]
1  24  [3.0, 4.0]

answered Apr 8, 2020 at 13:11

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Bruno Mello · Accepted Answer · 2020-04-08 13:08:22Z

1

You can use apply to change to that format:

df['B'] = df['B'].apply(lambda x: np.fromstring(x, sep=','))

answered Apr 8, 2020 at 13:08

Bruno Mello

4,6781 gold badge16 silver badges46 bronze badges

Collectives™ on Stack Overflow

Replacing string by numpy array in a pandas DataFrame

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related