1

I have a csv file that looks like this:

A, B
34, "1.0, 2.0"
24, "3.0, 4.0"

I'm reading the file using pandas:

import pandas as pd
df = pd.read_csv('file.csv')

What I need to do is to replace the strings by numpy arrays:

for index, row in df.iterrows():
        df['B'][index] = np.fromstring(df['B'][index], sep=',')

However, it raises the error A value is trying to be set on a copy of a slice from a DataFrame. However, the numpy arrays are being correctly created.

I need all value in B to be of type numpy.ndarray.

Edit: I tried replacing df by row in the code.

for index, row in df.iterrows():
    row['flux'] = np.fromstring(row['flux'][index][1:-1], sep=',')

And no error is raised, but the type of the variables doesn't change and the DataFrame still contains strings.

1
  • when you use df['B'][index] you first create a view (df['B']) which then update at the given index. When updating a view, the underlying dataframe (typically) doesn't get updates. If you index only once, you should get that problem: df.loc[index, 'B'] = .... Read more in the documentation Commented Apr 8, 2020 at 13:14

2 Answers 2

2

Use converters parameter in read_csv for convert to numpy array:

import pandas as pd
import numpy as np
from io import StringIO

temp='''A,B
34,"1.0, 2.0"
24,"3.0, 4.0"'''
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp), converters={'B':lambda x: np.fromstring(x, sep=',')})

print (df)
    A           B
0  34  [1.0, 2.0]
1  24  [3.0, 4.0]
Sign up to request clarification or add additional context in comments.

Comments

1

You can use apply to change to that format:

df['B'] = df['B'].apply(lambda x: np.fromstring(x, sep=','))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.