0

Assume the following data frame:

import pandas as pd
import numpy as np

vals = [1, 2, 3, 4, 5]

df = pd.DataFrame({'val': vals})
df['val'][[0, 3]] = np.nan

Gives:

    val
0   NaN
1   2.0
2   3.0
3   NaN
4   5.0

I need to be able to replace NaN values in the val column with a 2D numpy array of zeros. When I do the following:

z = np.zeros((10, 10))

df['val'][df['val'].isnull()] = z

The arrays are converted to scalars of value 0.0:

    val
0   0.0
1   2.0
2   3.0
3   0.0
4   5.0

I really need the array to be maintained (in this case, each NaN value - rows 0 and 3 from the original data frame - should be replaced with a 10x10 array of zeros). I've tried converting to object type first

df = df.astype(object)
df['val'][df['val'].isnull()] = z

With no success. Whhyyyyy

4
  • Will you please add a sample of your expected output? Commented Dec 23, 2021 at 1:12
  • It's pretty clear from the example, right? Commented Dec 23, 2021 at 1:25
  • So for 0 in val, do you want the 0th array of z, and for 3 in val, you want the 3rd item from z? Commented Dec 23, 2021 at 1:30
  • See my very simple answer below... Commented Dec 23, 2021 at 1:55

2 Answers 2

1

It is cause by the object data type we have a way with fillna

df.val.fillna(dict(zip(df.index[df['val'].isnull()],[z]*df['val'].isnull().sum())),inplace=True)
df
                                                 val
0  [[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,...
1                                                2.0
2                                                3.0
3  [[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,...
4                                                5.0
Sign up to request clarification or add additional context in comments.

5 Comments

Almost, but this replaces values with an array of shape (10,). I really need it to be replaced with z, which is shape (10,10) - a 2d array. Try df['val'][0].shape
@JmeCS check the update
Many thanks! This is gnarly. Really unclear to my why it drops the array structure to a scalar in the first place...
@JmeCS check my answer. It's simpler, and it might work for you.
@richardec I upvoted it but it doesn't work for my real-world problem unfortunately. Not clear why. It's too complicated to replicate here but your solution definitely works for the dummy problem.
1

You were really close. Change the dataframe's dtype to object and change = z to = [z]:

df = df.astype(object)
df.loc[df['val'].isna(), 'val'] = [z]

Output:

>>> df
                                                 val
0  [[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,...
1                                                2.0
2                                                3.0
3  [[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,...
4                                                5.0

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.