6

I have this pandas dataframe:

d=pandas.DataFrame([{"a": 1}, {"a": 3, "b": 2}])

and I'm trying to add a new column to it with non-null values only for certain rows, based on their numeric indices in the array. for example, adding a new column "c" only to the first row in d:

# array of row indices
indx = np.array([0])
d.ix[indx]["c"] = "foo"

which should add "foo" as the column "c" value for the first row, and NaN for all other rows. but this doesn't seem to change the array:

d.ix[np.array([0])]["c"] = "foo"
In [18]: d
Out[18]: 
   a   b
0  1 NaN
1  3   2

what am I doing wrong here? how can it be done? thanks.

2
  • If this is anything like a numpy array , shouldn't this be homogeneous ? Commented Mar 29, 2013 at 13:40
  • It's very possible to have a pandas df with a mixture of string values and NaN values Commented Mar 29, 2013 at 13:44

1 Answer 1

6
In [11]: df = pd.DataFrame([{"a": 1}, {"a": 3, "b": 2}])

In [12]: df['c'] = np.array(['foo',np.nan])

In [13]: df
Out[13]: 
   a   b    c
0  1 NaN  foo
1  3   2  nan

If you were assigning a numeric value, the following would work

In [16]: df['c'] = np.nan

In [17]: df.ix[0,'c'] = 1

In [18]: df
Out[18]: 
   a   b   c
0  1 NaN   1
1  3   2 NaN
Sign up to request clarification or add additional context in comments.

4 Comments

I knew it would work if I assigned an entire array to df['c'] but is there no way to assign just particular elements and have it infer the rest are NaN? It looks like I have to explicitly construct an array of size len(df) with nans and non-values...
df['c'].update(pd.Series(['foo'],index=[0])) should work, but this is a bug right now, if your assignments are numeric then you can just use my second example
thanks. they are not numeric, so I'll just stick to manually constructing the whole array
ok...I fixed this bug in 0.11-dev in any event, see here: github.com/pydata/pandas/pull/3219 thanks!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.