1

I have the following dataframe :

Y = list(range(5))
Z = np.full(5, np.nan)
df = pd.DataFrame(dict(ColY = Y, ColZ = Z))
print(df)
   ColY  ColZ
0     0   NaN
1     1   NaN
2     2   NaN
3     3   NaN
4     4   NaN

And this dictionary :

Dict = {
    0 : 1,
    1 : 2,
    2 : 3,
    3 : 2,
    4 : 1
}

I would like to fill ColZ with "ok" if corresponding value of ColY through Dict is 2. Consequently, I would like the following dataframe :

   ColY ColZ
0     0  NaN
1     1   ok
2     2  NaN
3     3   ok
4     4  NaN

I tried this script:

df['ColZ'] = df['ColZ'].apply(lambda x : "ok" if Dict[x['ColY']] == 2 else Dict[x['ColY']])

I have this error :

TypeError: 'float' object is not subscriptable

Do you know why ?

1
  • 1
    df['ColZ'] = df['ColY'].apply(lambda x : "ok" if Dict[x] == 2 else np.nan) Commented Jan 31, 2020 at 13:32

1 Answer 1

2

Use numpy.where with Series.map for new Series for compare by Series.eq (==):

df['ColZ'] = np.where(df['ColY'].map(Dict).eq(2), 'ok', np.nan)
print(df)
   ColY ColZ
0     0  nan
1     1   ok
2     2  nan
3     3   ok
4     4  nan

Detail:

print(df['ColY'].map(Dict))
0    1
1    2
2    3
3    2
4    1
Name: ColY, dtype: int64

Your solution should be changed with .get for return some default value, here np.nan if no match:

df['ColZ'] = df['ColY'].apply(lambda x : "ok" if Dict.get(x, np.nan) == 2 else np.nan)

EDIT: For set working with df['ColZ'] values use:

Y = list(range(5))
Z = list('abcde')
df = pd.DataFrame(dict(ColY = Y, ColZ = Z))
print(df)
Dict = {
    0 : 1,
    1 : 2,
    2 : 3,
    3 : 2,
    4 : 1
}

df['ColZ1'] = np.where(df['ColY'].map(Dict).eq(2), 'ok', df['ColZ'])
df['ColZ2'] = df.apply(lambda x : "ok" if Dict.get(x['ColY'], np.nan) == 2 
                                       else x['ColZ'], axis=1)
print (df)
   ColY ColZ ColZ1 ColZ2
0     0    a     a     a
1     1    b    ok    ok
2     2    c     c     c
3     3    d    ok    ok
4     4    e     e     e
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks @jezrael ! And how could I have the value of ColZ rather than np.nan if the condition Dict.get(x, np.nan) == 2 is False ? In this example, ColZ had only NaN values but in reality, it has real values
@Ewdlam - Answer was edited, faster solution is with map, apply are loops under the hood, so slow

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.