13

How can I apply a function element-wise to a pandas DataFrame and pass a column-wise calculated value (e.g. quantile of column)? For example, what if I want to replace all elements in a DataFrame (with NaN) where the value is lower than the 80th percentile of the column?

def _deletevalues(x, quantile):
if x < quantile:
    return np.nan
else:
    return x

df.applymap(lambda x: _deletevalues(x, x.quantile(0.8)))

Using applymap only allows one to access each value individually and throws (of course) an AttributeError: ("'float' object has no attribute 'quantile'

Thank you in advance.

1
  • 2
    replace x.quantile by df.quantile Commented May 10, 2017 at 14:16

2 Answers 2

19

Use DataFrame.mask:

df = df.mask(df < df.quantile())
print (df)
     a    b    c
0  NaN  7.0  NaN
1  NaN  NaN  6.0
2  NaN  NaN  5.0
3  8.0  NaN  NaN
4  7.0  3.0  5.0
5  6.0  7.0  NaN
6  NaN  NaN  NaN
7  8.0  4.0  NaN
8  NaN  NaN  6.0
9  7.0  7.0  6.0
Sign up to request clarification or add additional context in comments.

Comments

6
In [139]: df
Out[139]:
   a  b  c
0  1  7  3
1  1  2  6
2  3  0  5
3  8  2  1
4  7  3  5
5  6  7  2
6  0  2  1
7  8  4  1
8  5  0  6
9  7  7  6

for all columns:

In [145]: df.apply(lambda x: np.where(x < x.quantile(),np.nan,x))
Out[145]:
     a    b    c
0  NaN  7.0  NaN
1  NaN  NaN  6.0
2  NaN  NaN  5.0
3  8.0  NaN  NaN
4  7.0  3.0  5.0
5  6.0  7.0  NaN
6  NaN  NaN  NaN
7  8.0  4.0  NaN
8  NaN  NaN  6.0
9  7.0  7.0  6.0

or

In [149]: df[df < df.quantile()] = np.nan

In [150]: df
Out[150]:
     a    b    c
0  NaN  7.0  NaN
1  NaN  NaN  6.0
2  NaN  NaN  5.0
3  8.0  NaN  NaN
4  7.0  3.0  5.0
5  6.0  7.0  NaN
6  NaN  NaN  NaN
7  8.0  4.0  NaN
8  NaN  NaN  6.0
9  7.0  7.0  6.0

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.