4

I'm trying to generate a mask for broadcasting into dataframes: a boolean series that indicates whether a given row is between two values. This is easy to do for single logical statement, say the last five elements in a dataframe:

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.rand(10,1))
mask = (df.index.values>4)
df.loc[mask,'column'] = range(0,5)

But how does one do the same thing with more intersectional statements? For example, instead of the last five components in the array, can I address rows 2 through 6? Trying to use an AND statement for the mask fails, and I can't use Between on dataframe index values.

1 Answer 1

6

I think you can use mask mainly if duplicated index values.

So if want use between working only with Series, is possible use to_series or Series constructor.

mask = df.index.to_series().between(2,6)
#mask = pd.Series(df.index, index=df.index).between(2,6)
print (mask)
0    False
1    False
2     True
3     True
4     True
5     True
6     True
7    False
8    False
9    False
dtype: bool

mask = df.index.to_series().between(2,6).values
print (mask)
[False False  True  True  True  True  True False False False]

Or chain conditions with &:

mask = (df.index >= 2) & (df.index <= 6)
print (mask)
[False False  True  True  True  True  True False False False]

But maybe better is use loc if unique monotonic index:

df.loc[2:6, 0] = range(5)
print (df)
          0
0  0.642933
1  0.912846
2  0.000000
3  1.000000
4  2.000000
5  3.000000
6  4.000000
7  0.504830
8  0.000422
9  0.029358
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.