1

I have a dataframe df with 3 columns: time(timestamp), id (str) and red(boolean). I want to add another boolean column that for each row checks if this row or any of the chronologically-next two rows of this id are red. (If there are less than two rows of the same id after this row, we only consider the rows that we have.)

What is an elegant way to do this? My approach was not elegant: I sorted by time, created an empty list called new_col and filled it in a loop over all rows of df by:

(for row_number in xrange(len(df)-2)...)

using iloc and then typed df['col']=new_col. This is slow and not very readable.

2 Answers 2

1

Assuming you first sort by timestamp, you could group by id, and for each group, shift the values of red once and twice, and find the logical or of the result:

 df['col'] = df.red.groupby(df.id).apply(lambda g: g | g.shift(-1) | g.shift(-2))

For example:

In [100]: df = pd.DataFrame({'red': [True, True, True, False, False, True, True, True], 'id': [0] * 6 + [1] * 2})

In [101]: df.red.groupby(df.id).apply(lambda g: g | g.shift(-1) | g.shift(-2))
Out[101]: 
0    True
1    True
2    True
3    True
4    True
5    True
6    True
7    True
Name: red, dtype: bool
Sign up to request clarification or add additional context in comments.

1 Comment

How do I incorporate the id part of the question?
1

I agree with Ami, with the slight caveat that I believe you only want to check subsequent rows for red / not red, thus I would remove the first OR statement within the groupby:

# df1 (original df)
#   id    red        time
# 0  1   True  2016-09-01
# 1  1   True  2016-09-02
# 2  1   True  2016-09-03
# 3  2   True  2016-09-02
# 4  3  False  2016-09-03
# 5  4  False  2016-09-04
# 6  5  False  2016-09-05

df2 = df1.groupby(['id'])['red'].apply(lambda g: g.shift(-1) | g.shift(-2)).reset_index()
df2.drop(labels='index', axis=1, inplace=True)
df2.rename(columns={0: 'next red'}, inplace=True)
df1.join(other=df2)

OUTPUT:

  id    red        time next red
0  1   True  2016-09-01     True
1  1   True  2016-09-02     True
2  1   True  2016-09-03    False
3  2   True  2016-09-02    False
4  3  False  2016-09-03    False
5  4  False  2016-09-04    False
6  5  False  2016-09-05    False

1 Comment

I'm upvoting this answer, because after reading the question, I'm not sure you're not correct in your interpretation.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.