1

I have a pandas dataframe like below.

   id  A  B  C
0   1  1  1  1
1   1  5  7  2
2   2  6  9  3
3   3  1  5  4
4   3  4  6  2

After evaluating conditions,

   id  A  B  C  a_greater_than_b  b_greater_than_c  c_greater_than_a
0   1  1  1  1             False             False             False
1   1  5  7  2             False              True             False
2   2  6  9  3             False              True             False
3   3  1  5  4             False              True              True
4   3  4  6  2             False              True             False

And after evaluating conditions, want to aggregate the results per id.

id  a_greater_than_b  b_greater_than_c  c_greater_than_a
1   False             False             False
2   False             True              False
3   False             True              False

2 Answers 2

3

Group by id and aggregate the condition columns using all

agg_cols = ['a_greater_than_b', 'b_greater_than_c',  'c_greater_than_a']
res = df.groupby('id')[agg_cols].all()

Output:

>>> df

   id  A  B  C  a_greater_than_b  b_greater_than_c  c_greater_than_a
0   1  1  1  1             False             False             False
1   1  5  7  2             False              True             False
2   2  6  9  3             False              True             False
3   3  1  5  4             False              True              True
4   3  4  6  2             False              True             False

>>> res

    a_greater_than_b  b_greater_than_c  c_greater_than_a
id                                                      
1              False             False             False
2              False              True             False
3              False              True             False
Sign up to request clarification or add additional context in comments.

Comments

0

use group by on id and all as the aggregate function

txt="""
1  1  1  1
1  5  7  2
2  6  9  3
3  1  5  4
3  4  6  2
"""

df = pd.DataFrame(columns=['id', 'A', 'B', 'C'])
for line in txt.split('\n'):
    if line.strip():
        df.loc[len(df)] = line.split()
            
print(df)

df['a_greater_than_b']=df['A']>df['B']
df['b_greater_than_c']=df['B']>df['C']
df['c_greater_than_a']=df['C']>df['A']
grouped=df.groupby('id').agg({'a_greater_than_b': 'all', 'b_greater_than_c': 'all', 'c_greater_than_a': 'all'})
print(grouped)

output:

a_greater_than_b  b_greater_than_c  c_greater_than_a
id                                                      
1              False             False             False
2              False              True             False
3              False              True             False

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.