I have a pandas dataframe as below.
df = [['A',1,3],
['A',1,2],
['A',0,1],
['A',0,1],
['A',5,6],
['B',0,5],
['B',1,9],
['B',1,2],
['B',1,1]]
df = pd.DataFrame(df, columns = ['flag', 'A', 'B'])
df
Now I need to create a new variable called 'C' based on the below conditions,
1) For 1st row of each group of flag, 'C' = 'A'
2) ELSE, if A >= previous row of 'C', then 'C' = 'A' else 'C' = previous row 'C'
Below is my expected output:
flag A B C
0 A 1 3 1
1 A 1 2 1
2 A 0 1 1
3 A 0 1 1
4 A 5 6 5
5 B 0 5 0
6 B 1 9 1
7 B 1 2 1
8 B 1 1 1
I can do it using iterrows, but I need an efficient/vectorized way of doing this, since my dataset is huge