I have the following dataframe:
import pandas as pd
df = pd.DataFrame({"A":['a', 's', 'd', 'f', 'g', 'h', 'j', 'k', 'l'], "M":[11,4,9,2,2,5,5,6,6]})
My goal is to remove all the rows having 2 consecutive values of column M not equal to each other.
Therefore row 0, 1 and 2 should be removed because the values of M are: 11!=4, 4!=9 and 9!=2). However if 2 rows have the same consecutive value the must be kept: row 3 and 4 must be kept because they both have value 2. Same reasoning for row 5 and 6 which have value 5.
I was able to reach my goal by using the following lines of code:
l=[]
for i, row in df.iterrows():
try:
if df["M"].iloc[i]!=df["M"].iloc[i+1] and df["M"].iloc[i]!=df["M"].iloc[i-1]:
l.append(i)
except:
pass
df = df.drop(df.index[l]).reset_index(drop=True)
Can you suggest a smarter and better way to achieve my goal? maybe by using some built-in pandas function?
Here is what the dataframe should look like:
Before:
A M
0 a 11 <----Must be removed
1 s 4 <----Must be removed
2 d 9 <----Must be removed
3 f 2
4 g 2
5 h 5
6 j 5
7 k 6
8 l 6
After
A M
0 f 2
1 g 2
2 h 5
3 j 5
4 k 6
5 l 6