I have a DataFrame A as follows, and I want to find the rows with the same values in their first 3 columns.
import pandas as pd
import io
import numpy as np
import datetime
A= """
c0 c1 c2 c3 c4
0 1 a d 3 4
1 1 a c 0 0
2 1 a d 3 1
3 1 b c 0 0
4 2 b d 8 5
5 2 b d 3 3
"""
df = pd.read_csv(io.StringIO(A), delimiter='\s+')
df2= pd.DataFrame(df, columns=['c0', 'c1', 'c2'])
for i in range(0,4):
row1 = df2.irow(i)
row2 = df2.irow(i+1)
val=all(unique_columns = row1 != row2)
print(i)
I want it to print 2, 5.
Well, this does not work, even if it would it couldn't get the rows that are following eachother.
Alternatively, I tried np.unique(df2), to see if the number of columns are different from df2, which didn't work either.
Any help is appreciated.