Using Python and Pandas I want to find all columns with duplicate rows in a data frame and move them to another data frame. For example I might have:
cats, tigers, 3.5, 1, cars, 2, 5
cats, tigers, 3.5, 6, 7.2, 22.6, 5
cats, tigers, 3.5, test, 2.6, 99, 52.3
And I want cats, tigers, 3.5 in one data frame
cats, tigers, 3.5
and in another data frame I want
1, cars, 2, 5
6, 7.2, 22.6, 5
test, 2.6, 99, 52.3
The code should check every column for repeat rows and only remove columns in which repeats occur in all rows.
- Some of the cases none of the columns have repeats.
- Some times more than just the first three columns have repeats. It should check all of the columns because repeats can occur in any column
How could I do this?