I have a .csv file with multiple duplicate data entries.
Example of entries when viewed in notepad:
"Tom 1234"
"Andrew 4321"
I would like to extract the duplicate entries into another .csv along with line numbers. An expected output would look something like this.
Using
import pandas as pd
df = pd.read_csv('sample_dup.csv')
df[df.duplicated(subset=None, keep=False)].to_csv('dups.csv')
I managed to export this,
But my expected result is supposed to be this,
This is the data file in question
What went wrong for the first entry to keep appearing at the top of the list? and why is the numbering incorrect as well?



