0

I have a csv file with duplicates that are only in the column named "file". I wrote the following line:

df = pd.read_csv(path_to_file, encoding='utf-8', sep=',')
df.drop_duplicates(subset="Fichier",keep='first',inplace=True) 

But it doesn't work. I even tried to do it via Excell but it doesn't work either..

Many thanks in advance!!

1

1 Answer 1

1

You can try this, it works for me :

#In my case
metadata = pd.read_csv('CSV/data_full.csv', low_memory=False)

myresult = pd.Series(metadata.index, index=metadata['Fichier']).drop_duplicates()
Sign up to request clarification or add additional context in comments.

5 Comments

Thank you but I don't understand what is metadata? I have a csv file.
for sure, my bad, I will eidt my post
Thank you but it don't work, i don't know why
Your column name is file or Fichier ?
The name of my column is Fichier

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.