Python pandas - impossible to delete duplicates

Question

I have a csv file with duplicates that are only in the column named "file". I wrote the following line:

df = pd.read_csv(path_to_file, encoding='utf-8', sep=',')
df.drop_duplicates(subset="Fichier",keep='first',inplace=True)

But it doesn't work. I even tried to do it via Excell but it doesn't work either..

Many thanks in advance!!

You can visit on enter link description here

Hrushi
– Hrushi

2022-06-16 09:41:39 +00:00
Commented Jun 16, 2022 at 9:41 — Hrushi
– Hrushi, Commented Jun 16, 2022 at 9:41

Louis Chabert · Accepted Answer · 2022-06-16 11:14:30Z

1

You can try this, it works for me :

#In my case
metadata = pd.read_csv('CSV/data_full.csv', low_memory=False)

myresult = pd.Series(metadata.index, index=metadata['Fichier']).drop_duplicates()

edited Jun 16, 2022 at 11:14

answered Jun 16, 2022 at 9:39

Louis Chabert

4395 silver badges18 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Balkhrod Over a year ago

Thank you but I don't understand what is metadata? I have a csv file.

Louis Chabert Over a year ago

for sure, my bad, I will eidt my post

Balkhrod Over a year ago

Thank you but it don't work, i don't know why

Louis Chabert Over a year ago

Your column name is file or Fichier ?

Balkhrod Over a year ago

The name of my column is Fichier

Collectives™ on Stack Overflow

Python pandas - impossible to delete duplicates

1 Answer 1

5 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related