Let's say I have a DataFrame (that I sorted by some priority criterion) with a "name" column. Few names are duplicated, and I want to append a simple indicator to the duplicates.
E.g.,
'jones a'
...
'jones a' # this should become 'jones a2'
To get the subset of duplicates, I could do
df.loc[df.duplicated(subset=['name'], take_last=True), 'name']
However, I think the apply function does not allow for inplace modification, right? So what I basically ended up doing is:
df.loc[df.duplicated(subset=['name'], take_last=True), 'name'] = \
df.loc[df.duplicated(subset=['name'], take_last=True), 'name'].apply(lambda x: x+'2')
But my feeling is that there might be a better way. Any ideas or tips? I would really appreciate your feedback!
=withdf.name.duplicated(take_last=True).apply...