I have a dataframe like this:
name food
mike pizza
mike cookie
mike banana
mary apple
mary pear
jane broccoli
I want to add a sequential integer column that is unique to name, like this:
id name food
1 mike pizza
1 mike cookie
1 mike banana
2 mary apple
2 mary pear
3 jane broccoli
Is there an elegant pandas one- (or two-) liner to do that? I'm new to pandas and suspect there's a way to do it using some combination of groupby and lambda, but I'm not making any progress.
df["name"].astype("category").cat.codesdf.groupby('name', sort=False).ngroup()+1is likely what you want. It's unique per name, and the counter is based on the occurrence in the DataFrame, not any lexicographical sorting.