I have a csv file structured like this:
As you can see, many lines are repeated (they represent the same entity) with the attribute 'category' being the only difference between each other. I would like to join those rows and include all the categories in a single value.
For example the attribute 'category' for Walmart should be: "Retail, Dowjones, SuperMarketChains".
Edit:
I would like the output table to be structured like this:
Edit 2:
What worked for me was:
df4.groupby(["ID azienda","Name","Company code", "Marketcap", "Share price", "Earnings", "Revenue", "Shares", "Employees"]
)['Category'].agg(list).reset_index()


pivotdf.groupby(grp_by_cols)['Category'].agg(list).reset_index()? wheregrp_by_colsis a list of column names:['ID', 'Name', 'Company code', . . . ]Or you can groupby the id column, transform and drop duplicates.