I'm working with a dataset available here: https://www.kaggle.com/datasets/lehaknarnauli/spotify-datasets?select=artists.csv. What I want to do is to extract first element of each array in column genres. For example, if I got ['pop', 'rock'] I'd like to extract 'pop'. I tried different approaches but none of them works, I don't know why.
Here is my code:
import pandas as pd
df = pd.read_csv('artists.csv')
# approach 1
df['top_genre'] = df['genres'].str[0]
# Error: 'str' object has no attribute 'str'
# approach 2
df = df.assign(top_genre = lambda x: df['genres'].str[0])
# The result is single bracket '[' in each row. Seems like index=0 refers to first character of a string, not first array element.
# approach 3
df['top_genre'] = df['genres'].apply(lambda x: '[]' if not x else x[0])
# The result is single bracket '[' in each row. Seems like index=0 refers to first character of a string, not first array element.
Why these approaches doesn't work and how to make it work out?