I'm trying to create new columns by the fish species name and have the integer as the value, keeping the indexing to do a dataframe join afterwards.
import pandas as pd
df = pd.read_csv("fishCounts.csv",index_col=0)
countsdf = df[["Fish Count"]].copy()
countsdf.head()
Fish Count
0 38 Sand Bass, 16 Sculpin, 10 Blacksmith
1 138 Sculpin, 28 Sand Bass
2 150 Sculpin Released, 102 Sculpin, 40 Sanddab
3 156 Sculpin, 29 Sand Bass, 5 Black Croaker, 3 ...
4 161 Sculpin
countsdf.columns = ["fish"]
countsdf.fish = countsdf.fish.str.split(", ", expand=False)
countsdf.head()
fish
0 [38 Sand Bass, 16 Sculpin, 10 Blacksmith]
1 [138 Sculpin, 28 Sand Bass]
2 [150 Sculpin Released, 102 Sculpin, 40 Sanddab]
3 [156 Sculpin, 29 Sand Bass, 5 Black Croaker, 3...
4 [161 Sculpin]
Here's where I'm not sure where to go. Iterate through the dataframe rows? Make a list of dictionaries? Could I have imported the data differently to make this easier?
Edit: This is what I'm trying to get to.
Sand Bass Sculpin Blacksmith Sculpin Released Sanddab Black Croaker
0 38 16 10
1 28 138
2 102 150 40
3 29 156 5
4 161