this is my first question on StackOverflow, so I apologize if the formatting isn't perfect.
I've concatenated multiple dataframes and now I'm unable to figure out how to create a new column - df["population"] based on values from other columns - df["2013 pop"], df["2014 pop"] etc. For example, if the event occurred in 2014, meaning the df["Year"] == 2014, I want to take the population from the df["2014 pop"] column and plug it into the new df["population"] column. I'm explaining this horribly I know, I'm just frustrated over something I feel like I should be able to do easily. Here's a summarization of the dataframe and what I've tried so far.
"Year" : [2013,2014,2015...],
"State" : ["Louisana", "Texas", "California"... ],
"City" : ["New Orleans", "Dallas", "Sacramento"...],
"Number Killed" : [4,6,2,4],
"Safety Grade" : ["A", "B", "C", "D"...],
"2013 Pop" : [421329, 232321, 2454543....],
"2014 Pop" : [454545, 655654, 3421342....],
"2015 Pop" : [142314, 454355, 4324323....],
"Incident Date(datetime dtype)" : [12-29-2014, 3-12-2017...]
}
df = pd.DataFrame(d)
I've tried mapping, loc, apply, and I just can't find a solution. I think I'm on the right track with defining a function with conditionals but I'm getting thrown an error.
def categorise(row):
if row["Year"] == 2014:
return df["2014 Pop"]
elif row["Year"] == 2015:
return df["2015 Pop"]
elif row["Year"] == 2016:
return df["2016 Pop"]
elif row["Year"] == 2017:
return df["2017 Pop"]
else:
return "NONE"
When I try this:
df["Population"] = df.apply(lambda row : categorise(row), axis = 1)
I get the Value Error " Wrong number of items passed 3609 (length of the df), placement implies 1
Does anyone have a suggestion for how to create the df["Population"] column based on my poorly worded question?