I'm currently using R for data science and I'm learning Python and Pandas to expand my toolkit. I'd like to create a new column of lists within a Pandas dataframe using existing column names and values.
For the following Pandas dataframe:
test1 test2 test3
1 0 1 1
2 0 1 0
3 1 1 1
4 1 0 0
5 0 0 0
A new column would contain a list for each row that takes the column name wherever there's a '1' value, strips the 'test' prefix, and concatenates the list using a '-' separator.
test1 test2 test3 combo
0 0 1 1 2-3
1 0 1 0 2
2 1 1 1 1-2-3
3 1 0 0 1
4 0 0 0
I can create the column in R and data.table using the following code:
df [, combo := apply (df == 1, 1, function(x) {
paste (gsub("test", "", names(which(x))), collapse = "-")
}
)]
This is the closest I've come in Pandas:
def test(x):
paste(loc[x])
df['combo'] = df.apply(test, df == 1, axis = 1)
TypeError: apply() got multiple values for argument 'axis'
Am I on the right path?