I am analyzing a data set that is similar in shape to the following example. I have two different types of data (abc data and xyz data):
abc1 abc2 abc3 xyz1 xyz2 xyz3
0 1 2 2 2 1 2
1 2 1 1 2 1 1
2 2 2 1 2 2 2
3 1 2 1 1 1 1
4 1 1 2 1 2 1
I want to create a function that adds a categorizing column for each abc column that exists in the dataframe. Using lists of column names and a category mapping dictionary, I was able to get my desired result.
abc_columns = ['abc1', 'abc2', 'abc3']
xyz_columns = ['xyz1', 'xyz2', 'xyz3']
abc_category_columns = ['abc1_category', 'abc2_category', 'abc3_category']
categories = {1: 'Good', 2: 'Bad', 3: 'Ugly'}
for i in range(len(abc_category_columns)):
df3[abc_category_columns[i]] = df3[abc_columns[i]].map(categories)
print df3
The end result:
abc1 abc2 abc3 xyz1 xyz2 xyz3 abc1_category abc2_category abc3_category
0 1 2 2 2 1 2 Good Bad Bad
1 2 1 1 2 1 1 Bad Good Good
2 2 2 1 2 2 2 Bad Bad Good
3 1 2 1 1 1 1 Good Bad Good
4 1 1 2 1 2 1 Good Good Bad
While the for loop at the end works fine, I feel like I should be using Python's lambda function, but can't seem to figure it out.
Is there a more efficient way to map in a dynamic number of abc-type columns?