I have been tasked with filtering tags from a contact list in order to form calling lists
The original CSV has a column listed "Tags" that had no more than 6 values, so I split them into 6 different columns. There are 75 unique tags among the 6 columns, but the various tags do not appear in specific columns, the order in which they appear in the columns is random.
However the person I'm working with is asking for each single contact to be placed into a larger grouping while still preserving the original tags. So I decided on creating a 7th tag based on the conditions of the individual tags in the 6 columns. He doesn't care so much that it's an exact match to the columns, only that each person with a tag is placed in a single list for calling.
I have been provided with basically a key-value pair for the tags so I know which calling list they belong in.
Normally I would have simply done a replace with the key-value pair to limit the tags and go from there, but I have to preserve the original tags. Additionally I've dealt with numbers, and I can bin numbers on something such as age or income bracket. But I'm at a loss of how to string match other columns in the same row. Please let me know if I should be searching different terms, anything helps.
# the key-value pairs
'work' : list1
'hobby' : list2
'family' : list3
'conference' : list4
'extended family' : list3
'high school' : list5
'college' : list5
# sample dataframe
data = [[1,'family','extended family','','','',''], [2,'college','hobby','','','',''],
[3,'college','family','work','','',''], [4,'conference','','','','',''],
[5,'hobby','','','','',''], [6,'college','','','','',''],
[7,'college','work','family','high school','conference','hobby']]
df = pd.DataFrame(data, columns = ['contactID', 'tag1','tag2','tag3','tag4','tag5','tag6'])
df
Here's the sort of output that I'm trying to get
contactID tag1 tag2 tag3 tag4 tag5 tag6 call_list
001 family extended family list3
002 college hobby list2
003 college family work list1
004 conference list4
005 hobby list2
006 college list5
007 college work family high school conference hobby list2