I have a Python dictionary as follows:
ref_dict = {
"Company1" :["C1_Dev1","C1_Dev2","C1_Dev3","C1_Dev4","C1_Dev5",],
"Company2" :["C2_Dev1","C2_Dev2","C2_Dev3","C2_Dev4","C2_Dev5",],
"Company3" :["C3_Dev1","C3_Dev2","C3_Dev3","C3_Dev4","C3_Dev5",],
}
I have a Pandas data frame called df whose one of the columns looks like this:
DESC_DETAIL
0 Probably task Company2 C2_Dev5
1 File system C3_Dev1
2 Weather subcutaneous Company2
3 Company1 Travesty C1_Dev3
4 Does not match anything
...........
My goal is to add two extra columns to this data frame and name the columns, COMPANY and DEVICE. The value in each row of the COMPANY column will be either be the company key in the dictionary if it exists in the DESC_DETAIL column or if the corresponding device exists in the DESC_DETAIL column. The value in the DEVICE column will simply be the device string in the DESC_DETAIL column. If no match is found, the corresponding row is empty. Hence the final output will look like this:
DESC_DETAIL COMPANY DEVICE
0 Probably task Company2 C2_Dev5 Company2 C2_Dev5
1 File system C3_Dev1 Company3 C3_Dev1
2 Weather subcutaneous Company2 Company2 NaN
3 Company1 Travesty C1_Dev3 Company1 C1_Dev3
4 Does not match anything NaN NaN
My attempt:
for key, value in ref_dict.items():
df['COMPANY'] = df.apply(lambda row: key if row['DESC_DETAIL'].isin(key) else Nan, axis=1)
This is obviously just wrong and does not work. How do I make it work ?