I created a dataframe df where I have a column with the following values:
category
20150115_Holiday_HK_Misc
20150115_Holiday_SG_Misc
20140116_DE_ProductFocus
20140116_UK_ProductFocus
I want to create 3 new columns
category | A | B | C
20150115_Holiday_HK_Misc 20150115_Holiday_Misc HK Holiday_Misc
20150115_Holiday_SG_Misc 20150115_Holiday_Misc SG Holiday_Misc
20140116_DE_ProductFocus 20140116_ProductFocus DE ProductFocus
20140116_UK_ProductFocus 20140116_ProductFocus UK ProductFocus
In column A, I want to take out "_HK" - I think I need to manually code this, but this is fine, I have the list of all country codes
In column B, it's that very country code
Column C, is column A without the date in the beginning
I am trying something like this, but not getting far.
df['B'] = np.where([df['category'].str.contains("HK")==True], 'HK', 'Not Specified')
Thank you
.split()for example