I want to replace the the values in the column below with either 'ASUS' or 'ACER' (in caps) i.e. as long as there is the word (ignore case) 'acer' in the value, just replace it to 'ACER', and the word 'asus *' replace with 'ASUS'. I used below example screenshot from Pandas documentation as an example. I applied regex function and it doesn't seem to work - nothing happens at the output. My code:
dfx = pd.DataFrame({'Brands':['asus', 'ASUS ZEN', 'Acer','ACER Swift']})
dfx = dfx.replace([{'Brands': r'^asus.$'}, {'Brands': 'ASUS'}, {'Brands': r'^acer.$'}, {'Brands': 'ACER'}], regex=True)
dfx['Brands'].unique()
Output in Jupyter notebook:
array(['asus', 'ASUS ZEN', 'Acer', 'ACER Swift'], dtype=object)
Pandas documentation example used:
Any help with a little explanation is very much appreciated.
ACCEPTED SOLUTION(S):
dfx = pd.DataFrame({'Brands':['asus', 'ASUS ZEN', 'Acer','ACER Swift']})
dfx['Brands'] = dfx['Brands'].str.lower().str.replace('.*asus.*', 'ASUS', regex=True).str.replace('.*acer.*', 'ACER', regex=True)
OR
dfx['Brands'] = dfx.Brands.apply(lambda x: re.sub(r".*(asus|acer).*", lambda m: m.group(1).upper(), x, flags=re.IGNORECASE))
dfx['Brands'].unique()
Output:
array(['ASUS', 'ACER'], dtype=object)
