1

I want to replace the the values in the column below with either 'ASUS' or 'ACER' (in caps) i.e. as long as there is the word (ignore case) 'acer' in the value, just replace it to 'ACER', and the word 'asus *' replace with 'ASUS'. I used below example screenshot from Pandas documentation as an example. I applied regex function and it doesn't seem to work - nothing happens at the output. My code:

dfx = pd.DataFrame({'Brands':['asus', 'ASUS ZEN', 'Acer','ACER Swift']})
dfx = dfx.replace([{'Brands': r'^asus.$'}, {'Brands': 'ASUS'}, {'Brands': r'^acer.$'}, {'Brands': 'ACER'}], regex=True)
dfx['Brands'].unique()

Output in Jupyter notebook:

array(['asus', 'ASUS ZEN', 'Acer', 'ACER Swift'], dtype=object)

Pandas documentation example used:

Pandas Example

Pandas Link Here

Any help with a little explanation is very much appreciated.

ACCEPTED SOLUTION(S):

dfx = pd.DataFrame({'Brands':['asus', 'ASUS ZEN', 'Acer','ACER Swift']})

dfx['Brands'] =  dfx['Brands'].str.lower().str.replace('.*asus.*', 'ASUS', regex=True).str.replace('.*acer.*', 'ACER', regex=True)
OR
dfx['Brands'] = dfx.Brands.apply(lambda x: re.sub(r".*(asus|acer).*", lambda m: m.group(1).upper(), x, flags=re.IGNORECASE))

dfx['Brands'].unique()

Output:

array(['ASUS', 'ACER'], dtype=object)

2
  • Can you be more specific with the conditions that you are trying to meet? Commented Apr 19, 2021 at 7:52
  • 1
    the condition is that as long as there is 'acer' in the value, just replace it to 'ACER', samewise goes for 'asus' --> 'ASUS' Commented Apr 19, 2021 at 8:15

2 Answers 2

1
dfx.Brands.apply(lambda x: re.sub(r".*(asus|acer).*", lambda m: m.group(1).upper(), x, flags=re.IGNORECASE))
Sign up to request clarification or add additional context in comments.

4 Comments

Hi, the output I got was not what I wanted: array(['ASUS', 'ASUS ZEN', 'ACER', 'ACER Swift'], dtype=object). Perhaps I wasn't being clear enough. How do I achieve just 'ASUS' and 'ACER' ?
@snow uh, ok, I understood you wanted to uppercase just the brand
@snow edited, now it should give you the expected output
tried it! this works too. Output is correct. thanks !
0

Please try

dfx['Brands'] =  dfx['Brands'].str.lower().str.replace('.*asus.*', 'ASUS', regex=True).str.replace('.*acer.*', 'ACER', regex=True)

4 Comments

The code above helps to give the output I want but how do I approach it with regex?
The pattern inside .str.replace(). is a regular expression. By default regex=True inside .str.replace() pandas.pydata.org/docs/reference/api/…
oh! This is new to me. Thanks ! By any chance do you know why the example approach in Pandas did not work for me?
If you want to explicitly mention it, you can mention it. Pandas says in the future default value of regex will be False. You may also see the edited solution.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.