Regex to remove specific parts of a string in a column dataframe python

Question

I'm working with a dataframe which contains addresses and I want to delete a specfic part of a string. Like for example

And I want to delete the string since taking the words "REFERENCE:" and "reference:" to the end of the sentence. Also I want to create a new column that looks something like this (without the word REFERENCE:/reference: and the next letter of those words) Could you help me to do it in Regex? I want that it the new column looks something like this:

You should put the code you have and the outputs in text so we could easily work on them. — malisit
– malisit, Commented Sep 23, 2020 at 1:30

gold_cy · Accepted Answer · 2020-09-23 01:57:57Z

1

You can use some regex to obtain the desired results.

df = pd.DataFrame({"address": ["Street Pases de la Reforma #200 REFERENCE: Green house", "Street Carranza #300 12 & 13 REFERENCE: There is a tree"]})

df.address.str.findall(r".+?(?=REFERENCE)").explode()

0    Street Pases de la Reforma #200 
1       Street Carranza #300 12 & 13

Explanation of the regex pattern:

.+? matches any character (except for line terminators)
+? Quantifier — Matches between one and unlimited times, as few times as possible, expanding as needed (lazy)
Positive Lookahead (?=REFERENCE)

answered Sep 23, 2020 at 1:57

gold_cy

14.2k4 gold badges27 silver badges55 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Yusnel Rojas García · Accepted Answer · 2020-09-23 01:59:23Z

1

The regex should look like this:

import re

discard_re = re.compile('(reference:.*)', re.IGNORECASE | re.MULTILINE)

then you can add the new column:

df['address_new'] = df.addresses.map(lambda x: discard_re.sub('', x))

answered Sep 23, 2020 at 1:59

Yusnel Rojas García

3702 silver badges7 bronze badges

Collectives™ on Stack Overflow

Regex to remove specific parts of a string in a column dataframe python

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related