0

I have the following strings

s1 = 'XXX-2 I LOVE : XXX XXX'
s2 = 'FOOD : XXX'
s3 = 'XXX-FOOD : XXX

I would like the following

s1 = '2 I LOVE'
s2 = 'FOOD'
s3 = 'FOOD'

s2 only has 1 delimiter : while s1 & s3 have 2 - & :

I would like to keep everything between the two delimiters - & : so I use the following \-(.*?)\: however I don't get s2

If I use the following \w+\-?(.*?)\: I get everything before -

I am terrible at regex, if someone could help me with this one and provide a link to understanding regex I would really appreciate it.

4
  • s2 doesn't have two delimiters to use? Commented Jun 21, 2018 at 23:20
  • exactly why I am stuck Commented Jun 21, 2018 at 23:20
  • s2 doesn't meet your description of the original regex, so it's not clear what you want your regex to save. Additional explanations and/or examples are necessary. Commented Jun 21, 2018 at 23:22
  • perhaps the changes help Commented Jun 21, 2018 at 23:25

3 Answers 3

1

The regex you need is: (?:\w+-)?(.*?):

(?:\w+-)? says to check for an initial sequence of word-like characters \w followed by a hyphen. Since we have that in parantheses, the question mark after it says this entire part is optional - i.e., either there will be \w+ and - at the beginning of the line, or neither of them will be there. The ?: part is just to tell Python that you're using the parantheses () here just for grouping, not because you want the matching parts to be captured and stored.

(.*?) - This matches the parts we actually want, and stores that in the capture group number 1. So if you have m = re.match(r"(?:\w+-)?(.*?):", 'XXX-2 I LOVE : XXX XXX'), then m.group(1) will contain 2 I LOVE.

(Note that neither - nor : need a backslash escape in regex in general (- needs quoting just inside [] character classes), so you can just write them out without escaping them.)

You might find tools like RegExr useful for exploring and understanding regexes.

Sign up to request clarification or add additional context in comments.

Comments

1

The following regex should work for your example

(?:[^-]+-)?([^:]+):.*

5 Comments

how dude just how??? Thank you so much. Is this just from experience or do you mind sharing a good resource.
You're most welcome, regex looks scary at the beginning, but it gets easy when you get used to it. I just wrote that, but a good tool to test your regex is regex101.com
thank you thank you thank you!!! It looks so confusing at first but I want to get much better at it.
Just a note, @Sundar answer has more explanation on regex, and using \w+ instead of [^-]+ will work for your example, but will not work if you have a non-alphabet character befor the -
yes I was playing around with the link you provided me and ran into that problem.
1

We are using strip to remove the trailing space.

s1 = re.sub(r'[^a-zA-Z0-9\s]+|X','',s1).strip()
s2 = re.sub(r'[^a-zA-Z0-9\s]+|X','',s2).strip()
s3 = re.sub(r'[^a-zA-Z0-9\s]+|X','',s3).strip()

2 I LOVE
FOOD
FOOD

1 Comment

I was going to resort to this since I was losing hope thank you!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.