3

How do we get the following substring from a string using re in python.

string1 = "fgdshdfgsLooking: 3j #123"
substring = "Looking: 3j #123"

string2 = "Looking: avb456j #13fgfddg"
substring = "Looking: avb456j #13"

tried:

re.search(r'Looking: (.*#\d+)$', string1)
1
  • is Looking word allways present ? Commented Jan 8, 2022 at 7:22

4 Answers 4

2

Your regex is mostly correct, you just need to remove EOL(End of Line) $ as in some case like string2 the pattern does not end with a EOL, and have some extra string after the pattern ends.

import re

string1 = 'fgdshdfgsLooking: 3j #123'
string2 = 'Looking: avb456j #13fgfddg'

pattern = r'Looking: (.*?#\d+)'

match1 = re.search(pattern, string1)
match2 = re.search(pattern, string2)

print('String1:', string1, '|| Substring1:', match1.group(0))
print('String2:', string2, '|| Substring2:', match2.group(0))

Output:

String1: fgdshdfgsLooking: 3j #123 || Substring1: Looking: 3j #123
String2: Looking: avb456j #13fgfddg || Substring2: Looking: avb456j #13

should work, also I've matched everything before # lazily by using ? to match as few times as possible, expanding as needed, that is to avoid matching everything upto second #, in case there is a second #followed by few digits in the string somewhere further down.

Live Demo

Sign up to request clarification or add additional context in comments.

2 Comments

exactly what i need. Thank you! solved my third situation as well.
@jdfhf Glad it helped. Make sure to mark it as an answer if you feel like it answers your question, so that others know the question has been answered and know where to look for solution.
2

You need to remove the $ from the regex:

 re.search(r'Looking: (.*#\d+)', string1)

If you also want re to return Looking, you'll have to wrap it in parens:

 re.search(r'(Looking: (.*#\d+))', string1)

1 Comment

This is working great! but I forgot to add one more scenario in the description. String3 = "Looking: avb456j #13Looking:hgf55j #14" in this situation, how do I get both substring1 = "Looking: avb456j #13" and substring = "Looking:hgf55j #14"
1

Try,

re.search(r'Looking: (.)*#(\d)+', string1)

  1. It will match "Looking: "
  2. After that it will look for 0 or more any character
  3. After that a "#"
  4. and 1 or more digits

enter image description here

Comments

1

try this :

re.search("[A-Z]\w+:\s?\w+\s#\d+",string1)

2 Comments

This is working great! but I forgot to add one more scenario in the description. String3 = "Looking: avb456j #13Looking:hgf55j #14" in this situation, how do I get both substring1 = "Looking: avb456j #13" and substring = "Looking:hgf55j #14"
then this updated code should do it

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.