1

I would like to retrieve everything before a specific string in a text file using regex in Python. For example with the line

text = 'What I want is a red car'

I would like to retrieve everything that is before "a red car", that is:

result = 'What I want is'

The whole string "a red car" is important, not only "red" or "car" separately!

Thanks in advance!

6
  • 1
    Have you tried anything yourself? What did you try? Commented Aug 2, 2017 at 13:49
  • Actually, this is something you could easily google yourself... Commented Aug 2, 2017 at 13:50
  • 2
    duplicate of stackoverflow.com/questions/12572362/…? Commented Aug 2, 2017 at 13:50
  • Yes I tried: found = re.search('(.+?)A red car', text) It works if I put only "car" or "red" but not when I use the entire string Commented Aug 2, 2017 at 13:51
  • The whole string "a red car" is important,.... but I would like to retrieve everything that is before "a red car" Commented Aug 2, 2017 at 13:52

3 Answers 3

4

If you need to use a regex for this :

regex = re.compile('(?P<before_red_car>.+) a red car')
regex.search("What i want is a red car").group('before_red_car')

If you don't want to name your group :

regex = re.compile('(.+)a red car')
regex.search("What i want is a red car").group(1) 

If you need to catch everything including newlines, add the re.DOTALL flag.

However, doing

text = 'What I want is a red car'
text.split('a red car')[0]

Or even :

text = 'What I want is a red car'
text.replace('a red car', '')

Work too, and are arguably easier to understand. They are also twice faster :

timeit.timeit(lambda: text.split('a red car')[0])
0.5350678942020507

timeit.timeit(lambda: text.replace('a red car', ''))
0.5115460171814448

timeit.timeit(lambda: regex.search("What i want is a red car").group(1))
1.123993800741033 

# Without re.compile()
timeit.timeit(lambda: re.search('(.+)a red car', text).group(1))
1.94518623436079
Sign up to request clarification or add additional context in comments.

Comments

1

You can try this:

strIn = 'What I want is a red car'
searchStr = 'a red car'
print(strIn[:strIn.find(searchStr)])

Comments

-2

This might help

text = 'What I want is a red car'
print(text[0:13])

4 Comments

And what happens if the text is then changed to What you want is a red car?
you mean that text = 'What you want is a red card?'
Exactly that, you will end up returning What you want... My point being, that this solution is not dynamic or flexible. Also, I just tested the above and it returns What I want i, which doesn't seem like what the OP wants
yeah your right. But in the question he has not mentioned about any situation like that. Plus he himself is declaring the string text. That's why i came up with this soln.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.