0

I have an output as below:

output = test:"<no-cjar> mxy 3044 jpg fasst_st

I want to read the output and store only the value that comes after 'mxy' and before 'jpg'. The value is always an integer.

I used something like:

value = re.findall(r"mxy \d+", output)

which I think will return 'mxy 3044'. I am not too certain if this is the best way. As I would probably have to split this again.

Is this right and is there another way i can do the same. Thanks in advance.

1
  • Why not use a lookbehind: (?<=mxy )\d+ Commented Feb 12, 2015 at 4:27

2 Answers 2

1

Use capturing group.

value = re.findall(r"mxy\s*(.*?)\s*jpg", output)

\s* matches zero or more spaces. OR r"mxy\s*(\d+)\s*jpg"

Example:

>>> re.findall(r"mxy\s*(.*?)\s*jpg", 'test:"<no-cjar> mxy 3044 jpg fasst_st')
['3044']
Sign up to request clarification or add additional context in comments.

1 Comment

If I use "mxy\s*(.*)\s*jpg", it will return ['3044 '] with a space after the number. So space is considered a character and hence the dot matches it. So is it the '?' will restrict the match to the type of the character it matched before, int in this case. Thanks in advance.
0

Alterantive approach, assuming that the the format of your string does not change, i.e. it has mxy followed by space and 4 digit number.

output = "<no-cjar> mxy 3044 jpg fasst_st"
n = int(output.split('mxy')[1][0:6])
print(n) # gives 3044

Just different way of doing the same thing, but without regex.

2 Comments

Thanks. I think this will return a space before the number in the match.
Nope. Sorry it converts to int. Thank.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.