2

In the below string, I need the value of Version: Build Number: and perforce_url: Currently, I'm getting each of the matches listed above separately. I'd like to simplify my code to get the match in a single line.

x = '''Version: 2.2.4125
Build Number: 125
Project Name: xyz.master
Git Url: git+ssh://[email protected]:123/ab/dashboard
Git Branch: origin/master
Git Built Data: qw123ed45rfgt689090gjlllb
perforce_url:
  //projects/f5/dashboard/1.3/xyz/portal/
artifacts:
   "..//www/":     www/ '''

I have used re.match to extract the value of Version: Build Number: and perforce_url: separately. However, I'd like to simplify and get it done in a single line.

import re
matchObj=re.match('Version:\s*(.*)\n', x)
if matchObj:
  print  matchObj.group(1)

matchObj=re.match('perforce_url:\s*(.*)\n', x)
if matchObj:
  print  matchObj.group(1)
matchObj=re.match('Build Number:\s*(.*)\n', x)
if matchObj:
  print  matchObj.group(1)

I tried the following pattern:

Version:\s*(.*)\n|perforce_url:\s*(.*)\n.

But it did NOT work. I want to create a list x and append the matches to list using

list = []
list.append()

Expected result :

['2.2.4125', '//projects/f5/dashboard/1.3/xyz/portal/' , '125']

Actual result

2.2.4125

//projects/f5/dashboard/1.3/xyz/portal/

125

2 Answers 2

1

You could put Version and Build Number after each other to get those values in a capturing group.

For the preforce_url you could use a repeating pattern using with a negative lookahead (?:\n(?!perforce).*)* to match the lines as long as they don't start with perforce_url.

When is does, then match that using a capturing group:

Version:\s*(.*)\nBuild Number:\s*(.*)(?:\n(?!perforce).*)*\nperforce_url:\s*(.*)

Regex demo | Python demo

For example:

import re

regex = r"Version:\s*(.*)\nBuild Number:\s*(.*)(?:\n(?!perforce).*)*\nperforce_url:\s*(.*)"
x = ("Version: 2.2.4125\n"
            "Build Number: 125\n"
            "Project Name: xyz.master\n"
            "Git Url: git+ssh://[email protected]:123/ab/dashboard\n"
            "Git Branch: origin/master\n"
            "Git Built Data: qw123ed45rfgt689090gjlllb\n"
            "perforce_url:\n"
            "  //projects/f5/dashboard/1.3/xyz/portal/\n"
            "artifacts:\n"
            "   \"..//www/\":     www/ ")

print(re.findall(regex, x))

Result

[('2.2.4125', '125', '//projects/f5/dashboard/1.3/xyz/portal/')]
Sign up to request clarification or add additional context in comments.

Comments

1

Based off @The fourth bird answer but w/ a slight twist. By using non-capturing groups you can avoid having to have a non-capturing group between "Build Number" and "perforce". That way you only have regex for what you explicitly want to target.

r"Version:\s*(.*)\n|Build Number:\s*(.*)\n|perforce_url:\s*(.*)\n"

regex

Edit: realized non-capture groups around "Version", "Build" etc. were unnecessary

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.