0

I have a log file with the below content

commit da83ddfdfb36f0c48ab2137efaa8c81a6bb41993
Author: ”abc <[email protected]>
Commit: ”abc <[email protected]>
..
..

I am trying to create regex matching expression as below

TEST_COMMIT = 'commit\ (?P<commit>[a-f0-9]+)\n(?P<author>Author.*)\n'
RE_COMMIT = re.compile(TEST_COMMIT, re.MULTILINE | re.VERBOSE)

This matches fine on regex101 (https://regex101.com/) but does not work in my code.

I want to get the commit ID and the Author info as separate group expressions. So

commit group should be : `da83ddfdfb36f0c48ab2137efaa8c81a6bb41993`
author group should be : `Author: ”abc <[email protected]>

My python version is 2.7.12

Any comments on what I am doing wrong ?

1 Answer 1

1

Finally, I have been able to resolve this issue.

The problem was that the logfile new line was carriage return + new line. \r\n

Once the Regex is changed to include \r\n its able to get the regex groups correctly. This code is working

TEST_COMMIT = r'''
commit\ (?P<commit>[a-f0-9]+)\r\n
(?P<author>Author.*)\r\n'
(?P<committer>Commit.*)\r\n'
(?<message>.*)\r\n
)
'''
RE_COMMIT = re.compile(TEST_COMMIT, re.MULTILINE | re.VERBOSE)

commits = RE_COMMIT.finditer(data)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.