1

I've got a string that I want to use regex to find the characters encapsulated between two known patterns, "Cp_6%3A" then some characters then "&" and potentially more characters, or no & and just the end of string.

My code looks like this:

def extract_id_from_ref(ref):
  id = re.search("Cp\_6\%3A(.*?)(\& | $)", ref)
  print(id)

But this isn't producing anything, Any ideas?

Thanks in advance

2
  • 1
    Reference the match group ... Commented Jul 6, 2017 at 16:24
  • i.e. do id.group(0) or whatever item you want. See here Commented Jul 6, 2017 at 16:25

2 Answers 2

1

Note that (\& | $) matches either the & char and a space after it, or a space and end of string (the spaces are meaningful here!).

Use a negated character class [^&]* (zero or more chars other than &) to simplify the regex (no need for an alternation group or lazy dot matching pattern) and then access .group(1):

def extract_id_from_ref(ref):
    m = re.search(r"Cp_6%3A([^&]*)", ref)
    if m:
        print(m.group(1))

Note that neither _ nor % are special regex metacharacters, and do not have to be escaped.

See the regex demo.

Sign up to request clarification or add additional context in comments.

Comments

0

The problem is that spaces in a regex pattern, are also taken into account. Furthermore in order to add a backspace to the string, you either have to add \\ (two backslashes) or use a raw string:

So you should write:

r"Cp_6\%3A(.*?)(?:\&|$)"

If you then match with:

def extract_id_from_ref(ref):
    id = re.search(r"Cp_6\%3A(.*?)(?:\&|$)", ref)
    print(id)

It should work.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.