0

Using bytes regular expression works fine as follows:

In [48]: regexp_1 = re.compile(b"\xab.{3}")
In [49]: regexp_1.fullmatch(b"\xab\x66\x77\x88")
Out[49]: <re.Match object; span=(0, 4), match=b'\xabfw\x88'> # <----- good !

When I try formatting the bytes sequence according to this post I fail:

In [50]: byte = b"\xab"
In [51]: regexp_2 = re.compile(f"{byte}.{3}".encode())
In [52]: regexp_2.fullmatch(b"\xab\x66\x77\x88")
In [53]: # nothing found ... why ?

1 Answer 1

1

This happens because f-string converts the given object to string, and when the bytes object is converted to string, it doesn't look like what you'd expect:

>>> str(byte)
"b'\\xab'"

so when you put it through f-string as you did, it gets ugly, and it stays that way when it's encoded again!

>>> f"{byte}.{3}"
"b'\\xab'.3"
>>> f"{byte}.{3}".encode()
b"b'\\xab'.3"

Not to mention {3} gets parsed as 3. to prevent that you can use double brackets ({{3}}) instead, but that's not the point of this problem.

I recommend you to concate strings instead.

regexp = re.compile(byte + b'.{3}')

# <re.Match object; span=(0, 4), match=b'\xabfw\x88'>
regexp.fullmatch(b"\xab\x66\x77\x88")
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.