I have a special use case which I do not yet know how to cover. I want to dissect a string based on field_name/field_length. For that I define a regex like this:
'(?P<%s>.{%d})' % (field_name, field_length)
And this is repeated for all fields.
I have also a regex to remove spaces to the right of each field:
self.re_remove_spaces = re.compile(' *$')
This way I can get each field like this:
def dissect(self, str):
data = { }
m = self.compiled.search(str)
for field_name in self.fields:
value = m.group_name(field_name)
value = re.sub(self.re_remove_spaces, '', value)
data[field_name] = value
return data
I have to perform this processing for millions of strings, so it must be efficient.
What annoys me is that I would prefer to perform the dissection + space removal in a single step, using compiled.sub instead of compiled.search, but I do not know how to do this.
Specifically, my question is:
How do I perform regex substitution combining it with named groups in Python regexes?
'(?<%s>.%d)' % ..., which would become something like'(?<name>.12)') is not a valid Python regex.