1

I have the following string:

schema(field1, field2, field3, field4 ... fieldn)

I need to transform the string to an object with name attribute as schema and the field names as another attribute which is a list.

How do I do this in Python with a regular expression?

0

3 Answers 3

5

Are you looking for something like this?

>>> s = 'schema(field1, field2, field3, field4, field5)'
>>> name, _, fields = s[:-1].partition('(')
>>> fields = fields.split(', ')
>>> if not all(re.match(r'[a-z]+\d+$', i) for i in fields):
    print('bad input')

>>> sch = type(name, (object,), {'attr': fields})
>>> sch
<class '__main__.schema'>
>>> sch.attr
['field1', 'field2', 'field3', 'field4', 'field5']
Sign up to request clarification or add additional context in comments.

7 Comments

Thanks but I am looking for a solution that, in the process, also allows me to validate that the string is in the format specified above.
@WoLpH: partition is faster.
+1 for using type() to create a class on the fly, never seen it used quite like that before :)
Your idea is good but you should maybe compile the regex before using them in order to speed up the matching
@Elenaher: they're cached internally.
|
1

Regular expressions for things like that probably need tests:

import unittest

import re

# Verbose regular expression!  http://docs.python.org/library/re.html#re.X
p = r"""

(?P<name>[^(]+)         # Match the pre-open-paren name.
\(                      # Open paren
(?P<fields>             # Comma-separated fields
    (?:
        [a-zA-Z0-9_-]+
        (?:,\ )         # Subsequent fields must separated by space and comma
    )*
    [a-zA-Z0-9_-]+       # At least one field. No trailing comma or space allowed.
)

\)                      # Close-paren
"""

# Compiled for speed!
cp = re.compile(p, re.VERBOSE)

class Foo(object):
    pass


def validateAndBuild(s):
    """Validate a string and return a built object.
    """
    match = cp.search(s)
    if match is None:
        raise ValueError('Bad schema: %s' % s)

    schema = match.groupdict()
    foo = Foo()
    foo.name = schema['name']
    foo.fields = schema['fields'].split(', ')

    return foo



class ValidationTest(unittest.TestCase):
    def testValidString(self):
        s = "schema(field1, field2, field3, field4, fieldn)"

        obj = validateAndBuild(s)

        self.assertEqual(obj.name, 'schema')

        self.assertEqual(obj.fields, ['field1', 'field2', 'field3', 'field4', 'fieldn'])

    invalid = [
        'schema field1 field2',
        'schema(field1',
        'schema(field1 field2)',
        ]

    def testInvalidString(self):
        for s in self.invalid:
            self.assertRaises(ValueError, validateAndBuild, s)


if __name__ == '__main__':
    unittest.main()

7 Comments

how's that any different from my answer? except having all redundant testing code and an ugly regex?
@David, how do I change the regex to make the space between the fields optional?
On line 13, change \ ) to \ ?). This makes the escaped space optional. (See the section "Quantifiers" at <regular-expressions.info/reference.html>.
Because regexs are supposed to look like Perl (incomprehensible)
I personally like to be able to comprehend my regex months later.
|
0

You could use something like (in two rounds because python re doesn't support nested capture (thanks SilentGhost for pointing it out)) :

pattern = re.compile("^([a-z]+)\(([a-z,]*)\)$")

ret = pattern.match(s)

if ret==None:
    ...
else:
    f = ret.groups()
    name = f[0]
    args = f[1]

    arg_pattern = re.compile("^([a-z]+)(,[a-z]+)*$")

    ret2 = arg_pattern.match(args)

    # same checking as above
    if (ret2==None):
         ...
    else:
         args_f = ret2.groups()

3 Comments

it only works with two arguments, Python re doesn't support nested captures
Does it work for fields > 2? I tried with four fields and print fields prints schema, first and last. Error?
you could fix it by splitting input string and checking each element independently (see my answer).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.