4

I'm trying to find through a file expressions such as A*B.

A and B could be anything from [A-Z] [a-z] [0-9] and may include < > ( ) [ ] _ . etc. but not commas, semicolon, whitespace, newline or any other arithmetic operator (+ - \ *). These are the 8 delimiters. Also there can be spaces between A and * and B. Also the number of opening brackets need to be the same as closing brackets in A and B.

I unsuccessfully tried something like this (not taking into account operators inside A and B):

import re
fp = open("test", "r")
for line in fp:
    p = re.compile("( |,|;)(.*)[*](.*)( |,|;|\n)")
    m = p.match(line)
        if m:
            print 'Match found ',m.group()
        else:
            print 'No match'

Example 1:

(A1 * B1.list(), C * D * E) should give 3 matches:

  1. A1 * B1.list()
  2. C * D
  3. D * E

An extension to the problem statement could be that, commas, semicolon, whitespace, newline or any other arithmetic operator (+ - \ *) are allowed in A and B if inside backets:

Example 2:

(A * B.max(C * D, E)) should give 2 matches:

  1. A * B.max(C * D, E)
  2. C * D

I'm new to regular expressions and curious to find a solution to this.

7
  • 1
    Could you furnish some examples, please? Commented Aug 24, 2015 at 13:31
  • Use search ......... match tries to match from the begining. Commented Aug 24, 2015 at 13:31
  • You probably want to search for one or more non-separator chatacters, followed by one or more separators, followed by some non-separators again. Check out the ^. Commented Aug 24, 2015 at 13:35
  • 2
    Regular expressions is not a good tool for this particular task. Consider creating a simple parser Commented Aug 24, 2015 at 13:58
  • 1
    This regex is too clumsy, but working for 1 nested level. Commented Aug 24, 2015 at 14:14

1 Answer 1

1

Regular expressions have limits. The border between regular expressions and text parsing can be tight. IMO, using a parser is a more robust solution in your case.

The examples in the question suggest recursive patterns. A parser is again superior than a regex flavor in this area.

Have a look to this proposed solution: Equation parsing in Python.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.