1

I am trying to implement a program that will take a file, find all the regex matches associated with the document, and concatnate specific matches I want into a single string, which is written onto a file.

import re
import sys

f = open ('input/' + sys.argv[1], "r")
fd = f.read()
s = ''

pattern = re.compile(r'(?:(&#\d*|>))(.*?)(?=(&#\d*|<))')

for e in re.findall(pattern, fd, re.S)
        s += e[1]

f.close()
o = open ( 'output' + sys.argv[1], 'w', 0)
o.write(s)
o.close()

However, when I try to run this, I get the following error:

 File "./regex.py", line 8
    for e in re.findall(pattern, fd, re.S)

If

1
  • The error message is missing from your question. Commented Jul 30, 2015 at 21:00

2 Answers 2

1

You forgot a colon at the end of that line.

for e in re.findall(pattern, fd, re.S):

You seem to have chopped off the type of the error (SyntaxError, I imagine) but that information is very helpful. Seeing SyntaxError instead of some other type would let you know that your error has nothing to do with regexes.

Sign up to request clarification or add additional context in comments.

Comments

0

Not directly related to the original question (it was indeed a missing colon), but I suggest to take a different approach to string concatenation. Repeatedly appending new chunks will create a new string each time (because strings are immutable). A better way would be to create an accumulator list, append each matched string to it and then join these strings into a single one using ''.join(my_list_with_matches).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.