2

I was wondering how I can print all the text between a \begin statement and a \end statement. This is my code now. Also, how can I keep from printing certain words located between these 2 statements?

content=open("file", "r")
print content
content.read()

while len(content.split(start,1)) > 1:
    start=("\begin")
    end=("\end")
    s=content
    print find_between( s, "\begin", "\end" )


def find_between( s, first, last ):
    try:
        start = s.index( first ) + len( first )
        end = s.index( last, start )
        return s[start:end]
     except ValueError:
        return ""



print find_between( s, "\begin", "\end" )
2
  • 3
    Are you trying to process latex files ? Commented Sep 27, 2013 at 22:09
  • What's the problem with your current code? Commented Sep 27, 2013 at 22:29

3 Answers 3

1

This example presumes you don't mind loosing the data on the \begin and \end lines. It will print all occurrences of data between \begin and \end.

f = open("file", "r")

content = f.readlines()

f.close()

start = "\\begin"
end = "\\end"

print "Start ==", start, "End ==", end

printlines = False

for line in content:

    if start in line:
        printlines = True
        continue

    if end in line:
        printlines = False
        continue

    if printlines == True:
        print line

Input file -

test
\begin do re me fa
so la te do.


do te la so \end fa me re do

Output -

Start == \begin End == \end
so la te do.
Sign up to request clarification or add additional context in comments.

Comments

0

Assuming there is only one "\begin" to "\end" block in the file:

f = open('file', 'r')

between = ''
in_statement = False

for line in f:
    if '\begin' in line:
        in_statement = True
    if in_statement:
        between += line
    if '\end' in line:
        in_statement = False
        break

print between
f.close()

Comments

0

regex is good for this sort of thing.

In [152]: import re
In [153]: s = 'this is some \\begin string that i need to check \end some more\\begin and another \end stuff after'
In [167]: re.findall(r'\\begin(.*?)\\end', s)
[' string that i need to check ',
 ' and another ']

The regex:

use raw string because \ means something to the regex parser. \begin and \end are raw character strings to match. You have to do the backslash twice because backslash means 'special' for the regex, so you need \ to actually match a backslash. .*? = dot matches anything, * means match 0 or more repetitions. The ? turns off greedy behaviour - otherwise, it will match everything between the FIRST begin and the LAST end, instead of all in between matches.

and findall then gives you a list of all matches.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.