1

I am trying to parse latex code from html code which looks like this:

string = " your answer is wrong! Solution: based on \((\vec{n_E},\vec{g})= 0 \) and \(d(g,E)=0\) beeing ... "

I want to replace all latex code with the output of a function that takes the latex code as an argument (Since there is a problem with finding the correct pattern, the function extract returns an empty string for the moment).

I tried:

latex_end = "\)"
latex_start = "\("    
string = re.sub(r'{}.*?{}'.format(latex_start, latex_end), extract, string)

Result:

your answer is wrong! Solution: based on \= 0 \) and \=0\) beeing ...

Expected:

your answer is wrong! Solution: based on and beeing ...

Any idea why it does not find the pattern? Is there a way to implement it?

2 Answers 2

1

You should use a raw string for your definition of string since \v is being interpreted as a special character.

import re

string = r" your answer is wrong! Solution: based on \((\vec{n_E},\vec{g})= 0 \) and \(d(g,E)=0\) beeing ... "


string = re.sub(r'\\\(.*?\\\)', '', string))
print(string)

Prints:

 your answer is wrong! Solution: based on  and  beeing ...

If you need to have variables for the start and end:

latex_end = r"\\\)"
latex_start = r"\\\("    
string = re.sub(r'{}.*?{}'.format(latex_start, latex_end), '', string)
print(string)
Sign up to request clarification or add additional context in comments.

Comments

1

This is because of backslashes serving as escape characters in Python. This makes handling these kinds of situations very tricky. The following are two quick ways of making this work:

import re

extract = lambda a: ""

# Using no raw components
string = " your answer is wrong! Solution: based on \((\vec{n_E},\vec{g})= 0 \) and \(d(g,E)=0\) beeing ... "
latex_bounds = ("\\\(", "\\\)\)")
print(re.sub('{}.*?{}'.format(*latex_bounds), extract, string))

# Using all raw components (backslashes mean nothing, but not really)
string = r"%s" % string
latex_bounds = (r"\\\(", r"\\\)")
print(re.sub(r'{}.*?{}'.format(*latex_bounds), extract, string))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.