0

I need to extract a javascript variable containing a multiline JSON from a remote page using a python script(2.7), and I want to use regex to do this, but my pattern does not return anything

What am I doing wrong ?

here's my code :

request = urllib2.Request("http://somesite.com/affiliates/")
result = urllib2.urlopen(request)
affiliates = re.findall('#var affiliates = (.*?);\s*$#m', result.read())
print affiliates

1 Answer 1

2

If you look at the docs for re.findall(pattern, string, flags=0), you'll see you need to change how you're using it

affiliates = re.findall('var affiliates = (.*?);\s*$', result.read(), re.M)

You might also want to consider how whitespace can be sloppy in JavaScript.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.