0

I'm using the scrape.py library to scrape a website. (library and documentation can be found here http://zesty.ca/scrape/)

There is a a button on the page I want the session to press, but I don't understand exactly how to use the submit function. As I understand I am supposed to give it a region object of a form. The button itself is an input html element. I tried giving it both the form and input, and I get the same error every time.

My code (on google app engine):

s.go(url)
form = s.doc.first(name="form1")
s.submit(region=form)

or

s.go(url)
input = s.doc.first(tagname="input", id="blabla")
s.submit(region=input)

and the error:

ERROR    2011-05-01 23:37:18,673 __init__.py:427] sequence item 0: expected string, NoneType found
Traceback (most recent call last):
  File "\appengine\ext\webapp\__init__.py", line 636, in __call__
    handler.post(*groups)
  File "main.py", line 135, in post
    s.submit(region=form)
  File "scrape.py", line 342, in submit
    return self.go(url, p, redirects)
  File "scrape.py", line 288, in go
    self.cookiejar)
  File "scrape.py", line 176, in fetch
    data = urlencode(data)
  File "scrape.py", line 409, in urlencode
    for key, value in params.items()]
  File "scrape.py", line 405, in urlquote
    return ''.join(map(urlquoted.get, text))
TypeError: sequence item 0: expected string, NoneType found

2 Answers 2

1

Yes I do know that this is a year old but since I am currently using scrape.py and I know the answer to this question I thought I should add it for those who come after.

The problem is in the submit.

Instead of s.submit(region=form) it should be s.submit(form).

The reason is that the variable form contains something like <Region 1254:1250> so you don't need to tell scrape.py that it's there, it is expected to be there.

So it's probably nothing to do with Javascript.

Sign up to request clarification or add additional context in comments.

Comments

0

My assupmtion is that it's probably because the button and the form were covered in javascript, so scrape probably couldn't work with that. Need libraries that support JS, like selenium or windmill.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.