0

im makeing a program that uses google to search but i cant becuase of the HTTP error 403 is there any way around it or anything im using mechanize to browse here is my code

from mechanize import Browser

inp = raw_input("Enter Word: ")
Word = inp

SEARCH_PAGE = "https://www.google.com/"

browser = Browser()
browser.open( SEARCH_PAGE )
browser.select_form( nr=0 )

browser['q'] = Word
browser.submit()

here is the error message

Traceback (most recent call last):
File "C:\Python27\Project\Auth2.py", line 16, in <module>
browser.submit()
File "C:\Python27\lib\site-packages\mechanize\_mechanize.py", line 541, in submit
return self.open(self.click(*args, **kwds))
File "C:\Python27\lib\site-packages\mechanize\_mechanize.py", line 203, in open
return self._mech_open(url, data, timeout=timeout)
File "C:\Python27\lib\site-packages\mechanize\_mechanize.py", line 255, in _mech_open
raise response
httperror_seek_wrapper: HTTP Error 403: request disallowed by robots.txt

please help and thank you

2
  • You're going to end up getting banned temporarily by google if you do this too many times. Using Google search programmatically is a pay for service provided by Custom search API ( 100 free queries per day for development) Commented Apr 18, 2013 at 22:48
  • This problem look awefully similar to urllib2.HTTPError: HTTP Error 403: Forbidden Commented Nov 6, 2017 at 16:32

2 Answers 2

6

You can tell Mechanize to ignore the robots.txt file:

browser.set_handle_robots(False)
Sign up to request clarification or add additional context in comments.

2 Comments

now im getting this httperror_seek_wrapper: HTTP Error 403: Forbidden
@ChristianCareaga: You have to change your user agent: views.scraperwiki.com/run/python_mechanize_cheat_sheet?
2

Mechanize tries to respect crawling limitations announced by the /robots.txt file on the site.Here, Google does not want crawlers to index its search pages.

You can ignore this limitation:

browser.set_handle_robots(False)

as stated in Web Crawler - Ignore Robots.txt file?

Also, I would recommend using Google's Custom Search API instead, which exposes a proper API with easily parseable results.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.