Python error - redirect trying to parse webpage

Question

from urllib.request import urlopen
from bs4 import BeautifulSoup
html = urlopen("http://www.animeplus.tv/anime-show-list/")
content =(html.read())
soup = BeautifulSoup(content)
print(soup.prettify())

The script works fine with other webpages, but I run the program for my targeted website I get.

<meta .$_server["request_uri"]."'"="" content="0;URL='" http-equiv="refresh"/>

I do not really understand the html code.

I assume it's some sort of redirect or way to prevent web scraping.

Is there a way for python to access the code after the redirect or in a way the browser would return the source code?

Thank you!

Getting the page via curl also returns the same response -- I tried following redirects/changing the user agents, but no luck :( — Jess
– Jess, Commented Jun 28, 2014 at 4:11

alecxe · Accepted Answer · 2014-06-28 03:30:56Z

The trick here is that the page redirects to itself and sets the Cookie header which is important, without it you would not get the HTML you see in the browser.

Here's the solution using requests - opening up the same page in the same session:

import requests
from bs4 import BeautifulSoup

url = "http://www.animeplus.tv/anime-show-list/"
session = requests.session()
session.get(url)
response = session.get(url)  # open up the page second time
soup = BeautifulSoup(response.content)
print(soup.title.text)  # prints: "Watch Anime | Anime Online | Free Anime | English Anime | Watch Anime Online - AnimePlus.tv"

Alternatively, you can use mechanize, but it doesn't support python 3 at the moment. Here's how it works:

>>> import mechanize
>>> browser = mechanize.Browser()
>>> browser.open('http://www.animeplus.tv/anime-show-list/')
>>> print browser.response().read()
<!DOCTYPE html>
<html>
<head>
  <title>Watch Anime | Anime Online | Free Anime | English Anime | Watch Anime Online - AnimePlus.tv</title> 
...

Collectives™ on Stack Overflow

Python error - redirect trying to parse webpage

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related