1

I've tried a million different ways to parse out the zestimate, but have yet to be successful.

here's the html tag with the zestimate info:

<span>
  <span tabindex="0" role="button">
    <span class="sc-bGbJRg iiEDXU ds-dashed-underline">
      Zestimate
    <sup>®</sup>
    </span>
  </span>
  :&nbsp;
  <span>$331,425</span>
</span>

Honestly I thought this would get me close, but I get an empty list:

link = 'https://www.zillow.com/homedetails/1404-Clearwing-Cir-Georgetown-TX-78626/121721750_zpid/'
searched_word = '<span class="sc-bGbJRg iiEDXU ds-dashed-underline">Zestimate<sup>®</sup></span>'
test_page = requests.Session().get(link, headers=req_headers)
test_soup = BeautifulSoup(test_page.content, 'lxml')
results = test_soup('span',string='searched_word')
print(results)[0]

1 Answer 1

1

To get correct HTML from the site, add User-Agent header to request.

For example:

import requests
from bs4 import BeautifulSoup


url = 'https://www.zillow.com/homedetails/1404-Clearwing-Cir-Georgetown-TX-78626/121721750_zpid/'
headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:77.0) Gecko/20100101 Firefox/77.0'}
soup = BeautifulSoup(requests.get(url, headers=headers).content, 'html.parser')

home_value = soup.select_one('h4:contains("Home value")').find_next('p').get_text(strip=True)
print(home_value)

Prints:

$331,425
Sign up to request clarification or add additional context in comments.

6 Comments

I keep getting "AttributeError: 'NoneType' object has no attribute 'find_next'"
@max Try to do print(soup) and verify that you don't get captcha page
@max Sounds like you may be getting captcha. Modify @AndrejKesely code so that headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:77.0) Gecko/20100101 Firefox/77.0', 'content-type': 'text/html; charset=UTF-8'} and see if that works for you.
@AndrejKesely ah dang, didn't take the time to read the results of that soup variable. it was a captcha page. I added a different header and it worked: req_headers = { 'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,/;q=0.8', 'accept-encoding': 'gzip, deflate, br', 'accept-language': 'en-US,en;q=0.8', 'upgrade-insecure-requests': '1', 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36' }
@AndrejKesely Thanks so much. you have no idea how much time i spent trying to figure out both of those issues. i got nonetype all the time and didn't realize why. at this point i'm over this whole thing. But anyways, thanks again.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.