1

I'll preface by saying that I've seen similar questions, but none of the solutions worked for me

So I'm looking for a specific class in my html page, but I always get a None value returned. I've seen a few posts on here describing the same problem, but none of the solutions have worked for me. Here are my attempts - I'm looking for the player tags with their names, i.e. 'Chase Young'

from selenium import webdriver
from bs4 import BeautifulSoup
import pandas as pd
import requests

url = "https://www.nfl.com/draft/tracker/prospects/allPositions?
college=allColleges&page=1&status=ALL&year=2020"

soup = BeautifulSoup(url.content, 'lxml')
match = soup.find('div', class_ = 'css-gu7inl')
print(match)
# Prints None

I tried another method to find the match, still returned None:

match = soup.find("div", {"class": "css-gu7inl"} # Print match is None

It appears that the html file does not contain all of the webpage, so I tried using selenium as I've seen recommended on similar post, and still got nothing:

driver = webdriver.Chrome("chromedriver")
driver.get(url)
soup = BeautifulSoup(driver.page_source, 'lxml')
items=soup.select(".css-gu7inl")
print(items) # Empty list

What am i doing wrong here?

3
  • I checked the selenium approach and it does gives the result you are looking for, what problem are you facing at your end ? Commented Mar 25, 2020 at 16:44
  • I'm not sure why it works for you.. I re ran the code and i still get an empty list Commented Mar 25, 2020 at 16:49
  • I guess below answer should work, basically you need to wait until browser loads all the content and than you need to parse the HTML to get the content out of it. May be when you hit from your code it still loading the content and thats why you dont get result for your div. Commented Mar 25, 2020 at 16:53

2 Answers 2

2

Data is rendered by java scripts hence Induce WebDriverWait() and wait for the element to visible using visibility_of_all_elements_located()

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from bs4 import BeautifulSoup

url='https://www.nfl.com/draft/tracker/prospects/allPositions?college=allColleges&page=1&status=ALL&year=2020'
driver = webdriver.Chrome()
driver.get(url)
WebDriverWait(driver,20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR,'.css-gu7inl')))
soup = BeautifulSoup(driver.page_source, 'lxml')
items=soup.select(".css-gu7inl")
Players=[item.select_one('a.css-1fwlqa').text for item in items]
print(Players) 

Output:

['chase young', 'jeff okudah', 'derrick brown', 'isaiah simmons', 'joe burrow', "k'lavon chaisson", 'jedrick wills', 'tua tagovailoa', 'ceedee lamb', 'jerry jeudy', "d'andre swift", 'c.j. henderson', 'mekhi becton', 'mekhi becton', 'patrick queen', 'henry ruggs iii', 'henry ruggs iii', 'javon kinlaw', 'laviska shenault jr.', 'yetur gross-matos']
Sign up to request clarification or add additional context in comments.

2 Comments

That worked! Could you explain what your code did that mine dind't? Is it just the fact that when i was importing the page, it had not been fully rendered?
@StevenCunden : Yes you are spot on.you need to use Explicit wait to load elements properly.
0

Code number one helps you see the response from the server. This response contains HTML code sent by the server. Analyze the response(HTML code from server) of this code with another code and separate the class you want.

==================================================

import requests #CODE1
from requests_toolbelt.utils import dump

resp = requests.get('http://kanoon.ir/')
data = dump.dump_all(resp)
print(data.decode('utf-8')) 

===================================================

The output of code: HTML code:

< GET / HTTP/1.1

< Host: kanoon.ir

< User-Agent: python-requests/2.23.0

< Accept-Encoding: gzip, deflate

< Accept: */*

< Connection: keep-alive

< 
     ...

===================================================

The code you write for the second part(for Analysis and HTML code separation) depends on your creativity.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.