2

I'm writing a script that scans through a set of links. Within each link the script searches a table for a row. Once found, it increments the variable total_rank which is the sum ranks found on each web page. The rank is equal to the row number.

The code looks like this and is outputting zero:

import requests
from bs4 import BeautifulSoup
import time

url_to_scrape = 'https://www.teamrankings.com/ncb/stats/'
r = requests.get(url_to_scrape)
soup = BeautifulSoup(r.text, "html.parser")

stat_links = []

for a in soup.select(".chooser-list ul"):
    list_entry = a.findAll('li')
    relative_link = list_entry[0].find('a')['href']
    link = "https://www.teamrankings.com" + relative_link
    stat_links.append(link)

total_rank = 0

for link in stat_links:
    r = requests.get(link)
    soup = BeautifulSoup(r.text, "html.parser")

    team_rows = soup.select(".tr-table.datatable.scrollable.dataTable.no-footer table")

    for row in team_rows:
        if row.findAll('td')[1].text.strip() == 'Oklahoma':
            rank = row.findAll('td')[0].text.strip()
            total_rank = total_rank + rank

    # time.sleep(1)

print total_rank

debugging team_rows is empty after the select() call thing is, I've also tried different tags. For example I've tried soup.select(".scroll-wrapper div") I've tried soup.select("#DataTables_Table_0_wrapper div") all are returning nothing

2
  • 1
    I don't think that string = str(a) is what you want. It return a text representation of an element. Commented Jan 7, 2016 at 21:32
  • @mic4ael am I wrong that .get takes a string as an input? or is that what you're saying? Commented Jan 7, 2016 at 21:38

2 Answers 2

3

The selector

".tr-table datatable scrollable dataTable no-footer tr"

Selects a <tr> element anywhere under a <no-footer> element anywhere under a <dataTable> element....etc.

I think really "datatable scrollable dataTable no-footer" are classes on your .tr-table? So in that case, they should be joined with the first class with a period. So I believe the final correct selector is:

".tr-table.datatable.scrollable.dataTable.no-footer tr"

UPDATE: the new selector looks like this:

".tr-table.datatable.scrollable.dataTable.no-footer table"

The problem here is that the first part, .tr-table.datatable... refers to the table itself. Assuming you're trying to get the rows of this table:

<table class="tr-table datatable scrollable dataTable no-footer" id="DataTables_Table_0" role="grid">

The proper selector remains the one I originally suggested.

Sign up to request clarification or add additional context in comments.

3 Comments

I think you're correct, but that didn't fix the underlying problem
please check the post I have updated the code and found a new place I think the error is
@audiodude good explanation. Posted a more practical-focused answer, check it out.
0

The @audiodude's answer is correct though the suggested selector is not working for me.

You don't need to check every single class of the table element. Here is the working selector:

team_rows = soup.select("table.datatable tr")

Also, if you need to find Oklahoma inside the table - you don't have to iterate over every row and cell in the table. Just directly search for a specific cell and get the previous containing the rank:

rank = soup.find("td", {"data-sort": "Oklahoma"}).find_previous_sibling("td").get_text()
total_rank += int(rank)  # it is important to convert the row number to int

Also note that you are extracting more stats links than you should - looks like the Player Stats links should not be followed since you are focused specifically on the Team Stats. Here is one way to get Team Stats links only:

links_list = soup.find("h2", text="Team Stats").find_next_sibling("ul")
stat_links = ["https://www.teamrankings.com" + a["href"] 
              for a in links_list.select("ul.expand-content li a[href]")]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.