1

I'm new to webscraping and I'm trying to scrape the table from this website: https://www.eloratings.net/2016_European_Championship

import pandas as pd
import requests
from bs4 import BeautifulSoup

url = 'https://www.eloratings.net/2016_European_Championship'
r = requests.get(url).text
soup = BeautifulSoup(r, "html.parser")
df = pd.read_html(str(soup.find_all('table')))

I get the "No tables found" error.

If I try to use an index to find the table:

df = pd.read_html(str(soup.find_all('table')[0]))

I get "List index out of range".

I have also tried using the Json package and Helium/Selenium webdrivers but I cannot make anything work.

2
  • it's a js table, requests get only the html response, without running the JS, it's not a browser, so your table is not loaded, you need to use something like scrapy or tkinker to get do fetch the html after running JS code Commented May 18, 2021 at 14:34
  • @MohammedJanatiIdrissi no you don't need tkinker nor selenium. Commented May 18, 2021 at 14:36

1 Answer 1

1

Use the endpoint to grab the .tsv response. Dump that to a file and then read it with pandas.

Here's how:

import time

import pandas as pd
import requests

url = f"https://www.eloratings.net/2016_European_Championship.tsv?={int(time.time())}"
table = requests.get(url).content
with open("table_data.tsv", "wb") as f:
    f.write(table)
df = pd.read_csv("table_data.tsv", sep="\t", header=None)
print(df)

Output:

     1   3  DE  2016  1.1  2223   8  ...  532  201  185  2053  1090   −1   −19
0    2   4  FR  1983    1  2137  17  ...  389  253  168  1426  1171   +3    +8
1    3   6  PT  1959    2  2020  20  ...  269  171  133   935   685   +6   +50
2    4   8  IT  1950    1  2132   8  ...  412  157  216  1348   790   +7   +75
3    5   9  ES  1940    1  2165   7  ...  385  129  148  1295   605   −4   −53
4    6  11  EN  1913    1  2212   4  ...  595  191  234  2403  1012   −3   −58
5    7  14  BE  1891    4  1959  24  ...  311  282  160  1293  1250   −4   −32
6    8  16  HR  1849    5  2006  12  ...  148   55   76   493   272   +2   +31
7    9  18  PL  1824    2  2082  30  ...  348  262  200  1376  1105   +6   +58
8   10  19  TR  1816   10  1900  42  ...  209  215  125   739   802   −2   −13
9   11  20  CH  1803    9  1917  28  ...  263  341  168  1113  1336   +2   +31
10  12  23  WA  1779    3  1906  22  ...  196  302  134   790  1067  +24  +124
11  13  26  SK  1759   17  1774  39  ...  105   95   64   375   345   −2    −7
12  13  26  IE  1759    4  1918  22  ...  227  250  161   862  1050   +4    +1
13  15  31  IS  1742   27  1754  83  ...  123  208   77   485   695  +14   +76
14  16  32  UA  1739   15  1847  36  ...  103   64   65   319   228  −16   −98
15  17  33  SE  1735    2  2014  16  ...  492  294  217  2039  1341   −7   −29
16  18  37  HU  1723    1  2231  18  ...  445  286  200  1949  1397   +5   +27
17  19  38  RO  1719    5  1945  26  ...  310  212  172  1143   889   −9   −40
18  20  39  CZ  1718    1  2038  12  ...  371  226  172  1432   958   −8   −37
19  21  40  AT  1713    1  2067  20  ...  311  282  163  1365  1209  −19   −64
20  22  43  RU  1694    1  2080  22  ...  358  147  178  1203   661  −15   −66
21  23  51  EI  1642   14  1850  38  ...  138  249  131   536   812   +2   +18
22  24  53  AL  1634   33  1634  75  ...   76  165   66   274   485   +1   +17

[23 rows x 33 columns]
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks, this worked except it needed header = None in the read_csv().

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.