Issues Scraping the Web with Python

Question

I want to scrape the web with Python and I am running into some problems. Here is my code:

from urllib import request
from bs4 import BeautifulSoup

pageURL="https://gamesnacks.com/embed/games/omnomrun"
rawPage=request.urlopen(pageURL)

soup=BeautifulSoup(rawPage, "html5lib")

content=soup.article

linksList=[]


for link in content.find_all('a'):
    url=link.get("href")
    img=link.get("src")
    text=link.span.text

linksList.append({"url":"url","img":"img","text":"text"})

try:
    url=link.get("href")
    img=link.get("src")
    text=link.span.text
    linksList.append({"url":"url","img":"img","text":"text"})
except AttributeError:
    pass

import json

with open("links.json","w",encoding="utf-8") as links_file:
    json.dump(linksList,links_file,ensure_ascii=False)

print("the work is done")

It gives an error in for link in content.find_all('a'):

I have already tried some online help but it didn't work out.

linksList.append({"url":"url","img":"img","text":"text"}) seems suspicious for me BTW. — αԋɱҽԃ αмєяιcαη
– αԋɱҽԃ αмєяιcαη, Commented Jun 28, 2021 at 17:38

Peter Badida · Accepted Answer · 2021-06-28 17:09:54Z

1

You define content as soup.article but the article is just None, therefore you encounter this error:

Traceback (most recent call last):
  File "main.py", line 14, in <module>
    for link in content.find_all('a'):
AttributeError: 'NoneType' object has no attribute 'find_all'

because None itself isn't a BeautifulSoup object so it won't have any of its methods such as find_all().

You need to find a better place for retrieval of the article whatever that should be.

Try to use soup.find_all("article"), then iterate through it. Perhaps your website contains multiple article tags, however, judging by visiting of the website and checking its source I don't see any <article> tag anywhere which would be the reason there's no article attribute if it were only a single occurrence and would most likely not return anything useful even with find_all("article").

edited Jun 28, 2021 at 17:09

answered Jun 28, 2021 at 17:03

Peter Badida

12.3k10 gold badges53 silver badges98 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Issues Scraping the Web with Python

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related