0

When i run my program:

import requests
from bs4 import BeautifulSoup
class Data():
    def __init__(self):
        self.tags = open("tags.txt", "r")
        self.tag = "everything"
        self.url = "https://www.amazon.com/" + self.tag
        self.headers_param = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36 OPR/74.0.3911.160"}
        self.request = None
        self.soup = None
    def get_data(self):
        for i in range(6):
            self.tag = self.tags.readline()
            self.request = requests.get(self.url, headers=self.headers_param)
            self.soup = BeautifulSoup(self.request.content, "lxml")
            self.finder = self.soup.find("div", {"class":"content"}).text
            print(self.tag)
            print(self.url)
            print(self.finder)
            i += 1
data = Data()
data.get_data()

im getting this error:

Traceback (most recent call last):
  File "D:/Projects/neo_bot/main.py", line 25, in <module>
    data.get_data()
  File "D:/Projects/neo_bot/main.py", line 19, in get_data
    self.finder = self.soup.find("div", {"class":"content"}).text
AttributeError: 'NoneType' object has no attribute 'text'

If i remove text it works great, but i need it. I cant solve my issue. Please help me!

2
  • 2
    It is dangerous to simply assume that a method named .find() in fact finds what you are looking for. It might not be there. Either check the return value or wrap it in a try block. Commented Feb 27, 2021 at 23:10
  • 4
    self.url is not dynamic, it is a string that is defined once in __init__(). You're requesting amazon.com/everything, which is a 404 page that does not contain a div with the class "content". So soup.find() returns None, which is not something that has a .text attribute. Commented Feb 27, 2021 at 23:17

1 Answer 1

1

as comment above you have two problems

  • get_data() not updating the URL it will always https://www.amazon.com/everything and make your selector not work
  • .find() return None if no element found so you can't use .text, to avoid error check it before

fixed code

class Data():
    def __init__(self):
        self.tags = open("tags.txt", "r")
        self.tag = "everything"
        self.url = "https://www.amazon.com/" + self.tag
        self.headers_param = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36 OPR/74.0.3911.160"}
        self.request = None
        self.soup = None
    def update_url(self):
        self.tag = self.tags.readline()
        self.url = "https://www.amazon.com/" + self.tag
    def get_data(self):
        for i in range(6):
            self.update_url() # <=== update the url
            self.request = requests.get(self.url, headers=self.headers_param)
            self.soup = BeautifulSoup(self.request.content, "lxml")
            self.finder = self.soup.find("div", {"class":"content"})
            if not self.finder: # <== check for NoneType
                print('element not found')
                self.finder = 'no text'
            else:
                self.finder = self.finder.text
            print(self.tag)
            print(self.url)
            print(self.finder)
            i += 1
            
data = Data()
data.get_data()
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.