Python Web Scraping Error without Any Warning

Question

I am trying to scrape some text from a webpage and saving them in a text file using following code (I am opening links from a text file called links.txt):

import requests
import csv
import random
import string
import re

from bs4 import BeautifulSoup

#Create random string of specific length
def randStr(chars = string.ascii_uppercase + string.digits, N=10):
    return ''.join(random.choice(chars) for _ in range(N))
    
with open("links.txt", "r") as a_file:
  for line in a_file:
    stripped_line = line.strip()
    endpoint = stripped_line
    response = requests.get(endpoint)
    data = response.text
    soup = BeautifulSoup(data, "html.parser")
    for pictags in soup.find_all('col-md-2'):
        lastfilename = randStr()
        file = open(lastfilename + ".txt", "w")
        file.write(pictags.txt)
        file.close()
        print(stripped_line)

the webpage has following attribute:

<div class="col-md-2">

The problem is after running the code noting is happening and I am not receiving any error.

What are you trying to scrape from that page? Could you explain — Ram
– Ram, Commented Aug 26, 2021 at 7:36

Andrej Kesely · Accepted Answer · 2021-08-26 08:01:08Z

2

To get all keyword text from the page into a file, you can do:

import requests
from bs4 import BeautifulSoup

url = "http://www.mykeyworder.com/keywords?tags=dog&exclude=&language=en"

soup = BeautifulSoup(requests.get(url).content, "html.parser")

with open("data.txt", "w") as f_out:
    for inp in soup.select('input[type="checkbox"]'):
        print(inp["value"], file=f_out)

This creates data.txt with content:

dog
animal
canine
pet
cute
puppy
happy
young
adorable

...and so on.

answered Aug 26, 2021 at 8:01

Andrej Kesely

196k15 gold badges60 silver badges105 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

mibu · Accepted Answer · 2021-08-26 05:58:15Z

0

From the documentation of BeautifulSoup here, you can see your line for pictags in soup.find_all('col-md-2') will search for any element with tag name 'col-md-2' not element with that class name. In other word, your code will search element like so <col-md-2></col-md-2>.

You fix your code and try again or pictags in soup.find_all(class_='col-md-2')

edited Aug 26, 2021 at 5:58

answered Aug 26, 2021 at 5:53

mibu

1,62714 silver badges16 bronze badges

2 Comments

Katherine Elizabeth Kath Over a year ago

Thanks, I tried your recommendation and got this error "file.write(pictags.txt) TypeError: write() argument must be str, not Tag". Sorry to bother you any suggestion is really appreciated

mibu Over a year ago

@KatherineElizabethKath : If you are trying to get text content from the retrieved HTML tag, you can try file.write(pictags.text)

Mhd O. · Accepted Answer · 2021-08-26 06:24:17Z

0

you can match the elements with relevant attributes. pass a dictionary to the attrs parameter of find_all with the desired attributes of the elements you’re looking for.

pictags = soup.find_all(attrs={'class':'col-md-2'})

this will find all elements with class 'col-md-2'

edited Aug 26, 2021 at 6:24

answered Aug 26, 2021 at 5:59

Mhd O.

1301 silver badge8 bronze badges

Collectives™ on Stack Overflow

Python Web Scraping Error without Any Warning

3 Answers 3

Comments

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related