I want to take input from user of url in python

Question

I want to take website name from user input and maximum no. of pages that he want to crawl for crawling website...but can't getting any solution..here's my code

import requests
from bs4 import *
from urllib import request


url1 = input("Enter url you want to crawl:")
max_pages1 = int(input("Enter no. of pages you want to crawl:"))


def web_crawler(max_pages,url):
   page = 1
   while page <= max_pages:
      url4 = str(url) + str(page)
      url_get = requests.get(url4)
      plain_text = url_get.text
      soup = BeautifulSoup(plain_text,"html.parser")
      for a in soup.findAll('a',{'rel':'bookmark'}):
          href = a.get('href')
          title = a.string
          #print(title)
          print(href)
          #info_about_web_pages(href)
      page +=1

def info_about_web_pages(url):
   url_get = requests.get(url)
   plain_text = url_get.text
   soup = BeautifulSoup(plain_text,"html.parser" )
   links = set()
   for about in soup.findAll('a'):
       href = about.get('href')
       links.update([href])

   print(links)

web_crawler(max_pages1,url1)

It shows me nothing in output

do you have an example of the url you are trying to do this for? Are you sure that an anchor with the attribute 'rel': 'bookmark' is in its source code? — B.Adler
– B.Adler, Commented Feb 19, 2017 at 17:38

B.Adler · Accepted Answer · 2017-02-19 17:40:52Z

1

If there is no anchor with the attributes you are trying to find in the html source code then this will always print nothing. try printing soup.prettify() and see if the tag you are looking for even exists. More often than not when I am not printing the values I'm expecting it's because the value does not have the attributes I am looking for.

answered Feb 19, 2017 at 17:40

B.Adler

1,5481 gold badge18 silver badges26 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

B.Adler Over a year ago

in the line after soup = BeautifulSoup(plain_text,"html.parser" ) put print(str(soup.prettify()))

Collectives™ on Stack Overflow

I want to take input from user of url in python

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related