1

Hello I am trying to make a python function to save a list of URLs in .txt file

Example: visit http://forum.domain.com/ and save all viewtopic.php?t= word URL in .txt file

http://forum.domain.com/viewtopic.php?t=1333
http://forum.domain.com/viewtopic.php?t=2333

I use this function but not save I am very new in python can someone help me to create this

web_obj = opener.open('http://forum.domain.com/')
data = web_obj.read()

fl_url_list = open('urllist.txt', 'r')
url_arr = fl_url_list.readlines()
fl_url_list.close()

1 Answer 1

4

This is far from trivial and can have quite a few corner cases (I suppose the page you're referring to is a web page)

To give you a few pointers, you need to:

  • download the web page : you're already doing it (in data)
  • extract the URLs : this is hard, most probably, you'll want to usae an html parser, extract <a> tags, fetch the hrefattribute and put that into a list. then filter that list to have only the url formatted like you like (say with viewtopic). Let's say you got it into urlList
  • then open a file for Writing Text (thus wt, not r).
  • write the content f.write('\n'.join(urlList))
  • close the file

I advise to try to follow these steps and ask relevant questions when you're stuck on a particular issue.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.