3

I have list for Links (stored in links.txt file )

This code can save result of one link but I do not know how to make it download ALL the source codes of ALL links inside (links.txt) and SAVE THEM AS ONE SINGLE text file for next step of processing ...

import urllib.request    
urllib.request.urlretrieve("https://www.ebay.com/sch/i.html?_from=R40&_nkw=abc&_sacat=0&_pgn=1", "result.txt")

Example links form links.txt

https://www.ebay.com/sch/i.html?_from=R40&_nkw=abc&_sacat=0&_pgn=1
https://www.ebay.com/sch/i.html?_from=R40&_nkw=abc&_sacat=0&_pgn=2
https://www.ebay.com/sch/i.html?_from=R40&_nkw=abc&_sacat=0&_pgn=3
https://www.ebay.com/sch/i.html?_from=R40&_nkw=abc&_sacat=0&_pgn=4
https://www.ebay.com/sch/i.html?_from=R40&_nkw=abc&_sacat=0&_pgn=5
https://www.ebay.com/sch/i.html?_from=R40&_nkw=abc&_sacat=0&_pgn=6
https://www.ebay.com/sch/i.html?_from=R40&_nkw=abc&_sacat=0&_pgn=7
....
3
  • assuming you arent trying to save as *.html, you can use a dict and serialize it using json module Commented Aug 1, 2020 at 19:53
  • Thank you for prompt reply dear @DelphiX , sadly I do not have much knowledge in python Commented Aug 1, 2020 at 19:55
  • Did you try writing a for loop? Commented Aug 1, 2020 at 20:16

2 Answers 2

4

urllib

import urllib.request

with open('links.txt', 'r') as f:
    links = f.readlines()

for link in links:
    with urllib.request.urlopen(link) as f:
        # get html text
        html = f.read().decode('utf-8')

        # append html to file
        with open('result.txt', 'w+') as f:
            f.write(html)

requests

you could also use requests library which i find much more readable

pip install requests
import requests

with open('links.txt', 'r') as f:
    links = f.readlines()

for link in links:
    response = requests.get(link)
    html = response.text

    # append html to file
    with open('result.txt', 'w+') as f:
        f.write(html)

Use loop for page navigation

Use for loop to generate page links as the only thing that is changing is the page no.

links = [
  f'https://www.ebay.com/sch/i.html?_from=R40&_nkw=abc&_sacat=0&_pgn={n}'
  for n in range(1, 10) # [1, 2, 3, 4, 5, 6, 7, 8, 9]
]

or as you go along

for n in range(1, 10):
  link = f'https://www.ebay.com/sch/i.html?_from=R40&_nkw=abc&_sacat=0&_pgn={n}'

  [...]
Sign up to request clarification or add additional context in comments.

6 Comments

looking good and import urllib.request work whiteout error , but still it give result of 1 single link instead all links inside (links.txt)
import requests give error ,, maybe it is safier to use urllib.request //// error with import requests is line 8, in <module> f.write(html) TypeError: write() argument must be str, not bytes
requests is a module, you have to install it using pip
requests is good such that we wont have to worry about formatting from utf-8 to ascii, as the module does it for you
yes requests module already installed , Requirement already satisfied: certifi>=2017.4.17 in c:\users\a-data\appdata\loc al\programs\python\python38-32\lib\site-packages (from requests) (2020.4.5.1)
|
2

Actually, it's usual better to use requests lib, so you should start from installing it:

pip install requests

Then I'd propose to read the links.txt line by line, download all the data you need and store it in file output.txt:

import requests

data = []
# collect all the data from all links in the file 
with open('links.txt', 'r') as links:
    for link in links:
        response = requests.get(link)
        data.append(response.text)

# put all collected to a single file
with open('output.txt', 'w+') as output:
    for chunk in data:
        print(chunk, file=output)

3 Comments

sadly give error , line 8, in <module> data.appent(response.text) AttributeError: 'list' object has no attribute 'appent'
data.append(response.text)
@user13602012 I fix the typo

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.