3

I'm trying to pass existing URLs as parameter to load it's HTML in a single txt file:

for line in open('C:\Users\me\Desktop\URLS-HERE.txt'):
 if line.startswith('http') and line.endswith('html\n') :
    fichier = open("C:\Users\me\Desktop\other.txt", "a")
    allhtml = urllib.urlopen(line)
    fichier.write(allhtml)
    fichier.close()

but i get the following error:

TypeError: expected a character buffer object

2 Answers 2

3

The value returned by urllib.urlopen() is a file like object, once you have opened it, you should read it with the read() method, as showed in the following snippet:

for line in open('C:\Users\me\Desktop\URLS-HERE.txt'):
   if line.startswith('http') and line.endswith('html\n') :
      fichier = open("C:\Users\me\Desktop\other.txt", "a")
      allhtml = urllib.urlopen(line)
      fichier.write(allhtml.read())
      fichier.close()

Hope this helps!

Sign up to request clarification or add additional context in comments.

Comments

1

The problem here is that urlopen returns a reference to a file object from which you should retrieve HTML.

for line in open(r"C:\Users\me\Desktop\URLS-HERE.txt"):
 if line.startswith('http') and line.endswith('html\n') :
    fichier = open(r"C:\Users\me\Desktop\other.txt", "a")
    allhtml = urllib2.urlopen(line)
    fichier.write(allhtml.read())
    fichier.close()

Please note that urllib.urlopen function is marked as deprecated since python 2.6. It's recommended to use urllib2.urlopen instead.

Additionally, you have to be careful working with paths in your code. You should either escape each \

"C:\\Users\\me\\Desktop\\other.txt"

or use r prefix before a string. When an 'r' or 'R' prefix is present, a character following a backslash is included in the string without change.

r"C:\Users\me\Desktop\other.txt"

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.