python read file from a web URL

Question

I am currently trying to read a txt file from a website.

My script so far is:

webFile = urllib.urlopen(currURL)

This way, I can work with the file. However, when I try to store the file (in webFile), I only get a link to the socket. Another solution I tried was to use read()

webFile = urllib.urlopen(currURL).read()

However this seems to remove the formating (\n, \t etc) are removed.

If I open the file like this:

 webFile = urllib.urlopen(currURL)

I can read it line by line:

for line in webFile:
    print line

This will should result in:

"this" 
"is" 
"a"
"textfile"

But I get:

't'
'h'
'i'
...

I wish to get the file on my computer, but maintain the format at the same time.

stackoverflow.com/questions/22676/…. Just take webFile and write it to a file. — postelrich
– postelrich, Commented Oct 6, 2015 at 13:56
is there no way of doing it, without hving to first write it to a local file? — mat
– mat, Commented Oct 6, 2015 at 13:59

Pasqual Guerrero · Accepted Answer · 2015-10-06 14:02:37Z

8

You should use readlines() to read entire line:

response = urllib.urlopen(currURL)
lines = response.readlines()
for line in lines:
    .
    .

But, i strongly recommend you to use requests library. Link here http://docs.python-requests.org/en/latest/

answered Oct 6, 2015 at 14:02

Pasqual Guerrero

4062 silver badges8 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Noxeus · Accepted Answer · 2015-10-06 14:00:42Z

2

This is because you iterate over a string. And that will result in character for character printing.

Why not save the whole file at once?

import urllib
webf = urllib.urlopen('http://stackoverflow.com/questions/32971752/python-read-file-from-web-site-url')
txt = webf.read()

f = open('destination.txt', 'w+')
f.write(txt)
f.close()

If you really want to loop over the file line for line use txt = webf.readlines() and iterate over that.

answered Oct 6, 2015 at 14:00

Noxeus

5774 silver badges18 bronze badges

2 Comments

Raimundo Baravaglio Over a year ago

module 'urllib' has no attribute 'urlopen'

Noxeus Over a year ago

I think I wrote this in Python version 2. See here : stackoverflow.com/questions/25863101/…

Phil Sheard · Accepted Answer · 2015-10-06 14:02:15Z

0

If you're just trying to save a remote file to your local server as part of a python script, you could use the PycURL library to download and save it without parsing it. More info here - http://pycurl.sourceforge.net

Alternatively, if you want to read and then write the output, I think you've just got the methods out of sequence. Try the following:

# Assign the open file to a variable
webFile = urllib.urlopen(currURL)

# Read the file contents to a variable
file_contents = webFile.read()
print(file_contents)

> This will be the file contents

# Then write to a new local file
f = open('local file.txt', 'w')
f.write(file_contents)

If neither applies, please update the question to clarify.

answered Oct 6, 2015 at 14:02

Phil Sheard

2,1621 gold badge18 silver badges41 bronze badges

Comments

Udith Indrakantha · Accepted Answer · 2021-06-20 09:57:35Z

0

You can directly download the file and save it using a name that you prefer. After that, you can read the file and later you can delete it if you don't need the file anymore.

!pip install wget

import wget 
url = "https://raw.githubusercontent.com/apache/commons-validator/master/src/example/org/apache/commons/validator/example/ValidateExample.java" 
wget.download(url, 'myFile.java')

answered Jun 20, 2021 at 9:57

Udith Indrakantha

98013 silver badges17 bronze badges

Collectives™ on Stack Overflow

python read file from a web URL

4 Answers 4

Comments

2 Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related