0

I'm sure everyone will groan, and tell me to look at the documentation (which I have) but I just don't understand how to achieve the same as the following:

curl -s http://www.maxmind.com/app/locate_my_ip | awk '/align="center">/{getline;print}'

All I have in python3 so far is:

import urllib.request

f = urllib.request.urlopen('http://www.maxmind.com/app/locate_my_ip')

for lines in f.readlines():
    print(lines)

f.close()

Seriously, any suggestions (please don't tell me to read http://docs.python.org/release/3.0.1/library/html.parser.html as I have been learning python for 1 day, and get easily confused) a simple example would be amazing!!!

4
  • You may prefer this site for getting your IP: you don't need to go through the HTML to find it. Commented Jan 15, 2012 at 18:16
  • Your posted code is wrong because you have lost the indentation (the line print(lines) should be indented). Commented Jan 15, 2012 at 18:17
  • I know, it kept disappearing when I set it as code when posting. It is correct in the file. Commented Jan 15, 2012 at 18:18
  • The code the I run also gets geographic location etc (be it general) Commented Jan 15, 2012 at 18:19

3 Answers 3

4

This is based off of larsmans's answer, above.

f = urllib.request.urlopen('http://www.maxmind.com/app/locate_my_ip')
for line in f:
    if b'align="center">' in line:
        print(next(f).decode().rstrip())
f.close()

Explanation:

for line in f iterates over the lines in the file-like object, f. Python let's you iterate over lines in a file like you would items in a list.

if b'align="center">' in line looks for the string 'align="center">' in the current line. The b indicates that this is a buffer of bytes, rather than a string. It appears that urllib.reqquest.urlopen interpets the results as binary data, rather than unicode strings, and an unadorned 'align="center">' would be interpreted as a unicode string. (That was the source of the TypeError above.)

next(f) takes the next line of the file, because your original awk script printed the line after 'align="center">' rather than the current line. The decode method (strings have methods in Python) takes the binary data and converts it to a printable unicode object. The rstrip() method strips any trailing whitespace (namely, the newline at the end of each line.

Sign up to request clarification or add additional context in comments.

Comments

3
# no need for .readlines here
for ln in f:
    if 'align="center">' in ln:
        print(ln)

But be sure to read the Python tutorial.

4 Comments

TypeError: Type str doesn't support the buffer API
File "ip.py", line 7, in <module> if 'align="center">' in ln: TypeError: Type str doesn't support the buffer API
@user969617, change 'align="center">' to b'align="center">'.
@Rob yeah, I noticed that on the suggestion by mlefavor.
0

I would probably use regular expressions to get the ip itself:

import re
import urllib

f = urllib.request.urlopen('http://www.maxmind.com/app/locate_my_ip')
html_text=f.read()
re.findall(r'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}',html_text)[0]

which will print the first string of the format: 1-3digits, period, 1-3digits,...

I take it you were looking for the line, you could simply extend the string in the findall() expression to take care of that. (see the python docs for re for more details). By the way, the r in front of the match string makes it a raw string so you wouldn't need to escape python escape characters inside of it (but you still need to escape RE escape characters).

Hope that helps

3 Comments

Your code gives me: TypeError: can't use a string pattern on a bytes-like object
That's another symptom of the unicode/bytes problem. You'd need html_text=f.read().decode().
Interesting, is this an effect of Python 2.7 vs Python 3? I ran the code (on Python 2.7) and it worked. Thanks mlefavor for pointing to a solution.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.