Python: why does the following xpath returns empty list?

Question

I am trying to extract some text and links from instapaper.com. So I am using the following code to get the job done:

>>> import lxml.html as lh
>>> doc = lh.parse("http://www.instapaper.com/u/folder/1227370/programming")
>>> text = doc.xpath(".//*[@id='bookmark_list']/*/div[3]/a/text()")
>>> len(text)
0
>>> text
[]

As you can see it returns an empty list which means that it is not able to find any text matching the above xpath .

Now when I use the above xpath expr in firebug/firepath it works fine.

enter image description here

You can see in the above image it shows 40 matching nodes.

So, my question is why the above xpath expression is not working with python/lxml.

As requested Instapaper page source

Try removing the first period character.

Niels Bom
– Niels Bom

2012-08-06 10:13:02 +00:00
Commented Aug 6, 2012 at 10:13 — Niels Bom
– Niels Bom, Commented Aug 6, 2012 at 10:13

score 5 · Accepted Answer · 2012-08-06 10:23:43Z

5

There is no element with the ID bookmark_list. Maybe you must be logged in.

Edit

Parsing the real HTML it works:

doc = lh.parse("http://pastebin.com/raw.php?i=1WpFAfCt")
text = doc.xpath("//*[@id='bookmark_list']/*/div[3]/a/text()")
len(text) # => 40

edited Aug 6, 2012 at 10:23

answered Aug 6, 2012 at 10:15

user647772

Sign up to request clarification or add additional context in comments.

1 Comment

ronnie Over a year ago

Nice catch. Yes, I am logged in.

Collectives™ on Stack Overflow

Python: why does the following xpath returns empty list?

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related