Weird python error when using lxml and xpath

Question

I'm using python to write a crawler, since I need to parse html so I import lxml but it comes out an wierd error:

<type 'dict'>
{'xpath': '//ul[@id="i-detail"]/li[1]', 'name': u'\u6807\u9898'}

<type 'dict'>
{'xpath': '//ul[@id="i-detail"]/li[1]', 'name': u'\u6807\u9898'}

<type 'dict'>   
{'xpath': '//ul[@id="i-detail"]/li[1]', 'name': u'\u6807\u9898'}
Exception in thread Thread-3:
Traceback (most recent call last):
  File     "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/threading.py", line     522, in __bootstrap_inner
    self.run()
  File     "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/threading.py", line     477, in run
    self.__target(*self.__args, **self.__kwargs)
  File "fetcher.py", line 78, in run
    self.extractContent(html)
  File "fetcher.py", line 151, in extractContent
    m = tree.xpath(c['xpath'])
AttributeError: 'NoneType' object has no attribute 'xpath'

<type 'dict'>
{'xpath': '//ul[@id="i-detail"]/li[1]', 'name': u'\u6807\u9898'}

Here's a piece of my code:

for c in self.contents:
  print type(c)
  print c
  m = tree.xpath(c['xpath'])

Please help me with these two questions:

Why the type is dict but the error says NoneType ?
I'm tring to match something in the "tree", but it doesn't work (The website is encoded under GBK, could the encoding type cause this kind of problems ?).

Community · Accepted Answer · 2020-06-20 09:12:55Z

1

You are getting an AttributeError, which means that tree has no xpath attribute as it has become None, not that c has no xpath key, that'd be a KeyError instead.

Clearly we are missing some code here, where tree is set to `None.
You are not printing the result of your tree.xpath() calls, so there is nothing in your code (as shared with us here) that prints m. The tree.xpath() calls could be working fine for all we know.

Reading between the lines and speculating a little, you are assigning the result of tree.xpath() back to tree, and your XPath expression didn't match anything and returned None. The next time into the loop, you now have None instead of an ElementTreeNode, so the xpath() call fails with an AttributeError.

edited Jun 20, 2020 at 9:12

CommunityBot

11 silver badge

answered Jul 11, 2012 at 7:18

Martijn Pieters

1.1m326 gold badges4.2k silver badges3.4k bronze badges

Sign up to request clarification or add additional context in comments.

Comments

BrenBarn · Accepted Answer · 2012-07-11 07:14:43Z

0

For your first question, the error is telling you that tree is None, since that's what you're trying to read the xpath attribute of. But you are printing the type of c, not tree.

I can't understand what you're asking with your second question.

answered Jul 11, 2012 at 7:14

BrenBarn

253k39 gold badges421 silver badges392 bronze badges

Collectives™ on Stack Overflow

Weird python error when using lxml and xpath

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related