Python XPath scraping says list has no text attribute

Question

I am using a code to scrape a PDF to generate a relevant dictionary. My code works when I access each text block individually, i.e

x = scraperwiki.pdftoxml(u.read())
    r = lxml.etree.fromstring(x)
    s = r.xpath('//page[@number="142"]/text[@left = "134"]')
    print s[8].text

print s[0],s[1].. all seem to work but when I try the same for

x = scraperwiki.pdftoxml(u.read())
    r = lxml.etree.fromstring(x)
    s = r.xpath('//page[@number="142"]/text[@left = "134"]')
    print s[0:8].text

I get this error: AttributeError: 'list' object has no attribute 'text'

Can anyone tell me what's wrong?

you could just add /text() to the end of your xpath expression if all you care about is the text from each node. — roippi
– roippi, Commented Aug 30, 2014 at 14:28

falsetru · Accepted Answer · 2014-08-30 14:06:31Z

1

text is an attribute of each element, not of the list.

Iterate each elements.

x = scraperwiki.pdftoxml(u.read())
r = lxml.etree.fromstring(x)
s = r.xpath('//page[@number="142"]/text[@left = "134"]')
for elem in s[:8]:
    print elem.text

or use list comprehension:

x = scraperwiki.pdftoxml(u.read())
r = lxml.etree.fromstring(x)
s = r.xpath('//page[@number="142"]/text[@left = "134"]')
print [elem.text for elem in s[:8]]

answered Aug 30, 2014 at 14:06

falsetru

371k69 gold badges769 silver badges659 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Python XPath scraping says list has no text attribute

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related