1

I want to find a fast function to get all style properties of a lxml element that take into account the css stylesheet, the style attribute element and tackle the herit issue.

For example :

html :

<body>
  <p>A</p>
  <p id='b'>B</p>
  <p style='color:blue'>B</p>
</body>

css :

body {color:red;font-size:12px}
p.b {color:pink;}

python :

elements = document.xpath('//p')
print get_style(element[0]) 
>{color:red,font-size:12px}
print get_style(element[1]) 
>{color:pink,font-size:12px}
print get_style(element[2]) 
>{color:blue,font-size:12px}

Thanks

3
  • Since what you want is not XML parsing but HTML/CSS interpretation, this is not covered by lxml. Commented Mar 1, 2012 at 16:24
  • Sorry, but the only thing that does what you want is a browser. There's no way of resolving CSS rules without implementing a big mess of HTML, CSS and DOM specs. What a mess, eh? Commented Mar 1, 2012 at 16:24
  • Yes i know but i want a function that can do it Commented Mar 1, 2012 at 16:25

1 Answer 1

2

You can do this with a combination of lxml and cssutils. This cssutils utility module should be able to do what you're asking. Install cssutils along with that module, then run the following code:

from style import *

html = """<body>
    <p>A</p>
    <p id='b'>B</p>
    <p style='color:blue'>B</p>
</body>"""

css = """body {color:red;font-size:12px}
p {color:yellow;}
p.b {color:green;}"""


def get_style(element, view):
    if element != None:
        inline_style = [x[1] for x in element.items() if x[0] == 'style']
        outside_style =  []
        if view.has_key(element):
            outside_style = view[element].getCssText()
        r = [[inline_style, outside_style]]
        r.append(get_style(element.getparent(), view))
        return r
    else:
        return None

document = getDocument(html)
view = getView(document, css)

elements = document.xpath('//p')
print get_style(elements[0], view) 
Sign up to request clarification or add additional context in comments.

4 Comments

Not really, it do not tackle the herit issue. get_style(element[0]) -> {}
print get_style(elements[0], view) -> {color:yellow;} but i haven't font-size because herit doesn't work
I update my example, it's more correct now. In this case the get_style(elements[0], view) -> {} however the color may be red
Thanks, it's interesting, but i think not realy scalable to catch fast all css property of a web page

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.