10

I have HTML code parsed to org.w3c.dom.Document. I need check all tag style attributes, parse them, change some CSS properties and put modified style definition back to attribute.

Is there any standard ways to parse style attribute? How can I use classes and interfaces from org.w3c.dom.css package?

I need a Java solution.

1
  • 2
    +1 for not suggesting a regex. That is what 9 out of 10 newbs asks for first, and as we all know, that can't be done. Commented Nov 23, 2010 at 13:19

3 Answers 3

3

If you want a way to do this without any dependencies you can use the javax.swing.text.html package classes to get you most of the way there:

import javax.swing.text.html.*;

StyleSheet styleSheet = new StyleSheet()
AttributeSet dec = ss.getDeclaration("margin:2px;padding:3px");
Object marginLeft = dec.getAttribute(CSS.Attribute.MARGIN_LEFT);
String marginLeftString = marginLeft.toString(); // "2px"

This returns a StyleSheet.CssValue, which is unfortunately not public. Thus the need to convert it to a String. Also, it won't handle em units. It is sort of smart about various styles, though. Not ideal, but avoids dependencies.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for solution, but I'd like to see something more general, something that supports non-standard CSS properties (e.g. moz_xxx). In other words, generic parser that only parses style, not implements it.
1

First, I would check out the classes in the javax.xml packages. The javax.xml.parsers package contains parsers for two styles of parsing: SAXParser and DocumentBuilder. It sounds like you want the DocumentBuilder to create a DOM. You can either traverse the DOM manually (slow and painful), or you can use the XPath standard to look up elements in the DOM. Java support for that is in javax.xml.xpath.

XPathExpression xpath = XPath.compile("//@style");
Object results = xpath.evaluate(dom, XPathConstants.NODESET);

It's your responsibility to cast the results to the NodeList and iterate properly, but its the most direct way to get at what you want. Check out Java's DOM API for more information about reading and changing values.

I don't believe there is any support for a CSS parser built into Java, but you can look at these projects:

That may help you with your goals. NOTE: the Batik CSS parser is incorporated into the larger Apache Batik project: http://xmlgraphics.apache.org/batik/index.html which may have more than what you need, but it's a corporate friendly license.

3 Comments

HTML is already parsed, also I know how to collect style attributes. Now I have to parse content of these style attributes. I.e. convert string with CSS definitions to a collection of key-value pairs or something similar.
Did you look at the CSS parser projects I pointed you to? There are no javax.* packages for parsing CSS. The poor man's approach would be regex which will work fine for CSS--but that's not what you wanted.
Thanks for library links. css.sac is intended to parse CSS stylesheets. cssparser has no documentation at all, even simple how-to. batik seems to be too complex for my task.
0

I'm not sure I completely understand your requirements, but basically, you'll have to:

  1. Read the stylesheet(s) and extract the CSS rules.
  2. Read the HTML page(s) and find the attributes.
  3. Substitute the new CSS properties for the old CSS properties.
  4. Write the HTML page(s).

It looks like you would use the CSSStyleSheet interface to extract the CSS rules from the sytlesheet(s).

1 Comment

No, I have a stye attribute value in a string and I have to parse it to key-value pairs according to CSS standards.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.