2

I have this Element:

<td id="color" align="center">
Z 29.02-23.05 someText,
<br> 
some.Text2 <a href="man.php?id=111">J. Smith</a> (l.)&nbsp;
</td>

How do I get the text after the tag <br>, to look like some.Text2 J. Smith I tried to find answer in the documentation, but ...

update

If i use

System.out.println(element.select("a").text());

i get just only J. Smith.. Unfortunately, I don't know how to parse tags like <br>

2
  • @Rao, thanks. I dont know about xpath. I will go look for information Commented Feb 16, 2016 at 15:52
  • @LordAnomander exactly! I forgot about split... This is also nice. Thank! Let's arrange this in response please. Commented Feb 16, 2016 at 15:59

2 Answers 2

3

Node.childNodes could save your life:

package com.github.davidepastore.stackoverflow35436825;

import java.util.List;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.nodes.Node;
import org.jsoup.nodes.TextNode;

/**
 * Stackoverflow 35436825
 *
 */
public class App 
{
    public static void main( String[] args )
    {
        String html = "<html><body><table><tr><td id=\"color\" align=\"center\">" +
                        "Z 29.02-23.05 someText," +
                        "<br>" +
                        "some.Text2 <a href=\"man.php?id=111\">J. Smith</a> (l.)&nbsp;" +
                        "</td></tr></table></body></html>";
        Document doc = Jsoup.parse( html );
        Element td = doc.getElementById( "color" );
        String text = getText( td );
        System.out.println("Text: " + text);
    }

    /**
     * Get the custom text from the given {@link Element}.
     * @param element The {@link Element} from which get the custom text.
     * @return Returns the custom text.
     */
    private static String getText(Element element) {
        String working = "";
        List<Node> childNodes = element.childNodes();
        boolean brFound = false;
        for (int i = 0; i < childNodes.size(); i++) {
            Node child = childNodes.get( i );
             if (child instanceof TextNode) {
                 if(brFound){
                     working += ((TextNode) child).text();
                 }
             }
             if (child instanceof Element) {
                 Element childElement = (Element)child;
                 if(brFound){
                     working += childElement.text();
                 }
                 if(childElement.tagName().equals( "br" )){
                     brFound = true;
                 }
             }
        }
        return working;
    }
}

The output will be:

Text: some.Text2 J. Smith (l.) 
Sign up to request clarification or add additional context in comments.

1 Comment

Oh!! Thank you so much! will study Nodes in more detail, it`s cool
1

As far as I know you can only receive the text between two tags, which is not possible with a single <br/> tag in your document.

The only option I can think of is to use split() in order to receive the second part:

String partAfterBr = element.text().split("<br>")[1];
Document relevantPart = JSoup.parse(partAfterBr);
// do whatever you want with the Document in order to receive the necessary parts

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.