0

I've been looking on jsoup page, but all I could do was extract titles and so on from url... but I need whole absolute url address from exact div. I want to store it somewhere and use it later.

<div class="link-block container">
                <a href="/what-to-do/11636002" rel="nofollow" 
                        title="unique abilities" class="just-link">
                </a>
</div>

As I said, I tried String absHref = link.attr("abs:href"), but it gave me the "title" part from the code. What I am doing wrong? Please give me some advice.

4
  • Show us your code implementation. Commented May 8, 2015 at 23:54
  • For getting absolute url from some part of it you need to use a regex stackoverflow.com/questions/29326901/… Commented May 9, 2015 at 4:32
  • I found quite simple way: URL baseUrl = new URL("my base url"); URL url = new URL(baseUrl, "/what-to-do/11636002"); and it works fine, because I have got an absolute link at the end. Now just tell me guys how to extract the "/what-to-do/11636002" part using for example jsoup ?? Commented May 9, 2015 at 6:57
  • If some answer worked for you then you should accept it. Else, if you have later found out a better solution to the problem, you can answer your own question and accept that. Commented Nov 19, 2015 at 5:00

1 Answer 1

0

You can do it like this:

String myHtml = "<div class=\"link-block container\">\n"
                + "  <a href=\"/what-to-do/11636002\" rel=\"nofollow\" title=\"unique abilities\" class=\"just-link\">\n"
                + "  </a>\n"
                + "</div>";

Document doc = Jsoup.parseBodyFragment(myHtml, "http://your.baseurl");
Element e = doc.select("a").first();

System.out.println(e.attr("abs:href"));

Prints:

http://your.baseurl/what-to-do/11636002

If you want to get all a Elements which are similar, do:

Elements elements = doc.select("a[href*=/what-to-do/");
for (Element e: elements) {
   System.out.println(e.attr("abs:href"));
}

This will get you all a with href containing "/what-to-do/".

Sign up to request clarification or add additional context in comments.

4 Comments

The problem is that I don't know how to get exactly this part into my variable (for example myHtml). That was my question.
@edinson From where do you want to get it ? If you have myHtml as a String, then you should parse it as in my answer. If it's from a URL, use Jsoup.connect(yourUrl).get(); or do you mean something else ? It's not quite clear to me.
I have got a whole HTML site. And from the whole site' code I need to extract the "/what-to-do/11636002" part. So it is not the only one url in the code.
@edinson, Just select all a Elements from the page which you want. I've updated my answer.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.