1

I'm very new to html parsing with Java, I used JSoup previously to parse simple html without it dynamically changing, however I now need to parse a web page that has dynamic elements. This is the code I attempted to parse the web page with prior however it was impossible to find the elements since they where added after the page had loaded. The situation is question is a page that uses google maps with markers on it, I'm attempting to scrape the images of these markers.

    public static void main(String[] args) {
try {
    doc = Jsoup.connect("https://pokevision.com")
            .userAgent(
                    "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.106 Safari/537.36")
            .get();
} catch (IOException e) {
    e.printStackTrace();
}
Elements images = doc.select("img[src~=(?i)\\.(png|jpe?g|gif)]");

for (Element image : images) {
    System.out.println("src : " + image.attr("src"));
}

}

So since apparently this operation is impossible with JSoup, what other libraries can I use to find the image sources. Example of an element I am attempting to select

2

1 Answer 1

1

The problem you are facing is Jsoup retrieves the static source code, as it would be delivered to a browser. What you want is the DOM after the javaScript has been invoked. For this, you can use HTML Unit to get the rendered page and then pass its content to Jsoup for parsing.

// capture rendered page
WebClient webClient = new WebClient();
HtmlPage myPage = webClient.getPage("https://pokevision.com");

// convert to jsoup dom
Document doc = Jsoup.parse(myPage.asXml());

// extract data using jsoup selectors
Elements images = doc.select("img[src~=(?i)\\.(png|jpe?g|gif)]");
for (Element image : images) {
    System.out.println("src : " + image.attr("src"));
}

// clean up resources
webClient.close();
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.