0

I try to get a dynamic page from URL. I am working in Java. I have done this using Selenium, but it takes lots of time. As it takes time to invoke driver of Selenium. That's why I shifted to HtmlUnit, as it is GUILess Browser. But my HtmlUnit implementation shows some exception.

Question :-

  1. How can I correct my HtmlUnit implementation.
  2. Is the page produced by Selenium is simiar to the page produced by HtmlUnit? [ Both are dynamic or not? ]

My selenium code is :-

 public static void main(String[] args) throws IOException {

 // Selenium
 WebDriver driver = new FirefoxDriver();
 driver.get("ANY URL HERE");  
 String html_content = driver.getPageSource();
 driver.close();

 // Jsoup makes DOM here by parsing HTML content
 Document doc = Jsoup.parse(html_content);

 // OPERATIONS USING DOM TREE

}

HtmlUnit code:-

package XXX.YYY.ZZZ.Template_Matching;

import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.HtmlPage;
import org.junit.Assert;
import org.junit.Test;

public class HtmlUnit {

    public static void main(String[] args) throws Exception {
        //HtmlUnit htmlUnit = new  HtmlUnit();
        //htmlUnit.homePage();
        WebClient webClient = new WebClient();
        HtmlPage currentPage = webClient.getPage("http://www.jabong.com/women/clothing/womens-tops/?source=women-leftnav");
        String textSource = currentPage.asText();
        System.out.println(textSource);
    }
}

It shows exception :-

enter image description here

1 Answer 1

1

1: How can I correct my HtmlUnit implaementation.

Looking at the stack trace, it seems to be saying that the javascript engine executed some javascript that tried to access an attribute on a Javascript "undefined" value. If it is correct, that would be a bug in the javascript you are testing, not in the HtmlUnit code.

2: Is the page produced by Selenium is simiar to the page produced by HtmlUnit?

That does not make sense. Neither Selenium or HtmlUnit "produces" a page. The page is produced by the serve code you are testing.

If you are asking if HtmlUnit is capable of dealing with code that has embedded Javascript ... there is clear evidence in the stacktrace that it is trying to execute the Javascript.

Sign up to request clarification or add additional context in comments.

3 Comments

My Second question is clear that, HtmlUnit able to get dynamic source code of any URL. But, in my first question, how can I resolve this? As my task is to get the dynamic page in String of any URL in web. Please help me to do that. Tell me any other method that does the same.
Please help me from this problem.
Like I said. I think that the stacktrace is telling you there is a bug in the web page that your server is delivering. You resolve it by finding out WHY the javascript is trying to get an attribute from 'undefined'.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.