1

I have tried almost all the methods mentioned in Stackoverflow, but none of them worked...

I'm trying to scraping following page using HtmlUnit: http://www.nseindia.com/corporates/offerdocument/past_issue_document.htm

Just an empty page returned. It should be caused by javascript issue. I tried following tricks in HtmlUnit: waitForBackgroundJavaScript, refresh, redirect, sleep, enable javascript, click(true, true, true), etc. None of them worked...

Any suggestion:

my code:

String url = "http://www.nseindia.com/corporates/offerdocument/past_issue_document.htm";
WebClient webClient = new WebClient(BrowserVersion.INTERNET_EXPLORER_8);
webClient.setJavaScriptEnabled(true);
HtmlPage page = (HtmlPage) webClient.getPage(url);
this.getWebClient().waitForBackgroundJavaScriptStartingBefore(5000);
System.out.println(page.asXml());

Thanks a lot!

2
  • 1
    Links to the other methods you've tried would be useful. Commented Dec 18, 2012 at 18:16
  • 1
    If the page uses ajax you may need webClient.setAjaxController(new NicelyResynchronizingAjaxController());. This will cause the AJAX to block the call until it is completed. Commented Dec 20, 2012 at 2:54

1 Answer 1

1

I had once similar issues. I workarounded it by using a firefox dev plugin, which logs all the requests the javascript page does. Then I emulated those requests directly from HtmlUnit (just grep the requests from the request log, paste them and inject sessionid misc parameters which are usually easy to identify. Especially useful when dealing with sites using a lot of ajax stuff.

Sign up to request clarification or add additional context in comments.

1 Comment

Finally, I gave up to access it directly. Instead, I hijacked the response json file from their server and work on it. This India SE website really sucks! Always down!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.