Final redirected URL using Selenium Htmlunit Java

Question

I have a question for you , which I think could possibly be solved using Selenium I have a set of URLs like for example the one below.

http://www.sears.com/search=little tikes&Little Tikes?filter=Brand&keywordSearch=false&vName=Toys+%26+Games&catalogId=12605&catPrediction=false&previousSort=ORIGINAL_SORT_ORDER&viewItems=50&storeId=10153&adCell=W3

if you paste the URL as it is in the browser it will end up redirecting to another URL , which you can verify in the address bar of the browser(Firefox for example). I need to get the redirected URL , regardless of if the redirect was from a javascript code or not is it possible to do this using the selenium framework ?

I have already tried using HTMLUnit for this however I get the following javascript execution error. Please help!

com.gargoylesoftware.htmlunit.ScriptException: TypeError: Cannot call method "indexOf" of null (script in http://www.sears.com/search=little%20tikes&Little%20Tikes?filter=Brand&keywordSearch=false&catalogId=12605&adCell=W3&catPrediction=false&previousSort=ORIGINAL_SORT_ORDER&viewItems=50&storeId=10153&levels=Toys+%26+Games from (6942, 33) to (6974, 14)#6966)
    at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:669) ~[htmlunit-2.12.jar:2.12]
    at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:601) ~[htmlunit-core-js-2.12.jar:?]
    at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:507) ~[htmlunit-core-js-2.12.jar:?]
    at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute(JavaScriptEngine.java:555) ~[htmlunit-2.12.jar:2.12]
    at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.execute(JavaScriptEngine.java:530) ~[htmlunit-2.12.jar:2.12]
    at com.gargoylesoftware.htmlunit.html.HtmlPage.executeJavaScriptIfPossible(HtmlPage.java:979) ~[htmlunit-2.12.jar:2.12]
    at com.gargoylesoftware.htmlunit.html.HtmlScript.executeInlineScriptIfNeeded(HtmlScript.java:337) ~[htmlunit-2.12.jar:2.12]
    at com.gargoylesoftware.htmlunit.html.HtmlScript.executeScriptIfNeeded(HtmlScript.java:415) ~[htmlunit-2.12.jar:2.12]
    at com.gargoylesoftware.htmlunit.html.HtmlScript$3.execute(HtmlScript.java:266) ~[htmlunit-2.12.jar:2.12]
    at com.gargoylesoftware.htmlunit.html.HtmlScript.onAllChildrenAddedToPage(HtmlScript.java:276) ~[htmlunit-2.12.jar:2.12]
    at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endElement(HTMLParser.java:676) ~[htmlunit-2.12.jar:2.12]
    at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown Source) ~[xercesImpl-2.10.0.jar:?]
    at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.endElement(HTMLParser.java:635) ~[htmlunit-2.12.jar:2.12]
    at org.cyberneko.html.HTMLTagBalancer.callEndElement(HTMLTagBalancer.java:1170) ~[nekohtml-1.9.18.jar:1.9.18]
    at org.cyberneko.html.HTMLTagBalancer.endElement(HTMLTagBalancer.java:1072) ~[nekohtml-1.9.18.jar:1.9.18]
    at org.cyberneko.html.filters.DefaultFilter.endElement(DefaultFilter.java:206) ~[nekohtml-1.9.18.jar:?]
    at org.cyberneko.html.filters.NamespaceBinder.endElement(NamespaceBinder.java:330) ~[nekohtml-1.9.18.jar:?]
    at org.cyberneko.html.HTMLScanner$ContentScanner.scanEndElement(HTMLScanner.java:3074) ~[nekohtml-1.9.18.jar:1.9.18]
    at org.cyberneko.html.HTMLScanner$ContentScanner.scan(HTMLScanner.java:2041) ~[nekohtml-1.9.18.jar:1.9.18]
    at org.cyberneko.html.HTMLScanner.scanDocument(HTMLScanner.java:918) ~[nekohtml-1.9.18.jar:1.9.18]
    at org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:499) ~[nekohtml-1.9.18.jar:1.9.18]
    at org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:452) ~[nekohtml-1.9.18.jar:1.9.18]
    at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) ~[xercesImpl-2.10.0.jar:?]
    at com.gargoylesoftware.htmlunit.html.HTMLParser$HtmlUnitDOMBuilder.parse(HTMLParser.java:892) ~[htmlunit-2.12.jar:2.12]
    at com.gargoylesoftware.htmlunit.html.HTMLParser.parse(HTMLParser.java:241) ~[htmlunit-2.12.jar:2.12]
    at com.gargoylesoftware.htmlunit.html.HTMLParser.parseHtml(HTMLParser.java:187) ~[htmlunit-2.12.jar:2.12]
    at com.gargoylesoftware.htmlunit.DefaultPageCreator.createHtmlPage(DefaultPageCreator.java:268) ~[htmlunit-2.12.jar:2.12]
    at com.gargoylesoftware.htmlunit.DefaultPageCreator.createPage(DefaultPageCreator.java:156) ~[htmlunit-2.12.jar:2.12]
    at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto(WebClient.java:434) ~[htmlunit-2.12.jar:2.12]
    at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:309) ~[htmlunit-2.12.jar:2.12]
    at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:374) ~[htmlunit-2.12.jar:2.12]

This question could be a duplicate of: stackoverflow.com/questions/20315330/… — Mosty Mostacho
– Mosty Mostacho, Commented Dec 31, 2013 at 16:15
does not seem to be a duplicate, the use-case is different click vs get and the exception stack trace is completely different as well — user1965449
– user1965449, Commented Dec 31, 2013 at 16:44

A Paul · Accepted Answer · 2014-01-01 03:23:17Z

1

This should be easy if I have understood you question. Below are the steps

1. Get the FirefoxDriver object
2. call driver.get("http://www.sears.com/search=little tikes&Little Tikes?filter=Brand&keywordSearch=false&vName=Toys+%26+Games&catalogId=12605&catPrediction=false&previousSort=ORIGINAL_SORT_ORDER&viewItems=50&storeId=10153&adCell=W3");
This will open the url in firefox. On open the url will be forwarded to actual url. (This is as per my understanding from your description)
3. Then you can do driver.getCurrentUrl(). This will give you the url.

Let me know if this works for you :)

UPDATE :

        WebClient webClient = new WebClient(BrowserVersion.INTERNET_EXPLORER_9);
        webClient.getOptions().setJavaScriptEnabled(true);
        webClient.getOptions().setRedirectEnabled(true);
        webClient.getOptions().setThrowExceptionOnScriptError(false);
        webClient.getOptions().setCssEnabled(true);     
        HtmlPage page = (HtmlPage) webClient.getPage("http://www.sears.com/search=little tikes&Little Tikes?filter=Brand&keywordSearch=false&vName=Toys+%26+Games&catalogId=12605&catPrediction=false&previousSort=ORIGINAL_SORT_ORDER&viewItems=50&storeId=10153&adCell=W3");
        WebResponse response = page.getWebResponse();
        String content = response.getContentAsString();
        System.out.println(page.getUrl());

edited Jan 1, 2014 at 3:23

answered Dec 31, 2013 at 7:37

A Paul

8,3113 gold badges33 silver badges65 bronze badges

Sign up to request clarification or add additional context in comments.

9 Comments

user1965449 Over a year ago

thanks ABP i will try this once I rule out HTMLUnit. But when you "This will open the url in firefox" will it open a window in an actual window or is it mimiced in the java program ? Thanks

A Paul Over a year ago

It will open a Firefox browser instance.

user1965449 Over a year ago

how can this be done in a simulated fashion , meaning the java program simulates a web browser and not the actual browser ? Thanks.

A Paul Over a year ago

If you do not want to open the Firefox and want to do the html open using the java program them you have to use HTMLUnitDriver. But HTMLUnitDriver have issues with Javascript, I have faced Javascript issues that I was not able to solve. Tried everything. Also please check the EDIT in my post. Added another code, might work for you.

user1965449 Over a year ago

Thanks! I am doing the exact samething , but using Forefox_17 instead, it works but its way too slow , so much so that I cannot even debug in eclipse . I need to parse many urls and need to scale , not sure what options I have though .

|

Anuragh27crony · Accepted Answer · 2013-12-31 10:27:16Z

1

If you are using HTMLUnit Driver, then please enable JavaScript (it's set off by default) as shown below.

More over HTMLUnit uses Rhino as it's JavaScript engine which differs from other main stream browser JS engines.

HtmlUnitDriver Browser_Session= new HtmlUnitDriver();
Browser_Session.setJavascriptEnabled(true);

or

HtmlUnitDriver Browser_Session = new HtmlUnitDriver(true);

Below steps should fetch the redirected url.

Browser_Session.navigate().to("URL");
Browser_Session.getCurrentUrl(); //This fetches the current re-directed URL.

Hope this helps

answered Dec 31, 2013 at 10:27

Anuragh27crony

2,9571 gold badge21 silver badges29 bronze badges

1 Comment

user1965449 Over a year ago

Thanks! I was using HTMLUnit WebClient class with JavaScript enabled to true, apparently Rhino JS engine was throwing this exception , are you saying that if I use HTMLUnitDriver instead it will not cause this Javascript execution exception because it does not use Rhino engine ? Thanks.

Collectives™ on Stack Overflow

Final redirected URL using Selenium Htmlunit Java

2 Answers 2

9 Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

9 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related