0

I want to access forms on HTMl pages throught Java Programming Language without involving real browser in between.

At present I am doing it through HTML UNIT but it takes a bit more time to load a page. When it comes to accessing millions of page, then this extra bit time matters most.

Is there any other methods for doing this?

1
  • What exactly are you asking? I thought I understood it, but based on the other answers, perhaps I didn't. Commented Jan 5, 2010 at 19:04

3 Answers 3

2

I've used something similar called httpunit before, but I have no idea how it compares performance wise.

If you have millions of pages to process, I would recommend throwing some more threads at it. Just a guess, but I think that if you scale this up to multiple threads, you'll run out of bandwidth before you run out of CPU power (in which case it won't matter how much faster it could be)

Sign up to request clarification or add additional context in comments.

Comments

0

Accessing a web page using a browser, even HtmlUnit, is going to be slow. A better method is to test the layer just below the web interface, so that you don't need to access millions of pages -- instead you test enough to make sure that the web interface is using the lower layer correctly.

Comments

0

Most of the interaction in browser comes down to an HTTP GET or an HTTP POST. You need to figure out exactly the operation you need, and then you can construct the URL and/or form data. Then you can use something like this:

   try { 
    //Construct data 
    String data = URLEncoder.encode("key1", "UTF-8") + "=" + URLEncoder.encode("value1", "UTF-8"); data += "&" + URLEncoder.encode("key2", "UTF-8") + "=" + URLEncoder.encode("value2", "UTF-8"); 
    // Send data 
    URL url = new URL("http://hostname:80/cgi"); 
    URLConnection conn = url.openConnection(); conn.setDoOutput(true); 
    OutputStreamWriter wr = new OutputStreamWriter(conn.getOutputStream()); 
    wr.write(data); 
    wr.flush(); 

    // Get the response 
    BufferedReader rd = new BufferedReader(new InputStreamReader(conn.getInputStream())); 
    String line; while ((line = rd.readLine()) != null) { 

    // Process line... } 
    wr.close(); 
    rd.close(); 
    } catch (Exception e) { } 

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.