0

The below code i have helps me get the source code from the provided url without any errors. But what i am looking for is to format the source code i receive.

My manual task earlier was to go to this website http://www.freeformatter.com/html-formatter.html paste my source code and then format it by selecting 3 space per indent option. How do i get my java code to do the same formatting for me ?

The reason i want it formatted is because i have another script which reads it line by line and saves data which is required and ignores the rest.

 private static String getUrlSource(String url) throws IOException {
     URL x= new URL(url);
     URLConnection yc = x.openConnection();
     BufferedReader in = new BufferedReader(new InputStreamReader(
             yc.getInputStream(), "UTF-8"));
     String inputLine;
     StringBuilder a = new StringBuilder();
     while ((inputLine = in.readLine()) != null)
     { a.append(inputLine); a.append("\n");
     }
     in.close();

     return a.toString();
 }

public static void main(String[] args) {
    // TODO Auto-generated method stub
  System.out.println("Hello");

   url="http://www.bctransit.com/regions/cfv/schedules/schedule.cfm?p=day.text&route=1%3A0&day=1&";

  try {
    String value= getUrlSource(url);
    System.out.println(value);
} catch (IOException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
}

1 Answer 1

2

If you are scraping a web page, I suggest using a real HTML parser instead. Your method is bound to fail sooner or later.

I would recommend having a look at jsoup. While I have never used it, I have had great results with its Python counterpart, Beautifulsoup.

Using a library such as jsoup will get you a nice object model to traverse instead of relying on string manipulation.

As a bonus, jsoup will actually format the HTML string for you, should you want that anyway.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.