0

I am using Jsoup with Java to Parse an HTML file. My question is how can I just extract the line that says "Hourly Rate: 23,016 orders" I am parsing a lot of files, so the number next to the Hourly Rate will change.

<html>
<head>
<title>Testing</title>
</head>
<body>
<p class=MsoNormal align=center style='background:#DEDEDF'>
<span style='font-size:18.0pt'><b>Testing</b></span></p>
Hourly Rate: 23,016 orders<br>
<table border=0 cellpadding=0>
<tr valign=top>
<td>

Thanks

2 Answers 2

1

I just added this code:

 String HourlyRate = doc.body().ownText();
//String text = doc.body().text();

System.out.println(HourlyRate);

This Printed out: Hourly Rate: 23,016 orders

Sign up to request clarification or add additional context in comments.

Comments

0

Grab the MsoNormal class then use a regular expression to look for a number i.e.

Document doc = Jsoup.parse(htmlString);
Element msoNormal = doc.getElementsByClass("MsoNormal").first();
if(msoNormal!=null){
  Pattern p = Pattern.compile("[0-9]+,[0-9]+");
  Matcher m = pattern.matcher(msoNormal.text());
  if(matcher.find())
    System.out.println(m.get());
}

2 Comments

THanks for the response selig" I am getting eorrosthe "Pattern" and "Matcher" classes.My IDE cannot find them.
You need to import them from java.util.regex

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.