Let me get straight to my problem.
public static final String EXAMPLE_TEST = "<span id=\"lblObject\"><a href=\"http://www.guideline.gov/content.aspx?id=15135\" alt=\"View object\">Manual medicine guidelines for musculoskeletal injuries.</a></span>";
//public static final String EXAMPLE_TEST ="<a href=\"http://www.guideline.gov/content.aspx?id=1112\"></a>";
public static void main(String[] args) {
Pattern pattern = Pattern.compile("<a href=\"http://www.guideline.gov/content.aspx?id=(\\d+)\"");
// in case you would like to ignore case sensitivity,
// you could use this statement:
// Pattern pattern = Pattern.compile("\\s+", Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(EXAMPLE_TEST);
// check all occurance
while (matcher.find()) {
System.out.print("Start index: " + matcher.start());
System.out.print(" End index: " + matcher.end() + " ");
System.out.println(matcher.group());
}
}
There is some problem with the regex. The example string I have used is just a dummy string. Actually I will have a html file in which there are many url links which have the following pattern http://www.guideline.gov/content.aspx?id=some_number. I need to grab those links from that html file. Please guys can you help me find whats wrong with my regex.