I am trying to extract both the tag and the text between the tags in a text file. I am trying to achieve this using regex (Not many xml tags are there).
below is what I have tried so far
String txt="<DATE>December</DATE>";
String re1="(<[^>]+>)"; // Tag 1
String re2="(.*?)"; // Variable Name 1
String re3="(<[^>]+>)"; // Tag 2
Pattern p = Pattern.compile(re1+re2+re3,Pattern.CASE_INSENSITIVE | Pattern.DOTALL);
Matcher m = p.matcher(txt);
if (m.find())
{
String tag1=m.group(1);
String var1=m.group(2);
String tag2=m.group(3);
//System.out.print("("+tag1.toString()+")"+"("+var1.toString()+")"+"("+tag2.toString()+")"+"\n");
System.out.println(tag1.toString().replaceAll("<>", ""));
System.out.println(var1.toString());
}
As an answer, I get:
<DATE>
December
How do I get rid of the <>?