0

Hi,

I want to extract text between div tag

<div class="innercontenttxt"> 
<p>img border="1" align="left" height="170" width="324" vspace="3" hspace="2" src="/tmdbuserfiles/ramdev-balakrishna(1).jpg" alt="ramdev aide remanded, lakrishna acharya judicial remand, ramdev aide fake passport case, baba ramdev assistant judicial custody, balakrishna sent to judicial custody, yoga guru ramdev assistant remanded, yoga guru ramdev assistant balakrishna" />
Yoga guru Ramdev's aide Balakrishna Acharya remanded to 14 days judicial custody in a fake passport on Saturday. He was arrested yesterday after he failed to appear at a Dehradun court.
    <br />
    <br />
     Balakrishna Acharya, who is basically a Nepalese citizen, 
     is alleged to have submitted fake documents to procure a passport. 
     When he failed to appear in Dehradun court in connection with the case,
</p>  
</div>

After extracting the result should be:

ramdev aide alakrishna Acharya remanded to 14 days judicial custody in a fake passport on Saturday. He was arrested yesterday after he failed to appear at a Dehradun court.Balakrishna Acharya, who is basically a Nepalese citizen, is alleged to have submitted fake documents to procure a passport. When he failed to appear in Dehradun court in connection with the case, the court had issued a non-bailable warrant and subsequently arrested him yesterday.

3
  • I have tried different HTML Parsers like Jericho HTML Parser ,HTML Parser ,J soup Parser But those all are not supported in j2me Commented Jul 26, 2012 at 5:44
  • you need a general solution to parse div tags or specific to your case ? Commented Jul 26, 2012 at 6:09
  • Is any parser available with out using java.net.url class for my case?Can u help me out? Commented Jul 26, 2012 at 7:28

2 Answers 2

1

This problem seems similiar to this other question.

Assuming you already have the html source stored in a String variable called htmlPage.

int divIndex = htmlPage.indexOf("<div");
divIndex = htmlPage.indexOf(">", divIndex);

int endDivIndex = htmlPage.indexOf("</div>", divIndex);
String content = htmlPage.substring(divIndex + 1, endDivIndex);
Sign up to request clarification or add additional context in comments.

Comments

1

You may want to try some of the Java HTML parser libraries

HTML Parser - http://htmlparser.sourceforge.net

jsoup - http://jsoup.org/

1 Comment

Hii,this i am developing for j2me application,there ,i am not having java.net.url class,so other than this class any tell me any parsers avilable for my need...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.