1

I've got a text-file which looks like a XML file but it is not a valid XML file. How can I parse it? I am using Java. I need the content of last tag.

Example file:

<h4 class="is24qa-objektbeschreibung-label padding-top-xl margin-bottom-              s">Objektbeschreibung</h4> 
<div class="is24-text margin-bottom"> 
<pre class="is24qa-objektbeschreibung">TEST TEST TEST </pre>

1 Answer 1

1

You can use jsoup this way.

package com.company;

import java.io.*;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

public class Main {
    public static void main(String[] args) {
        String line;
        String cline = "";
        Document doc;
        try {
            BufferedReader br = new BufferedReader(new FileReader("input.txt"));

            while ((line = br.readLine()) != null) {
                cline = line;
            }

            doc = Jsoup.parse(cline);
            Elements elements = doc.select("body").first().children();
            for (Element el : elements)
                System.out.println("content: " + el.text());

        } catch (IOException e) {
            e.printStackTrace();
        }

    }

}

input.txt

<h4 class="is24qa-objektbeschreibung-label padding-top-xl margin-bottom-              s">Objektbeschreibung</h4>
<div class="is24-text margin-bottom">
<pre class="is24qa-objektbeschreibung">TEST TEST TEST </pre>

Output

/usr/lib/jvm/java-1.8.0-openjdk-amd64/bin/java -
content: TEST TEST TEST

Process finished with exit code 0
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.