Java Read RSS Feed

Question

I am following this Youtube tutorial, but while he gets ALL the headlines from CNN RSS, I only get 1 headline. Why is this so?

my code (same as the one in the tutorial as far as I can see)

import java.net.MalformedURLException;
import java.net.URL;
import java.io.*;


public class ReadRSS {

    public static void main(String[] args) {

        System.out.println(readRSSFeed("http://rss.cnn.com/rss/edition.rss"));
    }

    public static String readRSSFeed(String urlAddress){
        try{
            URL rssUrl = new URL (urlAddress);
            BufferedReader in = new BufferedReader(new InputStreamReader(rssUrl.openStream()));
            String sourceCode = "";
            String line;
            while((line=in.readLine())!=null){
                if(line.contains("<title>")){
                    System.out.println(line);
                    int firstPos = line.indexOf("<title>");
                    String temp = line.substring(firstPos);
                    temp=temp.replace("<title>","");
                    int lastPos = temp.indexOf("</title>");
                    temp = temp.substring(0,lastPos);
                    sourceCode +=temp+ "\n" ;
                }
            }
            in.close();
            return sourceCode;
        } catch (MalformedURLException ue){
            System.out.println("Malformed URL");
        } catch (IOException ioe){
            System.out.println("Something went wrong reading the contents");
        }
        return null;
    }
}

RSS is an XML-format. You should use an XML-parser. Especially since RSS is well-formed XML (as opposed to, for instance, bookmarks handling in Word's XML)... Trying to parse RSS without and XML-parser will cause these kinds of problems. An XML-parser will be able to handle this. Also, since most RSS feeds will at most contain up to 100 articles, creating and querying a document object model (DOM) will be just fine. You don't need to do the heavy tag-by-tag-parsing. Use for instance Rome. — Erk
– Erk, Commented Aug 18, 2023 at 9:20

janih · Accepted Answer · 2015-08-22 10:06:43Z

5

CNN's feed format has changed since he made that Youtube video. The code makes the assumption that there is one title tag per line, when actually there are multiple. Something like this should work now:

while ((line = in.readLine()) != null) {
    int titleEndIndex = 0;
    int titleStartIndex = 0;
    while (titleStartIndex >= 0) {
        titleStartIndex = line.indexOf("<title>", titleEndIndex);
        if (titleStartIndex >= 0) {
            titleEndIndex = line.indexOf("</title>", titleStartIndex);
            sourceCode += line.substring(titleStartIndex + "<title>".length(), titleEndIndex) + "\n";
        }
    }
}

answered Aug 22, 2015 at 10:06

janih

2,2642 gold badges18 silver badges26 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Java Read RSS Feed

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related