2

I am following this Youtube tutorial, but while he gets ALL the headlines from CNN RSS, I only get 1 headline. Why is this so?

my code (same as the one in the tutorial as far as I can see)

import java.net.MalformedURLException;
import java.net.URL;
import java.io.*;


public class ReadRSS {

    public static void main(String[] args) {

        System.out.println(readRSSFeed("http://rss.cnn.com/rss/edition.rss"));
    }

    public static String readRSSFeed(String urlAddress){
        try{
            URL rssUrl = new URL (urlAddress);
            BufferedReader in = new BufferedReader(new InputStreamReader(rssUrl.openStream()));
            String sourceCode = "";
            String line;
            while((line=in.readLine())!=null){
                if(line.contains("<title>")){
                    System.out.println(line);
                    int firstPos = line.indexOf("<title>");
                    String temp = line.substring(firstPos);
                    temp=temp.replace("<title>","");
                    int lastPos = temp.indexOf("</title>");
                    temp = temp.substring(0,lastPos);
                    sourceCode +=temp+ "\n" ;
                }
            }
            in.close();
            return sourceCode;
        } catch (MalformedURLException ue){
            System.out.println("Malformed URL");
        } catch (IOException ioe){
            System.out.println("Something went wrong reading the contents");
        }
        return null;
    }
}
1
  • RSS is an XML-format. You should use an XML-parser. Especially since RSS is well-formed XML (as opposed to, for instance, bookmarks handling in Word's XML)... Trying to parse RSS without and XML-parser will cause these kinds of problems. An XML-parser will be able to handle this. Also, since most RSS feeds will at most contain up to 100 articles, creating and querying a document object model (DOM) will be just fine. You don't need to do the heavy tag-by-tag-parsing. Use for instance Rome. Commented Aug 18, 2023 at 9:20

1 Answer 1

5

CNN's feed format has changed since he made that Youtube video. The code makes the assumption that there is one title tag per line, when actually there are multiple. Something like this should work now:

while ((line = in.readLine()) != null) {
    int titleEndIndex = 0;
    int titleStartIndex = 0;
    while (titleStartIndex >= 0) {
        titleStartIndex = line.indexOf("<title>", titleEndIndex);
        if (titleStartIndex >= 0) {
            titleEndIndex = line.indexOf("</title>", titleStartIndex);
            sourceCode += line.substring(titleStartIndex + "<title>".length(), titleEndIndex) + "\n";
        }
    }
}
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.