java data structure to replace file io

Question

My program goes to a my uni results page, finds all the links and saves to a file. Then I read the file and copy only lines which contain required links and save it to another file. Then I parse it again to extract required data

public class net {

    public static void main(String[] args) throws Exception {
        Document doc = Jsoup.connect("http://jntuconnect.net/results_archive/").get();

        Elements links = doc.select("a");
        File f1 = new File("flink.txt");
        File f2 = new File("rlink.txt");

            //write extracted links to f1 file
        FileUtils.writeLines(f1, links);

            // store each link from f1 file in string list
        List<String>  linklist  = FileUtils.readLines(f1);

            // second string list to store only required link elements
        List<String> rlinklist = new ArrayList<String>();

        // loop which finds required links and stores in rlinklist 
        for(String elem : linklist){
            if(elem.contains("B.Tech") && (elem.contains("R07")||elem.contains("R09"))){
                rlinklist.add(elem);                
            }           
        }           
        //store required links in f2 file
        FileUtils.writeLines(f2, rlinklist);

        // parse links from f2  file
        Document rdoc = Jsoup.parse(f2, null);
        Elements rlinks = rdoc.select("a");

        //  for storing hrefs and link text 
        List<String> rhref = new ArrayList<String>();
        List<String> rtext = new ArrayList<String>();

        for(Element rlink : rlinks){
            rhref.add(rlink.attr("href"));
            rtext.add(rlink.text());
        }

    }// end main

}

I don't want to create files to do this. Is there a better way to get hrefs and link texts of only specific urls without creating files?

It uses Apache commons fileutils, jsoup

You already have the list in memory (Elements links). Just operate on that. Your code to write and read from files is completely unnecessary. — vanza
– vanza, Commented Jul 11, 2012 at 4:24

Ted Hopp · Accepted Answer · 2012-07-11 04:32:46Z

1

Here's how you can get rid of the first file write/read:

Elements links = doc.select("a");
List<String> linklist = new ArrayList<String>();
for (Element elt : links) {
    linklist.add(elt.toString());
}

The second round trip, if I understand the code, is intended to extract the links that meet a certain test. You can just do that in memory using the same technique.

I see you're relying on Jsoup.parse to extract the href and link text from the selected links. You can do that in memory by writing the selected nodes to a StringBuffer, convert it to a String by calling it's toString() method, and then using one of the Jsoup.parse methods that takes a String instead of a File argument.

edited Jul 11, 2012 at 4:32

answered Jul 11, 2012 at 4:24

Ted Hopp

235k48 gold badges412 silver badges533 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

java data structure to replace file io

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related