0

I want to get all instances of jpg files from the "search results" page of Amazon using several while loops. I have included system.out.println statements to help me trace through what my code is doing in the terminal output. The java program successfully loops through the str3 String until it finds "s" "r" "c" in order, so it knows a source is ahead. It then takes the next 42 characters to see if the resulting code is the same as the code used on Amazon websites to display an image, "src=http://ecx.images-amazon.com/images/I/" .The loop found all the chars in "src=http://ecx.images-amazon.com/images/I/" , and transformed the array of chars into a String called temp. I compare temp is compared to the String variable stringToFind. These 2 are equal, I have checked the output, and verified that .equals() was used and not ==. I haven't the faintest clue why the comparison for the second if statement does not work. Please help!

edit: TL;DR: The comparison for the if statement with the comment //DOES NOT ENTER LOOP!!!! does not work.

import java.util.Arrays;

public class JpgFinder {
    //Finds the url for a jpg file within Amazon.ca search results page //source code so that the image results of a users search may be stored.
    public static void main(String[] args) {
        String str1 = "src=http://ecx.images-amazon.com/images/I/31IVWofSY8L._AA160_.jpg onload=";
        String str2 = "src=http://ecx.images-amazon.com/images/I/31ZTujPkvvL._AA160_.jpg onload=";
        String str3 = str1 + str2;
        int str3Length = str3.length();
        int counter1 = 0;
        int counter2 = 0;
        int counter3 = 0;
        int counter4 = 0;
        int counter5 = 0;
        int counter6 = 0;
        int sum = 0;
        String temp = "";
        char[] charArray = new char[100];
        char[] charArray2 = new char[100];
        String[] jpgArray = new String[500];
        boolean jpgFound = false;
        //Searches for src
        while (counter1 < str3Length) {
            System.out.println("1");
            if ((str3.charAt(counter1) == 's') && (str3.charAt(counter1 + 1) == 'r') && (str3.charAt(counter1 + 2) == 'c')) {
                //Found src
                System.out.println("2");
                counter3 = counter1;
                while (counter2 < 42) {
                    //Takes src=http://ecx.images-amazon.com/images/I/
                    System.out.println("3");
                    charArray[counter2] = str3.charAt(counter2);
                    counter2++;
                    counter1++;
                }
                temp = new String(charArray);
                String stringToFind = "src=http://ecx.images-amazon.com/images/I/";
                System.out.println(temp);
                System.out.println("4");
                if (temp.equals(stringToFind)) {
                    //If src=http://ecx.images-amazon.com/images/I/ is compared and confirmed, continue
                    //DOES NOT ENTER LOOP!!!!
                    System.out.println("5");
                    while ((counter2 < 82) && jpgFound == false) {
                        if ((str2.charAt(counter2) == '.') && (str3.charAt(counter2 + 1) == 'j') && (str3.charAt(counter2 + 2) == 'p') && (str3.charAt(counter2 + 3) == 'g')) {
                            counter2++;
                            jpgFound = true;
                            counter4 = counter2 + 3;
                            sum = counter4 - counter3;
                            System.out.println("6");
                            while (counter5 < sum) {
                                charArray2[counter5] = str3.charAt(counter5);
                                System.out.println("7");
                            }
                        }
                        else {
                            counter2++;
                            System.out.println("8");
                        }
                    }
                }
                System.out.println("9");
                System.out.println("DID NOT ENTER");
            }
            String temp2 = new String(charArray2);
            jpgArray[counter6] = temp2;
            counter6++;
            counter1++;
            System.out.println("10");
        }
        System.out.println("Second attempt: " + temp);
        System.out.println("Jpgs: " + Arrays.toString(jpgArray));
    }
}

Output:

1 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 src=http://ecx.images-amazon.com/images/I/ 4 9 DID NOT ENTER 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 2

src=http://ecx.images-amazon.com/images/I/

4 9 DID NOT ENTER 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 1 10 Second attempt: src=http://ecx.images-amazon.com/images/I/ Jpgs: [, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null, null]

10
  • 1
    This won't solve this problem, but you should look at JSoup if you are trying to scrape websites. Very powerful tool which might simplify things for you and make your program less prone to bugs. Commented Jul 6, 2015 at 14:18
  • 2
    I see you're using a lot of sysouts to debug your code. Have you tried printing out temp and stringToFind prior to the condition? Commented Jul 6, 2015 at 14:19
  • Can you write output of the program? Commented Jul 6, 2015 at 14:19
  • 2
    Are you completely sure that both strings are equal? Including spaces at the start and end of them? Commented Jul 6, 2015 at 14:21
  • 3
    Well, this is due to src=http://ecx.images-amazon.com/images/I/ and src=http://ecx.images-amazon.com/images/I/ are not equal. There are some trailing characters in the 'temp' string. Commented Jul 6, 2015 at 14:22

2 Answers 2

1

It is because temp and stringToFind are not equal.

temp has a length of 100 and stringToFind has a length of 42.

Why has temp a length of 100? Because new String(charArray) creates a String decoding all the bytes in the array. Including the ones you have not used. And charArray is assigned an array of 100 elements.

Also, it would be productive to use IDE debugging support, likes the ones found in Netbeans or Eclipse. Debugging a program with prints is cumbersome.

Use instead new String(charArray, 0, 42)

Sign up to request clarification or add additional context in comments.

Comments

0

I think you should use the indexOf(String str) method of the String class to get the location of the "src=" string, then use the substring(int,int) method to get the substrings containing the string parts you want, then you can use the equals method.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.