1

Whenever I try to perform a GET request on a certain website (seen below) I always get a SocketTimeoutException. I only get this problem in Java, whereas if I try using Python's requests library I successfully manage to get the text.

String link = "https://www.yeezysupply.com/api/products/FV6125/availability";

        try {
            Connection connection = Jsoup.connect(link);
            connection.header("content-type", "application/json; charset=utf-8");

            Document document = connection.get();

            System.out.println(document.text());
        } catch (IOException e) {
            e.printStackTrace();
        }

Here is a screenshot of the error: https://prnt.sc/rp1ym9

Line 64 from my Main class is Document document = connection.get();

Also, when I use the Chrome extension 'PlugMan' I am able to successfully obtain the body from the site using a GET request, so clearly there is an issue with how I'm doing it in Java, because it works elsewhere.

Thank you.

2 Answers 2

2

EDIT, the site has a counter measure to prevent bots. The only way I got it to respond was using a user agent. This is how you'd set it with JSoup:

Response resp = Jsoup.connect(link)
                  .userAgent("User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_3) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.0.5 Safari/605.1.15")
                  .followRedirects(true)                  
                  .execute();

Document document = resp.parse();

My original (and wrong) assumption kept for reference below:

I don't think this is a Java or coding issue. That site simply doesn't respond. Is the web site up, or possibly do you have a required proxy configured for Python and it's not used in the Java code? If that's the case, take a look at this: https://docs.oracle.com/javase/7/docs/technotes/guides/net/proxies.html

When I try a simple wget from my workstation, the site doesn't answer:

$ wget https://www.yeezysupply.com/api/products/FV6125/availability

--2020-03-29 17:59:13--  https://www.yeezysupply.com/api/products/FV6125/availability
Resolving www.yeezysupply.com (www.yeezysupply.com)... 184.28.114.123, 184.28.114.129
Connecting to www.yeezysupply.com (www.yeezysupply.com)|184.28.114.123|:443... connected.
HTTP request sent, awaiting response... Read error (Operation timed out) in headers.
Retrying.
Sign up to request clarification or add additional context in comments.

3 Comments

When visiting the site I do receive the message I'm looking for, the website is definitely up prnt.sc/rp20or
I upvoted this answer. Appears to me if the API were expecting a user agent or sort of, the intended consumer of the API could only be a UI page, possibly it’s own UI..
I imagine this service was originally built to service some Javascript-based UI, with the idea that if there's a UI, then there's a single user. Obvious, that's weak sauce, but that's definitely what's going on. If you just curl that URL, you get a fancy warning telling you you're a bot so you don't get to see :)
1

Two remarks:

  • content-type is not a request header. It's used to describe server response content. To indicate what you expect you should use accept header.
  • It's a good habit to add user-agent header. Some servers don't respond without user agent and that's the case here.
connection.header("accept", "text/html, application/xhtml+xml, application/xml");
connection.header("user-agent", "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:54.0) Gecko/20100101 Firefox/74.0");

These two above make the connection possible but I have to disappoint you. Regardless of what you set in accept header the response contains JSON and Jsoup can't parse JSON, only HTML and XML. You'll have to use other library to download and parse it.

EDIT:
To download JSON to String using Jsoup, instead of

connection.get();

use:

connection.ignoreContentType(true).execute().body();

2 Comments

How should I open the connection then? I can use GSON to parse the JSON response, but if Jsoup can't retrieve the data then how else can I do it?
Your updated answer got me exactly what I needed, thanks a lot!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.