HTTP response code in Java

Question

This is the code I used :

class ResponseCodeCheck 
{

public static void main (String args[]) throws Exception
{

    URL url = new URL("http://www.amazon.co.jp/gp/seller/sell-your-stuff.html");
    HttpURLConnection connection = (HttpURLConnection)url.openConnection();
    connection.setRequestMethod("GET");
    connection.connect();

    int code = connection.getResponseCode();
    System.out.println("Response code of the object is "+code);
    if (code==200)
    {
        System.out.println("OK");
    }
}
}

And it gave 404 for the URL while that URL is working fine. Any reason why ?

what does that mean? If i change url to 'google.com', above code works fine. — user801154
– user801154, Commented Jul 3, 2012 at 10:43
You say the URL is working fine. Did you check in your browser? Does your browser access the internet through a proxy? — Qwerky
– Qwerky, Commented Jul 3, 2012 at 10:57
@PhilippReichart lol!!! sorry i've been sick the last few days and came back to work today, so, i'm a bit off. Sorry! — Th0rndike
– Th0rndike, Commented Jul 3, 2012 at 11:38

corsair · Accepted Answer · 2012-07-03 11:04:24Z

2

Add a proper header value for "User-Agent"

connection.addRequestProperty("User-Agent", "Safari");

answered Jul 3, 2012 at 11:04

corsair

6685 silver badges16 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Philipp Reichart Over a year ago

Interestingly, Amazon.co.jp seems to 404 any UA containing curl (propbably to prevent scraping). Even a bogus UA like foo works.

user801154 Over a year ago

yeah , setting UA gives 200 , but any reason why this happens ?

Philipp Reichart Over a year ago

This most likely happens to prevent programs/spiders from downloading all catalogue data off Amazon.co.jp (which most likely is a violation of their TOS anyway).

user801154 Over a year ago

Still i don't understand why changing 'http' to 'https' gives 301

corsair Over a year ago

My suggestion is that Amazon simple has some filter for checking UA value.

ŁukaszBachman · Accepted Answer · 2012-07-03 11:02:36Z

0

CURL is saying:

curl -v http://www.amazon.co.jp/gp/seller/sell-your-stuff.html
* About to connect() to www.amazon.co.jp port 80 (#0)
*   Trying 176.32.120.128... connected
> GET /gp/seller/sell-your-stuff.html HTTP/1.1
> User-Agent: curl/7.23.1 (x86_64-pc-win32) libcurl/7.23.1 OpenSSL/0.9.8r zlib/1.2.5
> Host: www.amazon.co.jp
> Accept: */*
>
< HTTP/1.1 301 MovedPermanently

Please note HTTP/1.1 301 MovedPermanently. Are you sure you have received 404 and not 301? This is usual web practice, 301 header means that content was placed in some other location and user (browser) should navigate to it.

Also please make sure that HttpURLConnection allows redirection.

answered Jul 3, 2012 at 11:02

ŁukaszBachman

33.8k11 gold badges68 silver badges74 bronze badges

2 Comments

user801154 Over a year ago

If i change url to 'amazon.co.jp/gp/seller/sell-your-stuff.html' from 'amazon.co.jp/gp/seller/sell-your-stuff.html', it starts giving 301.

Philipp Reichart Over a year ago

With -H "User-Agent: foo" you get the actual page content. A 403 Forbidden with a "please don't crawl us" message would have been a lot nicer on Amazon's part, though :/

Collectives™ on Stack Overflow

HTTP response code in Java

2 Answers 2

5 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related