12

I want to ask a question about Java. I have use the URLConnection in Java to retrieve the DataInputStream. and I want to convert the DataInputStream into a String variable in Java. What should I do? Can anyone help me. thank you.

The following is my code:

URL data = new URL("http://google.com");
URLConnection dataConnection = data.openConnection();
DataInputStream dis = new DataInputStream(dataConnection.getInputStream());
String data_string;
// convent the DataInputStream to the String
7
  • 4
    you want to convert DataInputString to String or you want to read String from DataInputString? Commented Oct 6, 2010 at 8:54
  • @org.life.java, thank you for your reply. I want to convent the DataInputStream to string, like (data_string = dis;). by the way, I think it is another question, so I post a new question, no the old question I ask. Thank you. :-) Commented Oct 6, 2010 at 8:59
  • to convert you can just say String str = dis.toString(); , but It will give you string representation of Object, I don't understand why you need this ? Or you want to read the content of google.com here ? Commented Oct 6, 2010 at 9:01
  • @org.life.java, thank you for your reply. The google is just a example and I want to ask, what do you mean by 'give you string representation of Object'? Commented Oct 6, 2010 at 9:05
  • Object has a method toString that Returns a string representation of the object.I don;t think you are looking for that , What you want to do exactly by converting dis to String , explain with example . Commented Oct 6, 2010 at 9:09

3 Answers 3

11
import java.net.*;
import java.io.*;

class ConnectionTest {
    public static void main(String[] args) {
        try {
            URL google = new URL("http://www.google.com/");
            URLConnection googleConnection = google.openConnection();
            DataInputStream dis = new DataInputStream(googleConnection.getInputStream());
            StringBuffer inputLine = new StringBuffer();
            String tmp; 
            while ((tmp = dis.readLine()) != null) {
                inputLine.append(tmp);
                System.out.println(tmp);
            }
            //use inputLine.toString(); here it would have whole source
            dis.close();
        } catch (MalformedURLException me) {
            System.out.println("MalformedURLException: " + me);
        } catch (IOException ioe) {
            System.out.println("IOException: " + ioe);
        }
    }
}  

This is what you want.

Sign up to request clarification or add additional context in comments.

5 Comments

@org.life.java, thank you for your answer. And i think there is some misunderstand of the problem. After the 'System.out.println(inputLine);', the inputLine become 'null' value and I want the inputLine="<html><head..." and use in other class future. So, would you mind to give me another suggestion? thank you.
@org.life.java, a great great great help. thank you very much and sorry to lose your time.
I don't believe this can work. readUTF() expects string data to be stored in a specific way (see download.oracle.com/javase/1.3/docs/api/java/io/…). This will not be the case if you try to read content from an arbitrary URL.
@Grodriguez Thanks foe letting me know that. I have altered it back to readLine, I know its depricated .other solution are already here like bozho's
If you use DataInputStream.readLine(), your solution will not work correctly if the content encoding of the URL you are accessing is anything different than plain ASCII. This is why the readLine method is deprecated. See my answer to this same question for a way to read the contents of the URL taking into account the content encoding, without resorting to any external libraries.
7

You can use commons-io IOUtils.toString(dataConnection.getInputStream(), encoding) in order to achieve your goal.

DataInputStream is not used for what you want - i.e. you want to read the content of a website as String.

4 Comments

This does not take into account the content encoding for the URL you are accessing. You should use the two argument version of the IOUtils.toString method in order to explicitly specify the encoding.
@Grodriguez or use an InputStreamReader. I added the encoding, a good practice indeed.
Even if you pass an InputStreamReader instead, you still need to specify the encoding when the InputStreamReader is created, otherwise you will have the same problem (the default platform encoding would be used, which may or may not match the encoding of the URL content).
@Grodriguez that's what I meant by the InputStreamReader suggestion. (Btw the downvote can be removed, I guess)
7

If you want to read data from a generic URL (such as www.google.com), you probably don't want to use a DataInputStream at all. Instead, create a BufferedReader and read line by line with the readLine() method. Use the URLConnection.getContentType() field to find out the content's charset (you will need this in order to create your reader properly).

Example:

URL data = new URL("http://google.com");
URLConnection dataConnection = data.openConnection();

// Find out charset, default to ISO-8859-1 if unknown
String charset = "ISO-8859-1";
String contentType = dataConnection.getContentType();
if (contentType != null) {
    int pos = contentType.indexOf("charset=");
    if (pos != -1) {
        charset = contentType.substring(pos + "charset=".length());
    }
}

// Create reader and read string data
BufferedReader r = new BufferedReader(
        new InputStreamReader(dataConnection.getInputStream(), charset));
String content = "";
String line;
while ((line = r.readLine()) != null) {
    content += line + "\n";
}

1 Comment

Does the ContentEncoding header really contain character set? According to specs it should contain eg. gzip. You should be looking at charset.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.