0

I'm converting a big file (18 GB) to a byte[], but I got this error :

java.lang.OutOfMemoryError: Java heap space

This is the code responsable for the exception:

byte[] content = Files.readAllBytes(path); 

I'm creating the byte array to send it over network:

createFile(filename.toString(),content);

private ADLStoreClient client; // package com.microsoft.azure.datalake.store
public boolean createFile(String filename, byte[] content) {
        try {
            // create file and write some content
            OutputStream stream = client.createFile(filename, IfExists.OVERWRITE);
            // set file permission
            client.setPermission(filename, "777");
            // append to file
            stream.write(content);
            stream.close();

        } catch (ADLException ex) {
            printExceptionDetails(ex);
            return false;
        } catch (Exception ex) {
            log.error(" Exception: {}", ex);
            return false;
        }
        return true;
    }

Obviouslly readAllBytes() reads all bytes into memory and cause the OutOfMemoryError, I think this is can be solved using streams but I'm not good at them, can anybody give a proper solution, thanks

1
  • 2
    What are you actually trying to accomplish? Commented Oct 10, 2018 at 15:58

4 Answers 4

3

As Azure ADLStoreClient documentation states:

createFile(String path, IfExists mode)

create a file. If overwriteIfExists is false and the file already exists, then an exceptionis thrown. The call returns an ADLFileOutputStream that can then be written to.

So something like this:

try (InputStream in = new FileInputStream(path);
     OutputStream out = client.createFile(filename, IfExists.OVERWRITE)) {
    IOUtils.copyLarge(in, out);
}

You can get IOUtils from commons-io or make the copyLarge routine yourself, it's very simple:

void copyLarge(InputStream in, OutputStream out) throws IOException {
    byte[] buffer = new byte[65536];
    int length;
    while ((length = in.read(buffer)) > 0) {
        out.write(buffer, 0, length);
    }
}
Sign up to request clarification or add additional context in comments.

Comments

1

Something like this? (if you like to process it line-by-line)

try (Stream<String> stream = Files.lines(Paths.get(fileName))) {

            stream.forEach(System.out::println);

        } catch (IOException e) {
            e.printStackTrace();
        }
...

Comments

1

Here is a file Stream class that i use to read files to stream:

/**
 * Allows a file to be read and iterated over and allow to take advantage of java streams
 * @author locus2k
 *
 */
public class FileStream implements Iterator<byte[]>, Iterable<byte[]>, Spliterator<byte[]> {

  private InputStream stream;
  private int bufferSize;
  private long blockCount;


  /**
   * Create a FileStreamReader
   * @param stream the input stream containing the content to be read
   * @param bufferSize size of the buffer that should be read at once from the stream
   */
  private FileStream(InputStream stream, long fileSize, int bufferSize) {
    this.bufferSize = bufferSize;
    //calculate how many blocks will be generated by this stream
    this.blockCount = (long) Math.ceil((float)fileSize / (float)bufferSize);
    this.stream = stream;
  }

  @Override
  public boolean hasNext() {
    boolean hasNext = false;
    try {
      hasNext = stream.available() > 0;
      return hasNext;
    } catch (IOException e) {
      return false;
    } finally {
      //close the stream if there is no more to read
      if (!hasNext) {
        close();
      }
    }
  }

  @Override
  public byte[] next() {
    try {
      byte[] data = new byte[Math.min(bufferSize, stream.available())];
      stream.read(data);
      return data;
    } catch (IOException e) {
      //Close the stream if next causes an exception
      close();
      throw new RuntimeException(e.getMessage());
    }
  }

  /**
   * Close the stream
   */
  public void close() {
    try {
      stream.close();
    } catch (IOException e) { }
  }

  @Override
  public boolean tryAdvance(Consumer<? super byte[]> action) {
    action.accept(next());
    return hasNext();
  }

  @Override
  public Spliterator<byte[]> trySplit() {
    return this;
  }

  @Override
  public long estimateSize() {
    return blockCount;
  }

  @Override
  public int characteristics() {
    return Spliterator.IMMUTABLE;
  }

  @Override
  public Iterator<byte[]> iterator() {
    return this;
  }

  @Override
  public void forEachRemaining(Consumer<? super byte[]> action) {
    while(hasNext())
      action.accept(next());
  }

  /**
   * Create a java stream
   * @param inParallel if true then the returned stream is a parallel stream; if false the returned stream is a sequential stream.
   * @return stream with the data
   */
  private Stream<byte[]> stream(boolean inParallel) {
    return StreamSupport.stream(this, inParallel);
  }

  /**
   * Create a File Stream reader
   * @param fileName Name of the file to stream
   * @param bufferSize size of the buffer that should be read at once from the stream
   * @return Stream representation of the file
   */
  public static Stream<byte[]> stream(String fileName, int bufferSize) {
    return stream(new File(fileName), bufferSize);
  }

  /**
   * Create a FileStream reader
   * @param file The file to read
   * @param bufferSize the size of each read
   * @return the stream
   */
  public static Stream<byte[]> stream(File file, int bufferSize) {
    try {
      return stream(new FileInputStream(file), bufferSize);
    } catch (FileNotFoundException ex) {
      throw new IllegalArgumentException(ex.getMessage());
    }
  }

  /**
   * Create a file stream reader
   * @param stream the stream to read from (note this process will close the stream)
   * @param bufferSize size of each read
   * @return the stream
   */
  public static Stream<byte[]> stream(InputStream stream, int bufferSize) {
    try {
      return new FileStream(stream, stream.available(), bufferSize).stream(false);
    } catch (IOException ex) {
      throw new IllegalArgumentException(ex.getMessage());
    }
  }

  /**
   * Calculate the number of segments that will be created
   * @param sourceSize the size of the file
   * @param bufferSize the buffer size (or chunk size for each segment to be)
   * @return the number of packets that will be created
   */
  public static long caculateEstimatedSize(long sourceSize, Integer bufferSize) {
    return (long) Math.ceil((float)sourceSize / (float)bufferSize);
  }
}

Then to use it you can do something like

FileStream.stream("myfile.text", 30000).forEach(b -> System.out.println(b.length));

This will create a file stream and each call in forEach will return a byte array the size of the buffer specified in this case the byte array will be 30,000.

Comments

0

From what you said you try to put 18 GB in memory (RAM) so you can use -Xmsn and set it to 18 GB but you will need free 18 GB of memory you can read about it in java docs: -Xmsn Java

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.