0

I'm using Formscanner and its after processing some Images its giving the error:

Exception in thread "main" java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:717)
at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:950)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1357)
at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:134)
at com.albertoborsetta.formscanner.api.FormTemplate.findPoints(FormTemplate.java:852)
at com.albertoborsetta.formscanner.model.FormScannerModel.analyzeFiles(FormScannerModel.java:562)
at com.albertoborsetta.formscanner.main.FormScanner.main(FormScanner.java:145)

the find points method is as under:

public void findPoints(BufferedImage image, int threshold, int density,
        int size) throws FormScannerException {
    height = image.getHeight();
    width = image.getWidth();
    int cores = Runtime.getRuntime().availableProcessors();

    ExecutorService threadPool = Executors.newFixedThreadPool(cores - 1);
    HashSet<Future<HashMap<String, FormQuestion>>> fieldDetectorThreads = new HashSet<>();

    HashMap<String, FormQuestion> templateFields = template.getFields();
    ArrayList<String> fieldNames = new ArrayList<>(templateFields.keySet());
    Collections.sort(fieldNames);

    for (String fieldName : fieldNames) {
        Future<HashMap<String, FormQuestion>> future = threadPool.submit(new FieldDetector(threshold, density, size, this, templateFields.get(fieldName), image));
        fieldDetectorThreads.add(future);
    }

    for (Future<HashMap<String, FormQuestion>> thread : fieldDetectorThreads) {
        try {
            HashMap<String, FormQuestion> threadFields = thread.get();
            for (String fieldName : threadFields.keySet()) {
                FormQuestion field = threadFields.get(fieldName);
                fields.put(fieldName, field);
                for (Entry<String, FormPoint> point : field.getPoints().entrySet()) {
                    if (point.getValue() != null) {
                        pointList.add(point.getValue());
                    }
                }
            }
        } catch (InterruptedException | ExecutionException e) {
            throw new FormScannerException(e.getCause());
        }
    }

    threadPool.shutdown();

}

the above function is being called in the loop and the number of java processes grows and at a point it raises the above exception.

Is there any way that these threads got killed after the shutdown method is called. I'm not a java developer. I did some R&D. But I'm not successful.

8
  • Are you running a 32 or 64 bit JVM? What's the OS used? What are the memory settings (Xmx, Xms,Xss...)? Commented Jun 22, 2017 at 7:58
  • The pool is suppose to limit the number of thread running and you have a thread.get that wait for each thread to end. So unless you are calling this methods in new threads (that would be visible in the stacktrace), I don't see how this could overload. But of course, it could be the first running FieldDetectors instance that take all the memory available. What will do the FieldDetector instance ? Commented Jun 22, 2017 at 8:10
  • Im using ubuntu 64bit with openjdk Commented Jun 22, 2017 at 8:16
  • @AxelH i dont have idea about it. Commented Jun 22, 2017 at 8:20
  • 1
    I see that you have posted a question on their google forum. Searching there, I notice that there is a post here talking about a problem of memory usage only cleaning when the app is closed. So I would say that the API have memory leaking, the previous files are not release correctly, leading to your problem after a lot of file processes. It is opensource so I could try to check the source this week-end If you have time. I am curious about OMR now ;) Commented Jun 22, 2017 at 8:31

1 Answer 1

2

The problem come from the Set<Future> used to hold every instance to check them later.

In chat, you told me you were checking 120.000files. That means there are that many Future created, when the pool find a slot, it will create a Thread to execute the Callable.

Since the Set hold every instance, the Thread are not garbage collected, that what give you the leaking. You need to remove every used Future to let the GC clear the memory for the next Thread.

Using an iterator instead of the loop itself is simple and let you remove the current instance before usage

Iterator<Future<HashMap<String, FormQuestion>>> iterator = fieldDetectorThreads.iterator();
while (iterator.hasNext()) {
    //get the next instance
    Future<HashMap<String, FormQuestion>> thread = iterator.next();
    //Remove it from the set
    iterator.remove();

    //then work on that instance just like before
    try {
        HashMap<String, FormQuestion> threadFields = thread.get();
        for (String fieldName : threadFields.keySet()) {
            FormQuestion field = threadFields.get(fieldName);
            fields.put(fieldName, field);
            for (Entry<String, FormPoint> point : field.getPoints().entrySet()) {
                if (point.getValue() != null) {
                    pointList.add(point.getValue());
                }
            }
        }
    } catch (InterruptedException | ExecutionException e) {
        throw new FormScannerException(e.getCause());
    }
}

This solution is not tested but this should be able to release the memory fast enough.

But if the loop to submit the request took to much time to end (120k future to generate before checking the first one), this would break before every request are sended.

In that case, it might be necessary to split that logic in two threads, one to send request, one to check the result until the first thread is over and the set is empty.


Just in case, I would add after the loop a shutdown request

threadPool.shutdown();

it should not be necessary, but strangely my test program don't end without it ... even if every thread have been processed, they seems to keep existing and blocking the main thread.

Sign up to request clarification or add additional context in comments.

1 Comment

I have reverted the code when i have processed 1000 sheets it have the same effect 513 processes were running as I changed as per your suggestion.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.