1

I m trying to index some data in ES and I m receiving out of memory exception:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at org.elasticsearch.common.jackson.core.util.BufferRecycler.balloc(BufferRecycler.java:155)
    at org.elasticsearch.common.jackson.core.util.BufferRecycler.allocByteBuffer(BufferRecycler.java:96)
    at org.elasticsearch.common.jackson.core.util.BufferRecycler.allocByteBuffer(BufferRecycler.java:86)
    at org.elasticsearch.common.jackson.core.io.IOContext.allocWriteEncodingBuffer(IOContext.java:152)
    at org.elasticsearch.common.jackson.core.json.UTF8JsonGenerator.<init>(UTF8JsonGenerator.java:123)
    at org.elasticsearch.common.jackson.core.JsonFactory._createUTF8Generator(JsonFactory.java:1284)
    at org.elasticsearch.common.jackson.core.JsonFactory.createGenerator(JsonFactory.java:1016)
    at org.elasticsearch.common.xcontent.json.JsonXContent.createGenerator(JsonXContent.java:68)
    at org.elasticsearch.common.xcontent.XContentBuilder.<init>(XContentBuilder.java:96)
    at org.elasticsearch.common.xcontent.XContentBuilder.builder(XContentBuilder.java:77)
    at org.elasticsearch.common.xcontent.json.JsonXContent.contentBuilder(JsonXContent.java:38)
    at org.elasticsearch.common.xcontent.XContentFactory.contentBuilder(XContentFactory.java:122)
    at org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder(XContentFactory.java:49)
    at EsController.importProductEs(EsController.java:60)
    at Parser.fromCsvToJson(Parser.java:120)
    at CsvToJsonParser.parseProductFeeds(CsvToJsonParser.java:43)
    at MainParser.main(MainParser.java:49)

This is how I instantiate the ES client:

System.out.println("Elastic search client is instantiated");
Settings settings = ImmutableSettings.settingsBuilder().put("cluster.name", "elasticsearch_brew").build();
client = new TransportClient(settings);
String hostname = "localhost";
int port = 9300; 
((TransportClient) client).addTransportAddress(new InetSocketTransportAddress(hostname, port));     
bulkRequest = client.prepareBulk();

and then I run the bulk request:

// for each product in the list, we need to include the fields in the bulk request 
for(HashMap<String, String> productfields : products)
        try {
            bulkRequest.add(client.prepareIndex(index,type,productfields.get("Product_Id"))
                    .setSource(jsonBuilder()
                                .startObject()
                                    .field("Name",productfields.get("Name") )
                                    .field("Quantity",productfields.get("Quantity"))
                                    .field("Make", productfields.get("Make"))
                                    .field("Price", productfields.get("Price"))
                                .endObject()
                              )
                    );                  

        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
//execute the bulk request
BulkResponse bulkResponse = bulkRequest.execute().actionGet();
if (bulkResponse.hasFailures()) {
    // process failures by iterating through each bulk response item
}

I am trying to index products from various shops. Each shop is a different index. When I reach the 6th shop containing around 60000 products I get the above exception. I split the bulk request in chunks of 10000, trying to avoid the out of memory problems. I can't understand where exactly is the bottleneck. Would it help if i somehow flush the bulk request or restart the client?? I ve seen similar posts but non works for me.

EDIT

When I m instantiting a new client every time I process a new bulk request, then I don't get the out of memory exception. But instantiating a new client each time doesnt seem right..

Thank you

5
  • What do the imported data look like? Commented Mar 2, 2016 at 22:01
  • You just probably have two much products processed at once, given that this error is happening on client side, not on actual ES node. Commented Mar 2, 2016 at 22:14
  • @9000 - {"Name":"Wireless Shape Mouse", "Quantity":"100", "Make":"Sony","Price":"23.73"}. This is a line of a product. Commented Mar 2, 2016 at 22:25
  • @Bax, I m processing 10.000 at each request, I ll try to reduce it down to 5000. Commented Mar 2, 2016 at 22:25
  • Or try to increase the client memory via the -Xmx VM option. Commented Mar 2, 2016 at 22:27

1 Answer 1

2

So I figured out what was wrong.

Every new bulk request was adding up to the previous one and eventually it was leading to out of memory.

So now before I start a new bulk request I run the bulkRequest = client.prepareBulk(); which flushes the previous request.

Thank you guys for your comments

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.