6

I am trying to upload a big JSON file(newclicklogs.json) into mongodb using Java. Here is how my JSON file looks like:

{"preview":false,"result":{"search_term":"rania","request_time":"Sat Apr 01 12:47:04 -0400 2017","request_ip":"127.0.0.1","stats_type":"stats","upi":"355658761","unit":"DR","job_title":"Communications Officer","vpu":"INP","organization":"73","city":"Wash","country":"DC","title":"Tom","url":"www.demo.com","tab_name":"People-Tab","page_name":"PEOPLE","result_number":"5","page_num":"0","session_id":"df234f468cb3fe8be","total_results":"5","filter":"qterm=rina","_time":"2017-04-01T12:47:04.000-0400"}}
{"preview"......}
{"preview"......}
....

Here is my Java code:

import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import org.apache.commons.io.FileUtils;
import org.bson.Document;
import com.mongodb.DBObject;
import com.mongodb.MongoClient;

public class Main {

    public static void main(String[] args) throws IOException {

        String jsonString = FileUtils.readFileToString(new File("data/newclicklogs.json"), "UTF-8");

        Document doc = Document.parse(jsonString);
        List<Document> list = new ArrayList<>();
        list.add(doc);

        new MongoClient().getDatabase("test2").getCollection("collection1").insertMany(list);

    }
}

When I query my mongodb collection, only one document is getting added. How can I add all the documents from my file into a mongodb collection. I am a newbie to mongodb. Any help is appreciated.

1 Answer 1

7

You should try using the bulk writes with buffered reader.

The below code will read the json data from file, one line (document) at time, parse the json to Document and batch requests before writing it to database.

MongoClient client = new MongoClient("localhost", 27017);
MongoDatabase database = client.getDatabase("test2");
MongoCollection<Document> collection = database.getCollection("collection1");

int count = 0;
int batch = 100;

List<InsertOneModel<Document>> docs = new ArrayList<>();

try (BufferedReader br = new BufferedReader(new FileReader("data/newclicklogs.json"))) {
      String line;
      while ((line = br.readLine()) != null) {
         docs.add(new InsertOneModel<>(Document.parse(line)));
         count++;
         if (count == batch) {
           collection.bulkWrite(docs, new BulkWriteOptions().ordered(false));
           docs.clear();
           count = 0;
        }
    }
}

if (count > 0) {
   collection.bulkWrite(docs, new BulkWriteOptions().ordered(false));
}

When you run Document.parse on the entire json you are essentially reducing the documents to last document by overwriting all of previous ones.

More here

http://mongodb.github.io/mongo-java-driver/3.4/driver/tutorials/bulk-writes/

Sign up to request clarification or add additional context in comments.

5 Comments

Thanks a lot, Veeram. I've been trying to figure this out from hours. You saved my day
May I know why you specified batch equal to 100?
You are welcome. Tbh I didn't even think about it. You can try running at different batch sizes and time them and pick the right one for your need. I believe for 60K records it shouldn't really make a big difference from one batch size to another.
Ok. Cool. I will research on that :)
Use (count % batch == 0) if you want to keep the count throughout the entire process.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.