5

I have to migrate 5 million records from PostgreSQL to MongoDb.

I tried using mongify for the same but as it runs on ruby and I am not at all acquainted with ruby i couldn't solve the errors posed by it.

So, I tried writing a code myself in node.js that would first convert PostgreSQL data into JSON and then insert that JSON into mongoDb. But, this failed as it ate a lot of RAM and not more than 13000 records could be migrated.

Then I thought of writing code in Java because of its garbage collector. It works fine in terms of RAM utilization but the speed is very slow (around 10000 records/hour). At this rate it would take me days to migrate my data.

So, Is there a more efficient and faster way of doing this? Would a python program be faster than the Java program? Or is there any other ready-made tool available for doing the same?

My system configuration is : OS - Windows 7 (64 bit), RAM - 4GB, i3 processor

10
  • are you using bulkInsert?thejavageek.com/2015/07/08/mongodb-bulk-insert Commented Feb 2, 2017 at 9:09
  • @RahulKumar No I am fetching rows from Postgresql and inserting in Mongodb one by one. As converting all 5 million records together in JSON is not supported by the RAM. So, I am doing db.collection.insert(jsondata) Commented Feb 2, 2017 at 9:13
  • so you get all the 5 million rows from postgresql at once and then makes an entry into mongodb one by one? in any case you may look for batch processing. Commented Feb 2, 2017 at 9:15
  • 1
    You could read records from pg by let's say 200 each ( depends on size of records). convert these, do bulkInsert . And then do this with multiple threads at the same time ? Commented Feb 2, 2017 at 9:18
  • 1
    I think you can create a batch of 1000 records by using skip and limit in postgresql and process those 1000 and then use bulkinsert on those 1000 and do this batch process in loop until all the records are done. Most of the time is consumed for making connections to postgresql and mongodb. java is inherently faster. Commented Feb 2, 2017 at 9:22

1 Answer 1

6

Seems like I am late to the party. However, this might come in handy to somebody, someday!!!!

The following python-based migration framework should come in handy.

https://github.com/datawrangl3r/pg2mongo

Answering to your performance, the migration of each JSON object will be dynamic and there shouldn't be any memory lock issues when you use the above framework.

Hope it helps!!

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.