I have a mysql database with couple tables, I wanna migrate the mysql data to ElasticSearch. It's easy to migrate the whole database to ES via a batch job. But how should I update ES from mysql realtime. i.e if there was a update operation in mysql then I should do the same operation in ES. I researched mysql binLog which can reflect any changes from mysql. But I have to parse binLog to ES syntax, I think it's really painful. Thanks! (the same case with Solr)
3 Answers
There is an existing project which takes your binlog, transforms it and ships it to Elasticsearch, You can check it out at: https://github.com/siddontang/go-mysql-elasticsearch
Another one would be this one: https://github.com/noplay/python-mysql-replication.
Note, however, that whichever you pick, it's a good practice to pre-create your index and mappings before indexing your binlog. That gives you more control over your data.
UPDATE:
Here is another interesting blog article on the subject: How to keep Elasticsearch synchronized with a relational database using Logstash
2 Comments
version of document and the mark the previous version for deletion. This is the standard way how ES works.The best open source solution would be this. You can run this as a command line and give the incremental logic too in the command.
GO through this session to get a complete idea.