8

I have MySql database for my application. i implemented solr search and used dataimporthandler(DIH)to index data from database into solr. my question is: is there any way that if database gets updated then my solr indexes automatically gets update for new data added in the database. . It means i need not to run index process manually every time data base tables changes.If yes then please tell me how can i achieve this.

4 Answers 4

3

I don't think there is a possibility in Solr which lets you index the data when any updates happens to DB.

But there could be possibilities like, with the help of Triggers - there is a possibility to run an external application from triggers.

Write a CRON to trigger PHP script which does reading from the DB and indexing it in Solr. Write a trigger (which calls this script) for CRUD operation and dump it into DB, so, whenever something happens to DB, this trigger will call the above script and indexing could happen.

Please see:

Invoking a PHP script from a MySQL trigger

Automatic Scheduling:

Please see this post How can I Schedule data imports in Solr for more information on scheduling. The second answer, explains how to import using Cron.

Sign up to request clarification or add additional context in comments.

1 Comment

Rakesh: How to write CRON to trigger a script which reads data from database and index its into solr
1

Since you used a DataImportHandler to initially load your data into Solr... You could create a Delta Import Handler that is executed using curl from a cron job to periodically add changes in the database to the index. Also, if you need more real time updates, as @Rakesh suggested, you could use a trigger in your database and have that kick off the curl call to the Delta DIH.

4 Comments

How should i create a DeldaImportHandler which is executed using curl and solve my problem ??
@Romi, if you look at the example in the link I provided, it talks about creating a delta query that can detect changes in your database via a lassttimestamp column (or something similar) in the database. Also, you can see this example of using a full DIH as a delta - wiki.apache.org/solr/DataImportHandlerFaq#fullimportdelta Once you have this working, you can execute it via an http call using curl and schedule that curl call via cron. Hope this helps.
in windows environment i need to execute this url:localhost:8983/solr/db/dataimport?command=full-import, how can i do this using curl or any other command in windows
You should go to the home page for Curl - curl.haxx.se and check out the FAQ page. That should point you in the right direction.
1

you can import the data using your browser and task manager. do the following steps on windows server... GO to Administrative tools => task Schedular Click "Create Task"

Now a screen of Create Task will be open with the TAB General,Triggers,Actions,Conditions,Settings.

In the genral tab enter the task name "Solrdataimport" and in discriptions enter "Import mysql data"

Now go to Triggers tab CLick new in Setting check Daily.In Advanced setting Repeat task every ... Put time there whatever you want.click OK

Now go to Actions button click new Button IN setting put Program/Script "C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" this is the installation path of chrome browser.In the Add Arguments enter http://localhost:8983/solr/#/collection1/dataimport//dataimport?command=full-import&clean=true And click OK

Using the all above process Data import will Run automatically.In case of Stop the Imort process follow the all above process just change the Program/Script "taskkill" in place of "C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" under Actions Tab In arguments enter "f /im chrome.exe"

Set the triggers timing according the requirements

Comments

0

What you're looking for is a "delta-import", and a lot of the other posts have information about that covered. I created a Windows WPF application and service to issue commands to Solr on a recurring schedule, as using CRON jobs and Task Scheduler is a bit difficult to maintain if you have a lot of cores / environments.

https://github.com/systemidx/SolrScheduler

You basically just drop in a JSON file in a specified folder and it will use a REST client to issue the commands to Solr.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.