2

Is there any way I can configure logstash so that it picks up delta records real time automatically. If not then is there any opensource plugin/tool available to achieve this? Thanks for the help.

1 Answer 1

1

Try the below configuration for the MSSQL server. You need to schedule it like below by adding the schedule period, a statement which would the query to fetch the data from your database

input {
  jdbc { 
    jdbc_connection_string => "jdbc:sqlserver://localhost:1433;databaseName=test"
    # The user we wish to execute our statement as
    jdbc_user => "sa"
    jdbc_password => "sasa"
    # The path to our downloaded jdbc driver
    jdbc_driver_library => "C:\Users\abhijitb\.m2\repository\com\microsoft\sqlserver\mssql-jdbc\6.2.2.jre8\mssql-jdbc-6.2.2.jre8.jar"
    jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
    #clean_run => true
    schedule => "* * * * *"
    #query
    statement => "SELECT * FROM Student where studentid > :sql_last_value"
    use_column_value => true
    tracking_column => "studentid"
    }
}

output {
  #stdout { codec => json_lines }
  elasticsearch {
  "hosts" => "localhost:9200"
  "index" => "student"
  "document_type" => "data"
  "document_id" => "%{studentid}"
  }
}
Sign up to request clarification or add additional context in comments.

2 Comments

I dont have any primary id in my table. Neither any combination of columns can do so. In this case how would i make sure to insert only new rows in second scheduled run? Aslo, why are we using use_column_value attribute here?
@pradyumn do you have a time stamp on your table when records are updated?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.