1

I'm currently using logstash to parse and output the results of several similar commands to elasticsearch, similar to this:

input {
  exec {
    type => 'hist'
    command => '/usr/bin/somecommand'
    interval => 900
    codec => "json"
  }

  exec {
    type => 'hist'
    command => '/usr/bin/somecommand'
    interval => 900
    codec => "json"
  }

  exec {
    type => 'hist'
    command => '/usr/bin/somecommand'
    interval => 900
    codec => "json"
  }
}

output {
        if [type] == "hist" {
                elasticsearch {
                        hosts => ["hostname.domain.com:9200"]
                        index => "monitor-hist-%{+YYYY-MM-dd}"
                }
        }
}

What I would like is to be able to output to stdout or a file if the connection to elasticsearch fails, like:

if _connectionfails_ {
    stdout {
          codec => rubydebug
    }
}

Is this possible? Or any other recommendations for managing data when elastic is unavailable?

4
  • Do you even get an event when the input fails? If not, there would be nothing to filter or output. It will log the failure, so you could ingest the logstash logs into another elastic stack and look for failures that way. Commented Apr 19, 2016 at 19:18
  • 1
    Normally, if connection fails it goes for infinite retries. Commented Apr 19, 2016 at 19:26
  • @PriyanshGoel I noticed that here: elastic.co/guide/en/logstash/current/… but what's not clear to me is how it "buffers" that data. So say the cluster is down for 30+ minutes. There would be two runs that would have failed, do both of these keep retrying until the cluster is up? Is the data kept in the heap while it's retrying? Or perhaps committed to disk temporarily? Commented Apr 19, 2016 at 19:51
  • I hope my answer will clear your doubt Commented Apr 19, 2016 at 19:58

1 Answer 1

1

Logstash keeps all events in main memory during processing. Logstash responds to a SIGTERM by attempting to halt inputs and waiting for pending events to finish processing before shutting down. When the pipeline cannot be flushed due to a stuck output or filter, Logstash waits indefinitely. For example, when a pipeline sends output to a database that is unreachable by the Logstash instance, the instance waits indefinitely after receiving a SIGTERM.

Sign up to request clarification or add additional context in comments.

7 Comments

Ok, what I'm wondering is, say our elasticsearch cluster goes down in the middle of the night, and we don't notice for 8 hours. Because the logstash config I posted above runs every 15 minutes, there could be up to 32 runs worth of data being retried. This could be several GB. So, if logstash keeps trying to deliver this data, and it keeps piling up, would I be likely to have a OutOfMemoryError at some point? This is where the second part of my question comes in. If I could have logstash abort the connection attempt after a while and just output data to disk, I could prevent this?
If it happens, it will stop receiving inputs as well . "Logstash responds to a SIGTERM by attempting to halt inputs " this line suggests the same in my answer.
Ah ok, so it will only retry to deliver the first missed run? The following runs will be dropped or ignored?
Yes, In your case it will happen.
Ok, good to know. So then, what can be done to "buffer" this retry data? I was thinking of outputting to a file, hence the second part of the question. Is there a way to output to a file or some other way to buffer data if the elastic output fails? Perhaps I can always output a file and then remove the file if the elastic connection succeeded?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.