2

So, I'm trying to configure logstash to fetch JSON data from a public API and insert into Elasticsearch.

The data looks like this:

{
    "Meta Data": {
        "1. Information": "Daily Aggregation",
        "2. Name": "EXAMPLE",
        "3. Last Refreshed": "2018-04-06"
    },
    "Time Series": {
        "2018-04-06": {
            "1. Value1": "20",
            "2. Value 2": "21",
            "3. Value 3": "20",
            "4. Value 4": "21",
            "5. Value 5": "47"
        },
        "2018-04-05": {
            "1. open": "21",
            "2. high": "21",
            "3. low": "21",
            "4. close": "21",
            "5. volume": "88"
        },
        "2018-04-04": {
            "1. open": "20",
            "2. high": "20",
            "3. low": "20",
            "4. close": "20",
            "5. volume": "58"
        },
        "2018-04-03": {
            "1. Value1": "20",
            "2. Value 2": "21",
            "3. Value 3": "20",
            "4. Value 4": "21",
            "5. Value 5": "47"
        },
        ...
    }
}

I don't care about the metadata, I want each object inside the "Time Series" to become a different event to be sent to Elasticsearch. I just don't know how to do it.

So far, I just got the input configuration right...

input {
  http_poller {
    urls => {
        test1 => "https://www.public-facing-api.com/query?function=TIME_SERIES_DAILY&name=EXAMPLE"
        #headers => {
        #   Accept => "application/json"
        #}
    }
    request_timeout => 60
    # Supports "cron", "every", "at" and "in" schedules by rufus scheduler
    schedule => { cron => "* * * * * * UTC"}
    codec => "json"
  }
}

filter {
    json {
        source => "message"
        target => "parsedMain"
    }
    json {
        source => "[parsedMain][Time Series]"
        target => "parsedContent"
    }
}

output {
  stdout { codec => rubydebug }
}

But it just prints everything as a single object.

I would also like to capture the date, which is the name of each nested object, and set it to ES timestamp. Also, the id as %{date}_%{name}.

Does anyone know how to do it?

1 Answer 1

5

To do this, you'll need a ruby filter + a split filter. You need to turn the Time Series hash into an array and then split on the array:

filter {
    json {
        source => "message"
    }
    ruby {
        code => '
            arrayOfEvents = Array.new()
            ts = event.get("Time Series")
            ts.each do |date,data|
                data["date"]=date # set the date on the sub-object, since we likely need that
                arrayOfEvents.push(data)
            end
            event.set("event",arrayOfEvents)
        '
        remove_field => ["Time Series","Meta Data" ]
    }
    split {
        field => 'event'
    }
}
output {
    stdout { codec => rubydebug }
}

Example out output:

...
{
    "@timestamp" => 2018-04-09T15:01:01.765Z,
      "@version" => "1",
          "host" => "xxx.local",
          "type" => "yyyyy",
         "event" => {
              "date" => "2018-04-03",
         "1. Value1" => "20",
        "5. Value 5" => "47",
        "3. Value 3" => "20",
        "4. Value 4" => "21",
        "2. Value 2" => "21"
    }
}
{
    "@timestamp" => 2018-04-09T15:01:01.765Z,
      "@version" => "1",
          "host" => "xxx.local",
          "type" => "yyyyy",
         "event" => {
           "3. low" => "20",
             "date" => "2018-04-04",
        "5. volume" => "58",
          "1. open" => "20",
          "2. high" => "20",
         "4. close" => "20"
    }
}
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you so much. Your answer is spot on.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.