0

I'm trying to process xml file to ES with Logstash. But I tried multiple times it's still not working. I highly appreciate your kind help. The configure file as following:

input {
  file {
    path => "/data/logstashtest/*.xml"
    start_position => "beginning"
  }
}
filter {
  multiline {
    pattern => "^\s|</report>|^[A-Za-z].*"
    what => "previous"
  }
  xml {
    store_xml => "false"
    source => "message"
    xpath => [
       "/report/@logtype", "logtype",
       "/report/result/@name", "name",
       "/report/result/@start-epoch", "start-epoch",
       "/report/result/@generated-at","generated-at"
    ]
  }
  date {
    match => [ "generated-at", "ISO8601" ]
  }
}
output {
  elasticsearch {
    protocol => http
    host => localhost
    port => 9200
    cluster => mycluster
    index => mylog
  }
  stdout { codec => rubydebug }
}

The xml source file as following:

<report reportname="" logtype="news">
  <result name="financial news" logtype="news" start-epoch="1433134800" end-epoch="1433149199" generated-at="2015/06/01 04:10:17"/>
</report>

The Logstash is in the same node with one of ES nodes. I used the following command:

bin/logstash -f threatlog.conf

It output:

[2015-09-09 17:55:29.811]  WARN -- Concurrent: [DEPRECATED] Java 7 is deprecated, please use Java 8.
Java 7 support is only best effort, it may not work. It will be removed in next release (1.0).
Logstash startup completed

When I check the ES index, there is nothing. I'm using logstash-1.5.4. Thanks in advance!

1 Answer 1

2

The reason you see this is because Logstash keeps track of the position in the file until which it has already processed the content. The first time you launched Logstash you probably saw some output and then none anymore. To get rid of this and keep starting over until your get the config right, you need to set sincedb_path to /dev/null so Logstash doesn't keep track of where it is in the processing of your XML files.

So chance your input filter to this:

input {
  file {
    path => "/data/logstashtest/*.xml"
    start_position => "beginning"
    sincedb_path => "/dev/null"
  }
}

Then there's also a problem with your date filter which doesn't expect the correct date format, you'll get an error like this following one:

Failed parsing date from field {:field=>"generated-at", :value=>"2015/06/01 04:10:17", :exception=>"Invalid format: \"2015/06/01 04:10:17\" is malformed at \"/06/01 04:10:17\"", :config_parsers=>"ISO8601", :config_locale=>"default=fr_FR", :level=>:warn}

So in order to fix this, you simply need to change your date filter like this with the correct date format:

date {
    match => [ "generated-at", "yyyy/MM/dd HH:mm:ss" ]
}

After that, you'll get a nice and properly formatted Logstash event:

{
         "message" => "<report reportname=\"\" logtype=\"news\">\n  <result name=\"financial news\" logtype=\"news\" start-epoch=\"1433134800\" end-epoch=\"1433149199\" generated-at=\"2015/06/01 04:10:17\"/>\n</report>",
        "@version" => "1",
      "@timestamp" => "2015-06-01T02:10:17.000Z",
            "host" => "localhost",
            "path" => "/data/text.xml",
            "tags" => [
        [0] "multiline"
    ],
         "logtype" => [
        [0] "news"
    ],
            "name" => [
        [0] "financial news"
    ],
     "start-epoch" => [
        [0] "1433134800"
    ],
    "generated-at" => [
        [0] "2015/06/01 04:10:17"
    ]
}
Sign up to request clarification or add additional context in comments.

4 Comments

No prob, glad to help!
Another question please, I tried modify the xml input file as:<report logtype="news"> <result name="news11" logtype="news" start-epoch="1433134800" end-epoch="1433149199" generated-at="2015/06/01 04:10:17"/> <result name="news22" logtype="news22" start-epoch="1433134800" end-epoch="1433149199" generated-at="2015/06/01 04:10:17"/> <result name="news33" logtype="news33" start-epoch="1433134800" end-epoch="1433149199" generated-at="2015/06/01 04:10:17"/> </report>. I supposed to get 3 records, but I only get one record in the index. Did I need to modify config pls.
That would be another question (for the benefit of keeping this one uncluttered), but in short, I think in your xml filter the xpath expressions only match the first <result> element.
Thank you so much Val, Could you please give me an example that how to modify my config file, Sorry for bothering you again, I'm really headache of the lack of Logstash documents. I appreciate you kind help.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.