1

I'm trying to extract a substring from my request_uri field in logstash. Grok splits my apace access-log line into several field (is already working) so I get the request_uri in its own field. Now I want to get the root context of the uri.

/en/some/stuff
/ApplicationName/some/path
/fr/some/french/stuff

But I don#t know how to store en, ApplicationName, fr in its own field (additional to the others). I'm thinking something like this might work.

grok {
            pattern => "\"%{GREEDYDATA:domain}\" - %{IP:client_ip} \[%{GREEDYDATA:log_timestamp}\] \"%{WORD:method}\" \"%{GREEDYDATA:request_uri}\" - \"%{GREEDYDATA:query_string}\" - \"%{GREEDYDATA:protocol}\" - %{NUMBER:http_statuscode} %{NUMBER:bytes} \"%{GREEDYDATA:user_agent}\" %{NUMBER:seconds} %{NUMBER:milliseconds} \"%{GREEDYDATA:server_node}\""
            match => [ "new_context_field", "SOME-REGEX fo parse request_uri" ]
        }

Can you give me a hint?

1
  • Which Logstash version is this? I've never heard of a pattern option to the grok filter. Commented Jan 16, 2015 at 10:52

2 Answers 2

3

Thanks for your help. Solved it with this grok config which is pretty similar to your suggestion.

grok {
    patterns_dir => "/path/to/elk-stack/logstash-1.4.2/bin/custom_patterns"

    match => [ "message", "\"%{GREEDYDATA:domain}\" - %{IP:client_ip} \[%{GREEDYDATA:log_timestamp}\] \"%{WORD:method}\" \"%{GREEDYDATA:request_uri}\" - \"%{GREEDYDATA:query_string}\" - \"%{GREEDYDATA:protocol}\" - %{NUMBER:http_statuscode} %{NUMBER:bytes} \"%{GREEDYDATA:user_agent}\" %{NUMBER:seconds} %{NUMBER:milliseconds} \"%{GREEDYDATA:server_node}\""]
    match => [ "request_uri", "%{CONTEXTFROMURI:context}" ]

    break_on_match => false
}

To use multiple matches in a single grok block make sure to include break_on_match => false. Otherwise the second match is skipped if first one is successful.

Sign up to request clarification or add additional context in comments.

Comments

2

Your grok filter should actually look like this:

grok {
  match => [
    "message",
    "\"%{GREEDYDATA:domain}\" - %{IP:client_ip} \[%{GREEDYDATA:log_timestamp}\] \"%{WORD:method}\" \"%{GREEDYDATA:request_uri}\" - \"%{GREEDYDATA:query_string}\" - \"%{GREEDYDATA:protocol}\" - %{NUMBER:http_statuscode} %{NUMBER:bytes} \"%{GREEDYDATA:user_agent}\" %{NUMBER:seconds} %{NUMBER:milliseconds} \"%{GREEDYDATA:server_node}\""
  ]
}

Then, use a second grok filter after the one that matches against the whole log message in the 'message' field:

grok {
  match => ["request_uri", "/(?<context>[^/]+)"]
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.