We wanted to move a (bad implementation) daily unique PHP IP logger we set up years ago, and move it to Elasticsearch instead.
Not completely sure how we are going to structure it yet, but are considering logging each and every request as a single document for more possibilities for dynamic analyzing.
something like this:
{
"_index": "logger",
"_type": "_doc",
"_id": "-1q04XEBfzHON7FKVuMY", // Auto-generated
"_source": {
"ip": "211.543.232.533",
"user": "",
"request": "GET /index.php HTTP/1.1",
"status": 200,
"bytes": 10984,
"refer": "https://www.google.com/search?q=some%20website",
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.0 Safari/537.36",
"domain": "example.org"
"timestamp": 1662865208000
}
}
Now the issue here is that the ip may appear multiple times, and I was wondering if it was possible to count all unique requests from 24:00 ?
For instance, let's say there are 6 documents, 3 having ip field be 211.543.232.533, 2 haivng 192.168.1.1 and one having 127.0.0.1. How could it be possible to count this as 3 hits?
Maybe a search that looks something like this:
POST /logger/_doc/_count
{
"query": {
"bool": {
"must": [
{
"range": {
"timestamp": {
"gt": 1662854400000 // Epoch ms time at 24:00
}
}
}
]
// And then something here? I'm not really sure what to do
}
}
}
Is this something that can be defined in a search? Or perhaps you need to set up some mapping type? analyzer?
Currently there are around 500'000 requests each day, around 30'000 being unique.