1

I'm trying to load JSON stream (lines of json objects file) through logstash to elasticsearch. Some of my json object field contains unicode as you can see below.

{"status_link": "https://www.facebook.com/asia/videos/1118055131588324/", "num_loves": "4", "num_sads": "0", "num_wows": "0", "num_angrys": "0", "num_comments": "6", "num_reactions": "46", "num_hahas": "0", "link_name": "", "num_likes": "42", "timestamp": "2016-07-25 02:07:38", "num_shares": "8", "_id": "156915824368931_1118055131588324", "status_message": "\"\u0411\u0440\u0438\u0433\u0430\u0434\" \u0440\u0435\u0430\u043b\u0438\u0442\u0438 \u0448\u043e\u0443\u043d\u044b \u0448\u0438\u043d\u044d \u0434\u0443\u0433\u0430\u0430\u0440 07-\u0440 \u0441\u0430\u0440\u044b\u043d 28-\u043d\u044b \u043f\u04af\u0440\u044d\u0432 \u0433\u0430\u0440\u0430\u0433\u0438\u0439\u043d \u043e\u0440\u043e\u0439 18:00 \u0446\u0430\u0433\u0430\u0430\u0441", "status_type": "video"}

When I start logstash, it gives me an error:

"status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse", "caused_by"=>{"type"=>"illegal_state_exception", "reason"=>"Mixing up field types: class org.elasticsearch.index.mapper.core.StringFieldMapper$StringFieldType != class org.elasticsearch.index.mapper.internal.IdFieldMapper$IdFieldType on field _id"}}}}, :level=>:warn}

My logstash.conf:

input
{
    file
    {
        path => "test.json"
        start_position => "beginning"
        sincedb_path => "/dev/null"
        exclude => "*.gz"
        type => "posts"
        codec => "json"
    }
}

filter {
  json {
    source => "message"
  }
}

output {
  elasticsearch {
  hosts => ["localhost:9200"]
  index => "fb"
  codec => "json"
   }
}

I tried to load json object without unicode, it successfully parses and indexes in elasticsearch.

1 Answer 1

3

The problem is you have an _id field in your document. _id is a preserved field. So you need to either remove or rename it.

Sign up to request clarification or add additional context in comments.

2 Comments

Later on I'm loading comments and make it parent-child relation with posts. So isn't the _id field only field that makes the connection between posts and comments?
If you want to use _id as your document id, my suggestion rename it as id and use document_id => "%{[event][id]}" in your elasticsearch output.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.