2

I am parsing data from Elasticsearch index and have received the data in json format as follows:

{
    "_shards": {
        "failed": 0,
        "skipped": 0,
        "successful": 5,
        "total": 5
    },
    "hits": {
        "hits": [
            {
                "_id": "wAv4u2cB9qH5eo0Slo9O",
                "_index": "homesecmum",
                "_score": 1.0870113,
                "_source": {
                    "image": "0000000028037c08_1544283640.314629.jpg"
                },
                "_type": "dataRecord"
            },
            {
                "_id": "wwv4u2cB9qH5eo0SmY8e",
                "_index": "homesecmum",
                "_score": 1.0870113,
                "_source": {
                    "image": "0000000028037c08_1544283642.963721.jpg"
                },
                "_type": "dataRecord"
            },
            {
                "_id": "wgv4u2cB9qH5eo0SmI8Z",
                "_index": "homesecmum",
                "_score": 1.074108,
                "_source": {
                    "image": "0000000028037c08_1544283640.629583.jpg"
                },
                "_type": "dataRecord"
            }
        ],
        "max_score": 1.0870113,
        "total": 5
    },
    "timed_out": false,
    "took": 11
}

I am trying to extract only the image parameter from json data and store it as an array. I tried the following:

for result in res['hits']['hits']:
    post = result['_source']['image']
    print(post)

and this:

respars = json.loads(res['hits']['hits'][0]['_source'])['image']
print(json.dumps(respars, indent=4, sort_keys = True))

Both these throws an error:

TypeError: byte indices must be integers or slices, not str

I am sure similar problems were raised earlier here, but I couldn't get through this error. How can I fix it?

2
  • 1
    There is a nice package to handle elasticsearch: pypi.org/project/elasticsearch-dsl Commented Dec 17, 2018 at 17:04
  • @andreihondrari thank you will give this a try Commented Dec 17, 2018 at 17:05

2 Answers 2

3

Instead of going through the pain of manually handling the response, you could use the Elasticsearch-DSL package from PyPi.

Sign up to request clarification or add additional context in comments.

1 Comment

thank you...the package straightaway helped me in parsing the data.
0

To get all image in _source entry as list, you can use list comprehension:

image_list = [source['_source']['image'] for source in res['hits']['hits']]

Output:

['0000000028037c08_1544283640.314629.jpg',
 '0000000028037c08_1544283642.963721.jpg',
 '0000000028037c08_1544283640.629583.jpg']

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.