I am trying to build a DataFrame in pandas, using the results of a very basic query to Elasticsearch. I am getting the Data I need, but its a matter of slicing the results in a way to build the proper data frame. I really only care about getting the timestamp, and path, of each result. I have tried a few different es.search patterns.
Code:
from datetime import datetime
from elasticsearch import Elasticsearch
from pandas import DataFrame, Series
import pandas as pd
import matplotlib.pyplot as plt
es = Elasticsearch(host="192.168.121.252")
res = es.search(index="_all", doc_type='logs', body={"query": {"match_all": {}}}, size=2, fields=('path','@timestamp'))
This gives 4 chunks of data. [u'hits', u'_shards', u'took', u'timed_out']. My results are inside the hits.
res['hits']['hits']
Out[47]:
[{u'_id': u'a1XHMhdHQB2uV7oq6dUldg',
u'_index': u'logstash-2014.08.07',
u'_score': 1.0,
u'_type': u'logs',
u'fields': {u'@timestamp': u'2014-08-07T12:36:00.086Z',
u'path': u'app2.log'}},
{u'_id': u'TcBvro_1QMqF4ORC-XlAPQ',
u'_index': u'logstash-2014.08.07',
u'_score': 1.0,
u'_type': u'logs',
u'fields': {u'@timestamp': u'2014-08-07T12:36:00.200Z',
u'path': u'app1.log'}}]
The only things I care about, are getting the timestamp, and path for each hit.
res['hits']['hits'][0]['fields']
Out[48]:
{u'@timestamp': u'2014-08-07T12:36:00.086Z',
u'path': u'app1.log'}
I can not for the life of me figure out who to get that result, into a dataframe in pandas. So for the 2 results I have returned, I would expect a dataframe like.
timestamp path
0 2014-08-07T12:36:00.086Z app1.log
1 2014-08-07T12:36:00.200Z app2.log
