2

I have a generator being returned from:

data = public_client.get_product_trades(product_id='BTC-USD', limit=10)

How do i turn the data in to a pandas dataframe?

the method DOCSTRING reads:

"""{"Returns": [{
                     "time": "2014-11-07T22:19:28.578544Z",
                     "trade_id": 74,
                     "price": "10.00000000",
                     "size": "0.01000000",
                     "side": "buy"
                 }, {
                     "time": "2014-11-07T01:08:43.642366Z",
                     "trade_id": 73,
                     "price": "100.00000000",
                     "size": "0.01000000",
                     "side": "sell"
         }]}"""

I have tried:

df = [x for x in data]
df = pd.DataFrame.from_records(df)

but it does not work as i get the error:

AttributeError: 'str' object has no attribute 'keys'

When i print the above "x for x in data" i see the list of dicts but the end looks strange, could this be why?

print(list(data))

[{'time': '2020-12-30T13:04:14.385Z', 'trade_id': 116918468, 'price': '27853.82000000', 'size': '0.00171515', 'side': 'sell'},{'time': '2020-12-30T12:31:24.185Z', 'trade_id': 116915675, 'price': '27683.70000000', 'size': '0.01683711', 'side': 'sell'}, 'message']

It looks to be a list of dicts but the end value is a single string 'message'.

5
  • you say it returns a generator, but then your example is a list... Commented Dec 30, 2020 at 12:48
  • if its a list what happens if you call pd.json_normalize(your_list) ? Commented Dec 30, 2020 at 12:49
  • if i run type(data) i get : <class 'generator'>. The 'returns' data above is the method doc string Commented Dec 30, 2020 at 12:51
  • try just removing that last elemnt. this question really has nothing to do with generators, it has to do with you munging that data into something the pd.DataFrame constructor will accept Commented Dec 30, 2020 at 12:54
  • I thought it might have been getting the last element because i am using the generator wrong. The issue is if i run next(data) i get back 1 dict from the list. So i would have thought it is the generator providing the rest of the list. Commented Dec 30, 2020 at 12:57

3 Answers 3

3

Based on the updated question:

df = pd.DataFrame(list(data)[:-1])

Or, more cleanly:

df = pd.DataFrame([x for x in data if isinstance(x, dict)])
print(df)

                       time   trade_id           price        size  side
0  2020-12-30T13:04:14.385Z  116918468  27853.82000000  0.00171515  sell
1  2020-12-30T12:31:24.185Z  116915675  27683.70000000  0.01683711  sell

Oh, and BTW, you'll still need to change those strings into something usable...

So e.g.:

df['time'] = pd.to_datetime(df['time'])
for k in ['price', 'size']:
    df[k] = pd.to_numeric(df[k])
Sign up to request clarification or add additional context in comments.

4 Comments

I gave an example as my last line, i cannot just print data as it's a class of generator so my example was to show data as a list, which is the result i get when i run it. I understand the docstring is only documentation that is what i was pointing out.
I see, what about showing list(some_generator()) then? And, as I'm guessing it is too large, please show the beginning and then end. What is the type of the elements given by the generator? (str? dict? list?)
I have now added the list(generator) output at the end.
df = pd.DataFrame(list(data)[:-1]) worked for me. Which is kinda what i thought might happen i wanted to make sure i wasn't geting the 'message' at the end because of some issue in the way i was using the generator, i would not expect that 'message' to be at the end and it's not in the docstring.
0

You could access the values in the dictionary and build a dataframe from it (although not particularly clean):

dict_of_data =  [{
                     "time": "2014-11-07T22:19:28.578544Z",
                     "trade_id": 74,
                     "price": "10.00000000",
                     "size": "0.01000000",
                     "side": "buy"
                 }, {
                     "time": "2014-11-07T01:08:43.642366Z",
                     "trade_id": 73,
                     "price": "100.00000000",
                     "size": "0.01000000",
                     "side": "sell"
         }]

import pandas as pd 

list_of_data = [list(dict_of_data[0].values()),list(dict_of_data[1].values())]

pd.DataFrame(list_of_data, columns=list(dict_of_data[0].keys())).set_index('time')

Comments

0

its straightforward just use the pd.DataFrame constructor:

#list_of_dicts = [{
#                     "time": "2014-11-07T22:19:28.578544Z",
#                     "trade_id": 74,
#                     "price": "10.00000000",
#                     "size": "0.01000000",
#                     "side": "buy"
#                 }, {
#                     "time": "2014-11-07T01:08:43.642366Z",
#                     "trade_id": 73,
#                     "price": "100.00000000",
#                     "size": "0.01000000",
#                     "side": "sell"
#}]
# or if you take it from 'data'
list_of_dicts = data[:-1]
df = pd.DataFrame(list_of_dicts)

df
Out[4]: 
                          time  trade_id         price        size  side
0  2014-11-07T22:19:28.578544Z        74   10.00000000  0.01000000   buy
1  2014-11-07T01:08:43.642366Z        73  100.00000000  0.01000000  sell

UPDATE

according to the question update, it seems you have json data that is still string...

import json

data = json.loads(data)
data = data['Returns']
pd.DataFrame(data)

                          time  trade_id         price        size  side
0  2014-11-07T22:19:28.578544Z        74   10.00000000  0.01000000   buy
1  2014-11-07T01:08:43.642366Z        73  100.00000000  0.01000000  sell

4 Comments

nope same error as i stated in my question, the issue here is you are turning the doc string into the data i receive but i have said when i iterate over the data i receive it has a 'message' at the end, why is this.
so can you pring what exactly data contains in your question in more proper and full way?
I have added the full return value above now.
updated accordingly.... is that 'message' consistently in the last index?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.