1

I am trying to create a loop which will return for each ticker, 1. a different data frame (by the name of ticker) 2. with a conversion of the time column to "normal" day 3. and it (the new time) will be used as index for that data frame.

If I run it for each ticker it's working without problem. I appreciate your help!

import requests
import pandas as pd
desired_width = 320
pd.set_option('display.width', desired_width)

data = pd.DataFrame()
tickers = ['BTC', 'ETH', 'XRP']  # pools of tickers to get
for t in tickers:  # a loop to get data ticker by ticker
        url = 'https://min-api.cryptocompare.com/data/histoday' + \
              '?fsym=' + \
                t +\
              '&tsym=USD' + \
              '&limit=600000000000' + \
              '&aggregate=1' + \
              '&e=CCCAGG'
        response = requests.get(url)
        data[t] = response.json()['Data']
        #the following 2 lines I failed to execute
        #data[t]['time'] = pd.to_datetime(data[t]['time'], unit='s')
        #data[t].index = data[t]['time']
        print("downloading data for: " + t)
        print("data for:" + t, data.head(5))

My results is one data frame for all three tickers:

data for:XRP BTC
ETH XRP 0 {'time': 1342742400, 'close': 8.52, 'high': 8.... {'time': 1342742400, 'close': 0, 'high': 0, 'l... {'time': 1342742400, 'close': 0, 'high': 0, 'l... 1 {'time': 1342828800, 'close': 8.85, 'high': 9.... {'time': 1342828800, 'close': 0, 'high': 0, 'l... {'time': 1342828800, 'close': 0, 'high': 0, 'l... 2 {'time': 1342915200, 'close': 8.41, 'high': 8.... {'time': 1342915200, 'close': 0, 'high': 0, 'l... {'time': 1342915200, 'close': 0, 'high': 0, 'l... 3 {'time': 1343001600, 'close': 8.45, 'high': 9.... {'time': 1343001600, 'close': 0, 'high': 0, 'l... {'time': 1343001600, 'close': 0, 'high': 0, 'l... 4 {'time': 1343088000, 'close': 8.6, 'high': 8.8... {'time': 1343088000, 'close': 0, 'high': 0, 'l... {'time': 1343088000, 'close': 0, 'high': 0, 'l...

I am using python 3.6 with pycharm + anconda on Windows 10

2 Answers 2

2

I think you can use json_normalize for dinctionary of DataFrames with concat for DataFrame with MultiIndex - first level are tickers:

from pandas.io.json import json_normalize

data = {}
tickers = ['BTC', 'ETH', 'XRP']  # pools of tickers to get
for t in tickers:  # a loop to get data ticker by ticker
        url = 'https://min-api.cryptocompare.com/data/histoday' + \
              '?fsym=' + \
                t +\
              '&tsym=USD' + \
              '&limit=600000000000' + \
              '&aggregate=1' + \
              '&e=CCCAGG'
        response = requests.get(url)
        data[t] = json_normalize(response.json()['Data'])

df = pd.concat(data)
print (df.head())

       close  high   low  open        time  volumefrom    volumeto
BTC 0   8.52  8.87  7.60  8.87  1342742400   154661.12  1267523.74
    1   8.85  9.70  7.96  8.52  1342828800   139906.90  1242153.88
    2   8.41  8.97  8.27  8.85  1342915200    30070.67   259113.81
    3   8.45  9.20  7.75  8.41  1343001600   146396.18  1238579.49
    4   8.60  8.85  8.34  8.45  1343088000    40946.86   353506.54

Then for select each level is posible use:

print (df.xs('BTC').head())

#print (df.loc['BTC'].head())

   close  high   low  open        time  volumefrom    volumeto
0   8.52  8.87  7.60  8.87  1342742400   154661.12  1267523.74
1   8.85  9.70  7.96  8.52  1342828800   139906.90  1242153.88
2   8.41  8.97  8.27  8.85  1342915200    30070.67   259113.81
3   8.45  9.20  7.75  8.41  1343001600   146396.18  1238579.49
4   8.60  8.85  8.34  8.45  1343088000    40946.86   353506.54

Another approach is not concat, only create dictionary:

data = {}
tickers = ['BTC', 'ETH', 'XRP']  # pools of tickers to get
for t in tickers:  # a loop to get data ticker by ticker
        url = 'https://min-api.cryptocompare.com/data/histoday' + \
              '?fsym=' + \
                t +\
              '&tsym=USD' + \
              '&limit=600000000000' + \
              '&aggregate=1' + \
              '&e=CCCAGG'
        response = requests.get(url)
        data[t] = json_normalize(response.json()['Data'])
        data[t] = data[t].set_index(pd.to_datetime(data[t]['time'], unit='s'))

print (data['BTC'].head())

            close  high   low  open        time  volumefrom    volumeto
time                                                                   
2012-07-20   8.52  8.87  7.60  8.87  1342742400   154661.12  1267523.74
2012-07-21   8.85  9.70  7.96  8.52  1342828800   139906.90  1242153.88
2012-07-22   8.41  8.97  8.27  8.85  1342915200    30070.67   259113.81
2012-07-23   8.45  9.20  7.75  8.41  1343001600   146396.18  1238579.49
2012-07-24   8.60  8.85  8.34  8.45  1343088000    40946.86   353506.54

EDIT: If want global variable not recommended solution:

data = {}
tickers = ['BTC', 'ETH', 'XRP']  # pools of tickers to get
for t in tickers:  # a loop to get data ticker by ticker
        url = 'https://min-api.cryptocompare.com/data/histoday' + \
              '?fsym=' + \
                t +\
              '&tsym=USD' + \
              '&limit=600000000000' + \
              '&aggregate=1' + \
              '&e=CCCAGG'
        response = requests.get(url)
        globals()['df_' + str(t)] = json_normalize(response.json()['Data'])
        globals()['df_' + str(t)] = globals()['df_' + str(t)].set_index(pd.to_datetime(globals()['df_' + str(t)]['time'], unit='s'))

print (df_BTC.head())

            close  high   low  open        time  volumefrom    volumeto
time                                                                   
2012-07-20   8.52  8.87  7.60  8.87  1342742400   154661.12  1267523.74
2012-07-21   8.85  9.70  7.96  8.52  1342828800   139906.90  1242153.88
2012-07-22   8.41  8.97  8.27  8.85  1342915200    30070.67   259113.81
2012-07-23   8.45  9.20  7.75  8.41  1343001600   146396.18  1238579.49
2012-07-24   8.60  8.85  8.34  8.45  1343088000    40946.86   353506.54
Sign up to request clarification or add additional context in comments.

11 Comments

thanks! it does solve problems 2 + 3 but not the first which is creating 3 or 99 different data frames each with "df_" + t so I will have df_BTC, df_ETH etc.
@Giladbi - Is possible use dictionary of DataFrames ?
But if really need it, then use globals()['df_' + str(t)] = json_normalize(response.json()['Data']) instead data[t] = json_normalize(response.json()['Data']) and then get print (df_BTC), but it is a bit unpythonic.
@Giladbi ... it is not advised to store many similar structure objects like dfs in your global environment as it requires too much maintenance and resource pointers. Use one container like list or dictionary and you can reference each by unique keys as jezrael shows. The same advice applies for R and other languages!
@jezrael thanks, it works but i still didn't solve the time format and the time as index. it should be done before the creation of the different data frames. I tried this: response.json()['time'] = pd.to_datetime(data[t]['time'], unit='s') for time format change but i am messing it up.
|
0

I manged to defined a function to get the data and than use a loop to get it for all the tickers. this solves the problem.

import requests
import datetime
import pandas as pd
import matplotlib.pyplot as plt
desired_width = 320
pd.set_option('display.width', desired_width)

#function to download the Historical HOUR data
def hourly_price_historical(symbol, comparison_symbol, limit, aggregate, exchange=''):
    url = 'https://min-api.cryptocompare.com/data/histohour?fsym={}&tsym={}&limit={}&aggregate={}'\
            .format(symbol.upper(), comparison_symbol.upper(), limit, aggregate)
    if exchange:
        url += '&e={}'.format(exchange)
    page = requests.get(url)
    data = page.json()['Data']
    df = pd.DataFrame(data)
    df['timestamp'] = [datetime.datetime.fromtimestamp(d) for d in df.time]
    df = df.drop('time', 1)
    df.set_index('timestamp')
    return df


data = {}
tickers = ['BTC', 'ETH', 'XRP']                 # pools of tickers to get
for t in tickers:                                                           # a loop to get data ticker by ticker
    data[t] = hour_data = hourly_price_historical(t,'USD', 9999999,1)       # calling the function defined above
    print("Getting the data for: ", t)
    globals()['df_' + str(t)] = data

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.