0

I'm using YahooFinancials to get the stock price and volume for a list of several companies. I can extract the prices and volume to separate dataframes, but would like to get both price and volume into the same dataframe without having to merge them after the fact. I believe what I need is a nested list comprehension, but I'm not quite sure how to achieve this?

My code as follows:

import pandas as pd
from pandas.io.json import json_normalize
import numpy as np
from yahoofinancials import YahooFinancials
import matplotlib.pyplot as plt
import seaborn as sns

from datetime import date, timedelta
import warnings

%matplotlib inline
%config InlineBackend.figure_format = 'retina'

plt.style.use('seaborn')

start = date(2007,1,1)
end = date(2020,6,4)
today = date.today()
tomorrow = str(end + timedelta(days=1))

portfolio = ['AMZN', 'GOOGL', 'MSFT']
yahoo_financials = YahooFinancials(portfolio)

data = yahoo_financials.get_historical_price_data(start_date=str(start), end_date=str(today), time_interval='daily')

prices = pd.DataFrame({a: {x['formatted_date']: x['adjclose'] for x in data[a]['prices']} for a in portfolio})

volume = pd.DataFrame({a: {x['formatted_date']: x['volume'] for x in data[a]['prices']} for a in portfolio})

Ideally, the output looks something like this:

date      AMZNPrice AMZNVolume GOOGLPrice GOOGLVolume MSFTPrice MSFTVolume
6/9/2020    2600.860107 5176000 1452.079956 1681200 189.800003  29783900
6/10/2020   2647.449951 4946000 1464.699951 1588100 196.839996  43872300
6/11/2020   2557.959961 5800100 1401.900024 2357200 186.270004  52854700
6/12/2020   2545.02002  5429600 1412.920044 1832900 187.740005  43345700
6/15/2020   2572.679932 3865100 1420.73999  1523400 188.940002  32712500

1 Answer 1

1

Try this:

data = yahoo_financials.get_historical_price_data(start_date=str(start), end_date=str(today), time_interval='daily')

dfs = []
for s in portfolio:
    df = pd.json_normalize(data[s]['prices'])
    df['stock'] = s
    df = df[['stock', 'formatted_date', 'adjclose', 'volume']]
    dfs.append(df)

df = pd.concat(dfs)
df = pd.pivot(df, index='formatted_date', columns='stock', values=['adjclose', 'volume'])
df.columns = ['_'.join(col) for col in df.columns.values]
print(df)

Output:

                adjclose_AMZN  adjclose_GOOGL  adjclose_MSFT  volume_AMZN  volume_GOOGL  volume_MSFT
formatted_date
2007-01-03          38.700001      234.029022      22.123693   12405100.0    15397500.0   76935100.0
2007-01-04          38.900002      241.871872      22.086641    6318400.0    15759400.0   45774500.0
2007-01-05          38.369999      243.838837      21.960684    6619700.0    13730400.0   44607200.0
2007-01-08          37.500000      242.032028      22.175547    6783000.0     9499200.0   50220200.0
2007-01-09          37.779999      242.992996      22.197784    5703000.0    10752000.0   44636600.0
...                       ...             ...            ...          ...           ...          ...
2020-06-09        2600.860107     1452.079956     189.800003    5176000.0     1681200.0   29783900.0
2020-06-10        2647.449951     1464.699951     196.839996    4946000.0     1588100.0   43872300.0
2020-06-11        2557.959961     1401.900024     186.270004    5800100.0     2357200.0   52854700.0
2020-06-12        2545.020020     1412.920044     187.740005    5429600.0     1832900.0   43345700.0
2020-06-15        2572.679932     1420.739990     188.940002    3865100.0     1523400.0   32712500.0
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.