I'm having some issues with joining dataframes in python. The code is supposed to create a dataframe for each US state and put it into a main dataframe.
I think the problem is coming from the column titles. When I print them I get this:
Index([u'0AL', u'0', u'0AZ', u'0', u'0CO', u'0', u'0DE', u'0', u'0GA', u'0',
u'0IA', u'0', u'0IL', u'0', u'0KS', u'0', u'0LA', u'0', u'0MD', u'0',
u'0MI', u'0', u'0MO', u'0', u'0MT', u'0', u'0ND', u'0', u'0NH', u'0',
u'0NM', u'0', u'0NY', u'0', u'0OK', u'0', u'0PA', u'0', u'0SC', u'0',
u'0TN', u'0', u'0UT', u'0', u'0VT', u'0', u'0WI', u'0', u'0WY', u'0'],
dtype='object')
When I print the dataframe I get this:
0AL 0 0AZ 0 0CO 0 0DE 0 0GA 0 ... 0TN 0 \
2017-01-01 6.5 6.4 3.8 5 5.2 2.9 4.5 4.4 5 5.5 ... 2.9 5.4
0UT 0 0VT 0 0WI 0 0WY 0
2017-01-01 4.8 3.1 4 3.1 5.1 3.9 5.6 4.8
[1 rows x 50 columns]
It seems to be adding "0" everywhere as well as replacing some of the column headers (States). Initially, I was grabbing the States abbreviations from Wikipedia but then thought I would try and input them manually but that didn't help. Thanks in advance.
The purpose of this is to be able to rank the different states based on certain criteria.
Here is the code:
from fredapi import Fred
fred = Fred(api_key="XXXXXXXXXXXXXXXXX")
import matplotlib.pyplot as plt
from IPython.core.pylabtools import figsize
import datetime
import pandas as pd
import time
import html5lib
import numpy as np
#html= pd.read_html('https://en.wikipedia.org/wiki/List_of_states_and_territories_of_the_United_States')
#us_states= html[0][1][2:]
states =['AK','AL','AR','AZ','CA','CO','CT','DE','FL','GA','HI','IA','ID','IL','IN','KS','KY','LA','MA','MD','ME','MI','MN','MO','MS','MT','NC','ND','NE','NH', 'NJ','NM', 'NV','NY','OH','OK','OR','PA','RI','SC', 'SD','TN','TX','UT','VA','VT','WA','WI','WV',"WY"]
start = datetime.datetime(2017,1,1)
end = datetime.datetime(2017,1,1)
figsize(40, 5)
main_df = pd.DataFrame()
for abbv in states:
df= pd.DataFrame(fred.get_series(series_id=str(abbv)+"UR",observation_start=start, observation_end=end))
if main_df.empty:
main_df = df
else:
main_df = main_df.join(df, lsuffix=abbv)
print main_df
The first dataframe looks like this:
0
2017-01-01 6.5
columnsparameter when you form your firstdfdataframe. See my answer for implementation.