I read csv's from url and I would like to build a unique dataframe. The csv's corresponds to a timeseries of measurements for one parameter for a unique location (e.g each url is associated to a location and a unique parameter).
parameter = ['pm10','pm2.5','o3','no2']
location = [ 'Nabel_LUG', 'Nabel_MAG']
urls = []
dfs = []
CSV_URL = 'http://www.oasi.ti.ch/web/rest/measure/csv?domain=air&resolution=y¶meter={}&from=2007-01-01&to=2017-04-28&location={}'
for l in location:
for p in parameter:
url = CSV_URL.format( p, l)
urls.append(url)
urls here is a list of url from which i get the csv's.
dfs = [(pd.read_csv(url, comment='#', sep=';', usecols=[0, 1], index_col='data')) for url in urls]
result_pm10 = pd.concat(dfs, keys=location)
result_pm10 is a dataframe that contains all the location's timeseries for a specific parameter e.g.:
data PM10
Nabel_LUG 01.07.2011 01:00 21.0
Nabel_LUG 01.07.2012 01:00 21.0
Nabel_LUG 01.07.2013 01:00 18.0
Nabel_LUG 01.07.2014 01:00 15.0
Nabel_LUG 01.07.2015 01:00 18.0
Nabel_LUG 01.07.2016 01:00 16.0
Nabel_LUG 01.07.2017 01:00 24.0
Nabel_MAG 01.07.2011 01:00 24.0
Nabel_MAG 01.07.2012 01:00 21.0
Nabel_MAG 01.07.2013 01:00 19.0
Nabel_MAG 01.07.2014 01:00 15.0
Nabel_MAG 01.07.2015 01:00 19.0
Nabel_MAG 01.07.2016 01:00 15.0
Nabel_MAG 01.07.2017 01:00 22.0
I would like to obtain something like this
data PM10 O3 NO2
Nabel_LUG 01.07.2011 01:00 21.0 683.0 34.0
Nabel_LUG 01.07.2012 01:00 21.0 668.0 32.0
Nabel_LUG 01.07.2013 01:00 18.0 707.0 31.0
Nabel_LUG 01.07.2014 01:00 15.0 366.0 29.0
Nabel_LUG 01.07.2015 01:00 18.0 804.0 30.0
Nabel_LUG 01.07.2016 01:00 16.0 550.0 28.0
Nabel_LUG 01.07.2017 01:00 24.0 45.0 37.0
Nabel_MAG 01.07.2011 01:00 24.0 540.0 20.0
Nabel_MAG 01.07.2012 01:00 21.0 432.0 19.0
Nabel_MAG 01.07.2013 01:00 19.0 494.0 18.0
Nabel_MAG 01.07.2014 01:00 15.0 259.0 20.0
Nabel_MAG 01.07.2015 01:00 19.0 596.0 18.0
Nabel_MAG 01.07.2016 01:00 15.0 363.0 21.0
Nabel_MAG 01.07.2017 01:00 22.0 65.0 24.0
But I'm only able to do this by repeating the above code for each parameter separately and then doing something like
df_parameter = [result_pm10, result_pm25, result_o3, result_no2]
result = pd.concat(df_parameter, axis=1)
There is a way to do this in a more efficient way (especially when there are much more parameter)?