0

I have a DataFrame that is resampled into a smaller DataFrame, which retained a datetimeindex. I transposed the dataframe and now wish to remove the dateindex and replace it with strings (labels), then export it to .csv to be used in a format that can be read by javascript (doing all data manipulation in python).

I did try to write it to a .csv without header (removing the date) and then reading it again to add the labels, but that doesn't seem very efficient.

Link to csv: https://www.dropbox.com/s/qy72yht2m7lk2pg/17_predicted.csv

Python/pandas Code:

import pandas as pd
import numpy as np
from dateutil.parser import parse
from datetime import datetime
from pandas import *

# Load csv into pandas DataFrame
df = pd.read_csv("17_predicted_dummydata.csv", parse_dates=True, dayfirst=False, keep_date_col=True, index_col=0)

#Creating a date range
df.index = pd.to_datetime(pd.date_range(start='1/1/2000 00:30:00', end='1/1/2000 05:00:00', freq='30T'))

#Rename index
df.index.name = 'date'

df_year = df.resample('D', how='sum')
df_year = np.round(df_year, 0)

df_year.index.name = 'label'
df_year.column = ['value']

df_year = df_year.T

print df_year.head()
print df_year.index.name

df_year.to_csv("17_dummy.csv") #drop index through header=False

CSV input:

    Date/Time,InteriorEquipment:Electricity:Zone:4419 [J](TimeStep),InteriorEquipment:Electricity:Zone:3967 [J](TimeStep),InteriorEquipment:Electricity:Zone:3993 [J](TimeStep)
 01/01  00:30:00,0.583979872,0.428071889,0.044676234
 01/01  01:00:00,0.583979872,0.428071889,0.044676234
 01/01  01:30:00,0.583979872,0.428071889,0.044676234
 01/01  02:00:00,0.583979872,0.428071889,0.044676234
 01/01  02:30:00,0.583979872,0.428071889,0.044676234
 01/01  03:00:00,0.583979872,0.428071889,0.044676234
 01/01  03:30:00,0.583979872,0.428071889,0.044676234
 01/01  04:00:00,0.583979872,0.428071889,0.044676234
 01/01  04:30:00,0.583979872,0.428071889,0.044676234
 01/01  05:00:00,0.583979872,0.428071889,0.044676234

Proposed csv output:

    label,value
InteriorEquipment:Electricity:Zone:4419 [J](TimeStep),6.0
InteriorEquipment:Electricity:Zone:3967 [J](TimeStep),4.0
InteriorEquipment:Electricity:Zone:3993 [J](TimeStep),0.0

I tried to follow this (Insert a row to pandas dataframe) workaround, but couldn't make it work.

Any help is appreciated!

5
  • 2
    Could you edit this to a reproducible example that can be copied/pasted (using dummmy data if you need to)? Commented Jul 24, 2014 at 12:08
  • Hi again, I added some more information of what I have done up to now and the .csv file I am using. Hope this makes it a bit clearer. Thanks! Commented Jul 25, 2014 at 7:41
  • Since you want to manipulate the DataFrame further using Javascript, have you tried using JSON? DataFrame.to_json should be easy enough to use. Commented Jul 25, 2014 at 9:30
  • I wish to use .csv, because it then allows me to still analyse it and share it in excel. I didn't know about that function though, and will see if I can get that working for me as a backup. Commented Jul 25, 2014 at 10:06
  • I tried to read in the data, but got CParserError: Error tokenizing data. C error: Expected 2 fields in line 15, saw 7. A way simpler way is to simply reset_index() on your data and then paste df.head() in the post, since we can recreate dataframes through copy & paste. Commented Jul 25, 2014 at 11:54

1 Answer 1

1

You can assign directly to the index name and columns of dataframe to make it output as you want.

In [288]: df_year.index.name = 'label'

In [289]: df_year.columns = ['value']

In [290]: print df_year.to_csv()
label,value
Equipment:Electricity:LGF,79468.0
Equipment:Electricity:GF,66724.0
Equipment:Electricity:1st,30700.0
Equipment:Electricity:2nd,24126.0
Lights:Electricity:LGF,30596.0
Lights:Electricity:GF,30596.0
Lights:Electricity:1st,14078.0
Lights:Electricity:2nd,11063.0
General:Equipment:Electricity,201018.0
General:Lights:Electricity,86334.0
Electricity:Facility,314318.0
Electricity:Building,287352.0
Electricity:Plant,6329.0
Gas:Facility,279252.0
Electricity:HVAC,20637.0
General:Fans:Electricity,3554.0
Cooling:Electricity,17083.0
Pumps:Electricity,3708.0
WaterSystems:Electricity,2621.0
Sign up to request clarification or add additional context in comments.

8 Comments

This is actually not working for me. I think the issue is maybe related to the translation that I do first. I also found that index.names does not work anymore for 0.14 (stackoverflow.com/questions/19851005/…). Can't figure out how it should be done though even when using these examples: stackoverflow.com/questions/20037966/….
Make sure you're using index.name, not index.names
hmm, still no success. I was able to change the index to 'label' then transposed, but then have the format in your answer, except that 'value' is a date; '2000-12-32', which I can't seem to change. If I use df_year.columns = ['value'] after transpose, it throws away the 'label' and puts 'value' on the second column..
If it wasn't clear, the steps above should be done to the already transposed data.
I tried that, for the sake of getting it right. I updated my question with a small part of my data and my code to reproduce the situation. What I finally get is data labelled by only the date.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.