0

I have a CSV file like this:

2011    1   10  1000000
2011    1   11  998785
2011    1   12  1002940
2011    1   13  1004815
2011    1   14  1009415
2011    1   18  1011935

I want to read it into a DataFrame object and have a datetime typed index built from the frist 3 colomns. The final DataFrame should look like this:

                     values
datetime(2011,1,10)  1000000
datetime(2011,1,11)  998785
...

How should I do that? Thanks a lot!

1 Answer 1

3
import io
import pandas as pd
content = io.BytesIO('''\
2011    1   10  1000000
2011    1   11  998785
2011    1   12  1002940
2011    1   13  1004815
2011    1   14  1009415
2011    1   18  1011935''')

df = pd.read_table(content, sep='\s+', parse_dates=[[0,1,2]], header=None)
df.columns=['date', 'values']
print(df)

yields

                 date   values
0 2011-01-10 00:00:00  1000000
1 2011-01-11 00:00:00   998785
2 2011-01-12 00:00:00  1002940
3 2011-01-13 00:00:00  1004815
4 2011-01-14 00:00:00  1009415
5 2011-01-18 00:00:00  1011935
Sign up to request clarification or add additional context in comments.

5 Comments

Thanks. I got Error like :"Exception: Length mismatch (2 vs 4)". I am assuming the number of columns is incorrect. Is there a version mismatch with pandas?
Works perfectly for me with pandas 0.11.0
I doubt it's a version issue; more likely there is a header row like "date value" that you should skip.
I copied unutbu's code directly and tested it. Error came from line " df.columns=['date', 'values']". I am using Python 2.7 with pandas 0.7.0
You should really try to update your pandas version. Current stable version is 0.12.0.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.