2
  1988   1   1  7.88 15.57 25.00  0.00  0.81  4.02
  1988   1   2  6.50 10.37 24.87  0.00  0.49  4.30
  1988   1   3  6.48 -8.79 21.28  0.00  0.62  3.91

I read the above file as:

df = pd.read_fwf('27_35.txt', header=None, widths=[6,4,4,6,6,6,6,6,6])

The 1st column is year, 2nd column is month and 3rd column is day. How do I create an index out of these 3 columns ('1988-1-1')?

3 Answers 3

5

use parse_dates param and pass a list in a list (so it understands that you want to treat the list of columns as a single entity for parsing):

In [83]:
t="""1988   1   1  7.88 15.57 25.00  0.00  0.81  4.02
  1988   1   2  6.50 10.37 24.87  0.00  0.49  4.30
  1988   1   3  6.48 -8.79 21.28  0.00  0.62  3.91"""
df = pd.read_fwf(io.StringIO(t), header=None, widths=[6,4,4,6,6,6,6,6,6], parse_dates=[[0,1,2]])
df

Out[83]:
       0_1_2       3       4      5  6     7     8
0 1988-01-01  7.88 1  5.57 2   5.00  0  0.81  4.02
1 1988-01-02    6.50   10.37  24.87  0  0.49  4.30
2 1988-01-03    6.48   -8.79  21.28  0  0.62  3.91

In [84]:
df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 3 entries, 0 to 2
Data columns (total 7 columns):
0_1_2    3 non-null datetime64[ns]
3        3 non-null object
4        3 non-null object
5        3 non-null float64
6        3 non-null float64
7        3 non-null float64
8        3 non-null float64
dtypes: datetime64[ns](1), float64(4), object(2)
memory usage: 192.0+ bytes

EDIT

To keep the original date columns add param keep_date_col=True

df = pd.read_fwf('27_35.txt', header=None, widths=[6,4,4,6,6,6,6,6,6], parse_dates=[[0,1,2]], keep_date_col=True)
Sign up to request clarification or add additional context in comments.

2 Comments

thanks @EdChum, is there a way to retain the original 3 columns as well?
add param keep_date_col=True
1

I'm always fogetting how to deal with dates in Pandas .... You could try this:

df.index = df.apply(lambda row: datetime.datetime(int(row[0]), int(row[1]), int(row[2])),1)

Comments

1

I would do this:

import pandas as pd
df = pd.DataFrame(data={'Year':[1988, 1988, 1988], 'Month':[1, 1, 1],
                        'Day': [1, 2, 3]})

date = [str(y)+'-'+str(m)+'-'+str(d) for y, m, d in zip(df.Year, df.Month, df.Day)]
df.index = pd.to_datetime(date)

>>> print df
            Day  Month  Year
1988-01-01    1      1  1988
1988-01-02    2      1  1988
1988-01-03    3      1  1988

After that you can drop Day, Month, and Year from your dataframe.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.