0

I have a csv file containing 2 columns: id, val where id is the number of the day (total 365) Is it possible to convert the number id to dates in format '%d-%m-%Y'? In fact I want to add all the days of year 2015 e.g. 01-01-2015 etc. How can i do this with pandas in python?

following is a part of the file and the desired output

"id"    "val"
1   49
2   48
3   46
4   45


"date"  "val"
01-01-2015  49
02-01-2015  48
03-01-2015  46
04-01-2015  45

6 Answers 6

2

Use pd.tseries.offsets.Day:

df['date'] = pd.Timestamp('2015-01-01') \
             + df['id'].sub(1).apply(pd.tseries.offsets.Day)

Alternative, proposed by @HenryEcker:

df['date'] = pd.Timestamp('2015-01-01') \
             - pd.Timedelta(days=1) \
             + df['id'].apply(pd.tseries.offsets.Day)
>>> df['id'].sub(1).apply(pd.tseries.offsets.Day)
0    <0 * Days>
1         <Day>
2    <2 * Days>
3    <3 * Days>
Name: id, dtype: object


>>> df
   id  val       date
0   1   49 2015-01-01
1   2   48 2015-01-02
2   3   46 2015-01-03
3   4   45 2015-01-04
Sign up to request clarification or add additional context in comments.

2 Comments

Seems like pd.Timestamp('2015-01-01') - pd.Timedelta(days=1) would be faster than subtracting from the entire Series and give the same results.
@HenryEcker. Thanks for this. I added your proposal.
1

You can convert id to datetime and format the output with strftime:

df['Date'] = pd.to_datetime(df['id'].astype(str)+"-2015", format='%j-%Y').dt.strftime('%d-%m-%Y')

Result:

id val Date
0 1 49 01-01-2015
1 2 48 02-01-2015
2 3 46 03-01-2015
3 4 45 04-01-2015

Comments

1
df.columns['date', 'val']
for i, contents in enumerate(df['date']):
    info = str(contents)
    if contents < 10:
        info = str(0) + info
    df['date'][i] = "01-" + info + "-2015"

This iterates through your column and converts it to date formatting

Comments

1

Or like this:

df['Date'] = pd.Timestamp('2014-12-31') + df['id'].apply(lambda x: pd.Timedelta(days=x))

Output:

   id  val       Date
0   1   49 2015-01-01
1   2   48 2015-01-02
2   3   46 2015-01-03
3   4   45 2015-01-04

1 Comment

Maybe you should use x-1 to get the reference date untouched?
1

You can use pd.to_timedelta() on id column to turn its values into date offsets for adding to the base date, as follows:

df['date'] = pd.Timestamp('2015-01-01') + pd.to_timedelta(df['id'] -1, unit='day')

Result:

print(df)

   id  val       date
0   1   49 2015-01-01
1   2   48 2015-01-02
2   3   46 2015-01-03
3   4   45 2015-01-04

If you want the date in dd-mm-YYYY format, you can use together with .dt.strftime(), as follows:

df['date2'] = (pd.Timestamp('2015-01-01') + pd.to_timedelta(df['id'] -1, unit='day')).dt.strftime('%d-%m-%Y')

Result:

print(df)

   id  val       date       date2
0   1   49 2015-01-01  01-01-2015
1   2   48 2015-01-02  02-01-2015
2   3   46 2015-01-03  03-01-2015
3   4   45 2015-01-04  04-01-2015

Comments

0

I'm not sure about the years as the day count doesn't speak about which year to choose but you can convert it into months and dates.

change your csv column called id into the date. Then >>>

df['Date'] = pd.to_datetime(df['Date'], format='%j').dt.strftime('%m-%d')

it will change it into date. Then you can manually add year.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.