1

I have a date column in a python dataframe. I want to index these by ordering the date. Is this something possible in python?

date     indexed
2007-02-21  3
2007-02-18  1
2007-02-24  5
2007-02-18  1
2007-02-23  4
2007-02-20  2
2007-02-23  4

I was looking for indexation, but i guess i am using a wrong term to check for. Please guide.

Edit

Actually i want to replace the dates by the equivalent index numbers.

3
  • yep, you need to sort them by date, after that index them all using a simple loop Commented Sep 16, 2017 at 19:17
  • Possible duplicate of Update index after sorting data-frame Commented Sep 16, 2017 at 19:19
  • df.sort_values(by='Date') Commented Sep 16, 2017 at 19:28

3 Answers 3

1

IIUC you want to use pd.factorize() method:

In [190]: df['new'] = pd.factorize(df['date'], sort=True)[0] + 1

In [191]: df
Out[191]:
        date  indexed  new
0 2007-02-21        3    3
1 2007-02-18        1    1
2 2007-02-24        5    5
3 2007-02-18        1    1
4 2007-02-23        4    4
5 2007-02-20        2    2
6 2007-02-23        4    4

PS pd.factorize() starts counting from 0, so i've added 1 in order to meet your desired result

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks a lot. Why do we add 1 here? kindly clarify
@DoubtDhanabalu, pd.factorize() starts from 0. So i've added 1 in order to meet your desired result
ok i got it, Thanks a lot. I am accepting this answer. Thanks again.
1

What you looking for is sort_values by date

df = pd.DataFrame(["2007-02-21","2007-02-18","2007-02-24","2007-02-18","2007-02-23","2007-02-20","2007-02-23"],columns=["date"])

enter image description here

df.sort_values("date", axis=0)

enter image description here

Comments

1

Using pandas.DataFrame.sort_index

import pandas as pd

df = pd.DataFrame(['2007-02-21','2007-02-18','2007-02-24','2007-02-18','2007-
02-23', '2007-02-20' , '2007-02-23'], index=[3, 1, 5, 1, 4,2,4], columns=
['Date'])

print df
         Date
3  2007-02-21
1  2007-02-18
5  2007-02-24
1  2007-02-18
4  2007-02-23
2  2007-02-20
4  2007-02-23


df2 = df.sort_index(axis=0)
print(df2)

         Date
1  2007-02-18
1  2007-02-18
2  2007-02-20
3  2007-02-21
4  2007-02-23
4  2007-02-23
5  2007-02-24

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.