0

I want to reorder the rows of my dataframe based on a custom order over multiple columns.

Say, I have the following df:

import pandas as pd
df = pd.DataFrame.from_dict({'Name': {0: 'Tim', 1: 'Tim', 2: 'Tim', 3: 'Ari', 4: 'Ari', 5: 'Ari', 6: 'Dan', 7: 'Dan', 8: 'Dan'}, 'Subject': {0: 'Math', 1: 'Science', 2: 'History', 3: 'Math', 4: 'Science', 5: 'History', 6: 'Math', 7: 'Science', 8: 'History'}, 'Test1': {0: 10, 1: 46, 2: 54, 3: 10, 4: 83, 5: 39, 6: 10, 7: 58, 8: 10}, 'Test2': {0: 5, 1: 78, 2: 61, 3: 7, 4: 32, 5: 43, 6: 1, 7: 28, 8: 50}})

which looks like this

Name  Subject  Test1  Test2
Tim     Math     10      5
Tim  Science     46     78
Tim  History     54     61
Ari     Math     10      7
Ari  Science     83     32
Ari  History     39     43
Dan     Math     10      1
Dan  Science     58     28
Dan  History     10     50

I want to sort it by Name first according to custom order ['Dan','Tim','Ari'] and then sort it by Subject according to custom order ['Science','History','Math'].

So my final df should look like

Name  Subject  Test1  Test2
Dan  Science     58     28
Dan  History     10     50
Dan     Math     10      1
Tim  Science     46     78
Tim  History     54     61
Tim     Math     10      5
Ari  Science     83     32
Ari  History     39     43
Ari     Math     10      7

It seems like a simple thing, but I can't quite figure it out how to do it. The closest solution I could find was how to custom reorder rows according to a single column here. I want to be able to do this for multiple columns simultaneously.

2 Answers 2

1

You can represent Name and Subject as categorical variables:

names = ['Dan','Tim','Ari']
subjects = ['Science','History','Math']

df = df.astype({'Name': pd.CategoricalDtype(names, ordered=True),
                'Subject': pd.CategoricalDtype(subjects, ordered=True)})
>>> df.sort_values(['Name', 'Subject'])
  Name  Subject  Test1  Test2
7  Dan  Science     58     28
8  Dan  History     10     50
6  Dan     Math     10      1
1  Tim  Science     46     78
2  Tim  History     54     61
0  Tim     Math     10      5
4  Ari  Science     83     32
5  Ari  History     39     43
3  Ari     Math     10      7

>>> df.sort_values(['Subject', 'Name'])
  Name  Subject  Test1  Test2
7  Dan  Science     58     28
1  Tim  Science     46     78
4  Ari  Science     83     32
8  Dan  History     10     50
2  Tim  History     54     61
5  Ari  History     39     43
6  Dan     Math     10      1
0  Tim     Math     10      5
3  Ari     Math     10      7
Sign up to request clarification or add additional context in comments.

Comments

1

You can create 2 temporary columns for sorting and then drop them after you've sorted your df.

(
    df.assign(key1=df.Name.map({'Dan':0, 'Tim':1, 'Ari':2}), 
              key2=df.Subject.map({'Science':0, 'History':1, 'Math':2}))
    .sort_values(['key1', 'key2'])
    .drop(['key1', 'key2'], axis=1)
)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.