2

How to sort pandas's dataframe by specific column names? My dataframe columns look like this:

+-------+-------+-----+------+------+----------+
|movieId| title |drama|horror|action|  comedy  |
+-------+-------+-----+------+------+----------+
|                                              |
+-------+-------+-----+------+------+----------+

I would like to sort the dataframe only by columns = ['drama','horror','sci-fi','comedy']. So I get the following dataframe:

+-------+-------+------+------+------+----------+
|movieId| title |action|comedy|drama |  horror  |
+-------+-------+------+------+------+----------+
|                                               |
+-------+-------+------+------+------+----------+

I tried df = df.sort_index(axis=1) but it sorts all columns:

+-------+-------+------+------+-------+----------+
|action | comedy|drama |horror|movieId|  title   |
+-------+-------+------+------+-------+----------+
|                                                |
+-------+-------+------+------+-------+----------+
2
  • I think your first example and second example are accidentally swapped Commented Aug 31, 2020 at 9:34
  • Does this answer your question? Set order of columns in pandas dataframe Commented Oct 10, 2023 at 10:29

3 Answers 3

1

You can explicitly rearrange columns like so

df[['movieId','title','drama','horror','sci-fi','comedy']]

If you have a lot of columns to sort alphabetically

df[np.concatenate([['movieId,title'],df.drop('movieId,title',axis=1).columns.sort_values()])]
Sign up to request clarification or add additional context in comments.

1 Comment

Not very practical for a larger amount of cols. Perhaps suggest a not hardcoded method to sort the genres?
1

Sorting all columns after second column and add first 2 columns:

c = df.columns[:2].tolist() + sorted(df.columns[2:].tolist())
print (c)
['movieId', 'title', 'action', 'comedy', 'drama', 'horror']

Last change order of columns by this list:

df1 = df[c]

Another idea is use DataFrame.sort_index but only for all columns without first 2 selected by DataFrame.iloc:

df.iloc[:, 2:] = df.iloc[:, 2:].sort_index(axis=1)

Comments

0

Another way would be set movieId and title as index of the DataFrame and then sort index by the remaining column.

df.set_index(['movieId', 'title'], inplace=True)
df.sort_index(axis=1, inplace=True)

1 Comment

And then reset_index. Why not.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.