2

I got a csv that looks something like this:

col1, col2, col3, col4
txt,txt,error,txt
txt,txt,new,txt
txt,txt,new,txt
txt,txt,error,txt
txt,txt,new,txt
txt,txt,fix,txt

Id like to change the order of the rows to this

col1, col2, col3, col4
txt,txt,new,txt
txt,txt,new,txt
txt,txt,new,txt
txt,txt,fix,txt
txt,txt,error,txt
txt,txt,error,txt

so the rows follows news -> change -> error in col3

So far tried different things with:

import pandas as pd
csv_dataframe = pd.read_csv(user_submitted_csv_file)
csv_dataframe = csv_dataframe.sort_values(by=['col3'])

But its not enough since it is not alphabetical nor ascending/descending. also tried things like exstracting the rows -> deleting all rows -> adding back in correct order, but running into problems with that too...

0

1 Answer 1

4

From pandas>=1.1.0 you can use the key argument of the .sort_values method to write a lambda function which defines the custom order you prefer.

To do it, you just need to define a custom dictionary with your desired order

custom_dict = {'new': 0, 'fix': 1, 'error': 2}
df.sort_values(by=['col3'], key=lambda x: x.map(custom_dict))
Sign up to request clarification or add additional context in comments.

1 Comment

This answer is amazing. So much easier than using Categoricals. And it works on Index as well. And it can also ignore any values not in custom_dict.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.