0

I am trying to pivot some data in Python pandas package by using the pivot_table feature but as part of this I have a specific, bespoke order that I want to see my columns returned in - determined by a Sort_Order field which is already in the dataframe. So for test example with:


raw_data = {'Support_Reason' : ['LD', 'Mental Health', 'LD', 'Mental Health', 'LD', 'Physical', 'LD'],
            'Setting' : ['Nursing', 'Nursing', 'Residential', 'Residential', 'Community', 'Prison', 'Residential'],
            'Setting_Order' : [1, 1, 2, 2, 3, 4, 2],
            'Patient_ID' : [6789, 1234, 4567, 5678, 7890, 1235, 3456]}

Data = pd.DataFrame(raw_data, columns = ['Support_Reason', 'Setting', 'Setting_Order', 'Patient_ID'])

Data

Then pivot:

pivot = pd.pivot_table(Data, values='Patient_ID', index=['Support_Reason'],
                   columns=['Setting'], aggfunc='count',dropna = False)
pivot  = pivot.reset_index()

pivot

This is exactly how I want my table to look except that the columns have defaulted to A-Z ordering. I would like them to be ordered Ascending as per the Setting_Order column - so that would be order of Nursing, Residential, Community then Prison. Is there some additional syntax that I could add to my pd.pivot_table code would make this possible please?

I realise there are a few different work-arounds for this, the simplest being re-ordering the columns afterwards(!) but I want to avoid having to hard-code column names as these will change over time (both the headings and their order) and the Setting and Setting_Order fields will be managed in a separate reference table. So any form of answer that will avoid having to list Settings in code would be ideal really.

2
  • Quick remark: When you create the dataframe with Data = pd.DataFrame(raw_data, columns = ['Support_Reason', 'Setting', 'Setting_Order', 'Patient_ID']), you don't have to specify the column names, as they are already included in the dictionary raw_data. In this way, you can avoid hard-coding the column names at that place. Commented Apr 1, 2022 at 14:19
  • Thanks Flursch - this is just my lack of expertise showing. The real-world example is imported from a flat-file csv anyway Commented Apr 1, 2022 at 14:26

2 Answers 2

2

Try:

ordered = df.sort_values("Setting_Order")["Setting"].drop_duplicates().tolist()
pivot = pivot[list(pivot.columns.difference(ordered))+ordered]
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks @not_speshal - this works neatly like Flursch's example below, but also means the Support_Reason field is missing. It's vital that this field remains so the matrix type format makes sense
@nnn1234 - See the edited answer.
1
col_order = list(Data.sort_values('Setting_Order')['Setting'].unique())
pivot[col_order+['Support_Reason']]

Does this help?

2 Comments

This certainly provides the correct column order in the Pivot dataframe thanks @Flursch, although it's lost the Support_Reason field which is also crucial
@nnn1234 I have edited the code in my answer to also include the Support_Reason column.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.