0

I have an excel sheet with 4 columns, Filename, SNR, Dynamic Range, Level.

Filename SNR Dynamic Range Level
1___SLATE_FPGA_BESBEV_TX_AMIC_9.6MHz_Normal_IN1_G0_0_HQ_DEC0_FS8_HPOF.xlsx 5 11 8
19___SLATE_FPGA_BESBEV_TX_AMIC_9.6MHz_Normal_IN1_G0_0_HQ_DEC0_FS32_HPOF.xlsx 15 31 23
10___SLATE_FPGA_BESBEV_TX_AMIC_9.6MHz_Normal_IN1_G0_0_HQ_DEC0_FS16_HPOF.xlsx 10 21 24
28___SLATE_FPGA_BESBEV_TX_AMIC_9.6MHz_Normal_IN1_G0_0_HQ_DEC0_FS48_HPOF.xlsx 20 41 23
37___SLATE_FPGA_BESBEV_TX_AMIC_9.6MHz_Normal_IN1_G0_0_HQ_DEC0_FS8_HP4.xlsx 25 51 12

I need to reorganize the first column of the table, Xls filename, such that the bolded part is in order from least to greatest. i.e.

Filename SNR Dynamic Range Level
1___SLATE_FPGA_BESBEV_TX_AMIC_9.6MHz_Normal_IN1_G0_0_HQ_DEC0_FS8_HPOF.xlsx 5 11 8
37___SLATE_FPGA_BESBEV_TX_AMIC_9.6MHz_Normal_IN1_G0_0_HQ_DEC0_FS8_HP4.xlsx 25 51 12
10___SLATE_FPGA_BESBEV_TX_AMIC_9.6MHz_Normal_IN1_G0_0_HQ_DEC0_FS16_HPOF.xlsx 10 21 24
19___SLATE_FPGA_BESBEV_TX_AMIC_9.6MHz_Normal_IN1_G0_0_HQ_DEC0_FS32_HPOF.xlsx 15 31 23
28___SLATE_FPGA_BESBEV_TX_AMIC_9.6MHz_Normal_IN1_G0_0_HQ_DEC0_FS48_HPOF.xlsx 20 41 23

I don't want to change the actual excel file. I was hoping to use pandas because I am doing some other manipulation later on.

I tried this

df.sort_values(by='Xls Filename', key=lambda col: col.str.contains('_FS'),ascending=True)

but it didn't work.

Thank you in advance!

1 Answer 1

2

Extract the pattern, find the sort index using argsort and then sort with the sort index:

# extract the number to sort by into a Series
fs = df.Filename.str.extract('FS(\d+)_\w+\.xlsx$', expand=False)

# find the sort index using `argsort` and reorder data frame with the sort index
df.loc[fs.astype(int).argsort()]

#                                                                       Filename  ...  Level
#0    1___SLATE_FPGA_BESBEV_TX_AMIC_9.6MHz_Normal_IN1_G0_0_HQ_DEC0_FS8_HPOF.xlsx  ...      8
#4    37___SLATE_FPGA_BESBEV_TX_AMIC_9.6MHz_Normal_IN1_G0_0_HQ_DEC0_FS8_HP4.xlsx  ...     12
#2  10___SLATE_FPGA_BESBEV_TX_AMIC_9.6MHz_Normal_IN1_G0_0_HQ_DEC0_FS16_HPOF.xlsx  ...     24
#1  19___SLATE_FPGA_BESBEV_TX_AMIC_9.6MHz_Normal_IN1_G0_0_HQ_DEC0_FS32_HPOF.xlsx  ...     23
#3  28___SLATE_FPGA_BESBEV_TX_AMIC_9.6MHz_Normal_IN1_G0_0_HQ_DEC0_FS48_HPOF.xlsx  ...     23

Where regex FS(\d+)_\w+\.xlsx$ will capture digits that immediately follow FS and precede _\w+\.xlsx.


In case you might have patterns that don't match, convert to float instead of int due to possible nans:

df.loc[fs.astype(float).values.argsort()]
Sign up to request clarification or add additional context in comments.

1 Comment

Great. Glad it helps !

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.