0

I have the following dataframe. It has 2 indices to select rows (samples and epochs) and 2 indices to select columns (kpi and model).

kpi            Accuracy             Precision            Recall             Training time (sec)                 Model memory (MB)               HE Memory (GB)         
model                M0    M1    M2        M0   M1   M2      M0    M1    M2                  M0      M1      M2                M0     M1     M2             M0       M1
samples epochs                                                                                                                                                         
675     3          0.96  0.52  1.00       1.0  0.0  1.0  0.9166  0.00  1.00              0.2124  0.2083  0.2080             0.417  0.417  0.417       0.553547   6.2009
        4          0.96  0.52  1.00       1.0  0.0  1.0  0.9166  0.00  1.00              0.2066  0.2123  0.2137             0.417  0.417  0.417       0.553547   6.2009
1950    3          0.98  0.96  0.98       1.0  1.0  1.0  0.9600  0.92  0.96              0.2132  0.2139  0.2136             0.417  0.417  0.417       1.664447  12.3319
        4          0.98  0.90  0.98       1.0  1.0  1.0  0.9600  0.80  0.96              0.2064  0.2166  0.2152             0.417  0.417  0.417       1.664447  12.3319

The code to achieve this is like so:

tuples = list(zip_longest(shape_ind, epoch_ind))
flat_list = flatten_list(kpi_values)
df = pd.DataFrame(np.reshape(flat_list, (len(kpi_values), -1)))
df.index = pd.MultiIndex.from_tuples(tuples, names=['samples', 'epochs'])

df.columns= pd.MultiIndex.from_arrays(np.divmod(df.columns, len(kpi_values[0][0])), names=['kpi','model'])

df.rename((lambda x: f'M{x}' ), 
        axis=1,
        level=1,
        inplace=True)

kpi = ['Accuracy', 'Precision', 'Recall', 'Training time (sec)', 'Model memory (MB)', 'HE Memory (GB)', 'HE gen. time (sec)']

df.rename(mapper=lambda x: kpi[x], 
        axis=1,
        level=0,
        inplace=True)

print(df)

I want to rename just the last 2 columns and create new groupings, so change from this:

HE Memory (GB)         
M0         M1                                                                                                                                                         
0.553547   6.2009
0.553547   6.2009
1.664447  12.3319
1.664447  12.3319

to this

HE Memory (GB)  HE gen. time (sec)      
                                   <--- note how M0 and M1 are gone                                                                                                                                    
0.553547        6.2009
0.553547        6.2009
1.664447        12.3319
1.664447        12.3319

How can I achieve this while retaining the structure of the original dataframe?

2 Answers 2

1

I ended up with a solution like this:

model_kpi = ['ACC', 'PRC', 'REC', 'TR_T', 'MM']#, 'HE_M', 'HE_GEN_TIME']
he_kpi = ['HE_M', 'HE_GEN_T']
kpi = [ item for item in model_kpi for repetitions in range(len(kpi_values[0][0])) ] + he_kpi
model = ['M'+str(i) for i in range(len(kpi_values[0][0]))]*len(model_kpi) + ['',''] 
col_ind = list(zip(kpi, model))
row_ind = list(zip_longest(shape_ind, epoch_ind))
flat_list = flatten_list(kpi_values)
df = pd.DataFrame(np.reshape(flat_list, (len(kpi_values), -1)))
df.index = pd.MultiIndex.from_tuples(row_ind, names=['samples', 'epochs'])
df.columns = pd.MultiIndex.from_tuples(col_ind, names=['kpi', 'model'])
Sign up to request clarification or add additional context in comments.

Comments

0

You can try the droplevel method: https://pandas.pydata.org/docs/reference/api/pandas.MultiIndex.droplevel.html

df.droplevel(1) 

should do the trick.

2 Comments

This would remove the complete index, wouldn't it? I want to retain the indices as in the original dataframe and change it only for the last 2 columns as per my question
Hi there, I am not sure whether I get what you want. A multi-index, or any index, is always consistent throughout a dataframe. So you can "extract" the 2 columns by subsetting and dropping the MultiIndex, but that would create a new object. If it is really just a matter of a pretty printout, None could be an interesting column name to try.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.