I am trying to save and load a pandas DataFrame including MultiIndex (2 levels of indexing) for the columns. I have issues to save and load the DataFrame (I want to have exactly the same dataframe when I reload it if possible)
My dataframe looks like this:
> df.head()
A B
sp start end sp start end
0 V5894_1 243 251 V5894_1 243 251
1 V5894_1 244 252 V5894_1 244 252
2 V5894_1 244 252 V5894_1 244 252
3 V3246_0 28 36 V3246_0 28 36
4 V3246_0 29 37 V3246_0 29 37
What I tried for now is the regular df.to_csv("test.csv") and load it after with df.read_csv("test.csv",index_col=[0,1]).
When I save it, the .csv files looks like this:
,A,A,A,B,B,B
,sp,start,end,sp,start,end
0,V5894_1,243,251,V5894_1,243,251
1,V5894_1,244,252,V5894_1,244,252
2,V5894_1,244,252,V5894_1,244,252
3,V3246_0,28,36,V3246_0,28,36
So I already feel like the structure might already be a bit broken.
When I load it with the previous command, I get:
A.1 A.2 B B.1 B.2
A
NaN sp start end sp start end
0.0 V5894_1 243 251 V5894_1 243 251
1.0 V5894_1 244 252 V5894_1 244 252
2.0 V5894_1 244 252 V5894_1 244 252
3.0 V3246_0 28 36 V3246_0 28 36
As you can see, I lost my MultiIndex column structure.
I also tried to load with
pd.read_csv("test.csv",index_col=0)
But I still don't get the expect result:
A A.1 A.2 B B.1 B.2
NaN sp start end sp start end
0.0 V5894_1 243 251 V5894_1 243 251
1.0 V5894_1 244 252 V5894_1 244 252
2.0 V5894_1 244 252 V5894_1 244 252
3.0 V3246_0 28 36 V3246_0 28 36
My questions are:
Is there a way to save and load simply?
If not, what is the best way to restore the structure I had previously
header = [0,1]. While saving if index is a range useindex = None.