11

I am trying to save and load a pandas DataFrame including MultiIndex (2 levels of indexing) for the columns. I have issues to save and load the DataFrame (I want to have exactly the same dataframe when I reload it if possible)

My dataframe looks like this:

> df.head()
         A                   B
        sp start  end       sp start  end
0  V5894_1   243  251  V5894_1   243  251
1  V5894_1   244  252  V5894_1   244  252
2  V5894_1   244  252  V5894_1   244  252
3  V3246_0    28   36  V3246_0    28   36
4  V3246_0    29   37  V3246_0    29   37

What I tried for now is the regular df.to_csv("test.csv") and load it after with df.read_csv("test.csv",index_col=[0,1]).

When I save it, the .csv files looks like this:

,A,A,A,B,B,B
,sp,start,end,sp,start,end
0,V5894_1,243,251,V5894_1,243,251
1,V5894_1,244,252,V5894_1,244,252
2,V5894_1,244,252,V5894_1,244,252
3,V3246_0,28,36,V3246_0,28,36

So I already feel like the structure might already be a bit broken.

When I load it with the previous command, I get:

                   A.1  A.2        B    B.1  B.2
        A
NaN     sp       start  end       sp  start  end
0.0     V5894_1    243  251  V5894_1    243  251
1.0     V5894_1    244  252  V5894_1    244  252
2.0     V5894_1    244  252  V5894_1    244  252
3.0     V3246_0     28   36  V3246_0     28   36

As you can see, I lost my MultiIndex column structure.

I also tried to load with

pd.read_csv("test.csv",index_col=0)

But I still don't get the expect result:

           A    A.1  A.2        B    B.1  B.2
NaN       sp  start  end       sp  start  end
0.0  V5894_1    243  251  V5894_1    243  251
1.0  V5894_1    244  252  V5894_1    244  252
2.0  V5894_1    244  252  V5894_1    244  252
3.0  V3246_0     28   36  V3246_0     28   36

My questions are:

  • Is there a way to save and load simply?

  • If not, what is the best way to restore the structure I had previously

2
  • 6
    You are supposed to use header = [0,1]. While saving if index is a range use index = None. Commented Jun 24, 2019 at 4:49
  • Wow, that was it, so simple. Thanks!! Commented Jun 24, 2019 at 4:50

1 Answer 1

4
df.to_csv("test.csv", index=None)
df1 = pd.read_csv("test.csv", header=[0, 1] )

Gives back:

    A               B
    sp  start   end sp  start   end
0   V5894_1 243 251 V5894_1 243 251
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.