I have a multi-indexed dataframe,
>>> df
a1 a2
b1 b2 b1 b2
c1 d1 11 21 31 41
d2 12 22 32 42
c2 d1 13 23 33 43
d2 14 24 34 44
It has 2 levels of header and 2 levels of index. If I directly use the code df.to_csv('test_file.csv'), then the format of the file test_file.csv is
,,a1,a1,a2,a2
,,b1,b2,b1,b2
c1,d1,11,21,31,41
c1,d2,12,22,32,42
c2,d1,13,23,33,43
c2,d2,14,24,34,44
However, I would like to change it to
- remove the duplicates in the 1st level of header
- remove entire 1st level of index, and make an empty row for each one in the 1st level of index.
The wanted format is:
,a1,,a2,
,b1,b2,b1,b2
c1,,,,,
d1,11,21,31,41
d2,12,22,32,42
c2,,,,,
d1,13,23,33,43
d2,14,24,34,44
Could you please show me how to do it? Thanks! Please use the code below.
import pandas as pd
df = pd.DataFrame(
{
('a1', 'b1'): [11, 12, 13, 14],
('a1', 'b2'): [21, 22, 23, 24],
('a2', 'b1'): [31, 32, 33, 34],
('a2', 'b2'): [41, 42, 43, 44],
},
index=pd.MultiIndex.from_tuples([
('c1', 'd1'),
('c1', 'd2'),
('c2', 'd1'),
('c2', 'd2'),
]),
)
print(df)
df.to_csv('my_test_file.csv')