How to make two rows in a pandas dataframe into column headers

Question

I have seen how to work with a double index, but I have not seen how to work with a two-row column headers. Is this possible?

For example, row 1 is a repetitive series of dates: 2016, 2016, 2015, 2015

Row 2 is a repetitive series of data. Dollar Sales, Unit Sales, Dollar Sales, Unit Sales.

So each "Dollar Sales" heading is actually tied to the date in the row above.

Subsequent rows are individual items with data.

Is there a way to do a groupby or some way that I can have two column headers? Ultimately, I want to line up the "Dollar Sales" as a series by date so that I can make a nice graph. Unfortunately there are multiple columns before the next "Dollar Sales" value. (More than just the one "Unit Sales" column). Also if I delete the date row above, there is no link between which "Dollar Sales" are tied to each date.

You can use a pandas.MultiIndex as header. See for example this and this. — Midnighter
– Midnighter, Commented Dec 6, 2016 at 21:57
This works, thank you. I was unaware that MultiIndex would also apply to column headers... still learning pandas. — Stephen
– Stephen, Commented Dec 8, 2016 at 21:35

Kevin · Accepted Answer · 2019-11-20 12:47:12Z

43

If using pandas.read_csv() or pandas.read_table(), you can provide a list of indices for the header argument, to specify the rows you want to use for column headers. Python will generate the pandas.MultiIndex for you in df.columns:

df = pandas.read_csv('DollarUnitSales.csv', header=[0,1])

You can also use more than two rows, or non-consecutive rows, to specify the column headers:

df = pandas.read_table('DataSheet1.csv', header=[0,2,3])

edited Nov 20, 2019 at 12:47

answered Aug 8, 2018 at 12:00

Kevin

18.8k8 gold badges71 silver badges86 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

PV8 Over a year ago

How will it look like? Can you add an example?

ASH Over a year ago

Awesome solution squareskittles!! This is exactly what I was looking for...for the past several days!! Thanks so much!!

cottontail · Accepted Answer · 2023-03-27 01:32:34Z

A MultiIndex can be created from rows and assigned as the new column labels.

For example, to make the following transformation, use pd.MultiIndex.from_frame().

df = pd.DataFrame([[2016, 2016, 2015, 2015], 
                   ['Dollar Sales', 'Unit Sales', 'Dollar Sales', 'Unit Sales'], 
                   [1, 2, 3, 4], [5, 6, 7, 8]], columns=[*'ABCD'])

new_labels = pd.MultiIndex.from_frame(df.iloc[:2].T.astype(str), names=['Year', 'Sales'])
df1 = df.set_axis(new_labels, axis=1).iloc[2:]

A MultiIndex can also be created from the old column labels and a dataframe row. For example, to make the following transformation, use pd.MultiIndex.from_arrays().

df = pd.DataFrame([['Dollar Sales', 'Unit Sales', 'Dollar Sales', 'Unit Sales'], 
                   [1, 2, 3, 4], [5, 6, 7, 8]], columns=[2016, 2016, 2015, 2015])

new_labels = pd.MultiIndex.from_arrays([df.columns, df.iloc[0]], names=['Year', 'Sales'])
df1 = df.set_axis(new_labels, axis=1).iloc[1:]

N.B. An important thing to note is the dtypes of the dataframe could be not ideal for the data it holds; astype(int) etc. could be necessary in the end.

Also, reset_index(drop=True) may be needed if the index should be reset.

Collectives™ on Stack Overflow

How to make two rows in a pandas dataframe into column headers

2 Answers 2

2 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related