Pandas Dataframe add header without replacing current header

Question

How can I add a header to a DF without replacing the current one? In other words I just want to shift the current header down and just add it to the dataframe as another record.

*secondary question: How do I add tables (example dataframe) to stackoverflow question?

I have this (Note header and how it is just added as a row:

   0.213231  0.314544
0 -0.952928 -0.624646
1 -1.020950 -0.883333

I need this (all other records are shifted down and a new record is added) (also: I couldn't read the csv properly because I'm using s3_text_adapter for the import and I couldn't figure out how to have an argument that ignores header similar to pandas read_csv):

       A          B
0  0.213231  0.314544
1 -1.020950 -0.883333

re the tables, you can just copy and paste the text repr, then make sure you highlight and CTRL+K / indent 4 spaces (puts it in code formatting). — Andy Hayden
– Andy Hayden, Commented Oct 23, 2013 at 1:08
What is s3_text_adapter and how are you using it? It ought to have a header=None option... — Andy Hayden
– Andy Hayden, Commented Oct 24, 2013 at 0:39
@AndyHayden, you were absolutely right. I went back and double checked and found that field_names=False does the trick. Thank you again ! — horatio1701d
– horatio1701d, Commented Oct 24, 2013 at 11:02

Andy Hayden · Accepted Answer · 2013-10-23 20:45:23Z

13

Another option is to add it as an additional level of the column index, to make it a MultiIndex:

In [11]: df = pd.DataFrame(randn(2, 2), columns=['A', 'B'])

In [12]: df
Out[12]: 
          A         B
0 -0.952928 -0.624646
1 -1.020950 -0.883333

In [13]: df.columns = pd.MultiIndex.from_tuples(zip(['AA', 'BB'], df.columns))

In [14]: df
Out[14]: 
         AA        BB
          A         B
0 -0.952928 -0.624646
1 -1.020950 -0.883333

This has the benefit of keeping the correct dtypes for the DataFrame, so you can still do fast and correct calculations on your DataFrame, and allows you to access by both the old and new column names.

.

For completeness, here's DSM's (deleted answer), making the columns a row, which, as mentioned already, is usually not a good idea:

In [21]: df_bad_idea = df.T.reset_index().T

In [22]: df_bad_idea
Out[22]: 
              0         1
index         A         B
0     -0.952928 -0.624646
1      -1.02095 -0.883333

Note, the dtype may change (if these are column names rather than proper values) as in this case... so be careful if you actually plan to do any work on this as it will likely be slower and may even fail:

In [23]: df.sum()
Out[23]: 
A   -1.973878
B   -1.507979
dtype: float64

In [24]: df_bad_idea.sum()  # doh!
Out[24]: Series([], dtype: float64)

If the column names are actually a row that was mistaken as a header row then you should correct this on reading in the data (e.g. read_csv use header=None).

edited Oct 23, 2013 at 20:45

answered Oct 23, 2013 at 0:59

Andy Hayden

378k110 gold badges640 silver badges546 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

DSM Over a year ago

I'm going to delete mine in favour of this, because I think your point about changing dtypes is a good one.

Andy Hayden Over a year ago

@DSM you always do that after I +1! It was what the OP was after, but this is more correct I think (though could/should be easier)...

horatio1701d Over a year ago

Thank you. This is really cool and good to know but I meant how to replace the header 'A' and 'B' from the first df above but also just add the values 'A' and 'B' as another row, in other words move values 'A' and 'B' down to index 0 as the new first record in df.

TomAugspurger Over a year ago

@prometheus2305 for that you could do df.T.reset_index().T but you should think hard about why you would want to do that.

Andy Hayden Over a year ago

@tom which was DSMs deleted answer!

|

Brad123 · Accepted Answer · 2019-08-20 18:29:32Z

4

The key is to specify header=None and use column to add header:

data = pd.read_csv('file.csv', skiprows=2, header=None ) # skip blank rows if applicable
df = pd.DataFrame(data)
df = df.iloc[ : , [0,1]] # columns 1 and 2
df.columns = ['A','B'] # title

answered Aug 20, 2019 at 18:29

Brad123

94410 silver badges10 bronze badges

Collectives™ on Stack Overflow

Pandas Dataframe add header without replacing current header

2 Answers 2

8 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

8 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related