1

What is the right way of repeating columns in DataFrame?

I'm working on df:

  England    Germany    US
0 -3.3199    -3.31      496.68
1 1004.0     4.01       4.01
2 4.9794     4.97       1504.97
3 3.1766     2003.17    3.17

And I'd like to obtain:

  England  England   Germany  Germany   US        US    
0 -3.3199  -3.3199   -3.31    -3.31     496.68    496.68    
1 1004.0   1004.0    4.01     4.01      4.01      4.01 
2 4.9794   4.9794    4.97     4.97      1504.97   1504.97
3 3.1766   3.1766    2003.17  2003.17   3.17      3.17

I tough of getting headers from the original DataFrame and double them:

headers_double = [x for x in headers for i in range(2)]

Subsequently I tried to create df with new headers:

df.columns = [x for x in headers_double]

Unfortunately, my way of thinking was wrong. Any suggestions how to solve this problem?

3 Answers 3

5

I just came up with another solution that I want to share. Maybe it will be useful for somebody else.

print(df[np.repeat(df.columns.values,2)])
Sign up to request clarification or add additional context in comments.

1 Comment

I tried this but column index remains the same as the original column , am I missing a trick?
3

If you only have a few columns and you can name them manually, just select columns from your dataframe duplicating those names.

import io
import pandas as pd

data = io.StringIO('''\
  England    Germany    US
0 -3.3199    -3.31      496.68
1 1004.0     4.01       4.01
2 4.9794     4.97       1504.97
3 3.1766     2003.17    3.17
''')
df = pd.read_csv(data, delim_whitespace=True)

print(df[['England', 'England', 'Germany', 'Germany', 'US', 'US']])

Output:

     England    England  Germany  Germany       US       US
0    -3.3199    -3.3199    -3.31    -3.31   496.68   496.68
1  1004.0000  1004.0000     4.01     4.01     4.01     4.01
2     4.9794     4.9794     4.97     4.97  1504.97  1504.97
3     3.1766     3.1766  2003.17  2003.17     3.17     3.17

If you want to do this more generally, you can get your column names, duplicate them and then select columns. The following results in the same output as above:

print(df[[col for col in df.columns for i in range(2)]])

3 Comments

I simplified the problem. I have to many columns to do it by hand.
Check my latest edit, which addresses duplicating all columns programatically instead of manually.
Thank you so much, dude!
0

You can use this to replicate all columns or replace ':' with a selected range of columns:

df[df.columns[:].append(df.columns)]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.