6

So I have 3 or more dataframes that will be combined into a file. For example this will be my 3 data frames

            0   100 200 300 400
03/06/2017  0.0 0.1 0.2 0.4 0.6
03/07/2017  1.1 4.4 1.0 ND  4.3

             0  100 200 300 400
03/06/2017  ND  ND  ND  ND  ND
03/07/2017  4.3 4.2 4.3 ND  4.3

            0   100 200 300 400
03/06/2017  0.2 0.5 1.0 0.3 ND
03/07/2017  4.3 1.1 4.3 ND  4.3

When combined, the output should have a header title in each data frame like the example below:

                    HEADER TITLE1                    HEADER TITLE2                  HEADER TITLE3
DATE        0000  0100  0200  0300  0400    0000  0100  0200  0300  0400     0000  0100 0200  0300  0400
03/06/2017   0.0   0.1   0.2   0.4   0.6      ND    ND    ND    ND    ND      0.2   0.5  1.0   0.3    ND
03/07/2017   1.1   4.4   1.0    ND   4.3     4.3   4.2   4.3    ND   4.3      4.3   1.1  4.3    ND   4.3

But the problem is, when I tried my code the output has a header title above of each columns per dataframe. What I want is 1 header title only per dataframe. Here is what I've tried:

import pandas as pd
from decimal import Decimal, ROUND_HALF_UP

L=['0000','0100','0200','0300','0400','0500','0600'
                                        ,'0700','0800','0900','1000','1100','1200','1300'
                                        ,'1400','1500','1600','1700','1800','1900','2000'
                                        ,'2100','2200','2300']



df1 = pd.read_csv('Dataframe1.csv')
df1.Date = pd.to_datetime(df1.Date, dayfirst=True)
df1 = df1.pivot_table(values='SampleValues',index="SampleIndex",columns='SampleColumns',aggfunc='max',fill_value="ND")
df1.index = df1.index.map(lambda t: t.strftime('%Y-%m-%d'))
df1 = df1.reindex_axis(L, axis=1)
df1.ix[:,pd.isnull(df1).all()] = "ND"


df2 = pd.read_csv('Dataframe2.csv')
df2.Date = pd.to_datetime(df2.Date, dayfirst=True)
df2 = df2.pivot_table(values='SampleValues',index='SampleIndex',columns='SampleColumns',aggfunc='max',fill_value="ND")
df2.index = df2.index.map(lambda t: t.strftime('%Y-%m-%d'))
df2 = df2.reindex_axis(L, axis=1)
df2.ix[:,pd.isnull(df2).all()] = "ND"

df3 = pd.read_csv('Dataframe3.csv')
df3.Date = pd.to_datetime(df4.Date, dayfirst=True)
df3 = df4.pivot_table(values='SampleValues',index='SampleIndex',columns='SampleColumns',aggfunc='max',fill_value="ND")
df3.index = df4.index.map(lambda t: t.strftime('%Y-%m-%d'))
df3 = df4.reindex_axis(L, axis=1)
df3.ix[:,pd.isnull(df4).all()] = "ND"

keys = ['HEADER TITLE1','HEADER TITLE 2', 'HEADER TITLE 3']

df4 = pd.concat([df1,df2,df3], axis = 1,  keys = keys).to_csv("Output.csv", header = True, encoding = 'utf-8')

1 Answer 1

4
dfs = [d1, d2, d3]

df_combined = pd.concat(
    [df.rename(columns=lambda x: x.zfill(4)) for df in dfs],
    keys=['HEADER TITLE{}'.format(i) for i in range(1, len(dfs) + 1)],
    axis=1
)

df_combined

enter image description here

and for the csv

print(df_combined.to_csv())

,HEADER TITLE1,HEADER TITLE1,HEADER TITLE1,HEADER TITLE1,HEADER TITLE1,HEADER TITLE2,HEADER TITLE2,HEADER TITLE2,HEADER TITLE2,HEADER TITLE2,HEADER TITLE3,HEADER TITLE3,HEADER TITLE3,HEADER TITLE3,HEADER TITLE3
,0000,0100,0200,0300,0400,0000,0100,0200,0300,0400,0000,0100,0200,0300,0400
03/06/2017,0.0,0.1,0.2,0.4,0.6,ND,ND,ND,ND,ND,0.2,0.5,1.0,0.3,ND
03/07/2017,1.1,4.4,1.0,ND,4.3,4.3,4.2,4.3,ND,4.3,4.3,1.1,4.3,ND,4.3

However, as @StephenRauch pointed out... what you want isn't really csv... so, let's do not-csv!

with pd.option_context('display.width', 1000):
    print(df_combined.__repr__())

           HEADER TITLE1                     HEADER TITLE2                     HEADER TITLE3                    
                    0000 0100 0200 0300 0400          0000 0100 0200 0300 0400          0000 0100 0200 0300 0400
03/06/2017           0.0  0.1  0.2  0.4  0.6            ND   ND   ND   ND   ND           0.2  0.5  1.0  0.3   ND
03/07/2017           1.1  4.4  1.0   ND  4.3           4.3  4.2  4.3   ND  4.3           4.3  1.1  4.3   ND  4.3
Sign up to request clarification or add additional context in comments.

5 Comments

But +1 for effort :-)
@StephenRauch on osx shift+ctrl+cmd+4 allows me to drag a rectangle over the screen and capture to the clipboard.
@StephenRauch every time I try to double upvote, it resets my vote... must be a bug. I'll post on meta ;-)
Thanks for your answer, is there a way to do it something like this? df_combined.to_csv('Output.csv', encoding='utf-8') ? I need it to save as csv. But without the repeating headers like your first example.
@KarlGuevarra. No. There is no way to create a csv without the repeating headers, The csv by its nature has no good way to express combined columns. to_excel can do that. And the ridiculously clever solution that @piRSquared posted above, kinda does it also. If you could show exactly the output you want, someone may come by and help you out, but the current request is going to be tough to fulfill.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.