Merging two .csv files python-pandas

Question

I have two .csv files with the same initial column-header:

NAME         RA        DEC  Mean_I1  Mean_I2  alpha_K24 class  alpha_K8 class.1      Av  avgAv
Mon-000101  100.27242   9.608597   11.082   10.034       0.39     I      0.39       I              31.1      31.1
Mon-000171  100.29230   9.522860   14.834   14.385       0.45     I      0.45       I          33.7      33.7

and

       NAME        Sdev_I1        Sdev_I2
 Mon-000002,         0.023,   0.028000001,
 Mon-000003,   0.016000001,   0.016000001,

I want to merge the two together so that the 'NAME' columns match up, basically just add the two Sdev_I1/Sdev_I2 to the end of the first sample. I've tried...

import pandas as pd

df1 = pd.read_csv('h7.csv',sep=r'\s+')
df2 = pd.read_csv('NEW.csv',sep=r'\s+')

df = pd.merge(df1,df2)

df.to_csv('Newh7.csv',index=False)

but it's printing the 'NAME' twice and everything seems to be out of order and with a lot of added zeroes as well. I thought I had solved this one awhile back, but I've totally lost it. Help would be appreciated. Thanks.

Here's the output file:

NAME,RA,DEC,Mean_I1,Mean_I2,alpha_K24,class,alpha_K8,class.1,Av,avgAv,Sdev_I1,Sdev_I2

The looks like it ought to work... Could you print the actual DataFrame output rather than what the csvs look like (for one thing there are some rogue commas in you second csv...)? — Andy Hayden
– Andy Hayden, Commented Jun 17, 2013 at 6:51
Seems you didn't strip the comma. And I guess there're some spaces in your 'NAME' column. It's better print your dataframe or provide your csv files. — waitingkuo
– waitingkuo, Commented Jun 17, 2013 at 6:54
Check the edit. Actually, I had changed something, so I'm only getting the column headers now. — Matt
– Matt, Commented Jun 17, 2013 at 6:56

waitingkuo · Accepted Answer · 2013-06-17 07:20:10Z

2

Seems you didn't strip the comma symbol in the second csv, you might try to use converters to convert them:

In [81]: converters = {
             'NAME': lambda x:x[:-1], 
             'Sdev_I1': lambda x: float(x[:-1]),     
             'Sdev_I2': lambda x: float(x[:-1])
         }

In [82]: pd.read_csv('NEW.csv',sep=r'\s+', converters=converters)
Out[82]: 
         NAME  Sdev_I1  Sdev_I2
0  Mon-000002    0.023    0.028
1  Mon-000003    0.016    0.016

answered Jun 17, 2013 at 7:20

waitingkuo

94.5k28 gold badges119 silver badges122 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Matt Over a year ago

I plugged this into my initial script as the 'df2' variable but after everything it's printing a blank file. Any advice?

Matt Over a year ago

Oops...looked into the wrong file. Worked exactly as intended. Now I just have to remove the commas and add the white-space. THANKS!!!

waitingkuo Over a year ago

You can also use replace: df2.replace(r'(.*),', r'\1', regex=True)

Collectives™ on Stack Overflow

Merging two .csv files python-pandas

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related