2

I have two .csv files with the same initial column-header:

NAME         RA        DEC  Mean_I1  Mean_I2  alpha_K24 class  alpha_K8 class.1      Av  avgAv
Mon-000101  100.27242   9.608597   11.082   10.034       0.39     I      0.39       I              31.1      31.1
Mon-000171  100.29230   9.522860   14.834   14.385       0.45     I      0.45       I          33.7      33.7

and

       NAME        Sdev_I1        Sdev_I2
 Mon-000002,         0.023,   0.028000001,
 Mon-000003,   0.016000001,   0.016000001,

I want to merge the two together so that the 'NAME' columns match up, basically just add the two Sdev_I1/Sdev_I2 to the end of the first sample. I've tried...

import pandas as pd

df1 = pd.read_csv('h7.csv',sep=r'\s+')
df2 = pd.read_csv('NEW.csv',sep=r'\s+')

df = pd.merge(df1,df2)

df.to_csv('Newh7.csv',index=False)

but it's printing the 'NAME' twice and everything seems to be out of order and with a lot of added zeroes as well. I thought I had solved this one awhile back, but I've totally lost it. Help would be appreciated. Thanks.

Here's the output file:

NAME,RA,DEC,Mean_I1,Mean_I2,alpha_K24,class,alpha_K8,class.1,Av,avgAv,Sdev_I1,Sdev_I2
3
  • The looks like it ought to work... Could you print the actual DataFrame output rather than what the csvs look like (for one thing there are some rogue commas in you second csv...)? Commented Jun 17, 2013 at 6:51
  • Seems you didn't strip the comma. And I guess there're some spaces in your 'NAME' column. It's better print your dataframe or provide your csv files. Commented Jun 17, 2013 at 6:54
  • Check the edit. Actually, I had changed something, so I'm only getting the column headers now. Commented Jun 17, 2013 at 6:56

1 Answer 1

2

Seems you didn't strip the comma symbol in the second csv, you might try to use converters to convert them:

In [81]: converters = {
             'NAME': lambda x:x[:-1], 
             'Sdev_I1': lambda x: float(x[:-1]),     
             'Sdev_I2': lambda x: float(x[:-1])
         }

In [82]: pd.read_csv('NEW.csv',sep=r'\s+', converters=converters)
Out[82]: 
         NAME  Sdev_I1  Sdev_I2
0  Mon-000002    0.023    0.028
1  Mon-000003    0.016    0.016
Sign up to request clarification or add additional context in comments.

3 Comments

I plugged this into my initial script as the 'df2' variable but after everything it's printing a blank file. Any advice?
Oops...looked into the wrong file. Worked exactly as intended. Now I just have to remove the commas and add the white-space. THANKS!!!
You can also use replace: df2.replace(r'(.*),', r'\1', regex=True)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.