1

I have a dataframe with with multiple rows similar to the one showed below:

  wave      cross cross2
0 299.0   1.25    3.30
1 299.5   1.30    4.20
2 300.0   1.45    4.36
3 300.5   1.65    4.32
4 300.8   1.56    4.56

What I want to do is to average the data for the same wavelengths so that is get a data frame with wave as an integer, which results in something like this:

  wave   cross cross2
0 299    1.30  3.75
1 300    1.55  4.41

what is the best way to achieve this with python pandas?

2 Answers 2

2

Use groupby with aggreagate mean, but first cast wave column to int:

df = df.assign(wave = df['wave'].astype(int)).groupby('wave').mean()

Or:

df['wave'] = df['wave'].astype(int)
df = df.groupby('wave').mean()

Or:

df = df[df.columns.difference(['wave'])].groupby(df['wave'].astype(int)).mean()

print (df)
         cross    cross2
wave                    
299   1.275000  3.750000
300   1.553333  4.413333
Sign up to request clarification or add additional context in comments.

2 Comments

Yes. I was thinking of the 3rd one. Now 2nd. I think we can do it directly
@Sanne - Thank you.
0

In addition to jezrael's great alternatives you can pass a column to groupby:

df = df.groupby(df['wave'].astype(int)).mean().drop('wave',1).reset_index()

Full example:

import pandas as pd

data = '''\
wave      cross cross2
299.0   1.25    3.30
299.5   1.30    4.20
300.0   1.45    4.36
300.5   1.65    4.32
300.8   1.56    4.56'''

df = pd.read_csv(pd.compat.StringIO(data), sep='\s+')

df = df.groupby(df['wave'].astype(int)).mean().drop('wave',1).reset_index()
print(df)

Returns:

   wave     cross    cross2
0   299  1.275000  3.750000
1   300  1.553333  4.413333

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.