0

I have a CSV file that looks like this:

Build,Avg,Min,Max
BuildA,56.190,39.123,60.1039
BuildX,57.11,40.102,60.200
BuildZER,55.1134,35.129404123,60.20121

I want to get the average, min, max of each column and have each of these stats be as a new row. I exclude the non numeric column (the build column) and then run the statistics. I accomplish this by doing:

df = pd.read_csv('fakedata.csv')
columns = []
builds = []

for column in df.columns:
    if(df[column].dtype == 'float64'):
        columns.append(column)
    else:
        builds.append(column)

save = df[builds]
df = df[columns]

print(df)

df.loc['Min']= df.min()
df.loc['Average']= df.mean()
df.loc['Max']= df.max()

If I were then to write this data to a CSV it would look like:

,Avg,Min,Max
0,56.19,39.123,60.1039
1,57.11,40.102,60.2
2,55.1134,35.129404123,60.20121
Min,55.1134,35.129404123,60.1039
Average,55.8817,37.3709520615,60.1522525
Max,57.11,40.102,60.20121

Which is close to what I want but I want the Build column to be column one again and have the build names exist on top of the Min, Average, Max. Basically this:

Builds,Avg,Min,Max
BuildA,56.19,39.123,60.1039
BuildX,57.11,40.102,60.2
BuildZER,55.1134,35.129404123,60.20121
Min,55.1134,35.129404123,60.1039
Average,55.8817,37.3709520615,60.1522525
Max,57.11,40.102,60.20121

I have attempted to accomplish this by doing:

df.insert(0,'builds', save)
with open('fakedata.csv', 'w') as f:
    df.to_csv(f)

But this gives me this CSV:

,builds,Avg,Min,Max
0,Build1,56.19,39.123,60.1039
1,Build2,57.11,40.102,60.2
2,Build3,55.1134,35.129404123,60.20121
Min,,55.1134,35.129404123,60.1039
Average,,55.8817,37.3709520615,60.1522525
Max,,57.11,40.102,60.20121

How can I fix this?

2
  • 1
    try df.to_csv(f,index=False) Commented Jul 5, 2017 at 14:50
  • This is very close but it results in a csv where the build names are present but now the Min, Average, Max are absent. I think the easiest way to fix this would be to add "Min", "Average", and "Max" to the save dataframe. I can't find an easy way to do this however Commented Jul 5, 2017 at 15:08

1 Answer 1

1

IIUC:

df_out = pd.concat([df.set_index('Build'),df.set_index('Build').agg(['max','min','mean'])]).rename(index={'max':'Max','min':'Min','mean':'Average'}).reset_index()

Output:

      index      Avg        Min       Max
0    BuildA  56.1900  39.123000  60.10390
1    BuildX  57.1100  40.102000  60.20000
2  BuildZER  55.1134  35.129404  60.20121
3       Max  57.1100  40.102000  60.20121
4       Min  55.1134  35.129404  60.10390
5   Average  56.1378  38.118135  60.16837
Sign up to request clarification or add additional context in comments.

6 Comments

This would be the desired result but adding this line results in me getting a Key Error on "Build." I believe this is because to ensure I am only working with numeric data I do df = df[columns] where columns is equal to: for column in df.columns: if(df[column].dtype == 'float64'): columns.append(column) reference above for better formatting
There is a select_dtypes that will "Return a subset of a DataFrame including/excluding columns based on their dtype."
I realize that your answer was actually just a one line substitute to everything I was doing.
Your answer works but if you have extra time if you could explain what the code is doing I would greatly appreciate it. I am specifically confused as to where the min, max, avg are being computed and how the concat works.
Okay, so I have the original dataframe df, I set index to move the build in to index next, I use df.agg to perform multiple aggregations returns a dataframe with the names of the aggregrations in the index, so then I use concat to combine these two dataframes one on top of the other. Next I rename index labels to get your desired result and reset_index to move that index back in to columns.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.