Print one variable values together in the loop with Python

Question

How to print one variable values togehter in the loop?

I have a dataframe with some columns, then I loop though based on classification value, like this:

import pandas as pd
from sklearn.metrics import mean_squared_error
import numpy as np

data = pd.read_csv(r'C:\\example..csv')

classif = list(set(data['clasification']))
# print(classif)
# [1, 2, 3, 4]

for classif in classif:
    df_r = data[data['clasification'] == classif]

    Y = df_r['A']
    X = df_r[['B', 'C']]

    # i do some regression here

    print(np.mean(Y))
    print(np.max(Y))

When printing I get all values sequently based on classification value. So 14.580000000000002 and 43.6 is mean and max for rows which classification value is 1; 17.490909090909092 and 45.3 is mean and max for rows which classification value is 2 and so on, like this:

14.580000000000002
43.6
17.490909090909092
45.3
29.599999999999998
67.9
14.766666666666666
29.3

But is is possible to print values in loop not for each classification group together? The results would look something like this:

print(np.mean(Y))
print(np.max(Y))
14.580000000000002
17.490909090909092
29.599999999999998
14.766666666666666
43.6
45.3
67.9
29.3

Here is the example of dataframe used in the example:

Out[281]: 
    id  clasification     A     B      C
0    1              1   5.4   7.4   59.6
1    2              2  44.2  49.9  244.0
2    3              3   5.5   8.8   42.4
3    4              1  10.5  14.9   82.6
4    5              1  13.6  19.8   93.7
5    6              1  12.9  18.2  103.4
6    7              1   7.4  10.5   50.9
7    8              2   7.4  10.9   54.2
8    9              2   8.2  11.7   55.8
9   10              2  10.0  13.5   55.8
10  11              2   6.0   8.2   29.3
11  12              2  45.3  63.9  392.7
12  13              2   9.5   9.4   53.7
13  14              2  23.9  32.9  226.6
14  15              3  46.7  63.9  406.2
15  16              3   7.8   8.6   44.4
16  17              3  35.8  49.9  343.6
17  18              3  67.9  87.5  609.9
18  19              2  14.8  20.6  120.3

I'd probably add a mean and max column rather than looping to do it. — JeffUK
– JeffUK, Commented Feb 2, 2022 at 13:33
That's a solution, yes! But as I have multiple other calculations, don't need to add multiple new columns to dataframe dataframe. — g123456k
– g123456k, Commented Feb 2, 2022 at 13:36

DaSim · Accepted Answer · 2022-02-02 13:37:30Z

1

That is not possible without changing the structure of your code since without removing those prints inside the loop, each iteration will print sequentially a mean and a maximum. A suitable way to do as you'd like may be to save those values in lists and the print them at the end as follows

import pandas as pd
from sklearn.metrics import mean_squared_error
import numpy as np

data = pd.read_csv(r'C:\\example..csv')

classif = list(set(data['clasification']))
# print(classif)
# [1, 2, 3, 4]
means = []
maxes = []
for classif in classif:
    df_r = data[data['clasification'] == classif]

    Y = df_r['A']
    X = df_r[['B', 'C']]

    # i do some regression here

    means.append(np.mean(Y))
    maxes.append(np.max(Y))

print(means)
print(maxes)

answered Feb 2, 2022 at 13:37

DaSim

3635 silver badges11 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

piterbarg · Accepted Answer · 2022-02-02 13:42:49Z

sounds like you are trying to do a groupby and some statistics. You can try

df.groupby('clasification').agg([mean,max])

prints

    id              A                   B           C
    mean        max mean         max    mean    max mean    max
clasification                               
1   4.600000    7   9.960000    13.6    14.160000   19.8    78.040000   103.4
2   10.888889   19  18.811111   45.3    24.555556   63.9    136.933333  392.7
3   13.800000   18  32.740000   67.9    43.740000   87.5    289.300000  609.9

agg function gives you quite a lot of control over what statistics to priont for what columns

you can even try

df.groupby('clasification').apply(lambda g:g.describe())

that will give you a bunch of aggregated stats per group

Collectives™ on Stack Overflow

Print one variable values together in the loop with Python

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related