Updating variable name within a for loop while performing calculations

Question

Whenever I stumble across some sort of calculation in Python, I tend to do go for an unpythonic approach because I am not too familiar with the language:

import pandas as pd
import numpy as np

v        = 8
gf       = 2.5

data_a1  = np.random.randint(5, 10, 21)
data_a2  = np.random.randint(5, 10, 21)
data_a3  = np.random.randint(5, 10, 21)
data_a4  = np.random.randint(5, 10, 21)
data_a5  = np.random.randint(5, 10, 21)

data_b1  = np.random.randint(6, 11, 21)
data_b2  = np.random.randint(6, 11, 21)
data_b3  = np.random.randint(6, 11, 21)
data_b4  = np.random.randint(6, 11, 21)
data_b5  = np.random.randint(6, 11, 21)

e_1 = 2 * (data_a1 + data_b1) / 2 / v / gf
e_2 = 2 * (data_a2 + data_b2) / 2 / v / gf
e_3 = 2 * (data_a3 + data_b3) / 2 / v / gf
e_4 = 2 * (data_a4 + data_b4) / 2 / v / gf
e_5 = 2 * (data_a5 + data_b5) / 2 / v / gf

As you can see from the example above, I explicitly write it down five times instead of using Python how I can imagine it is intended to be used -- I would like to calculate e by updating it on every iteration using a for loop, and I would also prefer to use numpy.

Since all my effort was not bearing fruits, I turned to pandas because I was fairly confident that I could redeem myself for whatever reason:

df_a     = pd.DataFrame({'data_a1': data_a1, 'data_a2': data_a2, 'data_a3': data_a3, 'data_a4': data_a4, 'data_a5': data_a5})
df_b     = pd.DataFrame({'data_b1': data_b1, 'data_b2': data_b2, 'data_b3': data_b3, 'data_b4': data_b4, 'data_b5': data_b5})

c   = 0
dfs = []
for i,j in zip(df_a, df_b):
    e = 2 * (i + j) / 2 / v / gf
    e = e.add_suffix('_' + str(c))
    dfs.addpend(e)
    c += 1

Alas, my stupidity prevailed itself and I could not do it either way.

Is there a streamlined way to work with equations using numpy so that the variable is updating itself within a for loop that is considered pythonic?
When performing these tasks, is it recommended to stick to numpy or turn to pandas?

hpaulj · Accepted Answer · 2019-02-12 19:40:51Z

3

First let's step away from creating a lot of variable names. In Python lists can contain other objects including arrays.

datalist1 = []
for _ in range(5):
    datalist1.append(np.random.randin(5, 10, 21))
# same for datalist2
datalist2 = [np.random.randint(6, 11, 21), 
             np.random.randint(6, 11, 21),
            ...]

elist = [2*(a+b)/2/v/gf for a,b in zip(datalist1, datalist2)]

Working 2d arrays, with shape (5,21) is even better. But the kind of list iteration that I illustrate works for all of Python, not just numpy.

You could even make a list from pre existing variables:

alist = [data_b1, data_b2, ...]

edited Feb 12, 2019 at 19:40

answered Feb 12, 2019 at 19:18

hpaulj

233k14 gold badges260 silver badges392 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

naughty_waves Over a year ago

Thank toy for replying, hpaulj. I am sorry if my question is misleading. The random arrays were created as an example but, in reality, I work with lab data acquired at different times. That is why I wanted to distinguish between the variables by adding a suffix of some sort. Would you advise me to store all the data in an array and then perform the calculations on the lists themselves?

hpaulj Over a year ago

You can still use lists, or dictionaries. For example you could have a list of the 'times'. But if in effect you have one or more time series, then pandas may be the way to go. The focus of my answer was on replacing a whole bunch of variable names with lists.

G. Anderson · Accepted Answer · 2019-02-12 18:45:41Z

2

I may be misunderstanding your intent, but from what I can tell, you aren't inexing or looking up anything, so there's no reason to go from numpy into pandas (which is just a really really well-dressed numpy array). Instead, you should be looking at the vectorized operations numpy provides.

Again, I'm not clear on your end goal, since you didn;t provide output, but is this approaching what you're after?

v        = 8
gf       = 2.5
a=np.random.randint(5,10,(21,5))
b=np.random.randint(5,10,(21,5))
c=2*(a+b)/2/v/gf

c

array([[0.9 , 0.75, 0.75, 0.6 , 0.65],
       [0.75, 0.65, 0.5 , 0.9 , 0.75],
       [0.7 , 0.6 , 0.75, 0.75, 0.85],
       [0.6 , 0.6 , 0.7 , 0.8 , 0.7 ],
       [0.6 , 0.75, 0.9 , 0.8 , 0.8 ],
       [0.85, 0.65, 0.65, 0.7 , 0.65],
       [0.65, 0.65, 0.65, 0.55, 0.7 ],
       [0.5 , 0.7 , 0.7 , 0.55, 0.6 ],
       [0.65, 0.6 , 0.8 , 0.9 , 0.7 ],
       [0.65, 0.7 , 0.55, 0.6 , 0.8 ],
       [0.75, 0.55, 0.75, 0.7 , 0.65],
       [0.8 , 0.7 , 0.65, 0.7 , 0.55],
       [0.55, 0.8 , 0.6 , 0.6 , 0.7 ],
       [0.8 , 0.75, 0.7 , 0.85, 0.7 ],
       [0.7 , 0.55, 0.75, 0.7 , 0.55],
       [0.6 , 0.7 , 0.7 , 0.6 , 0.65],
       [0.55, 0.8 , 0.7 , 0.6 , 0.75],
       [0.65, 0.75, 0.7 , 0.65, 0.6 ],
       [0.8 , 0.85, 0.7 , 0.8 , 0.7 ],
       [0.85, 0.8 , 0.55, 0.6 , 0.8 ],
       [0.8 , 0.8 , 0.75, 0.7 , 0.7 ]])

answered Feb 12, 2019 at 18:45

G. Anderson

5,9652 gold badges16 silver badges22 bronze badges

2 Comments

naughty_waves Over a year ago

Thank you for replying, G. Anderson. I created the random variables as an example, whereas in real life I am working with data acquired in a laboratory setting. Since I am performing the same measurement at different times, I wanted to distinguish between the arrays and update them with, for example, a suffix of some sort. I apologize for the poor explanation.

G. Anderson Over a year ago

See my additional answer, maybe closer to your intent?

G. Anderson · Accepted Answer · 2019-02-12 19:52:03Z

So, given additional information, what about this:

#simulate getting new data every day for a week
n_days   = 7

#set constants
v        = 8
gf       = 2.5
data_dict={}
#append data
for i in range(n_days+1):
    a=np.random.randint(5,10,21)
    b=np.random.randint(5,10,21)
    data_dict['dayN+'+str(i)]=2*(a+b)/2/v/gf #instead of str(i), you could append the key with datetime.now(), etc.

data_dict

{'dayN+0': array([0.275, 0.275, 0.4  , 0.3  , 0.325, 0.425, 0.4  , 0.45 , 0.3  ,
        0.375, 0.375, 0.35 , 0.425, 0.35 , 0.4  , 0.325, 0.3  , 0.3  ,
        0.35 , 0.3  , 0.375]),
 'dayN+1': array([0.3  , 0.275, 0.325, 0.375, 0.4  , 0.425, 0.325, 0.325, 0.4  ,
        0.35 , 0.3  , 0.4  , 0.375, 0.25 , 0.375, 0.375, 0.45 , 0.35 ,
        0.425, 0.35 , 0.4  ]),
 'dayN+2': array([0.4...

Collectives™ on Stack Overflow

Updating variable name within a for loop while performing calculations

3 Answers 3

2 Comments

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related