Create dataframe of multiple appended arrays using for loop or nested loop in python

Question

I have a dataframe that for each row, I want to create list of 100 numbers (step 1), then multiply these lists together (step 2) and finally have a dataframe with the result (step 3). I can do this for one row but I'm struggling with how to write a for loop to do this for all of the rows. Using a nested loop or another method would also be fine.

Starting with the dataframe, orig:

import numpy as np
import pandas as pd

orig = pd.DataFrame(np.array([['a', 2.09328, 11.4043282, 0.1, 1.1], ['b', 5.985439, 6.59949, 0.3, 0.19], ['c', 8.5543045, 9.5402459, 0.09, 1.2]]),
                   columns=['site', 'x_min', 'x_max','y_min','y_max'])
orig = orig.set_index('site')

I want to create two new variables, x and y for each row in orig:

# Step 1: Create two new variables x and y for each row. For example, for the first row, site a, this would look like this:
x_a = np.linspace(2.09328,11.4043282,100)
y_a = np.linspace(0.1, 1, 100)

Then for each row, I want to multiply the x and y variables along with a constant z:

# Step 2: For each site, multiply the x and y arrays together with another variable z
z = 24
pd.DataFrame(x_a*y_a*24)

And then for Step 3 I want to have a dataframe where each column name is the row in orig (so, "a", "b", "c") and the rows are the product from the previous calculation, so xyz. The shape for this final dataframe should be three columns by 100 rows.

All I have so far is this and it's not working too well for me:

# So far all I have for step 1 is this:
xs = []
ys = []

#for each row in dataframe
for i in range(orig.shape[0]):
    row = orig.iloc[i:,]
    
    _xs = np.linspace(row['x_min'], row['x_max'], 100)
    _ys = np.linspace(row['y_max'], row['y_min'],100)
    print(_xs)

Parfait · Accepted Answer · 2020-09-28 02:22:45Z

1

With below data frame where x and y columns are float fields (unlike posted data frame that cast all values in np.array as string due to site values):

orig = pd.DataFrame(np.array([[2.09328, 11.4043282, 0.1, 1.1], 
                              [5.985439, 6.59949, 0.3, 0.19], 
                              [8.5543045, 9.5402459, 0.09, 1.2]]),
                    index=['a', 'b', 'c'],
                    columns=['x_min', 'x_max','y_min','y_max']).rename_axis('site')

consider a list comprehension followed by transpose of rows/columns, and renaming columns.

final_df = (pd.DataFrame([np.linspace(orig.loc[s, 'x_min'], orig.loc[s, 'x_max'],100) *
                          np.linspace(orig.loc[s, 'y_min'], orig.loc[s, 'y_max'],100) * z
                                  for s in orig.index.values])
              .transpose()
              .set_axis(orig.index.values, axis='columns', inplace=False)
           )

final_df.shape
# (100, 3)

final_df.head(10)
#            a          b          c
# 0   5.023872  43.095161  18.477298
# 1   5.779856  42.980042  20.803375
# 2   6.581441  42.864592  23.134811
# 3   7.428627  42.748812  25.471608
# 4   8.321413  42.632701  27.813764
# 5   9.259799  42.516259  30.161280
# 6  10.243786  42.399486  32.514155
# 7  11.273373  42.282382  34.872391
# 8  12.348561  42.164948  37.235986
# 9  13.469349  42.047182  39.604941

answered Sep 28, 2020 at 2:22

Parfait

108k19 gold badges102 silver badges138 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

JAG2024 Over a year ago

Hi @Parfait, thank you so much. This answer was great and worked perfectly when I applied it to my actual data. I'm always struggling with list comprehensions to solve my problems. Do you have any recommendations for reading/material that could help me improve this? Or will time and experience just lead me to where I want to be.

Parfait Over a year ago

Great to hear and glad to help! I say simply practice even with trial and error. Anything you need returned in a list or dictionary over an iterable can be written with list or dict comprehension. When you have to initialize a list/dict and in a for loop append or assign, =, items, then a comprehension can be used.

Collectives™ on Stack Overflow

Create dataframe of multiple appended arrays using for loop or nested loop in python

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related