3

I am using pandas Python library and I want to add rows to existing DF and also keep the existing one.

My data looks like this:

product price   max_move_%
  1     100      10

I run loops like this:

for i, row in df_merged.iterrows():
for a in range((row['max_move_%']) * (- 1), row['max_move_%']):
    df_merged['price_new'] = df_merged['price'] * (1 - a / 100.00)

I want to get:

product price   max_move_%  true_move     price_new
1       100      10          -10            90
1       100      10          -9             91
 .....
1       100      10          10             110

But nothing happens and the df looks the same like before. What can I do to add new values to columns and on the same time leave the data from existing df?

I tried this:

df_loop = []
for i, row in df_merged.iterrows():
    for a in range((row['max_move_%']) * (- 1), row['max_move_%'] + 1):
    df_loop.append((df_merged['product'], df_merged['price'], f_merged['max_move_%'],a))

pd.DataFrame(df_loop, columns=('product','price','max_move_%','price_new'))

But it doesnt work like I supposed.

Thank you!

4
  • Could you expand you data example a bit more? Does you data have a single row for each product that needs to be expanded to multiple rows? Commented May 5, 2018 at 11:32
  • Hi, yes, it is excatly! Commented May 5, 2018 at 11:43
  • Does not 100 * (1 - (-10/100.00)) equals 110 instead of 90? Can you mathematically explain what you try to get into price_new column? Commented May 5, 2018 at 11:52
  • Hi, it is % change, so once it will be min will be 90 and max 110 Commented May 5, 2018 at 12:44

2 Answers 2

2

I just created a new DataFrame with all 5 desired columns, to add rows into this one:

import pandas as pd

df_merged = pd.DataFrame(data=[[1, 100, 10]], columns=['product', 'price', 'max_move_%'])
print(df_merged)
#    product  price  max_move_%
# 0        1    100          10

new_columns = ['product', 'price', 'max_move_%', 'true_move', 'price_new']
df_new = pd.DataFrame(columns=new_columns)

idx = 0
for i, row in df_merged.iterrows():
    for true_move in range((row['max_move_%']) * (- 1), row['max_move_%']+1):
        price_new = df_merged.iloc[i]['price'] * (1 + true_move / 100.00)
        df_new.loc[idx] = row.values.tolist() + [true_move, price_new]
        idx += 1

print(df_new)
#     product  price  max_move_%  true_move  price_new
# 0       1.0  100.0        10.0      -10.0       90.0
# 1       1.0  100.0        10.0       -9.0       91.0
# 2       1.0  100.0        10.0       -8.0       92.0
# 3       1.0  100.0        10.0       -7.0       93.0
# 4       1.0  100.0        10.0       -6.0       94.0
# 5       1.0  100.0        10.0       -5.0       95.0
# 6       1.0  100.0        10.0       -4.0       96.0
# 7       1.0  100.0        10.0       -3.0       97.0
# 8       1.0  100.0        10.0       -2.0       98.0
# 9       1.0  100.0        10.0       -1.0       99.0
# 10      1.0  100.0        10.0        0.0      100.0
# 11      1.0  100.0        10.0        1.0      101.0
# 12      1.0  100.0        10.0        2.0      102.0
# 13      1.0  100.0        10.0        3.0      103.0
# 14      1.0  100.0        10.0        4.0      104.0
# 15      1.0  100.0        10.0        5.0      105.0
# 16      1.0  100.0        10.0        6.0      106.0
# 17      1.0  100.0        10.0        7.0      107.0
# 18      1.0  100.0        10.0        8.0      108.0
# 19      1.0  100.0        10.0        9.0      109.0
# 20      1.0  100.0        10.0       10.0      110.0

I just modified your % change equation for evaluating price_new column values.

Sign up to request clarification or add additional context in comments.

2 Comments

This is absulutely awesome!!!!! Thank you a lot! Please, can you explain me this thing? True move column is filled with this? df_new.loc[idx] = row.values.tolist() + [a, price_new] Is it that true_move is before price_new, thats why it is filled there?
df_new.loc[idx] is assigning a new row into df_new dataframe. This dataframe has 5 columns (the desired ones), so any new row has these 5 values (one for each col). Line row.values.tolist() + [a, price_new] creates a python list of size 5, containing all values of the row. First part of sum (row.values.tolist()) adds the first 3 values to the new row from the current iterated row of df_merged, while second part of sum ([a, price_new]) adds the last 2 new values corresponding to the true_move and price_new columns. Variable a just make reference to true_move. Editing it...
0

If i understand correctly, something like this would work. Use the following to create a larger DataFrame based on your original one (but with 21x as many rows, in this example):

max_move = df_merged['max_move_%'][0]  # 10 in this case
num_rows_needed = max_move * 2 + 1  # 21 in this case
new = pd.concat([df_merged] * num_rows_needed).reset_index(drop=True)

Then add the new columns:

new['true_move'] = [i for i in range(-max_move, max_move + 1)]
new['price_new'] = new['price'] + new['true_move']

This gives the desired results as posted in your question.

Depending on what your larger dataset looks like, this might need to be tweaked slightly. If this doesn't work for your needs, please edit your question with a representative dataset to test on.

2 Comments

It ooks nice :) But.. I am getting a error in this line new['true_move'] = [i for i in range(-max_move, max_move + 1)]
I am getting Value Error: ---> 9 new['true_move'] = [i for i in range(-max_move, max_move + 1)]

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.