4

I have a the below df1:

  Date        Tickers Qty
01-01-2018    ABC    25
02-01-2018    BCD    25
02-01-2018    XYZ    31
05-01-2018    XYZ    25

and another df2 as below

  Date         ABC  BCD  XYZ
01-01-2018    123   5   78
02-01-2018    125   7   79
03-01-2018    127   6   81
04-01-2018    126   7   82
05-01-2018    124   6   83

I want a resultant column in df1 which is the product of the correct column and row in df2 - getting the right ticker's rate on the given date and let the other dates have nan within df1

  Date       df1['Product']
01-01-2018      3075
02-01-2018      175
02-01-2018      2449
03-01-2018      nan
04-01-2018      nan
05-01-2018      2075

This seems like standard python operation, but I just am unable to achieve this without writing a loop - which is taking a very long time to execute:

I merged the above 2 tables on Date and then ran the below loop

for i in range(len(df1)):
    try:
        df1['Product'][i] = df1[df1['Ticker'][i]][i]
    except ValueError:
        df['Product'][i] = np.nan

Is there any better pythonic way of achieving this and not writing this loop pls?

1
  • I would like to add that i could have more than one tickers on a given date and so using date as a pure index may not help? Commented Sep 12, 2018 at 6:53

2 Answers 2

9

Use:

df11 = df1.pivot('Date', 'Tickers','Qty')
df22 = df2.set_index('Date')

s = df22.mul(df11).bfill(axis=1).iloc[:, 0]
print (s)
Date
01-01-2018     3075.0
02-01-2018      175.0
03-01-2018        NaN
04-01-2018        NaN
05-01-2018     2075.0
Name: ABC, dtype: float64

Solution for add new column to df1:

df11 = df1.pivot('Date', 'Tickers','Qty')
df22 = df2.set_index('Date')

df = df1.join(df22.mul(df11).stack().rename('new'), on=['Date','Tickers'], how='left')
print (df)
         Date Tickers  Qty     new
0  01-01-2018     ABC   25  3075.0
1  02-01-2018     BCD   25   175.0
2  05-01-2018     XYZ   25  2075.0

EDIT:

If pairs Dates with Tickers are duplicated, solution above is not possible use.

print (df1)
         Date Tickers  Qty
0  01-01-2018     ABC   25
1  01-01-2018     ABC   20 <-added duplicated pairs 01-01-2018 and ABC
2  02-01-2018     XYZ   31
3  02-01-2018     BCD   25
4  05-01-2018     XYZ   25

df3 = df1[['Date']].copy()
#add new values to column
df3['new'] = df2.set_index('Date').lookup(df1['Date'], df1['Tickers']) * df1['Qty']
#add missing values to duplicated Dates 
df3 = df2[['Date']].drop_duplicates().merge(df3, how='left') 
print (df3)
         Date     new
0  01-01-2018  3075.0
1  01-01-2018  2460.0
2  02-01-2018  2449.0
3  02-01-2018   175.0
4  03-01-2018     NaN
5  04-01-2018     NaN
6  05-01-2018  2075.0
Sign up to request clarification or add additional context in comments.

4 Comments

df11 = df1.pivot('Date', 'Tickers','Qty') gives ValueError: Index contains duplicate entries, cannot reshape
apologies for wasting your time so far by not providing the info that there can be multiple tickers on a single date..
@spiff - Yes, sure. Give me some time.
@spiff - Unfortunately duplicates complicated solution, so create another one.
3

you need to set 'Date' as index and multiply,

df1=df1.set_index('Date')
df2=df2.set_index('Date')


df3=(df2['ABC']*df1['Qty']).reset_index() 
print(df3)
    Date        0
0   01-01-2018  3075.0
1   02-01-2018  3125.0
2   03-01-2018  NaN
3   04-01-2018  NaN
4   05-01-2018  3100.0

1 Comment

thanks vm - but i think this may not work if I have more than one tickers on the same date? Edit: maybe I should have explained that in my example itself, apologies for that

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.