4
>>> p1.head()           
   StreamId            Timestamp    SeqNum
0         3  1490250116391063414  1158
1         3  1490250116391348339  3600
2         3  1490250116391542829  3600
3         3  1490250116391577184  1437
4         3  1490250116392819426  1389


>>> oss.head()
   OrderID    Symbol  Stream     SeqNo
0  5000000  AXBANK       3      1158
1  5000001  AXBANK       6      1733
2  5000002  AXBANK       6      1244
3  5000003  AXBANK       6      1388
4  5000004  AXBANK       3      1389

How can this be merged using 2 attributes as key (SeqNum and StreamId)

>>> merge
   OrderID    Symbol  Stream     SeqNo    Timestamp
0  5000000  AXBANK       3      1158      1490250116391063414
1  5000001  AXBANK       6      1733      NaN
2  5000002  AXBANK       6      1244      NaN
3  5000003  AXBANK       6      1388      NaN
4  5000004  AXBANK       3      1389      1490250116392819426

I tried using

oss['Time1'] = oss['SeqNo'].map.((p1.set_index('SeqNum')['Timestamp']))

But I need to include both (SeqNum-SeqNo & Stream-StreamId)as keys I know this can be easy if I rename column names same in both dataframes and use merge but I want to avoid that. I should rather use something generic like (take THIS dataframe, map THESE columns to THOSE columns IN ANOTHER DATAFRAME and fetch required coulmns)

2 Answers 2

5

Using join

oss.join(p1.set_index(['StreamId', 'SeqNum']), on=['Stream', 'SeqNo'])

   OrderID  Symbol  Stream  SeqNo     Timestamp
0  5000000  AXBANK       3   1158  1.490250e+18
1  5000001  AXBANK       6   1733           NaN
2  5000002  AXBANK       6   1244           NaN
3  5000003  AXBANK       6   1388           NaN
4  5000004  AXBANK       3   1389  1.490250e+18
Sign up to request clarification or add additional context in comments.

Comments

2

I think you need merge with drop:

print (pd.merge(oss, p1, left_on=['Stream','SeqNo'], 
                         right_on=['StreamId','SeqNum'],how='left')
          .drop(['StreamId','SeqNum'], axis=1))

   OrderID  Symbol  Stream  SeqNo     Timestamp
0  5000000  AXBANK       3   1158  1.490250e+18
1  5000001  AXBANK       6   1733           NaN
2  5000002  AXBANK       6   1244           NaN
3  5000003  AXBANK       6   1388           NaN
4  5000004  AXBANK       3   1389  1.490250e+18

Another solution with rename columns:

d = {'Stream':'StreamId','SeqNo':'SeqNum'}
print (pd.merge(oss.rename(columns=d), p1, how='left'))
   OrderID  Symbol  StreamId  SeqNum     Timestamp
0  5000000  AXBANK         3    1158  1.490250e+18
1  5000001  AXBANK         6    1733           NaN
2  5000002  AXBANK         6    1244           NaN
3  5000003  AXBANK         6    1388           NaN
4  5000004  AXBANK         3    1389  1.490250e+18

2 Comments

Is there a way to keep timestamp as it is and not in e
I think there is only one way - convert before to str - p1.Timestamp = p1.Timestamp.astype(str). Because is impossible have int valus together with float - int are always converted to float - see docs

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.