1

I have two pandas dataframes. One is the source and the other is the destination. I want to update based on multiple conditions the values of both dataframes. source dataframe look like this:

     Old_ID    New_ID   DATE      dt_insert
     FIRM345   FIRM21   21.11.19  11.11.19
     FIRM321   FIRM41   19.10.19  18.10.19

destination dataframe looks like this

     Old_ID    New_ID   DATE     
     FIRM345   FIRM21   21.11.19
     FIRM321   FIRM41   19.10.19

i want to know if there is a way to apply the following logic without using loops:

if src.old_ID == dest.old_id AND src.new_id == dest.new_id AND src.date == dest.date

THEN dest.dt_insert = src.date

ELSE append src row to destination dataframe

3
  • 2
    what is the expected output? Commented Oct 22, 2019 at 15:26
  • @Erfan I figure that was closed enough. But the question is reopened as you pointed out. Commented Oct 22, 2019 at 15:29
  • @Dan updated destination dataframe as per the logic described above. Commented Oct 22, 2019 at 15:38

2 Answers 2

1

You can solve your problem using this approach:

  1. outer join destination dataframe with a source dataframe on multiple keys (Old_ID, New_ID, DATE);
  2. replace a value in dt_insert column with a value from DATE column if the observation's merge keys are found in both dataframes;
  3. delete auxilary column _merge.

    import pandas as pd
    
    src_data = [{'Old_ID': 'FIRM345', 'New_ID': 'FIRM21', 'DATE': '21.11.19', 'dt_insert': '11.11.19'},
                {'Old_ID': 'FIRM321', 'New_ID': 'FIRM41', 'DATE': '19.10.19', 'dt_insert': '18.10.19'},
                {'Old_ID': 'FIRM333', 'New_ID': 'FIRM31', 'DATE': '20.10.19', 'dt_insert': '20.10.19'}]
    
    dest_data = [{'Old_ID': 'FIRM345', 'New_ID': 'FIRM21', 'DATE': '21.11.19'},
                 {'Old_ID': 'FIRM321', 'New_ID': 'FIRM41', 'DATE': '19.10.19'}]
    
    df_src = pd.DataFrame(src_data)
    print(df_src)
    
    #        DATE  New_ID   Old_ID dt_insert
    # 0  21.11.19  FIRM21  FIRM345  11.11.19
    # 1  19.10.19  FIRM41  FIRM321  18.10.19
    # 2  20.10.19  FIRM31  FIRM333  20.10.19
    
    df_dest = pd.DataFrame(dest_data)
    print(df_dest)
    
    #        DATE  New_ID   Old_ID
    # 0  21.11.19  FIRM21  FIRM345
    # 1  19.10.19  FIRM41  FIRM321
    
    df_dest_new = pd.merge(left=df_dest, right=df_src, how='outer', 
                           on=['Old_ID', 'New_ID', 'DATE'], indicator=True)
    df_dest_new['dt_insert'] = df_dest_new[['DATE', 'dt_insert', '_merge']].apply(lambda x: x[0] if x[2] == 'both' else x[1], axis=1)
    df_dest_new = df_dest_new.drop(labels='_merge', axis=1)
    print(df_dest_new)
    
    #        DATE  New_ID   Old_ID dt_insert
    # 0  21.11.19  FIRM21  FIRM345  21.11.19
    # 1  19.10.19  FIRM41  FIRM321  19.10.19
    # 2  20.10.19  FIRM31  FIRM333  20.10.19
    
Sign up to request clarification or add additional context in comments.

Comments

1

This should work

import pandas as pd

data = {'Old_ID':['FIRM345', 'FIRM321', 'FIRM11'], 'New_ID':['Firm21','FIRM41','FIRM42'],
        'DATE':['21.11.19', '19.10.19', '19.12.19'], 'dt_insert':['11.11.19','18.10.19','18.12.19']}
data2 = {'Old_ID':['FIRM345', 'FIRM321','FIRM12'], 'New_ID':['Firm21','FIRM41', 'FIRM43'],
        'DATE':['21.11.19', '19.10.19','19.12.19']}
src = pd.DataFrame(data)
dest = pd.DataFrame(data2)

print(src)
print(dest)

if src.Old_ID.any() == dest.Old_ID.any() and src.New_ID.any() == dest.New_ID.any() and\
    src.DATE.any() == dest.DATE.any():
    dest['dt_insert'] = src.DATE
else:
    src.append(dest)

print(src)
print(dest)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.