4
set df1["name"] = df2["name] if df1["id"] == df2["id]. 

Both dataframes are of different sizes. I am trying to implement this behavior by writing the code as below:

   dtl['name'] = dtlLookUp[["name"]].loc[ dtlLookUp["id"] == (dtl["id"]) ]

However, I am getting error:

ValueError: Can only compare identically-labeled Series objects

EDIT :

enter image description here

12
  • 1
    dtl['name'] = dtlLookUp[["name"]].loc[ dtlLookUp["id"].isin(dtl["id"])) ] ? Can you come up with a minimal reproducible example ? Commented Oct 31, 2018 at 17:32
  • Could you post the output of dtl.info() and dtlLookup.info() Commented Oct 31, 2018 at 17:34
  • @Vishnudev , done that. Commented Oct 31, 2018 at 17:52
  • @harvpan , I want to get the name for that particular id. So, I think I need comparison: == rather then .isin() . seems right? Commented Oct 31, 2018 at 17:54
  • @frozenshine can you come up with a minimal reproducible example ? Commented Oct 31, 2018 at 17:55

2 Answers 2

13

My problem is solved. Posting it for anyone else who might encounter the same error ( as I search this error but none of the already posted solutions worked for me, So I simply changed the orientation of my problem solving). I treated this problem as a Left Join.

    psb = pd.merge(dtl, dtlLookUp, how='left', on=['id'])
Sign up to request clarification or add additional context in comments.

Comments

2

Convert id columns on both dataframe to same dtype before condition check. I am assuming the columns named id should have dtype int.

df1['id'] = df1.id.astype(int)
df2['id'] = df2.id.astype(int)

Put values from the other dataframe based on selection

selection = (df1.id == df2.id)
df1.loc[selection, 'name'] = df2.loc[selection, 'name']

4 Comments

If i use astype(int), I get error: ValueError: cannot convert float NaN to integer
You need to fill nan before comparing. Use df.id.fillna(0) to fill zeros where the value is empty i.e. np.nan
ok thanks for that. After filling nan's, selection = (df1.id == df2.id) gives the same error as mentioned in the question :(
@frozenshine How about you use df.reset_index(drop=True) for both dataframe and then select? Pandas uses index for matching

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.