how to match two dataFrame in python

Question

I have two dataFrame in Python. The first one is df1:

'ID'    'B' 
   AA    10
   BB    20
   CC    30
   DD    40

The second one is df2:

 'ID'  'C'  'D'
    BB   30   0
    DD   35   0

What I want to get finally is like df3:

'ID'  'C'  'D'
   BB   30   20
   DD   35   40

how to reach this goal? my code is:

for i in df.ID
  if len(df2.ID[df2.ID==i]):
    df2.D[df2.ID==i]=df1.B[df2.ID==i]

but it doesn't work.

So are you trying to replace zero with value ?

Shijo
– Shijo

2016-08-26 16:31:19 +00:00
Commented Aug 26, 2016 at 16:31 — Shijo
– Shijo, Commented Aug 26, 2016 at 16:31

Work of Artiz · Accepted Answer · 2016-08-26 21:41:21Z

3

So first of all, I've interpreted the question differently, since your description is rather ambiguous. Mine boils down to this:

df1 is this data structure:

ID   B            <- column names
AA  10
BB  20
CC  30
DD  40

df2 is this data structure:

ID   C  D        <- column names
BB  30  0
DD  35  0

Dataframes have a merge option, if you wanted to merge based on index the following code would work:

import pandas as pd

df1 = pd.DataFrame(
    [
        ['AA', 10],
        ['BB', 20],
        ['CC', 30],
        ['DD', 40],
    ],
    columns=['ID','B'],
)
df2 = pd.DataFrame(
    [
        ['BB', 30, 0],
        ['DD', 35, 0],
    ], columns=['ID', 'C', 'D']
)

df3 = pd.merge(df1, df2, on='ID')

Now df3 only contains rows with ID's in both df1 and df2:

ID   B   C  D    <- column names
BB  20  30  0
DD  40  35  0

Now you were trying to remove D, and fill it in with column B, a.k.a

ID  C  D
BB 30 20 
DD 35 40

Something that can be done with these simple steps:

df3 = pd.merge(df1, df2, on='ID') # merge them
df3.D = df3['B']                  # set D to B's values
del df3['B']                      # remove B from df3

Or to summarize:

def match(df1, df2):
    df3 = pd.merge(df1, df2, on='ID') # merge them
    df3.D = df3['B']                  # set D to B's values
    del df3['B']                      # remove B from df3
    return df3

edited Aug 26, 2016 at 21:41

answered Aug 26, 2016 at 17:38

Work of Artiz

1,1008 silver badges17 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

dongxu mo Over a year ago

Think you for your answer. Your description was right, that's what I want.

Shijo · Accepted Answer · 2016-08-26 16:40:47Z

0

Following code will replace zero in df1 with value df2

df1=pd.DataFrame(['A','B',0,4,6],columns=['x'])
df2=pd.DataFrame(['A','X',3,0,5],columns=['x'])
df3=df1[df1!=0].fillna(df2)

answered Aug 26, 2016 at 16:40

Shijo

9,7913 gold badges23 silver badges31 bronze badges

1 Comment

dongxu mo Over a year ago

Maybe the description of my question confused you, your answer is not what I want, but I still can learn something from it. and also thank you.

Collectives™ on Stack Overflow

how to match two dataFrame in python

2 Answers 2

1 Comment

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related