4

I have two df - df_a and df_b,

# df_a
number    cur    code
1000      USD    700
2000      USD    800
3000      USD    900

# df_b
number    amount    deletion code
1000      0.0       L        700
1000      10.0      X        700
1000      10.0      X        700
2000      20.0      X        800
2000      20.0      X        800
3000      0.0       L        900
3000      0.0       L        900

I want to left merge df_a with df_b,

df_a = df_a.merge(df_b.loc[df_b.deletion != 'L'], how='left', on=['number', 'code'])

and also, create a flag called deleted in the merge result df_a, that has three possible values - full, partial and none;

full - if all rows associated with a particular number value, have deletion = L;

partial - if some rows associated with a particular number value, have deletion = L;

none - no rows associated with a particular number value, have deletion = L;

Also when doing the merge, rows from df_b with deletion = L should not be considered; so the result looks like,

 number    amount    deletion    deleted    cur    code
 1000      10.0      X           partial    USD    700
 1000      10.0      X           partial    USD    700
 2000      20.0      X           none       USD    800
 2000      20.0      X           none       USD    800
 3000      0.0       NaN         full       USD    900

I tried,

g = df_b['deletion'].ne('L').groupby([df_b['number'], df_b['code']])
m1 = g.any()
m2 = g.all()

d1 = dict.fromkeys(m1.index[m1 & ~m2], 'partial')
d2 = dict.fromkeys(m2.index[m2], 'full')

d = {**d1, **d2}
df_a = df_a.merge(df_b.loc[df_b.deletion != 'L'], how='left', on=['code', 'number'])

df_a['deleted'] = df_a[['number', 'code']].map(d).fillna('none')

but I got an error,

AttributeError: 'DataFrame' object has no attribute 'map'

It seems df does not have map function, so I am wondering if there are any alternative ways to achieve this.

2
  • 1
    @jpp sry, updated again, i was trying to df_a['deleted'] = df_a[['number', 'code']].map(d).fillna('none'), which caused the error, so wondering if there is any other way to do the same thing. Commented Aug 8, 2018 at 10:56
  • Does this answer your question? AttributeError: 'DataFrame' object has no attribute 'map' Commented Feb 8, 2020 at 1:06

1 Answer 1

7

pd.DataFrame objects don't have a map method. You can instead construct an index from two columns and use pd.Index.map with a function:

df_a['deleted'] = df_a.set_index(['number', 'code']).index.map(d.get)
df_a['deleted'] = df_a['deleted'].fillna('none')

Compatibility note

For Pandas versions >0.25, you can use pd.Index.map directly with a dictionary, i.e. use d instead of d.get.

For prior versions, we use d.get instead of d because, unlike pd.Series.map, pd.Index.map does not accept a dictionary directly. But it can accept a function such as dict.get. Note also we split apart the fillna operation as pd.Index.map returns an array rather than a series.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.