pandas string replace TypeError - replace words using pandas df column

Question

I am using Python 3.7 and have a large data frame that includes a column of addresses, like so:

c = ['420 rodeo drive', '22 alaska avenue', '919 franklin boulevard', '39 emblem alley', '67 fair way']
df1 = pd.DataFrame(c, columns = ["address"])

There is also a dataframe of address abbrevations and their usps version. An example:

x = ['avenue', 'drive', 'boulevard', 'alley', 'way']
y = ['aly', 'dr', 'blvd', 'ave', 'way']
df2 = pd.DataFrame(list(zip(x,y)), columns = ['common_abbrev', 'usps_abbrev'], dtype = "str")

What I would like to do is as follows:

Search df1['address'] for occurrences of words in df2['common_abbrev'] and replace with df2['usps_abbrev'].
Return the transformed df1.

To this end I have tried in a few ways what appears to be the canonical strategy of df.str.replace() like so:

df1["address"] = df1["address"].str.replace(df2["common_abbrev"], df2["usps_abbrev"])

However, I get the following error:

`TypeError: repl must be a string or callable`.

My question is:

Since I've already given the dtype for df2, why am I receiving the error?

How can I produce my desired result of:

         address
0       420 rodeo dr
1      22 alaska ave
2  919 franklin blvd
3      39 emblem aly
4        67 fair way

Thanks for your help.

Mayank Porwal · Accepted Answer · 2021-02-08 17:01:50Z

1

First create a dict using zip. Then use Series.replace:

In [703]: x = ['avenue', 'drive', 'boulevard', 'alley', 'way']
     ...: y = ['aly', 'dr', 'blvd', 'ave', 'way']

In [686]: d = dict(zip(x, y))

In [691]: df1.address = df1.address.replace(d, regex=True)

In [692]: df1
Out[692]: 
             address
0       420 rodeo dr
1      22 alaska aly
2  919 franklin blvd
3      39 emblem ave
4        67 fair way

edited Feb 8, 2021 at 17:01

answered Feb 8, 2021 at 16:58

Mayank Porwal

34.2k9 gold badges45 silver badges65 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Mustafa Aydın Over a year ago

d = dict(zip(x, y)) might be better

Mayank Porwal Over a year ago

@MustafaAydın Yes, updated my answer. Thanks for suggestion.

jvalenti Over a year ago

@MayankPorwal thanks this is clean and intuitive.

jvalenti Over a year ago

@MayankPorwal one more question: what is it about dict that converts my datatype to strings?

jvalenti Over a year ago

Also, some of the 'ave' replacements are adding an 'e' to the new string, so 'avenue' becomes 'avee' or 'aveee' instead of 'ave'. Is there a reason why?

Collectives™ on Stack Overflow

pandas string replace TypeError - replace words using pandas df column

1 Answer 1

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related