mysterious Python Pandas lambda function error

Question

I have a pandas dataframe and I have a column called 'email'. I have verified the dtype is object. It contains normally formatted emails such as [email protected]

When I do this:

$ df['emaillower'] = df['email'].apply(lambda x: x.lower())

I get this:

Traceback (most recent call last):

File "<ipython-input-153-e951d53133eb>", line 1, in <module>
df['emaillower'] = df['email'].apply(lambda x: x.upper())

File "C:\ProgramData\Anaconda2\lib\site-packages\pandas\core\series.py", 
line 
2355, in apply
mapped = lib.map_infer(values, f, convert=convert_dtype)

File "pandas\_libs\src\inference.pyx", line 1569, in 
pandas._libs.lib.map_infer (pandas\_libs\lib.c:66440)

File "<ipython-input-153-e951d53133eb>", line 1, in <lambda>
df['emaillower'] = df['email'].apply(lambda x: x.upper())

AttributeError: 'float' object has no attribute 'upper'

What is going on?

does your "Email" columns contain only string??

Espoir Murhabazi
– Espoir Murhabazi

2018-02-22 18:23:58 +00:00
Commented Feb 22, 2018 at 18:23 — Espoir Murhabazi
– Espoir Murhabazi, Commented Feb 22, 2018 at 18:23

Sevy · Accepted Answer · 2018-02-22 18:27:07Z

3

One of the entries in the column 'email' is a float, not a string, and it doesn't know how to do upper() on a float. This is common when one entry is empty and is converted to NaN - this is read as a float and that's the source of your error. Something like this may fix the problem:

df['emaillower'] = df['email'].apply(lambda x: x.upper() if type(x) is str else 'empty')

Also want to note that you call the column emaillower but you are actually making it upper case - this might cause some confusion in the future

answered Feb 22, 2018 at 18:27

Sevy

6884 silver badges11 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Mark Ginsburg Over a year ago

The situation is funny because the SQL that generated the input file for the dataframe load specified that email > ' ' So all my email values are populated... nevertheless Python complained about float. I just did df['email'] = df['email'].astype(str)

Mark Ginsburg Over a year ago

for clarity i corrected the original question so the field name emaillower matched the lower() function

Espoir Murhabazi · Accepted Answer · 2018-02-22 18:36:55Z

2

Will suggest using str function from pandas

df['emaillower'] = df['email'].astype(np.str).str.upper()

I have used astye(np.str) to be sure all values are converted to string .

edited Feb 22, 2018 at 18:36

answered Feb 22, 2018 at 18:27

Espoir Murhabazi

6,4415 gold badges49 silver badges78 bronze badges

2 Comments

Mark Ginsburg Over a year ago

Thanks these comments suggested to me that I put in: df['email'] = df['email'].astype(str) and that fixed the issue. When I eyeballed the data, it looked all proper in the form [email protected] but clearly somewhere Python interpreted something as a float.

Espoir Murhabazi Over a year ago

@MarkGinsburg, please consider upvoting this answer if it gives you any help

Collectives™ on Stack Overflow

mysterious Python Pandas lambda function error

2 Answers 2

2 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related