2

I have a pandas dataframe and I have a column called 'email'. I have verified the dtype is object. It contains normally formatted emails such as [email protected]

When I do this:

$ df['emaillower'] = df['email'].apply(lambda x: x.lower())

I get this:

Traceback (most recent call last):

File "<ipython-input-153-e951d53133eb>", line 1, in <module>
df['emaillower'] = df['email'].apply(lambda x: x.upper())

File "C:\ProgramData\Anaconda2\lib\site-packages\pandas\core\series.py", 
line 
2355, in apply
mapped = lib.map_infer(values, f, convert=convert_dtype)

File "pandas\_libs\src\inference.pyx", line 1569, in 
pandas._libs.lib.map_infer (pandas\_libs\lib.c:66440)

File "<ipython-input-153-e951d53133eb>", line 1, in <lambda>
df['emaillower'] = df['email'].apply(lambda x: x.upper())

AttributeError: 'float' object has no attribute 'upper'      

What is going on?

1
  • 1
    does your "Email" columns contain only string?? Commented Feb 22, 2018 at 18:23

2 Answers 2

3

One of the entries in the column 'email' is a float, not a string, and it doesn't know how to do upper() on a float. This is common when one entry is empty and is converted to NaN - this is read as a float and that's the source of your error. Something like this may fix the problem:

df['emaillower'] = df['email'].apply(lambda x: x.upper() if type(x) is str else 'empty')

Also want to note that you call the column emaillower but you are actually making it upper case - this might cause some confusion in the future

Sign up to request clarification or add additional context in comments.

2 Comments

The situation is funny because the SQL that generated the input file for the dataframe load specified that email > ' ' So all my email values are populated... nevertheless Python complained about float. I just did df['email'] = df['email'].astype(str)
for clarity i corrected the original question so the field name emaillower matched the lower() function
2

Will suggest using str function from pandas

df['emaillower'] = df['email'].astype(np.str).str.upper()

I have used astye(np.str) to be sure all values are converted to string .

2 Comments

Thanks these comments suggested to me that I put in: df['email'] = df['email'].astype(str) and that fixed the issue. When I eyeballed the data, it looked all proper in the form [email protected] but clearly somewhere Python interpreted something as a float.
@MarkGinsburg, please consider upvoting this answer if it gives you any help

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.