1

I have to clean CSV file Data. The data that I am trying to clean is below. Condition: I have to add @myclinic.com.au at the end of every string where it is missing.

[email protected]
mildura
[email protected]
[email protected]
nowa [email protected]
[email protected]
[email protected]
[email protected]
logan village
[email protected]

The code for this is

    DataFrame = pandas.read_csv(ClinicCSVFile)
    DataFrame['Email'] = DataFrame['Email'].apply(lambda x: x if '@' in str(x) else str(x)+'@myclinic.com.au')
    DataFrameToCSV = DataFrame.to_csv('Temporary.csv', index = False)   
    print(DataFrameToCSV)

But the output that I am getting is none and I could not work on the later part of the Problem as it is generating the error below

TypeError: 'NoneType' object is not iterable

which is originated by the above data frame. Please Help me with this.

2 Answers 2

2

Use endswith for condition with inverting by ~ and add string to end:

df.loc[~df['Email'].str.endswith('@myclinic.com.au'), 'Email'] += '@myclinic.com.au'
#if need check only @
#df.loc[~df['Email'].str.contains('@'), 'Email'] += '@myclinic.com.au'
print (df)
                           Email
0        [email protected]
1        [email protected]
2      [email protected]
3        [email protected]
4      nowa [email protected]
5   [email protected]
6       [email protected]
7      [email protected]
8  logan [email protected]
9        [email protected]

For me it working nice:

df = pd.DataFrame({'Email': ['[email protected]', 'mildura', '[email protected]', '[email protected]', 'nowa [email protected]', '[email protected]', '[email protected]', '[email protected]', 'logan village', '[email protected]']})
df.loc[~df['Email'].str.contains('@'), 'Email'] += '@myclinic.com.au'
print (df)
                           Email
0        [email protected]
1        [email protected]
2      [email protected]
3        [email protected]
4      nowa [email protected]
5   [email protected]
6       [email protected]
7      [email protected]
8  logan [email protected]
9        [email protected]
Sign up to request clarification or add additional context in comments.

8 Comments

df['Email'] = df.loc[~df['Email'].str.contains('@'), 'Email'] += '@myclinic.com.au' ^ SyntaxError: invalid syntax I am getting this error
@Damian - It is weird, what about df.loc[~df['Email'].str.contains('@'), 'Email'] = df.loc[~df['Email'].str.contains('@'), 'Email'] + '@myclinic.com.au' ?
@Damian - Or maybe copy error from my post, I think if rewrite it it working nice.
@Damian - If copy my data from edited answer still same problem?
Sir, I am not understanding this error. Can you provide me with your email address? So, I can explain my problem.
|
0

Using apply and endswith

Ex:

import pandas as pd
df = pd.read_csv(filename, names=["Email"])
print(df["Email"].apply(lambda x: x if x.endswith("@myclinic.com.au") else x+"@myclinic.com.au"))

Output:

0          [email protected]
1          [email protected]
2        [email protected]
3          [email protected]
4        nowa [email protected]
5     [email protected]
6         [email protected]
7        [email protected]
8    logan [email protected]
9          [email protected]
Name: Email, dtype: object

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.