0

I need help with deleting "None" along with extra comma in language columns that have one or more language

Here is the existing csv:

f = pd.DataFrame({'Movie': ['name1','name2','name3','name4'],
                  'Year': ['1905', '1905','1906','1907'],
                  'Id': ['tt0283985', 'tt0283986','tt0284043','tt3402904'],
                  'language':['Mandarin,None','None,Cantonese','Mandarin,None,Cantonese','None,Cantonese']})

Where f now looks like:

   Movie  Year         Id   language
0  name1  1905  tt0283985  Mandarin,None
1  name2  1905  tt0283986  None,Cantonese
2  name3  1906  tt0284043  Mandarin,None,Cantonese
3  name4  1907  tt3402904  None,Cantonese

And the result should be like this:

   Movie  Year         Id             language
0  name1  1905  tt0283985            Mandarian
1  name2  1905  tt0283986            Cantonese
2  name3  1906  tt0284043            Mandarin,Cantonese
3  name4  1907  tt3402904            Cantonese

There are also other columns that have only 'None' values in language column, so I can't just use the replace function in excel, and there's also a problem of extra "," after doing that. So I may need help with a new way using pandas or something. Thanks in advance!

3 Answers 3

2

You could achieve it this way,

f["language"] = f.apply(
    lambda x: ",".join(filter(lambda y: y != "None", x.language.split(","))), axis=1
)

Or much better

f["language"] = f.apply(
    lambda x: ",".join([y for y in x.language.split(",") if y != "None"]), axis=1
)
Sign up to request clarification or add additional context in comments.

Comments

2

You could just remove all the None values as follows:

df['language'] = df['language'].str.replace('None,', '')

and then wherever the language column is empty, you could insert a 'None' value using regex.

df['language'] = df['language'].replace(r'^\s*$', 'None', regex=True)

1 Comment

That helps, and I add: df['language'] = df['language'].str.replace('None,', '') so it replace all None. Thanks!
2

You can use replace method to remove 'None' and ','

for i in range(len(f)):
f.loc[i,"language"].replace('None','')
f.loc[i,"language"].replace(',','')
print(f)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.