1

I have a dataframe, need to apply the same lambda function to multiple columns.

sample data:

col1                    col2                  col3 
xxx;#2;yyy              zzz;#46;zyzcz        1
aaa;#3;bbbccc           bbbb;cccc;dd#5        2

I need to clean up and result should be as below:

col1                    col2                  col3 
xxx;yyy                 zzz;zyzcz             1
aaa;bbbccc              bbbb;cccc;dd          2

function I used:

def cleanDigit(row):
    replacements = [('\d', ''), ('#', ''), (';;', ';')]

    for (old, new) in replacements: 
        row = re.sub(old, new, row)

    return row

code to apply function to multiple columns:

df[['col1', 'col2']] = df[['col1', 'col2']] .apply(lambda r: cleanDigit(r))

Error message:

TypeError: ('expected string or buffer', u'occurred at index col1')

1 Answer 1

2

Use DataFrame.applymap, also lambda function should be omit and pass only function:

df[['col1', 'col2']] = df[['col1', 'col2']].applymap(cleanDigit)
print (df)
         col1          col2  col3
0     xxx;yyy     zzz;zyzcz     1
1  aaa;bbbccc  bbbb;cccc;dd     2
Sign up to request clarification or add additional context in comments.

3 Comments

@iamklaus - yes, if question is about another solutions, then seems good.
@jezrael applymap is not working for my situation, I actually found reason is col2 has Nan, and both 'apply' and 'applymap' do not work when row is Nan. I tried below but not working.... df[['col1', 'col2']] = df[['col1', 'col2']] .apply(lambda r: cleanDigit(r) if pd.notnull(r) else r)
@Ling - Use np.nan != np.nan from definition - df[['col1', 'col2']] = df[['col1', 'col2']].applymap(lambda x: cleanDigit(x) if x == x else np.nan)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.