0

I have a large dataframe with (104959, 298) rows and columns

in the string column I have multiple substrings that I need to replace

I've tried

df.EVENT_DTL.replace(['SPOUSE_2','SPOUSE_nan','PARENT_2','PARENT_nan','GRANDPARENT_2','GRANDPARENT_nan','CHILD_2',
        'CHILD_nan','RELATIVE_2','RELATIVE_nan','LOVER_2','LOVER_nan','FRIEND_2','FRIEND_nan',
        '세부 대인관계문제 기타 상세_nan','세부 대인관계문제 기타 상세_','대인관계문제_1',
        '애인 관련_2','애인 관련_nan','직장 내_2','직장 내_nan','소외 문제_2','소외 문제_nan',
        '수면제_2','수면제_nan','진통제_2','진통제_nan','병원에서 처방 받은 약물_2',
        '병원에서 처방 받은 약물_nan','기타약물_nan','농약_2','농약_nan','살충제_2','살충제_nan',
        '제초제_2','제초제_nan','쥐약_2','쥐약_nan','화학약품_nan','목매달기_2','목매달기_nan',
        '가스 질식_2','가스 질식_nan','물에 뛰어들기_2','물에 뛰어들기_nan','뛰어내림_2',
        '뛰어내림_nan','칼, 송곳으로 찌르기_2','칼, 송곳으로 찌르기_nan','세부 동거자 기타 상세_nan'],"")

(I'm trying to delete all of the substrings above)

but it causes a memory error.

I've found a method to replace multiple substrings in a string but haven't found way to replace substrings in a dataframe

0

2 Answers 2

0

Found the answer:

Replace multiple substrings in a Pandas series with a value

the trick is to avoid making dictionary and use regex

Sign up to request clarification or add additional context in comments.

Comments

0

You could iterate through the list of strings you want to replace as shown. Other ideas here

to_replace=['SPOUSE_2','SPOUSE_nan'...] #for example
for str_rep in to_replace:
    df.EVENT_DTL.replace(str_rep,'')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.