3

I have a python data frame with a column called "accredited This column should have the data of accreditation: "10/10/2011" Or put: "Not accredited" But in most of the cases when isn't accredited the column have some text, like: "This business is not accredited....." I want to replace the whole text and just put: "Not accredited"

Now, I wrote a function:

def notAcredited(string):
    if ('Not' in string or 'not' in string):
        return  'Not Accredited'

I'm implementing the function with a loop, is possible to do this with the ".apply" method?

for i in range(len(df_1000_1500)):
    accreditacion = notAcredited(df_1000_1500['BBBAccreditation'][i])
    if accreditacion == 'Not Accredited':
        df_1000_1500['BBBAccreditation'][i] = accreditacion

1 Answer 1

4

You could use the vectorized string method Series.str.replace:

In [72]: df = pd.DataFrame({'accredited': ['10/10/2011', 'is not accredited']})

In [73]: df
Out[73]: 
          accredited
0         10/10/2011
1  is not accredited

In [74]: df['accredited'] = df['accredited'].str.replace(r'(?i).*not.*', 'not accredited')

In [75]: df
Out[75]: 
       accredited
0      10/10/2011
1  not accredited

The first argument passed to replace, e.g. r'(?i).*not.*', can be any regex pattern. The second can be any regex replacement value -- the same kind string as would be accepted by re.sub. The (?i) in the regex pattern makes the pattern case-insensitive so not, Not, NOt, NoT, etc. would all match.

Series.str.replace Cythonizes the calls to re.sub (which makes it faster than what you could achieve using apply since apply uses a Python loop.)

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.