0
>>> df
                       Time
    5/10/2017 (135) 01:05:03
    5/11/2017 (136) 04:05:06

Given an input date such as this in a DataFrame, how would I delete the Julian Date, (135) and (136), and remove the whitespace in the middle, so that the output looks like:

>>> df
                       Time
    5/10/2017 01:05:03
    5/11/2017 04:05:06

I've tried:

df['Time'].replace('(135)','', regex=True, inplace=True)

which outputs:

>>> df
                    Time
0  5/10/2017 () 01:05:03

I was wondering what I'm doing wrong here.

1 Answer 1

3

You can use replace by regex:

First need escape () by \ because special chars in regex, then match all ints by \d+ and last match zero or more whitespaces after ) by \s*.


df['Time'] = df['Time'].str.replace("\(\d+\)\s*", '')
print (df)
                 Time
0  5/10/2017 01:05:03
1  5/11/2017 04:05:06

And if need convert to datetime:

df['Time'] = pd.to_datetime(df['Time'].str.replace("\(\d+\)\s*", ''))
print (df)
                 Time
0 2017-05-10 01:05:03
1 2017-05-11 04:05:06

EDIT:

In your sample are mising escaping chars \ and is possible use instead \d+ [0-9]+:

df['Time'].replace('\([0-9]+\)\s*','', regex=True, inplace=True)
print (df)
                 Time
0  5/10/2017 01:05:03
1  5/11/2017 04:05:06
Sign up to request clarification or add additional context in comments.

2 Comments

Awesome, that works! Could you describe the backslash and d+ values in str.replace() and how they're used, or if there's documentation on how you came up with that input? Thanks!
Unfortunately regex problematic is very huge, in pandas documentation are only some samples here - it is about extracting, but similar works with str.replace.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.