extract substring and convert to datetime python

Question

I have a dataframe, df:

data = [{0: 18, 1: '(Responses) 17th Nov 20'},
        {0: 304, 1: '(Responses) 17th Nov 20'},
        {0: 1177, 1: '(Responses) 17th Nov 20'},
        {0: 899, 1: '(Responses) 17th Nov 20'}]

df = pd.DataFrame(data)

0                1                                          
18    (Responses) 17th Nov 20
304   (Responses) 17th Nov 20
1177  (Responses) 17th Nov 20
899   (Responses) 17th Nov 20

Is there any efficient way to extract out 17th Nov 2020 and make it to a new column[2] as 17-11-2020 as date?

It can also be 1st or 2nd or 3rd for other date.

Expected output:

0                1                 2                         
18    (Responses) 17th Nov 20   17-11-2020
304   (Responses) 17th Nov 20   17-11-2020
1177  (Responses) 17th Nov 20   17-11-2020
899   (Responses) 17th Nov 20   17-11-2020

If we can see how the column was created, it might be possible to provide a more optimal solution that avoids this awkward uncleaned data. — cs95
– cs95, Commented Dec 28, 2020 at 3:54

U13-Forward · Accepted Answer · 2020-12-28 04:20:48Z

1

Try using str.split and pd.to_datetime:

df[2] = pd.to_datetime(df[1].str.replace('\(Responses\) ', ''))
print(df)

Output:

      0                        1          2
0    18  (Responses) 17th Nov 20 2020-11-17
1   304  (Responses) 17th Nov 20 2020-11-17
2  1177  (Responses) 17th Nov 20 2020-11-17
3   899  (Responses) 17th Nov 20 2020-11-17

edited Dec 28, 2020 at 4:20

answered Dec 28, 2020 at 4:05

U13-Forward

71.8k15 gold badges100 silver badges125 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

user6308605 Over a year ago

Im getting Nan value instead

U13-Forward Over a year ago

@user6308605 I edited my answer, please check again

user6308605 Over a year ago

I realised why I get Nan. The actual string is Copy of BeAChampion (Responses) so it should be str[4], correct? But still getting Nan

U13-Forward Over a year ago

@user6308605 I have a new edit on my answer check it out

U13-Forward Over a year ago

@user6308605 We shouldn't use split, see my example of replace

|

ljuk · Accepted Answer · 2020-12-28 03:53:34Z

0

Just split your string with "(responses)" as your keyword, and then get the second element after split:

  df['new_column'] = df['1'].str.split("(responses)").str[1]

answered Dec 28, 2020 at 3:53

ljuk

7947 silver badges15 bronze badges

Collectives™ on Stack Overflow

extract substring and convert to datetime python

2 Answers 2

6 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

6 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related