I have a data frame with 4 columns, col4 is a string including texts and digits:
Col1 Col2 Col3 Col4
Syslog 2016,09,17 1 PD380_003 %LINK-3-UPDOWN
Syslog 2016,09,17 1 NM380_005 %BGP-5-NBR_RESET
Syslog 2016,09,14 1 NM380_005 %BGP-5-NBR_RESET
Syslog 2016,09,08 1 DO NOT TICKET LO380_004 %SYS-5-CONFIG_I Config
i need to keep a substring of that column and delete anything else, so I used regex and i made a pattern but when i run the following query the result is not what i want, it replace everything with the pattern itself:
data.replace({'Col4':{'.*':'([A-Z]{2}[0-9]{3}_[0-9]{3})'}},regex=True)
desired result is:
Col1 Col2 Col3 Col4
Syslog 2016,09,17 1 PD380_003
Syslog 2016,09,17 1 NM380_005
Syslog 2016,09,14 1 LO380_004
Syslog 2016,09,08 1 LO380_004
but the result i get is like:
Col1 Col2 Col3 Col4
Syslog 2016,09,17 1 ([A-Z]{2}[0-9]{3}_[0-9]{3})
Syslog 2016,09,17 1 ([A-Z]{2}[0-9]{3}_[0-9]{3})
Syslog 2016,09,14 1 ([A-Z]{2}[0-9]{3}_[0-9]{3})
Syslog 2016,09,08 1 ([A-Z]{2}[0-9]{3}_[0-9]{3})
what am i doing wrong?
dataDF before replacing?