Hi All,
I have a query related to Regular expressions in SQL.
I have a case where a portion of string has to be extracted from a column. The portion of that column will be prefixed with my column A. Please see the screenshot for the sample data. I have also added the output expected in a separate column (highlighted in green).
Scenarios:
- Now if a column value has more than 1 unique number then that has to be shown up with Null Eg: To verify CAN06010025, CAN06010026 & CAN06010030 after the approval.
In the above string I have more than 1 number(bold portion) and this case should be ignored (meaning it has to give me Null Value).
- If there is only one number and if it is repetitive then I have to consider that case and extract the portion of String.. Eg: Project USA12: Id USA12S001: Contact required -USA12S001- form to be updated
In this example, the portion I wanted to extract is repetitive and I am looking to extract the highlighted portion alone.
The same applies to the other cases as well.
I tried with the below sql. The challenge is my Col A can also be present in Col B (Line 2 in screenshot) and this code is considering my Col A portion when I count with REGEXP_COUNT function and is giving me the value as Null. My expectation is to extract that USA12S001 portion from the column.
Could you please help in achieving this where the above two conditions satisfies.
SQL:
SELECT
ColA,
ColB,
case when REGEXP_COUNT(ColB,ColA) >2 THEN NULL
ELSE REPLACE(REPLACE(concat(regexp_substr(ColB,ColA||'([[:alnum:]]+\.?)'),
nvl(regexp_substr(ColB,ColA||'(\-[[:digit:]]+)'),
regexp_substr(ColB,ColA||'([[:space:]]\-[[:space:]][[:digit:]]+)'))),
' ',''),'.','')
END AS Result
FROM
table
Test Data:
Col A
CAN06
USA12
USA27
HUN04
CAN05
USA24
CAN06
Col B
to verify CAN06010025, CAN06010026 & CAN06010030 after the approval
Project USA12: Id USA12S001: Contact required -USA12S001- form to be updated
Project USA27: Id: USA27S001: Prod
To review id HUN04S002-HUN04S004 after the due date.
ID: CAN05S005 with the details as CAN05S005 are completed.
Project USA24: Id: USA24S009: Data Issue
"Project: Subject CAN06S009: V2 & V3- Id CAN06S010: V1"
