0

I want to extract some strings from the VAL column, according the regex furhter below in bold. This is an example of the data I have in source :

Table1
-----------------
ID         VAL       
-----------------
1          GR-RDE
2          GR-RZA-RDE
3          GR-RZA-RDE_RZA
4          GR-RGS
5          GR-RZA-OR-ORC
6          GR-RZA-RDE-OR-ORC_RZA  

Desired result :

> Output
-----------------
ID         RESULT       
-----------------
1          RDE
2          RZA
2          RDE
3          RZA
3          RDE
4          RGS
5          RZA
5          OR
6          RZA
6          RDE
6          OR  

To do that, I've done this regex : (?<=-)(RDE|RZA|RGS|OR)(?![A-Z])

  • (?<=-) : checks that the character before is '-'
  • (RDE|RZA|RGS|OR) : search for 'RDE', 'RZA', 'RGS', 'OR' strings
  • (?![A-Z]) : ignore the string if it's followed by a letter

The regex works perfectly and it ignores all the unwhanted parts : Regex to handle the process

My problem is that I don't find the way to use this regex in a SQL statement (Oracle database). I've tried to perform a test with something like this, which returns Null :

select REGEXP_SUBSTR(VAL,'(?<=-)(RDE|RZA|RGS|OR)(?![A-Z])') from Table1;
2
  • I'm afraid that Oracle doesn't support lookaround :( Commented Oct 11, 2017 at 9:55
  • Ok so maybe I need to find another way to do this, without using regex. Commented Oct 11, 2017 at 9:59

1 Answer 1

1

SQL Fiddle

Oracle 11g R2 Schema Setup:

CREATE TABLE Table1 ( ID, VAL ) AS
SELECT 1, 'GR-RDE' FROM DUAL UNION ALL
SELECT 2, 'GR-RZA-RDE' FROM DUAL UNION ALL
SELECT 3, 'GR-RZA-RDE_RZA' FROM DUAL UNION ALL
SELECT 4, 'GR-RGS' FROM DUAL UNION ALL
SELECT 5, 'GR-RZA-OR-ORC' FROM DUAL UNION ALL
SELECT 6, 'GR-RZA-RDE-OR-ORC_RZA' FROM DUAL

Query 1:

WITH words ( id, val, lvl, str, maxlvl ) AS (
  SELECT id,
         val,
         1,
         REGEXP_SUBSTR( val, '[A-Z]+', 1, 1 ),
         REGEXP_COUNT( val, '[A-Z]+' )
  FROM   table1
UNION ALL
  SELECT id,
         val,
         lvl + 1,
         REGEXP_SUBSTR( val, '[A-Z]+', 1, lvl + 1 ),
         maxlvl
  FROM   words
  WHERE  lvl < maxlvl
)
SELECT id, str, lvl
FROM   words
ORDER BY id, lvl

Results:

| ID | STR | LVL |
|----|-----|-----|
|  1 |  GR |   1 |
|  1 | RDE |   2 |
|  2 |  GR |   1 |
|  2 | RZA |   2 |
|  2 | RDE |   3 |
|  3 |  GR |   1 |
|  3 | RZA |   2 |
|  3 | RDE |   3 |
|  3 | RZA |   4 |
|  4 |  GR |   1 |
|  4 | RGS |   2 |
|  5 |  GR |   1 |
|  5 | RZA |   2 |
|  5 |  OR |   3 |
|  5 | ORC |   4 |
|  6 |  GR |   1 |
|  6 | RZA |   2 |
|  6 | RDE |   3 |
|  6 |  OR |   4 |
|  6 | ORC |   5 |
|  6 | RZA |   6 |

Query 2:

SELECT t.id, w.COLUMN_VALUE AS str
FROM   Table1 t
       CROSS JOIN
       TABLE(
         CAST(
           MULTISET(
             SELECT REGEXP_SUBSTR( t.val, '[A-Z]+', 1, LEVEL )
             FROM   DUAL
             CONNECT BY LEVEL <= REGEXP_COUNT( t.val, '[A-Z]+' )
           ) AS SYS.ODCIVARCHAR2LIST
         )
       ) w

Results:

| ID | STR |
|----|-----|
|  1 |  GR |
|  1 | RDE |
|  2 |  GR |
|  2 | RZA |
|  2 | RDE |
|  3 |  GR |
|  3 | RZA |
|  3 | RDE |
|  3 | RZA |
|  4 |  GR |
|  4 | RGS |
|  5 |  GR |
|  5 | RZA |
|  5 |  OR |
|  5 | ORC |
|  6 |  GR |
|  6 | RZA |
|  6 | RDE |
|  6 |  OR |
|  6 | ORC |
|  6 | RZA |
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.