1

I have the string '1_A_B_C_D_E_1_2_3_4_5' and I am trying to extract the data 'A_B_C_D_E'. I am trying to remove the _1_2_3_4_5 & the 1_ portion from the string. Which is essentially the numeric portion in the string. any special characters after the last alphabet must also be removed. In this example the _ after the character E must also not be present.

and the Query I am trying is as below

SELECT 
REGEXP_SUBSTR('1_A_B_C_D_E_1_2_3_4_5','[^0-9]+',1,1) 
from dual

The Data I get from the above query is as below: -

_A_B_C_D_E_

I am trying to figure a way to remove the underscore towards the end. Any other way to approach this?

3
  • Please state more specifically what you must keep and what you must remove from the string. Is the part you must keep always at the beginning, or can it also be in the middle? Is the pattern letter-underscore-letter-underscore (etc) and then same with digits, and you must just keep the letters part? Or what? Don't ask us to guess! Commented Feb 28, 2018 at 3:41
  • So, are the "letters" always first and the "digits" always last? Commented Feb 28, 2018 at 3:51
  • You must clarify your question with some sample strings with varying combinations of string available in you data and the expected output for each. It is hard to assume things while answering. Commented Feb 28, 2018 at 4:28

2 Answers 2

2

Assuming the "letters" come first and then the "digits", you could do something like this:

select regexp_substr('A_B_C_D_E_1_2_3_4_5','.*[A-Z]') from dual;

This will pull all the characters from the beginning of the string, up to the last upper-case letter in the string (.* is greedy, it will extend as far as possible while still allowing for one more upper-case letter to complete the match).

Sign up to request clarification or add additional context in comments.

2 Comments

select regexp_substr('1_A_B_C_D_E_1_2_3_4_5','.*[A-Z]') from dual; returns 1_A_B_C_D_E which is not good. I voted up your answer as it works perfectly on the string used in the question.
@cdaiga - see my comments under the original question. I asked the OP for clarification, he amended the question but it is still not clear. In my answer above, read the first sentence - I stated my assumption explicitly.
1

I have the string '1_A_B_C_D_E_1_2_3_4_5' and I am trying to extract the data 'A_B_C_D_E'

Use REGEXP_REPLACE:

SQL> SELECT trim(BOTH '_' FROM
  2         (REGEXP_SUBSTR('1_A_B_C_D_E_1_2_3_4_5','[0-9]+', ''))) str
  3  FROM dual;

STR
---------
A_B_C_D_E

How it works:

  1. REGEXP_REPLACE will replace all numeric occurrences '[0-9]+' from the string. Alternatively, you could also use POSIX character class '[^[:digit:]]+'
  2. TRIM BOTH '_' will remove any leading and lagging _ from the string.

Also using REGEXP_SUBSTR:

SELECT trim(BOTH '_' FROM 
       (REGEXP_SUBSTR('1_A_B_C_D_E_1_2_3_4_5','[^0-9]+'))) str 
FROM dual;
STR
---------
A_B_C_D_E

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.