1

We have to remove certain characters from VARCHAR2 and CLOB data types using SQL or PL/SQL and load into destination tables.

There are functions available in Oracle ( e.g REGEXP_REPLACE or SUBSTR ) which can be used.

However, we have large amount of data.

Will it be faster if we extract the data into Linux host and use a combination of TR ( /bin/tr ) and Oracle External Files ?

4
  • There are probably tools out there which can do the replacement faster than Oracle. The question is do you want to go to the trouble of exporting and reimporting? Commented Apr 12, 2021 at 12:40
  • You could benchmark with a subset of the data to know for sure. My guess would be that writing all the data out to disk and reading it all back in would take vastly longer than whatever speed improvement you get from improving the update process. But if you're changing 90% of the data and you optimize the unload and load process, maybe you can make it faster. Commented Apr 12, 2021 at 12:43
  • You might want to consider a solution such as using views and generated columns to hide the data. That requires no changes to the actual data -- and updating large numbers of rows takes a looooooong time. Commented Apr 12, 2021 at 12:59
  • 2
    If you just need to remove some characters, no matter where they appear in the string, you should use REPLACE (not any regular expression functions; as to SUBSTR, I don't see how you would use it for this task). I don't think anything you can do will be faster than using REPLACE in the database. Same thing if you need to translate rather than remove; you can use the TRANSLATE function in the database. Commented Apr 12, 2021 at 14:05

2 Answers 2

3

I usually use TRANSLATE (see: https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/TRANSLATE.html#GUID-80F85ACB-092C-4CC7-91F6-B3A585E3A690) to delete characters from a string. But it depends a bit on how many characters you want to delete.

The following example is intended to illustrate this. The characters 'D' and 'E' are deleted from the input string.

SELECT TRANSLATE('ABCDEFG', '_DE', '_') FROM DUAL;

It returns 'ABCFG'.

'ABCDEFG' is the input string. '_DE' means 'D' and 'E' are removed.

Sign up to request clarification or add additional context in comments.

1 Comment

It's worth noting that in place of '_' there could be any other character. That one character's purpose is to avoid passing NULL (empty string is treated as null) as the third argument.
1

You can also use REPLACE, see the examples:

select REPLACE('AB 123456','AB','YZ') from dual;

It returns: 'YZ 123456'

In the next example, the third parameter of the REPLACE function has been omitted, so the corresponding string is removed from the original one.

select REPLACE('AB 123456','AB ') from dual;

It returns: '123456'

See full documentation on Oracle SQL Language Reference - REPLACE

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.