2

Say I have a column in postgres database called pid that looks like this:

set1/2019-10-17/ASR/20190416832-ASR.pdf
set1/2019-03-15/DEED/20190087121-DEED.pdf
set1/2021-06-22/DT/20210376486-DT.pdf

I want to remove everything after the last dash "-" including the dash itself. So expected results:

set1/2019-10-17/ASR/20190416832.pdf
set1/2019-03-15/DEED/20190087121.pdf
set1/2021-06-22/DT/20210376486.pdf

I've looked into replace() and split_part() functions but still can't figure out how to do this. Please advise.

1
  • Show us your attempt so we can see where we could assist. Commented Jan 4, 2023 at 9:34

1 Answer 1

3

We can use a regex replacement here:

SELECT col, REGEXP_REPLACE(col, '^(.*)-[^-]+(\.\w+)$', '\1\2') AS col_out
FROM yourTable;

The regex used above captures the column value before the last dash in \1, and the extension in \2. It then builds the output using \1\2 as the replacement.

Here is a working regex demo.

Sign up to request clarification or add additional context in comments.

2 Comments

What is the '\1\2' part after the comma in REGEXP_REPLACE()?
@amnesic \1 refers to the first capture group, which is (.*). The \2 is the second capture group and is (\.\w+).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.