1

Querying data in a Postgres SQL database..

I have a column job_name with data like so, where I want to parse the year out of the string (the first 4 numbers)..

AG_2017_AJSJGHJS8GJSJ_7_25_WK_AHGSGHGHS

In this case I want to pull 2017

This works fine and pulls the rows where the 4 digits are >= 2019...

select 
    job_id, 
    cast(substring(job_name, 4, 4) as integer) as year 
from job_header 
where 
    job_type = 'PAMA' 
    and cast(substring(job_name, 4, 4) as integer) >= 2019

The issue is we have some data where the string is formatted slightly differently, and there are 3 characters in the beginning, or 4 (instead of 2), followed by an underscore like this...

APR3_2018_Another_Test_SC_PAMA

I still want to pull the first 4 numbers (so 2018 in this case), cast to int etc...Is there a way to make my first query dynamic to handle this?

1 Answer 1

3

You can use a regular expression to find the first group of 4 digits, e.g.:

with job_header(job_id, job_name, job_type) as (
values
    (1, 'AG_2017_AJSJGHJS8GJSJ_7_25_WK_AHGSGHGHS', 'PAMA'),
    (2, 'APR3_2018_Another_Test_SC_PAMA', 'PAMA')
)
select 
    job_id, 
    cast(substring(job_name from '\d{4}') as int) as year 
from job_header 
where job_type = 'PAMA'

 job_id | year
--------+------
      1 | 2017
      2 | 2018
(2 rows)

Read about POSIX Regular Expressions in the documentation.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.