0

I have a table in PostgreSQL that has two date fields ( start and end ). There are many invalid dates both date fields like 0988-08-11,4987-09-11 etc.. Is there a simple query to identify them? The data type of the field is DATE. Thanks in advance.

8
  • 3
    A column defined as date can not contain invalid dates. Commented Sep 18, 2018 at 18:40
  • What is your definition of an "invalid date"? Commented Sep 18, 2018 at 18:42
  • The dates in your question look just fine... Commented Sep 18, 2018 at 18:43
  • @a_horse_with_no_name : These incorrect dates entered the system using some old tool. The manual input does not accept this format. I am trying to find out these bad dates and delete them. Thanks. Commented Sep 18, 2018 at 18:45
  • 1
    select all the dates that are not between a min and a max date (a range of "valid" dates according to your business logic). Commented Sep 18, 2018 at 18:49

2 Answers 2

1

Values in a date column ARE valid per definition. The year 0988 = 988 is a valid historic date as well as the year 4987 which is far in the future.

To filter out dates which are too historic or too far in the future you simply make this query:

SELECT 
    date_col        
FROM
    table
WHERE 
       date_col < /* <MINIMUM DATE> */ 
    OR date_col > /* <MAXIMUM DATE> */

For date ranges (your minimum and maximum date) you could use the daterange functionality:

Example table:

start_date    end_date
2015-01-01    2017-01-01   -- valid
 200-01-01     900-01-01   -- completely too early
3000-01-01    4000-01-01   -- completely too late
0200-01-01    2000-01-01   -- begin too early
2000-01-01    4000-01-01   -- end too late
 200-01-01    4000-01-01   -- begin too early, end too late

Query:

SELECT 
    start_date, 
    end_date 
FROM 
    dates 
WHERE 
    daterange('1900-01-01', '2100-01-01') @> daterange(start_date, end_date)

Result:

start_date    end_date
2015-01-01    2017-01-01

demo:db<>fiddle

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you so much. It should help. The organization is just 20 years old and the start and end of an employee can't be 988 or 4987.
0

Those are valid dates, but if you have business rules that state they are not valid for your purpose, you can delete them based on those rules:

For example, if you don't want any dates prior to 1900 or after 2999, this statement would delete the records with those dates:

DELETE FROM mytable
WHERE
    start_date < '1900-01-01'::DATE OR
    start_date >= '2999-01-01'::DATE OR
    end_date < '1900-01-01'::DATE OR
    end_date >= '2999-01-01'::DATE;

If you want to replace the dates with the lowest/highest acceptable dates instead of deleting the entire record, you could do something like this:

UPDATE mytable
SET
    start_date = least('2999-01-01'::DATE, greatest('1900-01-01'::DATE, start_date)),
    end_date = least('2999-01-01'::DATE, greatest('1900-01-01'::DATE, end_date))
WHERE
    start_date < '1900-01-01'::DATE OR
    start_date >= '2999-01-01'::DATE OR
    end_date < '1900-01-01'::DATE OR
    end_date >= '2999-01-01'::DATE;

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.