2

I have IDs, their as at date (end of month format), and their corresponding flags. I only care if the flag is Y.

It is in the following format:

ID as_at_date Enabled_Flag
1 31/01/2025 Y
1 28/02/2025 Y
1 31/03/2025 Y
1 30/06/2025 Y
2 31/01/2025 Y
2 28/02/2025 Y
2 31/03/2025 Y
2 31/04/2025 Y
2 30/06/2025 Y
2 30/09/2025 Y

As you can see, ID 1's flag is enabled from Jan-March. Because it does not have entries for April and May, it is disabled in those months, but is enabled again in June. Similar for ID 2.

I want to make the data in the following format using SQL (Teradata SQL preferably but any SQL is workable). Where the to_date is "01/01/3000", it is to indicate it is the current/most recent record.

ID from_date to_date Enabled_Flag
1 31/01/2025 31/03/2025 Y
1 30/06/2025 01/01/3000 Y
2 31/01/2025 31/04/2025 Y
2 30/06/2025 30/06/2025 Y
2 30/09/2025 01/01/3000 Y

Using a MIN()/MAX() on the data doesn't work because it'll just take the MAX() date but doesn't indicate to me if at any point an ID left the 'Y' population at any time. Please help me

3
  • You don't show any rows with enabled='N' do they exist and if so how should they be treated? Commented Jul 11 at 7:12
  • Note for the future, if you provide sample data as we have done in the answers, you make it much easier for people to answer. Commented Jul 11 at 7:32
  • Check this question a few days ago: stackoverflow.com/q/79695686/2527905 It's similar, you just need to convert as_at_date to a period covering the full month using period(as_at_date, oAdd_month(as_at_date) Commented Jul 11 at 8:39

3 Answers 3

2

I guess there is no simple, short query which produces the expected result.

You have to implement a heavy mix of steps (or you need to do it in another way rather than in pure SQL):

  1. Spotting real gaps
    You need to know when one month jumps straight to the next and when it doesn’t. SQL won’t just “see” that for you. You have to compare each date to the one before and say “yes, that’s a gap” or “no, that’s consecutive.”

  2. Bunching back‑to‑back months together
    Once you know where the gaps are, you still have to turn each run of consecutive months into a single row. That means to tag every row with a bucket number, then collapse those buckets with a simple MIN/MAX call.

  3. Handling as many IDs as you’ve got
    You can’t assume there are just one or two IDs. The same logic has to work if you have dozens or hundreds, each with its own pattern of gaps and runs.

  4. Ending the last run differently
    Every ID’s final enabled period is not supposed to end where it really ends. It rather needs the date 3000‑01‑01 to show the period is “ongoing.” You have to detect “this is the last group for that ID” and swap in the fake end date.

  5. Putting it all together
    To keep it in a single query in pure SQL, you need to...

    • First tag each row with “where’s the previous date?”

    • Then turn those tags into running group numbers

    • Then group by those numbers to get from/to

    • Then find the final group and patch its end date

Each of those steps is straightforward on its own, but to combine them inside one SQL query is quite hard. The following query is the best solution I could build using ANSI-compliant functions:

SELECT
  id,
  MIN(as_at_date) AS from_date,
  /* If this group's end date is the last date we see for this ID,
     make it open‑ended as 3000‑01‑01 */
  CASE
    WHEN MAX(as_at_date) = MAX(MAX(as_at_date)) OVER (PARTITION BY id)
    THEN CAST('3000-01-01' AS DATE)
    ELSE MAX(as_at_date)
  END AS to_date,
  'Y' AS Enabled_Flag

FROM (
  SELECT
    id,
    as_at_date,
    /* 
       Build groups so that true month‑to‑month runs get the same group:
       1) turn the date into a month number (year * 12 + month)
       2) subtract the row’s sequence number for that ID
       as long as dates are back‑to‑back months,
       the difference stays the same. When there's a gap, it changes.
    */
    (EXTRACT(YEAR FROM as_at_date) * 12
     + EXTRACT(MONTH FROM as_at_date))
    - ROW_NUMBER() OVER (PARTITION BY id ORDER BY as_at_date)
      AS grp
  FROM your_table
  WHERE Enabled_Flag = 'Y'
) t
-- group by id and group to collapse each run into one row
GROUP BY id, grp
ORDER BY id, from_date;

This produces the expected result for your sample data, I verified this on this db<>fiddle (the fiddle is a Postgres DB because I could not find one for Teradata, but the query should do in Teradata, too).

Note 1: There are maybe options to simplify this code (for example, in Postgres you likely could use some functions like RANGE_MERGE). But this requires RDBMS-specific functions. Since I don't know much about Teradata, I stuck to "Standard SQL".

Note 2: What about rows with other values in the column Enabled_Flag?

Sign up to request clarification or add additional context in comments.

Comments

2

You said any SQL will do, so here is a solution for SQL Server - it will certainly need translating to Teradata.

It's not as concise as other answers, my way of thinking is to build up the logic in a methodical way. It's potentially not as performant though.

This is called a gaps-and-islands problem, you are trying to find "islands" where the date is always the next month from the previous row. This calculation works as follows:

  • Use LAG to determine whether or not the row is exactly one month on from the previous row.
  • Assign a number representing the group to each set of consecutive rows.
  • For each "island" group take the min date and the max date.
  • Have some special logic to detect the last row for an ID and assign the fixed date in the future.
WITH TempCalc1 AS (
  SELECT *
    -- Check whether this row is exactly 1 month on from the previous row
    , CASE WHEN dateadd(day, -1, dateadd(month, -1, dateadd(day, 1, as_at_date))) = LAG(as_at_date) OVER (PARTITION BY ID ORDER BY as_at_date asc) THEN 0 ELSE 1 END ChangeOfGroup
  FROM MyData
), TempCalc2 AS (
  SELECT *
    -- Use the ChangeOfGroup value to allocate a Group Number
    , SUM(ChangeOfGroup) OVER (PARTITION BY ID ORDER BY as_at_date ASC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) GroupNumber
    -- Work out which is the last date so we can allocate the fixed end date
    , MAX(as_at_date) OVER (PARTITION BY ID) MaxIdDate
  FROM TempCalc1
)
SELECT Id
  -- Take the first date in the group
  , MIN(as_at_date) from_date
  -- And the last date in the group, unless it's the last row for the ID and then take the fixed end date
  , CASE WHEN MIN(as_at_date) = MaxIdDate THEN CAST('1 Jan 3000' AS datetime2) ELSE MAX(as_at_date) END to_date
  , MAX(Enabled_Flag) Enabled_Flag
FROM TempCalc2
GROUP BY ID, GroupNumber, MaxIdDate
ORDER BY ID, GroupNumber;
Id from_date to_date Enabled_Flag
1 2025-01-31 00:00:00.0000000 2025-03-31 00:00:00.0000000 1
1 2025-06-30 00:00:00.0000000 3000-01-01 00:00:00.0000000 1
2 2025-01-31 00:00:00.0000000 2025-04-30 00:00:00.0000000 1
2 2025-06-30 00:00:00.0000000 2025-06-30 00:00:00.0000000 1
2 2025-09-30 00:00:00.0000000 3000-01-01 00:00:00.0000000 1

db<>fiddle

Note: You must always use an unambiguous date format when presenting date data in string format. Hence why I changed it

1 Comment

General remark about your answers: I'm wondering if you could perhaps omit the VALUES from your answers (let them for the fiddle): they can make a big, impressive block of SQL that makes your answer look discouraging (and sometimes requires scrolling), while in fact after ignoring that test data it appears that it is concise and clean.
0

For Teradata, this can be solved using period data type and normalize. Since your as_of_date is always the end of the month, we can create a period column for each row using the first of the month through the as_of date, unless the dates aren't contiguous (which we determine using lead). In that case, we use the as_of_date as the beginning, and 3000-01-01 as the end. I've tried to explain things in sql comments. Based on your very limited sample data, this looks to work.

select
    id,
    begin(prd) as from_date,
    end(prd) -1 as end_date,
    enabled_flag
from (
    select normalize
    id,
    enabled_flag,
    period(
        --get a beginning date
        --if the dates are contiguous, use the first day of the month
        --if not, use the as_of_date
        case when lead (trunc(as_of_date,'MONTH')) over (partition by id order by as_of_date)is null then as_of_date else  trunc(as_of_date,'MONTH') end,
        --get an end date
        --if the dates are contiguous, use the as_of_Date + 1 day 
            --(normalize/period the start date in inclusive, the end date is exclusive)
        --if not, use 3000-01-01'   
        case when lead (trunc(as_of_date,'MONTH')) over (partition by id order by as_of_date)is null then date '3000-01-01' else as_of_date + 1 end) as prd
    from
    <your table>
) t
order by 1,2

3 Comments

Tried this on my data. Works great!! Appreciate the answer. I need the dates to be month-ends rather than start of the months, I'm guessing it's the lead function doing that.
- try 'last_day(begin(prd))` to get the month end date.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.