Let's assume table with 3 columns (originally it's a big table): id, is_deleted, date. I have to check if given id's are deleted or not and create new column with this value (TRUE or FALSE). Let's simplify it to below table (before):
| id | is_deleted | date |
|---|---|---|
| A | False | 03-07-2022 |
| A | True | 04-07-2022 |
| B | False | 05-07-2022 |
| B | False | 06-07-2022 |
| C | True | 07-07-2022 |
(after):
| id | is_deleted | date | deleted |
|---|---|---|---|
| A | True | 03-07-2022 | TRUE |
| A | False | 04-07-2022 | TRUE |
| B | False | 05-07-2022 | FALSE |
| B | False | 06-07-2022 | FALSE |
| C | True | 07-07-2022 | TRUE |
So we can see that row with ids A and C should have True value in new column. For given id could be more than one TRUE value in is_deleted column. If any id has at least one TRUE value, all rows with given id should be deleted (TRUE value in new column). I need to do it inside this table, without group by, cuz by choosing group by, I have to create another CTE to join it with and it complicates a problem and performance.
I want to just create single column inside this table with new deleted value.
I've found bool_or function, but it won't work with window functions in redshift, my code:
bool_or(is_deleted) over(partition by id) as is_del
I can't use max, sum functions on boolean. Casting bool to int worsens the performance. Is there any other way to do it using booleans and keep good performance?
Thank you.
Ahave ais_delvalue ofTrue, or just the one row withis_deleted=True? It's not clear to me. Perhaps sharing Desired Results after this operation is complete would help clarify.