1

I want to duplicate a row based on the Parameter(Parameter is 999 by default) Column and ID. For example in the below sample, we have a threshold value 999, If an ID has a row with ParamComp = 999 and another row with ParamComp <>999, then for the row with ParamComp <>999 we must create a new record with the ColVal of ParamComp = 999.

If an ID has rows with only ParamComp = 999, Just load it to the target directly (No duplication logic is needed).

Also If an ID has rows with only ParamComp <> 999, Just load it to the target directly (No duplication logic is needed)

Input Data

id  ParamComp   ColVal 
1   999         a
1   80          b
2   999         c
3   85          d

Target Data

id  ParamComp   ColVal  
1   999         a
1   80          b
1   80          a
2   999         c
3   85          d
4
  • I don't think your sample data follows the rules you have specified. If so, then id = 1 would have two rows with 999. Please clarify the question. Commented Feb 24, 2017 at 11:47
  • When you say you want to "duplicate a column" you mean a row, correct? I don't see any additional columns created. Then: if you create a new row based on an existing one, but the ParamComp value is different, then it's not even a "duplicate". And in the title: a relational database has "rows" rather than "records." Commented Feb 24, 2017 at 11:51
  • @GordonLinoff For an ID, 999 is the value to be compared, Based on the ColVal of 999 other rows are duplicated -(In the case if an ID has ParamComp as '999' and a non 999 value). Commented Feb 24, 2017 at 12:03
  • @mathguy I have corrected the question. Commented Feb 24, 2017 at 12:03

2 Answers 2

1

An alternative to Gordon's answer (which may or may not be faster) is to do a partial cross join on a two-row dummy "table", like so:

WITH your_table AS (SELECT 1 ID, 999 paramcomp, 'a' colval FROM dual UNION ALL
                    SELECT 1 ID, 80 paramcomp, 'b' colval FROM dual UNION ALL
                    SELECT 2 ID, 999 paramcomp, 'c' colval FROM dual UNION ALL
                    SELECT 3 ID, 85 paramcomp, 'd' colval FROM dual UNION ALL
                    SELECT 4 ID, 999 paramcomp, 'e' colval FROM dual UNION ALL
                    SELECT 4 ID, 75 paramcomp, 'f' colval FROM dual UNION ALL
                    SELECT 4 ID, 70 paramcomp, 'g' colval FROM dual)
-- end of mimicking your table; see SQL below:
SELECT yt.ID,
       yt.paramcomp,
       case WHEN dummy.id = 1 THEN yt.colval
            WHEN dummy.id = 2 THEN yt.paramcomp_999_colval
       END colval
FROM   (SELECT ID,
               paramcomp,
               colval,
               MAX(CASE WHEN paramcomp = 999 THEN colval END) OVER (PARTITION BY ID) paramcomp_999_colval
        FROM   your_table) yt
       INNER JOIN (SELECT 1 ID FROM dual UNION ALL
                   SELECT 2 ID FROM dual) dummy ON dummy.id = 1 -- ensures every yt row is returned
                                                   OR (dummy.id = 2
                                                       AND paramcomp_999_colval IS NOT NULL
                                                       AND yt.paramcomp != 999) -- returns an extra row if the 999 paramcomp row exists but the current row isn't 999
ORDER BY yt.ID, yt.paramcomp DESC, yt.colval;

        ID  PARAMCOMP COLVAL
---------- ---------- ------
         1        999 a
         1         80 b
         1         80 a
         2        999 c
         3         85 d
         4        999 e
         4         75 e
         4         75 f
         4         70 g
         4         70 e

This assumes that there is only ever one 999 paramcomp row per id (e.g. a unique constraint on (id, paramcomp) exists).

You'd have to test this and Gordon's answer to see which is most performant against your data.


ETA: here's a fixed version of Gordon's answer for you to compare with:

select id, paramcomp, colval
from your_table
union all
select id, paramcomp, paramcomp_999_colval colval
from (select yt.*, MAX(CASE WHEN paramcomp = 999 THEN colval END) OVER (PARTITION BY ID) paramcomp_999_colval
      from your_table yt
     ) t
where paramcomp_999_colval IS NOT NULL and paramcomp <> 999
ORDER BY ID, paramcomp DESC, colval;

ETA2: More explanation of the use of the dummy table:

If you wanted to duplicate all rows in your table, you would do a cross join to a table/subquery that has two rows, like so:

SELECT *
FROM   your_table yt
CROSS JOIN (SELECT 1 ID FROM dual UNION ALL
            SELECT 2 ID FROM dual) dummy;

        ID  PARAMCOMP COLVAL         ID
---------- ---------- ------ ----------
         1        999 a               1
         1         80 b               1
         2        999 c               1
         3         85 d               1
         4        999 e               1
         4         75 f               1
         4         70 g               1
         1        999 a               2
         1         80 b               2
         2        999 c               2
         3         85 d               2
         4        999 e               2
         4         75 f               2
         4         70 g               2

However, you don't always want the duplicate row to appear, so you need to do an inner join that's a bit selective. I'll break down the inner join in my initial answer so you can hopefully see what it's doing a bit better.

First, here's the part of the join that ensures that each row in your_table is returned:

SELECT *
FROM   your_table yt
INNER JOIN (SELECT 1 ID FROM dual UNION ALL
            SELECT 2 ID FROM dual) dummy ON dummy.id = 1;

        ID  PARAMCOMP COLVAL         ID
---------- ---------- ------ ----------
         1        999 a               1
         1         80 b               1
         2        999 c               1
         3         85 d               1
         4        999 e               1
         4         75 f               1
         4         70 g               1

Next, here's the part of the join that ensures the selective joining

SELECT *
FROM   your_table yt
INNER JOIN (SELECT 1 ID FROM dual UNION ALL
            SELECT 2 ID FROM dual) dummy ON dummy.id = 2
                                            AND yt.paramcomp != 999;

        ID  PARAMCOMP COLVAL         ID
---------- ---------- ------ ----------
         1         80 b               2
         3         85 d               2
         4         75 f               2
         4         70 g               2

You can see with this second part that we still get the id = 3 row, which we don't want. So, in my final answer above, I found out what the colval of the paramcomp = 999 row was and returned that for all rows, using a conditional max analytic function. Then, I added that into the 2nd join condition part to only return rows that had a 999 colval (if they don't have a value, then we assume that the 999 row doesn't exist). This does assume that the colval will always be present for the 999 row.

Sign up to request clarification or add additional context in comments.

8 Comments

Gordon's answer was not giving the desired output, It was duplicating the record with 999 itself and copying colval of non 999 record to it. Your answer is Working d gives me desired output. Do we have any workaround to correct Gordon's answer, so that I could compare the performance.
I've updated my answer to include a working version of Gordon's answer.
I've also updated my answer to include a more detailed explanation of the dummy table usage. If you still don't understand, I suggest you break the query down and have a play around for yourself, e.g. commenting out bits of the join conditions etc, to see what it's doing.
You'd need to use a conditional analytic function to check for rows in the partition (aka group) that aren't paramcomp = 999, then you'd include that check in the join to dummy.id = 1 section. Try to work it out for yourself and if you get stuck, please raise a new question.
If you look in my initial solution, you will see that in the yt subquery I select MAX(CASE WHEN paramcomp = 999 THEN colval END) OVER .... The presence of the OVER clause indicates that it's an analytic function. The CASE statement inside the MAX() indicates that it's conditional. In your case, you need to check that there is at least one row that doesn't have a paramcomp = 999. Then you can add that into the join condition in the relevant place.
|
0

If I assume that 999 is the maximum value for paramcomp, then an analytic function and union all can solve the problem. Following the rules you specify in the text this would be:

select id, paramcomp, colval
from t
union all
select id, 999 as paramcomp, colval
from (select t.*, max(paramcomp) over (partition by id) as max_paramcomp
      from t
     ) t
where max_paramcomp = 999 and paramcomp <> 999;

This is easily modified for a simple variant of the rules.

1 Comment

In this table we have only '999' as the threshold. Also I don't want the 999 row to get duplicated,

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.