2

I have data that looks something like:

ID1   ID2  ID3  ID4
123  32    43   123
56   67    56   89
123  56   123   56

which basically describes a sequence starting at ID1 and ending at ID4. What I am interested in is just extracting the pattern, and not the IDs involved. for example, the pattern in the first row would be:

ABCA: since it starts at an ID, goes to a new ID (B), then another new ID (C), and back to the original ID (A).

For the second row it would be : ABAC

and for the third it would be: ABAB.

I am looking for an efficient way to do this in sql server instead of using a massive if statement for each potential case.

1 Answer 1

3

Hmmm. Here is a brute force method:

select 'A' +
       (case when id2 = id1 then 'A' else 'B' end) +
       (case when id3 = id1 then 'A'
             when id3 = id2 then 'B'
             when id2 = id1 then 'B'
             else 'C'
        end) +
       (case when id4 = id1 then 'A'
             when id4 = id2 then 'B'
             when id4 = id3 and id2 = id1 then 'B'
             when id3 = id2 then 'C'
             when id2 = id1 then 'C'
             else 'D'
        end)

This is a bit complicated, but something like this should work.

EDIT:

Here is another method that should work:

select t.*, pattern
from t outer apply(
     (select (max(case when id = 1 then val end) +
              max(case when id = 2 then val end) +
              max(case when id = 3 then val end) +
              max(case when id = 4 then val end)
             ) pattern
      from (select v.*,
                   char(ascii('A' + dense_rank() over (order by minpos) - 1)) as val
            from (select v.*, min(pos) over (partition by id) as minpos
                  from (values(id1, 1), (id2, 2), (id3, 3), (id4, 4)) as v(id, pos)
                 ) v
            ) v
      ) v;

Explaining how this works is quite a challenge. The values() command pivots the data into rows, so the first row ends up like:

id    pos
123     1
 32     2
 43     3
123     4

The next level puts the minimum pos where the value is found:

id    pos    minpos
123     1      1
 32     2      2
 43     3      3
123     4      1

(Note: it is a coincidence that the numbers are sequential.)

Then the dense_rank() turns this into letters:

id    pos    minpos   val
123     1      1       A
 32     2      2       B
 43     3      3       C
123     4      1       A

And the final aggregation puts this into the pattern ABCA.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.