1

As part of my attribution modeling setup, I need to assign order_id's to the interactions that happened before the assigned transaction but after any previous order.

Example, I have a table like this:

enter image description here

And I would like to have somethin like this:

enter image description here

thus filling in where the order_id is null.

I have tried using a self join and calling the order_id where time is <= than the time in the second join but with no luck, this duplicates some of the rows.

EDIT:

here is my sql attempt:

select
c.cookie_id,
c.order_id,
c.channel,
c.min_report_timestamp
,case when lead(c.cookie_id) over (order by min_report_timestamp desc) is null then c.order_id else c.order_id end as order_id_ok

,lead(c.order_id) over (order by min_report_timestamp desc) as lead_order
--,lag(c.order_id) over (order by min_report_timestamp asc) as lag_order

from table c

group by 1,2,3,4

I have an idea on the conditions I should use, the problem is I can't make it fill in the missing spaces with the order_id i need, it doesn't "carry" the order id row by row

3
  • you might have luck with a LAG function if available. Commented Mar 11, 2016 at 17:09
  • Can you post the sample data actually and not as screenshot so we can grab for test run? Also please post your attempted query which might need slight adjustment and shows your effort. Commented Mar 11, 2016 at 18:13
  • thanks guys that really did the trick Commented Mar 12, 2016 at 13:01

2 Answers 2

1
--Create table with dummy data.
with c(cookie, channel, order_id, order_timestamp) AS(
values
('hxaxlw79u', 'price_comparison', Null, '2016-03-10 10:24:55'),
('hxaxlw79u', 'price_comparison', Null, '2016-03-10 10:24:56'),
('hxaxlw79u', 'price_comparison', Null, '2016-03-10 10:24:57'),
('hxaxlw79u', 'price_comparison', 1, '2016-03-10 10:24:58'),
('hxaxlw79u', 'price_comparison', Null, '2016-03-10 10:24:59'),
('hxaxlw79u', 'price_comparison', Null, '2016-03-10 10:25:00'),
('hxaxlw79u', 'price_comparison', Null, '2016-03-10 10:25:01'),
('hxaxlw79u', 'price_comparison', 2, '2016-03-10 10:25:02'),
('hxaxlw79u2', 'price_comparison', Null, '2016-03-10 10:25:00'),
('hxaxlw79u2', 'price_comparison', 1, '2016-03-10 10:25:01'),
('hxaxlw79u2', 'price_comparison', Null, '2016-03-10 10:25:02'),
('hxaxlw79u2', 'price_comparison', 2, '2016-03-10 10:25:02')

),

--Get a lagged table.
Data AS
(
SELECT      c.cookie, c.channel, c.order_id, c.order_timestamp, 
        LAG(COALESCE(c.order_id, 0), 1, 0) OVER (PARTITION BY c.cookie, c.channel   ORDER BY c.order_timestamp) as lag

FROM c
)

--Get the result
SELECT      d.cookie, d.channel, d.order_id, d.order_timestamp,
        1+ SUM(d.lag) OVER(PARTITION BY d.cookie, d.channel ORDER BY d.order_timestamp) as result


FROM        data d

EDIT: Changed names of table to reflect names in OP and removed extra column in final query.

Sign up to request clarification or add additional context in comments.

2 Comments

Whoops, I accidentally pasted out my comments... I was saying that this answer is a bit convoluted but seems to work for me. I added some extra data to test it for a more general case.
Oh and btw, it's the result column in the end that will give you what you're after. I seem to have left a lot of unnecessary columns in the end.
1

Consider a correlated aggregate subquery without need of a window function:

SELECT c.cookie, c.channel, c.order_id, c.order_timestamp,
      (SELECT Min(sub.order_id) 
       FROM Table sub 
       WHERE sub.order_timestamp >= c.order_timestamp
         AND sub.cookie = c.cookie) as new_id
FROM Table c

3 Comments

Thanks parfait but in your solution the order_id selected for each group is the actual minimum of all the cookie id's, so all the order_id's for cookie='hxaxlw79u' are = 1
Your solution works if I add "and sub.cookie=c.cookie" to the new_id subquery
Indeed, adding that to subquery correlates by group. Your posted data only had one distinct cookie which I did not think would be a grouping factor. Had I known otherwise, I would have definitely added that condition to subquery.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.