0

I am trying to use user defined variables to limit the results of a subquery, in order to get the difference between two timestamps in some analytics data. The code I am working with is as follows:

SELECT  @visitID := `s`.`visit_id` AS `visit_id`,                             # Get the visit ID and assign to a variable
        @dt :=      `s`.`dt` AS `visit`,                                      # Get the timestamp of the visit and assign to a variable
                    `tmp`.`dt` AS `next-visit`                                # Get the 'next visit' timestamp which should be returned by the subquery

FROM `wp_slim_stats` AS `s`                                                   # From the main table...

LEFT JOIN (SELECT   `s`.`visit_id`,                                           # Start the subquery 
                    MIN(`s`.`dt`) as `dt`                                     # Get the lowest timestamp returned
               FROM `wp_slim_stats` AS `s`                                    # ...from the same table
               WHERE    `s`.`visit_id` = @visitID                             # Visit ID should be the same as the row the main query is working on
                   AND  `s`.`dt` > @dt                                        # Timestamp should be HIGHER than the row we are working on
               LIMIT 0, 1) as `tmp` ON `tmp`.`visit_id` = `s`.`visit_id`      # Join on visit_id

WHERE `s`.`resource` LIKE 'foo%'                                              # Limit all results to the page we are looking for

The intention is to get an individual pageview and record its visit ID and the timestamp. The subquery should then return the next record from the database with the same visit ID. I can then subtract one from the other to get the seconds spent on a page.

The problem I am running into is that the subquery seems to be re-evaluating for each row returned, and not populating the next-visit column until the end. This means that all the rows returned are matched against the subquery's results for the final row, thus all next-visit columns are null apart from the final row.

The results I am looking for would be something like:

_________________________________________________
| visit_id     |     visit     |      next-visit|
|--------------|---------------|----------------|
|      1       | 123456789     |  123457890     |
|--------------|---------------|----------------|
|      4       | 234567890     |  234567891     |
|--------------|---------------|----------------|
|      6       | 345678901     |  345678902     |
|--------------|---------------|----------------|
|      8       | 456789012     |  456789013     |
|______________|_______________|________________|

But I am getting

_________________________________________________
| visit_id     |     visit     |      next-visit|
|--------------|---------------|----------------|
|      1       | 123456789     |  NULL          |
|--------------|---------------|----------------|
|      4       | 234567890     |  NULL          |
|--------------|---------------|----------------|
|      6       | 345678901     |  NULL          |
|--------------|---------------|----------------|
|      8       | 456789012     |  456789013     |
|______________|_______________|________________|

I am still fairly new to using variables in mySQL, particularly when assigning them dynamically. As I mentioned, I think I am messing up the order of operations somewhere, which is causing the subquery to re-populate each row at the end.

Ideally I need to be able to do this in pure mySQL due to restrictions that from the client, so no PHP unfortunately. Is it possible to do what I am trying to do?

Thank you!

1
  • Comment deleted to avoid misunderstanding, thanks for the tip @fancyPants Commented Aug 30, 2013 at 10:04

2 Answers 2

2

You don't need variables here at all.

SELECT `s`.`visit_id` AS `visit_id`,                          
       `s`.`dt` AS `visit`,                                   
        (SELECT MIN(dt) FROM `wp_slim_stats` ws WHERE ws.visit_id = s.visit_id AND ws.dt > s.dt)
FROM `wp_slim_stats` AS `s`                                                
WHERE `s`.`resource` LIKE 'foo%'

And to answer why your solution doesn't work, have a look at the order of operations in a sql query:

  1. FROM clause
  2. WHERE clause
  3. GROUP BY clause
  4. HAVING clause
  5. SELECT clause
  6. ORDER BY clause
Sign up to request clarification or add additional context in comments.

4 Comments

Brilliant! I have obviously over-thought this one ;)
With regard to the order of operations, how do sub-queries fit in? Would it go through the 6 stages for the main query, then the subquery, or the other way round, or process the FROMs from the each, then the WHEREs from each, and so forth?
The subquery is in the select clause of the main query, plus it's a dependant subquery (meaning it's output depends on values from the main query), so it would go through the first 5 stages of the main query, then the 6 stages of the subquery, then the last stage of main query again.
Cool, I will have a play around with it.
0

Here's the query you would need to run.

select visits.visitid as vId, temp.time as tTime, visits.time as vTime from visits inner join (select min(id) as firstId, visitid, time from visits v1 group by visitid)temp on visits.visitid = temp.visitid where id > temp.firstid group by visits.visitid;

See this SQL fiddle

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.