0

I am trying to get the count of items given an interval with no start or stop times specified. I would imagine you could do it with window functions but i am not too sure how to go about it.

The problem is as follows i would like to get the number of times people login to a website within a given an arbitrary interval say 20 mins.

Example A

     1. 2015-06-24 23:00:00
     2. 2015-06-24 23:45:00
     3. 2015-06-25 00:00:00
     4. 2015-06-25 00:15:00
     5. 2015-06-25 00:17:00
     6. 2015-06-25 00:21:00

In the above example I would highlight items (2,3),(3,4,5), (4,5,6), (5,6) the output I would like is the

start,end,count
2015-06-25 23:45:00,2015-06-25 00:00:00,2
2015-06-25 00:00:00,2015-06-25 00:17:00,3
2015-06-25 00:15:00,2015-06-25 00:21:00,3

Also only keep the data where count >= 2 otherwise everything will be a valid grouping

Now is a window function the way i should go, cte or is there another practice to adopt?

6
  • Are those windows fixed or floating? If you had 6. 2015-06-25 01:21:00, how would it change the result? Commented Jun 26, 2015 at 23:31
  • if that were the case then the results would be batched (2,3),(3,4,5) I guess the thing I missed was that where the count is >=2 otherwise everything will be caught Commented Jun 26, 2015 at 23:33
  • why is 3 going into two groups at once? Commented Jun 26, 2015 at 23:37
  • Because it looks at every login time and adds the interval of 20mins and counts how many items fall in between there. so item 1. 1+interval has 1 and count is less then 2 so ignore 2. 2+interval has two items 2 and 3 3. 3+ interval has 3, 4,5 4. 4+interval has 4,5,6 5. 5+interval has 5,6 Commented Jun 26, 2015 at 23:40
  • 5+ is not in your resultset, is it intentional? Commented Jun 26, 2015 at 23:41

2 Answers 2

1

Try this query with self join:

select a.id, a.log_at, max(b.log_at), count(1)
from logs a
join logs b on b.log_at >= a.log_at and b.log_at <= a.log_at+ '20 m'::interval
group by 1, 2
having count(1) > 1
order by 1
Sign up to request clarification or add additional context in comments.

2 Comments

Brilliant thanks! It works I am not even going to worry about performance at this moment will work on that on the future.
If the query is too slow (on large amount of data), first create an index on log_at.
0

You can get each "day" groups with counts by a query like:

SELECT MIN(last_seen_at), MAX(last_seen_at), COUNT(*)
FROM user_kinds
GROUP BY DATE(last_seen_at)
ORDER BY DATE(last_seen_at) DESC LIMIT 5;

Which on my sample data set yields a result like:

 2015-06-26 00:12:30.476548 | 2015-06-26 22:06:25.134322 |    69
 2015-06-25 00:46:03.392651 | 2015-06-25 23:49:46.616964 |    14
 2015-06-24 14:22:33.578176 | 2015-06-24 23:39:01.32241  |    10
 2015-06-23 01:42:53.438663 | 2015-06-23 20:12:21.864601 |     2
(5 rows)

1 Comment

if someone logged in at 2015-06-23 23:59:00 and then 2015-06-24 00:10:00 they would show a count of 1 for each day. Even thought they are within my 20 min interval.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.