1

I have a huge log table and I need to fetch some data for usage statistics. let's say we have a log table:

| user_id | action            |
| 12345   | app: IOs          |
| 12345   | app_version: 2018 |
| 12346   | app: Android      |
| 12346   | app_version: 2019 |
| 12347   | app: Windows      |
| 12347   | app_version: 2019 |

Is there a way to fetch all user ids who uses old(2018) mobile apps?

There is a way I did it but it is not efficient

SELECT 
     user_id
FROM 
    log
WHERE 
    action LIKE '%2018%'
AND 
    user_id IN (SELECT DISTINCT user_id FROM log WHERE(action LIKE '%IOs%' OR action LIKE '%Android%' ))
GROUP BY user_id

This query took about half an hour on production.

So in the end I want to have list of user ids as efficient as possible as I also will join another table to get their emails. What options do I have?

1
  • Any EXPLAIN output? Commented Apr 25, 2019 at 11:49

3 Answers 3

1

You can use aggregation:

SELECT l.user_id
FROM log l
WHERE l.action LIKE '%2018%' OR
      l.action LIKE '%IOs%' OR
      l.action LIKE '%Android%'
GROUP BY l.user_id
HAVING SUM(l.action LIKE '%2018%') > 0 AND       -- at least one 2018
       SUM(l.action LIKE '%2018%') <> COUNT(*);  -- at least one other

Unfortunately, the LIKE comparisons require scanning the log table. The only way around this would be to use a full text index.

You can simplify the logic to:

SELECT l.user_id
FROM log l
WHERE l.action REGEXP '2018|IOs|Android'
GROUP BY l.user_id
HAVING SUM(l.action LIKE '%2018%') > 0 AND       -- at least one 2018
       SUM(l.action LIKE '%2018%') <> COUNT(*);  -- at least one other

I'm not sure if one REGEXP is (marginally) faster than three LIKEs or not.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you for your input, I tried it but my initial query seems to be faster.
0

You can use EXISTS :

SELECT l.*
FROM log l
WHERE EXISTS (SELECT 1 FROM log l1 WHERE l1.user_id = l.user_id AND l1.action LIKE '%2018%');

1 Comment

I've tried to run it, but it takes ages to run on my test system, so regardless of output it's much slower that my query. But still, thank you for input.
0

Here is my solution with a LEFT JOIN. I understand that you have a big logging table so this might not be the best one. I also added a few more records for testing:

Basically I use the LEFT JOIN to move data from columns to rows so that I can simply filter with WHERE.

SQL fiddle: https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=9db538e59b3d265e4e8d8559762e79d4

WITH log_table AS (
      SELECT *
      FROM (VALUES (12345, 'app: iOS'),
                   (12345, 'app_version: 2018'),
                   (12346, 'app: Android'),
                   (12346, 'app_version: 2019'),
                   (12347, 'app: Windows'),
                   (12347, 'app_version: 2019'),
                   (12348, 'app: iOS'),
                   (12348, 'app_version: 2019'),
                   (12349, 'app: Android'),
                   (12349, 'app_version: 2018'),
                   (12350, 'app: Windows'),
                   (12350, 'app_version: 2018')
           ) v(user_id, action)
)
SELECT 
    L.user_id
FROM 
    log_table AS L 
    LEFT JOIN log_table AS L2 
         ON L.user_id = L2.user_id
WHERE (L.action LIKE '%iOS%' OR L.action LIKE '%Android%') AND L2.action LIKE '%2018%'

The result: (only select those with iOS or Android and have 2018 version)

user_id
 12345
 12349

2 Comments

Am I right understood that you usung WITH clause to emulate data and in my case with real data I can just use select statment with my log table? If so, then it also takes a long time to run :( Still, thank you for you help.
@Vladk Yeah you only need the SELECT part to run, and change the name of the table. You might want to try out Gordon's answer as he is much more experienced with SQL.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.