1

I have an ecommerce table 'orders', and the table has a 'state' record and a 'fruit' record. I am trying (and failing) to create a query that will return a result of every state and the top 3 most popular fruits, shown in order.

So the 'orders' table looks like this:

id  State  Fruit
----------------
1    CA     grape
2    FL     orange
3    CA     grape
4    FL     grapefruit
5    CA     orange
6    CA     grape
7    FL     orange
8    CA     peach
9    CA     orange
10   FL     orange
11   FL     grapefruit
12   FL     peach
etc etc etc

The result of the query on this table would be:

the_state   the_fruits
------------------------
CA          grape, orange, peach
FL          orange, grapefruit, peach

I tried this:

SELECT state as the_state, 
(select count(id) as count, fruit from orders where state = the_state order by count(id) limit 3  ) as the_fruits
FROM orders
group by fruit
order by count(id) DESC

But that is not valid a valid query, and I am not sure I am on the right track

2 Answers 2

1

Limiting results of grouped data in MySQL is quite difficult. There are many solutions on various threads, but it may depend a lot on the type and amount of data you have.

The following is probably the easiest solution.

mysql> INSERT INTO orders VALUES
    -> ('1', 'CA', 'grape'),
    -> ('2', 'FL', 'orange'),
    -> ('3', 'CA', 'grape'),
    -> ('4', 'FL', 'grapefruit'),
    -> ('5', 'CA', 'orange'),
    -> ('6', 'CA', 'grape'),
    -> ('7', 'FL', 'orange'),
    -> ('8', 'CA', 'peach'),
    -> ('9', 'CA', 'orange'),
    -> ('10', 'FL', 'orange'),
    -> ('11', 'FL', 'grapefruit'),
    -> ('12', 'FL', 'peach'),
    -> ('13', 'CA', 'apple'),
    -> ('14', 'FL', 'apple');
Query OK, 14 rows affected (0.03 sec)
Records: 14  Duplicates: 0  Warnings: 0

mysql> select state, fruit, count(fruit) cf from orders group by state, fruit order by state, cf desc;
+-------+------------+----+
| state | fruit      | cf |
+-------+------------+----+
| CA    | grape      |  3 |
| CA    | orange     |  2 |
| CA    | peach      |  1 |
| CA    | apple      |  1 |
| FL    | orange     |  3 |
| FL    | grapefruit |  2 |
| FL    | peach      |  1 |
| FL    | apple      |  1 |
+-------+------------+----+
8 rows in set (0.00 sec)

SELECT state
     , SUBSTRING_INDEX(GROUP_CONCAT(fruit ORDER BY cf DESC, fruit),',',3) top3 
  FROM 
     ( SELECT state
            , fruit
            , COUNT(fruit) cf 
         FROM orders 
        GROUP
           BY state
            , fruit
     ) t1 
 GROUP 
    BY state;
+-------+-------------------------+
| state | top3                    |
+-------+-------------------------+
| CA    | grape,orange,peach      |
| FL    | orange,grapefruit,apple |
+-------+-------------------------+
2 rows in set (0.00 sec)
Sign up to request clarification or add additional context in comments.

2 Comments

My table actually has many fruits, and this query does not return the top 3. It returns random 3. The first query with the table of state, fruit, and cf is correct.
I updated the group_concat function to order the top 3, ties ordered by fruit name. Let me know if this works.
0

As doog abides pointed out, my previous solution handled ties poorly. Here's an alternative, using variables as I mentioned before. It's also blindingly fast...

  DROP TABLE IF EXISTS orders;

  CREATE TABLE orders 
  ( id  INT NOT NULL AUTO_INCREMENT PRIMARY KEY
  , state CHAR(2)
  , fruit VARCHAR(20) NOT NULL
  );

  INSERT INTO orders VALUES
  (1    ,'CA','grape'),
  (2    ,'FL','orange'),
  (3    ,'CA','grape'),
  (4    ,'FL','grapefruit'),
  (5    ,'CA','orange'),
  (6    ,'CA','grape'),
  (7    ,'FL','orange'),
  (8    ,'CA','peach'),
  (9    ,'CA','orange'),
  (10   ,'FL','orange'),
  (11   ,'FL','grapefruit'),
  (12   ,'FL','peach'),
  (13   ,'FL','banana');

  SELECT state
       , fruit
       , total
       , IF(@prev_state = state, IF(@prev_total=total,@rank,@rank:=@rank+1),@rank:=1) rank
       , @prev_state := state
       , @prev_total := total
    FROM 
       ( SELECT state 
              , fruit
              , COUNT(*) total
           FROM orders 
          GROUP  
             BY state
              , fruit
        ) x
        , ( SELECT @prev_state := null, @prev_total:=null, @rank := 0) vars
    ORDER 
       BY state,total DESC;

  +-------+------------+-------+------+----------------------+----------------------+
  | state | fruit      | total | rank | @prev_state := state | @prev_total := total |
  +-------+------------+-------+------+----------------------+----------------------+
  | CA    | grape      |     3 |    1 | CA                   |                    3 |
  | CA    | orange     |     2 |    2 | CA                   |                    2 |
  | CA    | peach      |     1 |    3 | CA                   |                    1 |
  | FL    | orange     |     3 |    1 | FL                   |                    3 |
  | FL    | grapefruit |     2 |    2 | FL                   |                    2 |
  | FL    | banana     |     1 |    3 | FL                   |                    1 |
  | FL    | peach      |     1 |    3 | FL                   |                    1 |
  +-------+------------+-------+------+----------------------+----------------------+ 

1 Comment

Unfortunately, if you add another single count fruit you will have two rows with rank 4 and none with rank 3 which means there is no way to limit it to the top 3.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.