0

That's my query to get first $count rows for each city/subcategory combination

$contacts = $dbh->prepare("
    SELECT *
    FROM (SELECT c.*,
                 (@rn := IF(@cc = CONCAT_WS(':', city_id, subcategory_id), @rn + 1,
                            IF(@cc := CONCAT_WS(':', city_id, subcategory_id), 1, 1)
                           )
                 ) as rn
          FROM (SELECT reg.title as region_title, cnt.title, cnt.city_id, cnt.id, cnt.catalog_id, cnt.address, cnt.phone, cnt.email, cnt.website,  cnt.subcategory_title, cnt.subcategory_id, cnt.manufacturer 
                FROM contacts as cnt
                LEFT JOIN regions as reg 
                ON cnt.city_id = reg.id 
                WHERE city_id IN (".implode(',', $regions).") AND 
                      subcategory_id IN (".implode(',', $categories).") 
                ORDER BY subcategory_title, city_id, title
               ) c CROSS JOIN
               (SELECT @cc := '', @rn := 0) params
         ) c
    WHERE rn <= $count");

And i'm using $contacts->fetchAll(PDO::FETCH_GROUP); to group rows by reg.title

[ 
 ['City 1'] = > [ 
   [ contact 1 ],
   [ contact 2 ],
   ...
 ],
 ['City 2'] = > [ 
   [ contact 3 ],
   [ contact 4 ],
   ...
 ]
 ...
]

Now I need to upgrade that query but it's too complicated for me :( Selected rows must have unique contacts.catalog_id value.
How it can be done?

UPD
Here is a demo database - http://sqlfiddle.com/#!9/ac71d7/2

3
  • See meta.stackoverflow.com/questions/333952/… Commented Dec 12, 2017 at 0:19
  • when we say "unique contacts.catalog_id" value, do we want to exclude from the result any and all rows where there is more than one row with a given catalog_id? Or do we just want to ensure that catalog_id isn't repeated in result? Or that catalog_id isn't repeated for a given (city_id,subcategory_id)? Understanding and communicating the specification is 80% of the battle. And example data along with expected output would go a long ways towards illuminating the specification. Commented Dec 12, 2017 at 0:30
  • I have add an example data. We need unique catalog_id globally. Commented Dec 12, 2017 at 1:12

1 Answer 1

1

"We need unique catalog_id globally"

To identify unique values of catalog_id in contacts, we could use a query like this:

   SELECT r.catalog_id
     FROM contacts r
    GROUP BY r.catalog_id
   HAVING COUNT(1) = 1

That says, for a given row in contacts, if the value of catalog_id matches catalog_id on any other row in contacts, that catalog_id will be excluded from the result.

If we want to restrict the original query to returning only those values of catalog_id, we could include this query as an inline view, and join that to rows in contacts with matching catalog_id.


                    FROM contacts cnt
  -- ------------
                    JOIN ( SELECT r.catalog_id
                             FROM contacts r
                            GROUP BY r.catalog_id
                           HAVING COUNT(1) = 1
                         ) s
                      ON s.catalog_id = cnt.catalog_id
  -- ------------
                    LEFT
                    JOIN regions reg
                      ON reg.id = cnt.city_id

EDIT

If the specification is interpreted differently, instead of meaning catalog_id must be unique in contacts, we mean that a catalog_id should not be repeated in the result... we can use the same approach, but get single value of id from contacts for each catalog_id. We could write a query like this:

   SELECT MAX(r.id) AS max_id
        , r.catalog_id
     FROM contacts r
    GROUP BY r.catalog_id

We could use MIN() aggregate in place of MAX(). The goal is to return a single contacts.id for each discrete value of catalog_id.

We can incorporate that into the query as an inline view, matching max_id from the inline view to the id from contacts table.

Something like this:

                    FROM contacts cnt
  -- ------------
                    JOIN ( SELECT MAX(r.id) AS max_id
                             FROM contacts r
                            WHERE ... 
                            GROUP BY r.catalog_id
                         ) s
                      ON s.max_id = cnt.id
  -- ------------
                    LEFT
                    JOIN regions reg
                      ON reg.id = cnt.city_id

We probably want to move the conditions in the WHERE clause of the outer query into that inline view. If we don't, then the max_id returned by the inline view might reference an row (id) in contacts that doesn't satisfy the conditions in the WHERE clause.

Relocating the WHERE conditions on cnt into the inline view ...

SELECT d.*
  FROM ( SELECT c.*
              , ( @rn := IF( @cc = CONCAT_WS(':', city_id, subcategory_id)
                           , @rn + 1
                           , IF( @cc := CONCAT_WS(':', city_id, subcategory_id),1,1)
                         )
                ) AS rn
           FROM ( SELECT reg.title AS region_title
                       , cnt.title
                       , cnt.city_id
                       , cnt.id
                       , cnt.catalog_id
                       , cnt.address
                       , cnt.phone
                       , cnt.email
                       , cnt.website
                       , cnt.category_title
                       , cnt.subcategory_title
                       , cnt.subcategory_id
                       , cnt.manufacturer
                    FROM contacts cnt
  -- --------------
                    JOIN ( SELECT MAX(r.id) AS max_id
                             FROM contacts r
                            WHERE r.city_id        IN ( ... ) 
                              AND r.subcategory_id IN ( ... )
                              AND r.email          IS NOT NULL
                              AND r.manufacturer   = 1
                            GROUP BY r.catalog_id
                         ) s
                      ON s.max_id = cnt.id
  -- --------------
                    LEFT
                    JOIN regions reg
                      ON reg.id = cnt.city_id
                   ORDER
                      BY cnt.category_title
                       , cnt.subcategory_title
                       , cnt.city_id
                       , cnt.title
                ) c
          CROSS
           JOIN ( SELECT @cc := '', @rn := 0) i
       ) d
 WHERE d.rn <= 10

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks for your answer! But this query a bit wrong. Look here sqlfiddle.com/#!9/e42628/5 Here should be 5 rows. Rows with catalog_id '4926348963357290 and '70000001018295015 should be included
catalog_id 4926348963357290 does not appear to satisfy the specification "unique catalog_id globally". As I said in my comment on the question, it really depends on what the specification is, how the specification is interpreted. My comment asked whether the specification meant that we should only return catalog_id values that are unique in contacts, or whether we are just ensuring a catalog_id value isn't repeated in the result. Again, this is where providing example of the expected result would help clarify the specification (in a way that "unique catalog_id globally" doesn't.)
sorry for the language barrier and thank you for help!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.