0

I have two tables, products and categories, and a join table products_categories.

Categories are nested (via categories.parent_id). So, for any given product, it can belong to many, potentially nested categories.

The categories are structured like this:

  • department (depth = 0)
    • category (depth = 1)
      • class (depth = 2)

Each product will belong to one "department", one "category", and one "class".

So, a product like "Rad Widget", could belong to the "Electronics" department, the "Miscellaneous" category, and the "Widgets" class. My schema would look like this:

# products

id | name
---------------
1  | Rad Widget


# categories

id | parent_id | depth | name
--------------------------------------
1  | null      | 0     | Electronics
2  | 1         | 1     | Miscellaneous
3  | 2         | 2     | Widgets

# products_categories

product_id | category_id
------------------------
1          | 1
1          | 2
1          | 3

I'd like to run a query that lists all of a product's departments into a single row, like this:

product.id | product.name | department  | category      | class
-----------------------------------------------------------------
1          | Rad Widget   | Electronics | Miscellaneous | Widgets

I can't think of a way to do this, so I'm considering denormalizing my data, but I want to make certain I'm not missing something first.

5
  • I totally couldn't think of a decent way to word the title of this question, and I know it's terrible. :( Commented Jun 30, 2014 at 19:24
  • Are there ALWAYS three levels/depths of categories, or can the depth be variable? Commented Jun 30, 2014 at 19:26
  • And can a product have more categories from the same dept? Commented Jun 30, 2014 at 19:32
  • I assume from your statement that each product will belong to one department, one category and one class, if that is the case why do we need to map a product to each level.I mean if we map a product to last levet (class) from that we can identify which category and department.And we can go from there if the nesting is only three levels. Commented Jun 30, 2014 at 19:41
  • “Each product will belong to one "department", one "category", and one "class".” One and only one of each, should I edit to clarify? Commented Jun 30, 2014 at 23:01

2 Answers 2

2

Since each category (and by the way, you might want to rename either the table or the level so that "category" doesn't mean two different things) has a singular known parent, but an indeterminate number of unknown children, you need to "walk up" from the most specific (at depth = 2) to the most general category, performing a self-join on the category table for each additional value you want to insert.

If you're impatient, skip to the SQL Fiddle link at the bottom of the post. If you'd rather be walked through it, continue reading - it's really not that different from any other case where you have a surrogate ID that you want to replace with data from the corresponding table.

You could start by looking at all the information:

SELECT * FROM products AS P
        JOIN
    products_categories AS PC ON P.id = PC.product_id
        JOIN
    categories AS C ON PC.category_id = C.id
WHERE P.id = 1 AN D C.depth = 2;

+----+------------+------------+-------------+----+-----------+-------+---------+
| id | name       | product_id | category_id | id | parent_id | depth | name    |
+----+------------+------------+-------------+----+-----------+-------+---------+
| 1  | Rad Widget | 1          | 3           | 3  | 2         | 2     | Widgets |
+----+------------+------------+-------------+----+-----------+-------+---------+

First thing you have to do is recognize which information is useful and which is not. You don't want to be SELECT *-ing all day here. You have the first two columns you want, and the last column (recognize this as your "class"); you need parent_id to find the next column you want, and let's hold onto depth just for illustration. Forget the rest, they're clutter.

So replace that * with specific column names, alias "class", and go after the data represented by parent_id. This information is stored in the category table - you might be thinking, but I already joined that table! Don't care; do it again, only give it a new alias. Remember that your ON condition is a bit different - the products_categories has done its job already, now you want the row that matches C.parent_id - and that you only need certain columns to find the next parent:

SELECT
    P.id,
    P.name,
    C1.parent_id,
    C1.depth,
    C1.name,
    C.name AS 'class'
FROM
    products AS P
        JOIN
    products_categories AS PC ON P.id = PC.product_id
        JOIN
    categories AS C ON PC.category_id = C.id
        JOIN
    categories AS C1 ON C.parent_id = C1.id
WHERE
    P.id = 1
        AND C.depth = 2;

+----+------------+-----------+---------------+---------+
| id | name       | parent_id | name          | class   |
+----+------------+-----------+---------------+---------+
| 1  | Rad Widget | 1         | Miscellaneous | Widgets |
+----+------------+-----------+---------------+---------+

Repeat the process one more time, aliasing the column you just added and using the new C1.parent_id in your next join condition:

SELECT
    P.id,
    P.name,
    PC.category_id,
    C2.parent_id,
    C2.depth,
    C2.name,
    C1.name AS 'category',
    C.name AS 'class'
FROM
    products AS P
        JOIN
    products_categories AS PC ON P.id = PC.product_id
        JOIN
    categories AS C ON PC.category_id = C.id
        JOIN
    categories AS C1 ON C.parent_id = C1.id
        JOIN
    categories AS C2 ON C1.parent_id = C2.id
WHERE
    P.id = 1
        AND C.depth = 2;

+----+------------+-----------+-------+-------------+---------------+---------+
| id | name       | parent_id | depth | name        | category      | class   |
+----+------------+-----------+-------+-------------+---------------+---------+
| 1  | Rad Widget | NULL      | 0     | Electronics | Miscellaneous | Widgets |
+----+------------+-----------+-------+-------------+---------------+---------+

Now we're clearly done; we can't join another copy on C2.parent_id = NULL and we also see that depth = 0, so all that's left to do is get rid of the columns we don't want to display and double check our aliases. Here it is in action on SQL Fiddle.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks, this is great. Your fiddle link didn't work for me though. Still a great answer.
Oh, I also simplified all of the table names, it's actually store_collections. :D
0

If you want a list of all the categories, you can simply do a

Select Distinct p.category_id, c.name
From products_categories p Join
     categories c On p.category_id = c.id
Where p.product_id = 1

The problem is you are putting Classes and Departments into your Category table. Technically you'd be correctly normalizing your data by moving each of these to their own tables. I know the overhead of creating more tables is a pain but it'll simplify your queries (saving processing power and potentially bandwidth).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.