1

I am (via a web service) receiving a flat XML file that contains a listing of categories, subcategories, subsubcategories, etc, all at the same level. (i.e. Subcategories are not nested under their parent categories.) This, unfortunately, cannot be changed. I have to work with the data provided.

I'm taking this XML and converting it into an object, then into an array using simplexml_load_string and array_map. This is all working as expected. What I'm left with is a master category array that looks something like this.

Array
(
    [0] => Array
        (
            [CategoryID] => HARDWARE
            [Description] => Hardware Issue
        )
    [1] => Array
        (
            [CategoryID] => MAC_OSX
            [Description] => Mac OSX
            [ParentCategoryID] => OS
        )
    [2] => Array
        (
            [CategoryID] => OFFICE
            [Description] => Microsoft Office
            [ParentCategoryID] => SOFTWARE
        )
    [3] => Array
        (
            [CategoryID] => OS
            [Description] => Operating Systems
            [ParentCategoryID] => SOFTWARE
        )
    [4] => Array
        (
            [CategoryID] => WIN_7
            [Description] => Windows 7
            [ParentCategoryID] => OS
        )
    [5] => Array
        (
            [CategoryID] => SOFTWARE
            [Description] => Software Issue
        )
)

As you can see, there are subcategories mixed in there, all keyed off of the ParentCategoryID. Parent Categories have that field omitted.

The CategoryID will always be unique, no matter what level it is on. And the array is sorted alphabetically by the Description. The real array is over 250 categories long, above is an abbreviated version for example sake.

I need to take that master array, loop through it and come up with a new array that looks something like this.

Array
(
    [0] => Array
        (
            [CategoryID] => SOFTWARE
            [Description] => Software Issue
            [SubCategories] => Array
                (
                    [0] => Array
                        (
                            [CategoryID] => OS
                            [Description] => Operating Systems
                            [SubCategories] => Array
                                (
                                    [0] => Array
                                        (
                                            [CategoryID] => WIN_7
                                            [Description] => Windows 7
                                        )
                                    [1] => Array
                                        (
                                            [CategoryID] => MAC_OSX
                                            [Description] => Mac OSX
                                        )
                                )
                        )
                    [1] => Array
                        (
                            [CategoryID] => OFFICE
                            [Description] => Microsoft Office
                        )
                )
        )
    [1] => Array
        (
            [CategoryID] => HARDWARE
            [Description] => Hardware Issue
        )
)

Things I have to keep in mind is there may be infinitely many subcategories (so there isn't a set number of sublevels I can hard-code in). I've been trying to mess around with array_search and array_filter, but I'm just not having any luck getting something working. I obviously don't want to loop hundreds of times for this process to happen.

Does anyone with a little more experience with multidimensional arrays under their belt have some ideas, direction, or an example that may help me achieve my desired result?

8
  • Will the API always return the category before the subcategory items? Commented Jan 10, 2012 at 21:50
  • No, the API returns the categories in alphabetical order based on the Description. The array above was hand-crafted here as an example; The real array has over 250 categories. Commented Jan 10, 2012 at 21:51
  • Wow, that's not much to work with, especially if you happen to have a child category with the same name as another child category - it could get really messed up. Was there any more dimensional structure in the XML that you flattened? or does the XML come flat like this as well? Commented Jan 10, 2012 at 21:53
  • The CategoryID will always be unique, no matter what level it is on. The XML is flat as well. Commented Jan 10, 2012 at 21:54
  • 1
    I feel like there is a way to do this with 2 passes (1 to build a list of category links, another to populate the data) but it eludes me Commented Jan 10, 2012 at 22:08

2 Answers 2

1

OK, I figured out an algorithm (i think). The key is to build a linked list, and use placeholders for parent categories later in your list.

Create a category obect, which includes all the relevant data, a link to its parent and an array of links to its children. These will all go into a 1d array

Next loop through your inputs and: If there is a parent category, check if we have an object for it already. If not, create a placeholder for the parent. If we don't have an object for it yet, create an object for the category. Set the parent/child relationships in both objects.

Finally, loop through your array again, and add any object without a parent to a new array.

This new array should be a list of all the parent categories, and usign the relationship you defined you should be able to browse it like a tree. You could also do another pass and build a native 2d array if you need.

Sign up to request clarification or add additional context in comments.

4 Comments

I followed an idea very similar to this already, starting with the children first and saving the parents for last. The problem is, the only information I have on the children is the ParentCategoryID. Unfortunately, this doesn't tell me how "deep" it is in the structure. So, what I'm seeing happen is subcategories are being placed at the same level as the top-level categories depending on where they fall in the master list. The only logical way around this behavior is to start with the parents first and find a way to fill in. Blah. Still trying to work through it today. Thanks for the idea.
The point of a tree structure like this is that you don't need to know how deep it is. all you need is to start at the parent, and then follow the relationship pointers down.
I cannot get what you have suggested to work. Do you have an example you can share?
Sadly, no. I feel like I made one in algorithms when I was in college, but even if i could find it, we worked in C
0

I finally got it! It seems so simple now, I'm almost embarrassed to say it took so long to figure out. This was my final code to accomplish my goal.

if($cats['HasError'] == "false") {
    $cats = $cats['Support_SubjectsList']['Support_Subjects'];

    //Generate a hierarchy array of categories
    $count = count($cats);
    $i=0;
    while($count > 0) {
        foreach($cats as $k => $v) {
            if($i==0) {
                //Parents   
                if(is_array($v['ParentCategoryID'])) {
                    $categories[$i][$v['CategoryID']] = array("Description" => $v['Description']);
                    unset($cats[$k]);
                }
            } else {
                //Children
                if(array_key_exists($v['ParentCategoryID'], $categories[($i-1)])) {
                    $categories[$i][$v['CategoryID']] = array("Description" => $v['Description'], "ParentCategoryID" => $v['ParentCategoryID']);
                    unset($cats[$k]);
                }
            }
        }
        $count = count($cats);
        $i++;
    }

    //Traverse the hierarchy array backwards to make a nested category array
    $count = count($categories)-1;
    for($i=$count;$i>0;$i--) {
        foreach($categories[$i] as $k => $v) {
            $categories[($i-1)][$v['ParentCategoryID']]['SubCategories'][$k] = array("Description" => $v['Description']);
            if(is_array($v['SubCategories'])) {
                $categories[($i-1)][$v['ParentCategoryID']]['SubCategories'][$k]['SubCategories'] = $v['SubCategories'];
            }
        }
        unset($categories[$i]);
    }
    $categories = $categories[0];
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.