3

I'm trying to iterate over an array containing multi-parent dependencies in the most efficient way possible for the purpose of building an efficient file-dependency script implementing RequireJS.

I have used the answer to this question Category Hierarchy (PHP/MySQL) successfully for a 1 parent scenario. However there is another constraint in that one child-script may depend on more than one parent-script.

For example a list of scripts and their dependencies which are currently an array in PHP (consider the script name unique and the key):

╔══════════════════╦═════════════════════════════╗
║   Script Name    ║        Dependencies         ║
╠══════════════════╬═════════════════════════════╣
║ jquery           ║ jquery-core, jquery-migrate ║
║ jquery-core      ║ null                        ║
║ jquery-migrate   ║ null                        ║
║ statistics       ║ null                        ║
║ backtotop        ║ jquery                      ║
║ jquery-circle    ║ jquery                      ║
║ interface-slider ║ jquery-circle               ║
║ test1            ║ jquery, statistics          ║
║ test2            ║ test1                       ║
║ test3            ║ null                        ║
╚══════════════════╩═════════════════════════════╝

Would create an array (demonstrated as JSON for clarity) like the one below where a shared dependency such as that of test1 would be nested under ['jquery-core jquery-migrate']['jquery']['statistics'] as opposed to having a new inclusion of ['jquery-core jquery-migrate']['jquery statistics']['test1']

allScripts = [
    {
        "name": "jquery-core jquery-migrate",
        "children":[
            {
                "name":"jquery",
                "children":[
                    {
                        "name":"backtotop",
                        "children":null
                    },
                    {
                        "name":"statistics",
                        "children":[
                            {
                                "name":"test1",
                                "children":[
                                    {
                                        "name":"test2",
                                        "children":null
                                    }
                                ]
                            }
                        ]
                    },
                    {
                        "name":"jquery-circle",
                        "children":[
                            {
                                "name":"interface-slider",
                                "children":null
                            }
                        ]
                    }
                ]
            }
        ]
    },
    {
        "name":"test3",
        "children":null
    }
];

Perhaps this needs a form of lowest-common-ancestor approach?http://www.stoimen.com/blog/2012/08/24/computer-algorithms-finding-the-lowest-common-ancestor/

Any help is much appreciated!

2 Answers 2

2
+50

I put together some demo code that loads the dependencies of a script first, then loads the script. There's some sample output to demonstrate what's going on as well. Make sure to read all of the comments as there's too much to explain here. Please let me know if you want any changes.
+1, interesting challenge!


#!/usr/bin/php
<?php
/*
 * -------
 * There's room for improvement, but this is a good start.
 * Let me know if you need any changes.
 * -------
 * 
 * This loads scripts with dependencies being loaded
 * first, with efficiency being key here.
 * Reference counting is also performed, and can be
 * seen in the $loaded array.  A script can be referenced
 * indirectly many times through the loading of various
 * scripts and their dependencies.
 * Circular dependencies are handled by checking if the
 * script has already been loaded.  Since it can only
 * be loaded once, a circular dependency is avoided.
 *
 * Original Sample Data:
 * ╔══════════════════╦═════════════════════════════╗
 * ║   Script Name    ║        Dependencies         ║
 * ╠══════════════════╬═════════════════════════════╣
 * ║ jquery           ║ jquery-core, jquery-migrate ║
 * ║ jquery-core      ║ null                        ║
 * ║ jquery-migrate   ║ null                        ║
 * ║ statistics       ║ null                        ║
 * ║ backtotop        ║ jquery                      ║
 * ║ jquery-circle    ║ jquery                      ║
 * ║ interface-slider ║ jquery-circle               ║
 * ║ test1            ║ jquery, statistics          ║
 * ║ test2            ║ test1                       ║
 * ║ test3            ║ null                        ║
 * ╚══════════════════╩═════════════════════════════╝
 * 
 */

define('NO_DEPENDENCIES_LIST',    TRUE);  //create a list of scripts with no dependencies

//sample data, taken from OP
$scripts = array(
    'jquery'=>array('jquery-core','jquery-migrate',),
    'jquery-core'=>array(),
    'jquery-migrate'=>array(null ),
    'statistics'=>array( ),
    'backtotop'=>array('jquery' ),
    'jquery-circle'=>array('jquery' ),
    'interface-slider'=>array('jquery-circle' ),
    'test1'=>array('jquery','statistics', 'test3'  ),
    'test2'=>array('test1' ),
    'test3'=>array( ),
);

$loaded     = array();  //list of loaded scripts, order is important
$nodepends  = array();  //list of scripts with no dependencies. async load perhaps?


/**
 * Adds a null item to an empty array, strictly for json output
 * as the OP used null instead of empty arrays in the example.
 * @param array $scripts
 */
function process_array(&$scripts) { 
  foreach($scripts as $s=>&$data)
    if (count($data)==0)
      $data = array(null);
}


/**
 * Finds dependents of $scriptName.
 * @param array $scripts script test data
 * @param string $scriptName name of script to search for dependcies for
 * @param boolean $retNames TRUE to return script names, false to return ref count 
 * @return multitype:number unknown 
 */
function find_dependencies(array $scripts, $scriptName, $retNames=FALSE) {
  $ret = array();
  foreach($scripts as $s=>$data)
    foreach($data as $d) {
      if ($d == $scriptName)
        if (!$retNames) {
          $ret[$s] = (isset($ret[$s])?$ret[$s] + 1 : 0);
        } else {
          $ret[$s] = $s;
        }
    }
  return $ret;
}

/**
 * Checks $script to see if it has already been added to the list of
 * loaded scripts.
 * @param string|array $script script name or array of script names
 * @return boolean
 */
function script_loaded($script) {
  global $loaded;
  if (is_array($script)) {
    foreach($script as $s)
      if (!isset($loaded[$s]))
        return FALSE;
  } 
  if (is_string($script)) 
    return isset($loaded[$script]);
  return TRUE;
}

/**
 * Loads a script into the $loaded array, first loading all
 * dependencies, and the dependencies of those, etc., etc.
 * Ensures that scripts are only loaded after all required
 * scripts are loaded.  Remember - order of $loaded is important!
 * Return value is unimportant in this version.
 * 
 * @param array $scripts
 * @param string $script
 * @param array|null $children
 * @param integer $level
 * @return boolean
 */
function load_script(array &$scripts, $script, &$children, $level) {
  global $loaded, $nodepends;
  if ($script == null)
    return FALSE;
  if (script_loaded($script))
    return TRUE;

  if (count($scripts[$script]) > 0 && $scripts[$script][0] != null) {
    for($i=0;$i<count($scripts[$script]);$i++) {
      if (!isset($scripts[$script][$i]))
        break;
      if ($i >= count($scripts[$script]))
        break;
      if (!script_loaded($scripts[$script][$i])) {
        load_script($scripts, $scripts[$script][$i], $scripts[$script], $level+1);
      } 

      if (isset($children[$i]) && script_loaded($children[$i]))
        $children[$i] = null;
    }
  } 
  if ($scripts[$script][0]==null) {
    if (!isset($loaded[$script]))
      $loaded[$script] = $script;
      if (NO_DEPENDENCIES_LIST) 
        $nodepends[$script] = $loaded[$script];
  }


  if (!isset($loaded[$script])) {
    $loaded[$script] = 0;
  } else {
    $loaded[$script] = $loaded[$script] + 1;
    return TRUE;
  }

  $loaded[$script] = $loaded[$script] + 1;

  echo "load_script($script)\n";  
  return TRUE;
}


/**
 * demo main function
 * @param array $scripts - array of scripts and their dependencies: test data
 */
function main(&$scripts, &$loaded, &$nodepends) {
  process_array($scripts);
  foreach($scripts as $s=>$data) {
    load_script($scripts, $s, $data, 0);
  } 


if (NO_DEPENDENCIES_LIST)
  //reverse this to put the less-referenced scripts at the top
  $nodepends = array_reverse($nodepends);

//here we print out a table of the $loaded array.
//it's just here for a visual representation of
//what scripts were loaded and what their dependent scripts
//are.  
//The left column is the loaded script, the right column
//is a list of scripts that tried to load the script.
echo "
 ╔══════════════════════════════════════════════════╗
 ║  Loaded Scripts: with scripts that loaded them   ║
 ╚══════════════════════════════════════════════════╝
 ╔══════════════════╦═══════════════════════════════╗
 ║   Script Name    ║         Loading Scripts       ║
 ╠══════════════════╬═══════════════════════════════╣\n";
 foreach($loaded as $s=>$n) { 
   $d = find_dependencies($scripts, $s, TRUE);
   $n = 16-strlen($s);
   $s2 = implode(",", $d);
   if ($s2=="")
     $s2 = "null";
   $n2 = 29-strlen($s2);
   printf (" ║ %s%s ║ %s%s ║\n", $s, str_repeat(" ", $n), $s2, str_repeat(" ", $n2));
 }
 echo " ╚══════════════════╩═══════════════════════════════╝\n";

//print json of loaded scripts -- just because OP used json
print_r( json_encode($scripts, JSON_PRETTY_PRINT) );

//print array of loaded scripts: the order of this array is important;
//scripts that are depended on by other scripts are loaded first.  If
//a script has no dependencies, its order is not significant.
//--
//this is the actual result that we're looking for: all scripts loaded
//with dependencies loaded first.
print_r( $loaded );

//print array of scripts that have no dependencies.  Since efficiency is
//a requirement, you could async load these first, since it doesn't matter
//what order they load in.
if (NO_DEPENDENCIES_LIST)
  print_r( $nodepends );
}


//run the demo
main($scripts, $loaded, $nodepends);
Sign up to request clarification or add additional context in comments.

5 Comments

Thanks @trick, I will take a thorough look and get back with findings!
I wrote it as a shell script, so make sure to chmod +x demo.php && demo.php -or- php -f demo.php (in bash)
Looking good in so far that we're getting the correct order of dependencies. However I've gotten to that point on my own previously and would like to take it a step further but grouping by mutual exclusivity to take advantage of the asynchronous loading in requireJS. For example I would then be able to require backtotop, statistics and jquery-circle (then thereafter their dependants) at the same time once jquery had loaded as opposed to synchronously as suggested here by the output as I currently have no way of knowing the grouping/reliance. Hence I demo'd a nested JSON/array.
Is the JSON you posted the exact format that you need, or was it just an example?
It was really just an example of displaying a nested array/structure which I can then process for HTML/Javascript output to get an efficient requireJS chain
0

If I understand your question correctly, you have chosen the wrong data structure to represent dependencies. Whenever multiple parents are allowed in a dependency hierarchy, you have a Directed Acyclic Graph (DAG), not a tree as you have shown. (A tree is a specialization of DAG where every node happens to have one parent. DAGs are strictly more powerful.)

You have shown a quirky kind of tree where a single node (e.g. "jquery-core jquery-migrate",) represents a pair of DAG nodes. This won't work in general. If A is a dependency of (B and C) and D is one of (B and E), then what? If you choose to use two different tree nodes for the (B,C) and (B,E) combinations, what do edges from these combination nodes to their dependencies mean? Moreover, even if you adopt some conservative answer to this question, there are 2^N combinations of N nodes. This approach could easily lead to an enormous tree.

In JSON or PHP, a DAG is easy to represent and search. Store it as an adjacency list: A map taking any library to a list of other libraries that are either its dependencies or dependent on it (choose the direction based on your purpose; to match the tree you showed, it would be the second choice). Search it with any graph search algorithm (e.g. depth first or breadth first).

NB In a general system you must also worry about dependency cycles, which can be accomplished with a search as above.

2 Comments

Thanks for the idea of a DAG @Gene. Reading your answer, what do you mean by it would be the second choice ?
@RobSchmuecker I meant choose that the DAG edges run from the dependency to the dependent node rather than the opposite. But note that for many purposes, it's handy to have both "dependency" and "dependent" edges labeled as such you can navigate over the DAG in any way you need. In this case, the data structure is a map from libraries to two lists of libraries.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.