3

I know there are various discussions around this subject already, but I have a specific, slightly different question (most existing questions I have found focus on external (inter-)dependencies of other packaging, while my interest is mostly in my own direct package).

I have found a variety of tools that help to find & visualize interdependencies:

The problem I have with using is that they show all the dependencies of all the modules, while I really would like to focus on my own internal dependencies plus the "first" external dependency per module. As an example: I use pandas & scipy in many places, so I would like to see those referenced, but not the internal structure and dependencies of those packages on other stuff. You can imagine that those give a large explosion of other dependencies that are not in my control and therefore not of my direct interest.

Pycallgraph does work, but it gives gigantic results that obfuscate the tiny bit of the total dependencies that I'm interested in. Does anyone have any pointers? Do I need to build something more simple myself or am I overlooking something?

Thank you for help!

Edit:

So pycallgraph is not really handy for me as it really works by executing stuff. The problem with modulegraph is that (as said in the comment too) it creates this huge dot file (9000 lines). However (argh) it does not give dependencies on modules on the same package level. So if you have package "main" with modules "a", "b", "c" and a "main.file_import" with "x", "y", "z" it gives a dependency between "main" and "main.file_import". Which is not what i'm looking for, as i'm trying to figure out whether the actual structure should be re-factored (on module and on function/class level). I'll keep on adding things here, when I find or create a good solution for this. I had thought this to be a common issue though.

4
  • 1
    Have you tried fetching the raw-graph representation of the output (possibly dot) and write a script to trim the graph (e.g using pydot) Commented Dec 29, 2013 at 20:30
  • No, overlooked that! :) Good idea, ideally I would like to have it trimmed while it generates, but the end result is the thing that matters. I'm going to try to do that now for the modulegraph output! Commented Dec 29, 2013 at 20:32
  • If it works, be sure to post it as an answer, for posterity :) Commented Dec 29, 2013 at 20:42
  • Thanks; to be honest looking at the output of tools they give on one hand too much (all the external dependencies) but on the other hand the internal dependencies without enough detail (see the edit I made). I will create something this week I guess and will post it here! Commented Dec 29, 2013 at 20:56

2 Answers 2

1

wrt to pycallgraph, I ended up with something somewhat useful, coming from basically the same point as you.

  1. hack pycallgraph to save the intermediate dot file somewhere you can see it.

  2. run egrep -v to trim out the stuff you don't care about in the dot. This is where you strip out all logging calls, for example.

  3. run gvpr, a DOT-manipulation utility that comes with graphviz to select the node you are interested in.

Basic proof of concept code is at https://gist.github.com/jpeyret/33739f6cd99f6108ad5046bd47df5a16

Sign up to request clarification or add additional context in comments.

Comments

1

Snakefood can restrict the dependencies that it will draw: http://furius.ca/snakefood/doc/snakefood-doc.html#restricting-dependencies

You might also be able to use clustering to group all dependencies in the same package (e.g. only show pandas once): http://furius.ca/snakefood/doc/snakefood-doc.html#filtering-and-clustering-dependencies

Snakefood is also a good option if you plan on filtering the output, as it cat output data for each stage of it's processing.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.