1

I'm doing a program that is supposed to load a graph with 200k vertex and 500k edges. Then, I'm required to look into the vertexes that are near to some distance of a vertex (they all have a specific ubication). I'm doing this through recursion (DFS), however, for a sufficient big amount, the recursion I have to get through is large enough to crash the program (stack overflow). I've been reading about it and there's a common consensus about this error being a Bad Programming issue, and not a size of the data issue. I can increment stack space and it works, however I would like to know whether I could change my DFS implementation to make it less heavy for the stack. I was also wondering if the "weight" of each element in the stack is determined by the recursion (is it creating new variables? What is going on for it to crash?)

If someone could help me with this I would be really thankful!!

4
  • 1
    Does your graph have cycles? Are you able to use iteration rather than recursion? Commented Nov 20, 2015 at 22:08
  • 6
    DFS can be factored into an iterative algorithm (like any other recursion, actually). But sometimes switching to a different algorithm might help. Can BFS for instance solve your problem? Commented Nov 20, 2015 at 22:08
  • 1
    In one of my application i was dealing with quite complex graphs, but not as big as yours, and most of the time when I had stackoverflow error, it was caused by bug in code and visiting same node infinite times rather than large data. Commented Nov 20, 2015 at 22:11
  • Yes it probably has cycles, however I apply a "marked" BitSet that doesn't allow to recur on those vertex (It is a city map, with each intersection as a vertex). Is implementing the iterative version of the algorithm recommendable? I was eager to implement BFS, but is a little bit ineffective for the problem I'm trying to solve (find the vertexes within a circle, but I would have to drop one branch if I found it is out of range (as I said it would be a little bit more complex haha). Commented Nov 20, 2015 at 22:15

2 Answers 2

4

The crash is caused by the use of a thread-specific stack that is limited in size. You won't get away from that.

Each time a method is called, the arguments to the method, and other information, is put on the stack, and it stays there until the method returns, even if the values are modified. With a recursive call, this means every recursing increases the amount of stack space used.

Every recrusive solution can be modified to an iterative one, although it may be tricky to implement.

Sign up to request clarification or add additional context in comments.

2 Comments

Ok, I'm passing by parameter some sets, so are you saying the state of that set is saved in each recursion? Or is it unique for all of them?
@kickingnico, Each (recursive) invocation of the method has its own, independent set of local variables, including all its arguments. For arguments of reference type (i.e. references to objects) it is the references that are indpendent. If you forward such an argument in a recursive call then you thereby make a copy of the reference, but the copy refers to the same object as the original reference does.
1

A stack overflow arising from a recursive algorithm often indicates a programming error (which perhaps is what you mean by "a Bad Programming issue"). This is because such stack overflows typically arise because the programmer has failed to ensure that the recursion will always terminate.

It is possible, however, for a stack overflow to arise from a basically correct program, when large enough data are presented to it that it runs out of stack before it reaches the recursion termination condition. One might nevertheless still characterize this as a programming error: the program implementation is not suitable for some of the inputs it is expected to be able to handle. In this sense, you can't really separate data-size issues from programming issues.

If your program needs to traverse very long paths in a graph (where the definition of "very long" is fuzzy) then a recursive algorithm is probably a poor choice. If DFS is indeed a good match to the problem then it isn't so hard to implement it iteratively.

2 Comments

Thanks a lot! It is then a programming mistake to use recursion when it is the input that is not suitable for the recursion! However, is the iterative solution better than recursion? I mean, does it not affect memory in the same way, even if it does not affect the stack?
Generally speaking, you have a lot more heap to work with than you have stack. Also, an iterative implementation can always beat a recursive implementation on total memory use. In the end, though, if you have big data then you need proportionally big memory to work with it. Even so, 200k vertices and 500k edges is largish, but not huge.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.