Timeout for recursion Python

Question

I have a script that analyzes javascript in links, recursively, so if it finds a javascript, then it analyzes the javascript, and if the javascript it's analyzing contains more javascript then it keeps going, so on so forth. However, I've encountered issues where this recursion would never stop, is there a way to add a timeout for this recursion?

Care to share some code?

Jari
– Jari

2012-07-17 18:00:04 +00:00
Commented Jul 17, 2012 at 18:00 — Jari
– Jari, Commented Jul 17, 2012 at 18:00

kindall · Accepted Answer · 2012-07-17 18:34:41Z

6

Python has a built-in recursion limit and will raise a RuntimeError if this is exceeded. By default its stack limit is 1000. So:

try:
    func_that_may_recurse_infinitely()   # i.e., your JavaScript crawler func
except RuntimeError as e:
    if "recursion" in str(e):
        print "stop all the downloadin'!"

You may modify the initial recursion limit with sys.setrecursionlimit() if you need it to be deeper or shallower.

However, a better approach may be to keep a set() of the items you've already seen and simply decline to process any you've already processed. This can keep you from getting into recursive situations in the first place.

edited Jul 17, 2012 at 18:34

answered Jul 17, 2012 at 18:10

kindall

185k36 gold badges291 silver badges321 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Stupid.Fat.Cat Over a year ago

Oh, this is very useful. But I think for my purposes I would like the depth version more. Thank you!

abarnert Over a year ago

The memoization idea is neat, but it won't necessarily help here. If the cause of the infinite recursion is that the JS is actually evaluating to itself, it's fine; if the cause of the recursion is that the JS is evaluating to a slightly longer string of JS each time, it'll still go forever.

kindall Over a year ago

Hence the "may." I don't know that the "analysis" consists of so I don't know how practical it is for the problem at hand.

Stupid.Fat.Cat Over a year ago

@abarnert yes, that is exactly the problem, the js in the particular sample I was working on refers to each other, causing an endless recursion.

abarnert Over a year ago

Well, if the first one evals to exactly the second, and the second to exactly the first, then the trick kindall suggests will be able to detect the loop and bail out after 2 steps instead of having to go through the loop N times or N seconds—not just saving lots of time and CPU, but also meaning you can be sure it really is an infinite loop, rather than just a finite one that's a hair too deep for your limits.

Joel Cornett · Accepted Answer · 2012-07-17 18:34:34Z

2

I tend to agree with kindall. If, however, you did want to limit the depth of recursion, you could do something like this:

def foo(max_depth = 10, cur_depth=0):
    if cur_depth >= max_depth:
        return BASE_CASE

    else:
        return foo(max_depth, cur_depth+1)

edited Jul 17, 2012 at 18:34

answered Jul 17, 2012 at 18:14

Joel Cornett

24.8k9 gold badges69 silver badges90 bronze badges

1 Comment

Stupid.Fat.Cat Over a year ago

Oh question, the if depth >=max_depth, I assume you mean cur_depth?

abarnert · Accepted Answer · 2012-07-17 18:20:43Z

A quick&dirty solution is to just call time.time() and compare. For example, let's say you've got a simple factorial function like this:

def fact(i):
  if i == 0 or i == 1: return 1
  return i * fact(i-1)

If you call fact(-1), this will spin for a while and then raise a RuntimeError because of maximum recursion depth.

You can add a timeout like this:

import time

def factWithTimeout(i, timeout):
  def fact(i, endtime):
    if time.time() > endtime:
      raise RuntimeError('Timeout')
    if i == 0 or i == 1: return 1
    return i * fact(i-1, endtime)
  return fact(i, time.time() + timeout)

Now, if you call factWithTimeout(-1, 0.0001), it'll only spin for about 100us and then quit with RuntimeError because of timeout.

Obviously, for a function so trivial that it hits the recursion limit in under a millisecond, this isn't much different, but for a more realistic function, that won't be an issue.

Michael0x2a · Accepted Answer · 2012-07-17 18:13:13Z

1

You could do something like this:

import time

start = time.time()
timeout_limit = 30   # 30 seconds, or some other number.

def foo():
    if time.time() > start + timeout_limit:
        return 0

    # insert code here.

    foo()

...and if you don't want global variables, you could try this:

class Foo(object):
    def __init__(self, timeout_limit):
        self.timeout_limit = timeout_limit

    def run(self, ...):
        self.start = time.time()
        self._run(...)

    def _run(self, ...):
        if time.time() > self.start + self.timeout_limit:
            return

        # insert code here.

        self._run(...)

...although that might be overkill.

answered Jul 17, 2012 at 18:13

Michael0x2a

65.6k32 gold badges190 silver badges241 bronze badges

2 Comments

Stupid.Fat.Cat Over a year ago

Yea that's slightly more complicated than what I bargained for.

abarnert Over a year ago

In this case, I think it makes more sense to use a local function and a closure (as in my answer) than an object, because the state is so trivial, and there's no need for an interface with multiple methods. But it is more explicit this way, especially to people who have more OO than FP background.

Collectives™ on Stack Overflow

Timeout for recursion Python

4 Answers 4

5 Comments

1 Comment

Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

5 Comments

1 Comment

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related