0

I have spent a lot of time to learn about implementing/visualizing dynamic programming problems using iteration but I find it very hard to understand, I can implement the same using recursion with memoization but it is slow when compared to iteration.

Can someone explain the same by a example of a hard problem or by using some basic concepts. Like the matrix chain multiplication, longest palindromic sub sequence and others. I can understand the recursion process and then memoize the overlapping sub problems for efficiency but I can't understand how to do the same using iteration.

Thanks!

2
  • 3
    This problem, as written, is a little broad. Can you give an example of a specific problem you tried to solve iteratively, how you solved it recursively, and what problem you ran into when trying to do it iteratively? Commented Jul 22, 2016 at 17:34
  • For example you can take the matrix chain multiplication geeksforgeeks.org/…, see my code here stackoverflow.com/questions/38464887/…, I can't think how to maintain a dp matrix or top-down/bottom-up. Commented Jul 22, 2016 at 18:01

2 Answers 2

8
+50

Dynamic programming is all about solving the sub-problems in order to solve the bigger one. The difference between the recursive approach and the iterative approach is that the former is top-down, and the latter is bottom-up. In other words, using recursion, you start from the big problem you are trying to solve and chop it down to a bit smaller sub-problems, on which you repeat the process until you reach the sub-problem so small you can solve. This has an advantage that you only have to solve the sub-problems that are absolutely needed and using memoization to remember the results as you go. The bottom-up approach first solves all the sub-problems, using tabulation to remember the results. If we are not doing extra work of solving the sub-problems that are not needed, this is a better approach.

For a simpler example, let's look at the Fibonacci sequence. Say we'd like to compute F(101). When doing it recursively, we will start with our big problem - F(101). For that, we notice that we need to compute F(99) and F(100). Then, for F(99) we need F(97) and F(98). We continue until we reach the smallest solvable sub-problem, which is F(1), and memoize the results. When doing it iteratively, we start from the smallest sub-problem, F(1) and continue all the way up, keeping the results in a table (so essentially it's just a simple for loop from 1 to 101 in this case).

Let's take a look at the matrix chain multiplication problem, which you requested. We'll start with a naive recursive implementation, then recursive DP, and finally iterative DP. It's going to be implemented in a C/C++ soup, but you should be able to follow along even if you are not very familiar with them.

/* Solve the problem recursively (naive)

   p - matrix dimensions
   n - size of p
   i..j - state (sub-problem): range of parenthesis */
int solve_rn(int p[], int n, int i, int j) {
    // A matrix multiplied by itself needs no operations
    if (i == j) return 0;

    // A minimal solution for this sub-problem, we
    // initialize it with the maximal possible value
    int min = std::numeric_limits<int>::max();

    // Recursively solve all the sub-problems
    for (int k = i; k < j; ++k) {
        int tmp = solve_rn(p, n, i, k) + solve_rn(p, n, k + 1, j) + p[i - 1] * p[k] * p[j];
        if (tmp < min) min = tmp;
    }

    // Return solution for this sub-problem
    return min;
}

To compute the result, we starts with the big problem:

solve_rn(p, n, 1, n - 1)

The key of DP is to remember all the solutions to the sub-problems instead of forgetting them, so we don't need to recompute them. It's trivial to make a few adjustments to the above code in order to achieve that:

/* Solve the problem recursively (DP)

   p - matrix dimensions
   n - size of p
   i..j - state (sub-problem): range of parenthesis */
int solve_r(int p[], int n, int i, int j) {
    /* We need to remember the results for state i..j.
       This can be done in a matrix, which we call dp,
       such that dp[i][j] is the best solution for the
       state i..j. We initialize everything to 0 first.

       static keyword here is just a C/C++ thing for keeping
       the matrix between function calls, you can also either
       make it global or pass it as a parameter each time.

       MAXN is here too because the array size when doing it like
       this has to be a constant in C/C++. I set it to 100 here.
       But you can do it some other way if you don't like it. */
    static int dp[MAXN][MAXN] = {{0}};

    /* A matrix multiplied by itself has 0 operations, so we
       can just return 0. Also, if we already computed the result
       for this state, just return that. */
    if (i == j) return 0;
    else if (dp[i][j] != 0) return dp[i][j];

    // A minimal solution for this sub-problem, we
    // initialize it with the maximal possible value
    dp[i][j] = std::numeric_limits<int>::max();

    // Recursively solve all the sub-problems
    for (int k = i; k < j; ++k) {
        int tmp = solve_r(p, n, i, k) + solve_r(p, n, k + 1, j) + p[i - 1] * p[k] * p[j];
        if (tmp < dp[i][j]) dp[i][j] = tmp;
    }

    // Return solution for this sub-problem
    return dp[i][j];;
}

We start with the big problem as well:

solve_r(p, n, 1, n - 1)

Iterative solution is only to, well, iterate all the states, instead of starting from the top:

/* Solve the problem iteratively

   p - matrix dimensions
   n - size of p

   We don't need to pass state, because we iterate the states. */
int solve_i(int p[], int n) {
    // But we do need our table, just like before
    static int dp[MAXN][MAXN];

    // Multiplying a matrix by itself needs no operations
    for (int i = 1; i < n; ++i)
        dp[i][i] = 0;

    // L represents the length of the chain. We go from smallest, to
    // biggest. Made L capital to distinguish letter l from number 1
    for (int L = 2; L < n; ++L) {
        // This double loop goes through all the states in the current
        // chain length.
        for (int i = 1; i <= n - L + 1; ++i) {
            int j = i + L - 1;
            dp[i][j] = std::numeric_limits<int>::max();

            for (int k = i; k <= j - 1; ++k) {
                int tmp = dp[i][k] + dp[k+1][j] + p[i-1] * p[k] * p[j];
                if (tmp < dp[i][j])
                    dp[i][j] = tmp;
            }
        }
    }

    // Return the result of the biggest problem
    return dp[1][n-1];
}

To compute the result, just call it:

solve_i(p, n)

Explanation of the loop counters in the last example:

Let's say we need to optimize the multiplication of 4 matrices: A B C D. We are doing an iterative approach, so we will first compute the chains with the length of two: (A B) C D, A (B C) D, and A B (C D). And then chains of three: (A B C) D, and A (B C D). That is what L, i and j are for.

L represents the chain length, it goes from 2 to n - 1 (n is 4 in this case, so that is 3).

i and j represent the starting and ending position of the chain. In case L = 2, i goes from 1 to 3, and j goes from 2 to 4:

(A B) C D     A (B C) D     A B (C D)
 ^ ^             ^ ^             ^ ^
 i j             i j             i j

In case L = 3, i goes from 1 to 2, and j goes from 3 to 4:

(A B C) D     A (B C D)
 ^   ^           ^   ^
 i   j           i   j

So generally, i goes from 1 to n - L + 1, and j is i + L - 1.

Now, let's continue with the algorithm assuming that we are at the step where we have (A B C) D. We now need to take into account the sub-problems (which are already calculated): ((A B) C) D and (A (B C)) D. That is what k is for. It goes through all the positions between i and j and computes the sub problems.

I hope I helped.

Sign up to request clarification or add additional context in comments.

1 Comment

Can you explain the loops in the last code, why the 2nd loop is upto n - L +1 and how j is calculated and the 3rd loop i to j-1. With a example perhaps.
0

The problem with recursion is the high number of stack frames that need to be pushed/popped. This can quickly become the bottle-neck.

The Fibonacci Series can be calculated with iterative DP or recursion with memoization. If we calculate F(100) in DP all we need is an array of length 100 e.g. int[100] and that's the guts of our used memory. We calculate all entries of the array pre-filling f[0] and f[1] as they are defined to be 1. and each value just depends on the previous two.

If we use a recursive solution we start at fib(100) and work down. Every method call from 100 down to 0 is pushed onto the stack, AND checked if it's memoized. These operations add up and iteration doesn't suffer from either of these. In iteration (bottom-up) we already know all of the previous answers are valid. The bigger impact is probably the stack frames; and given a larger input you may get a StackOverflowException for what was otherwise trivial with an iterative DP approach.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.