Dynamic Programming Matrix
March 04, 2024
Background - Dynamic Programming
Dynamic Programming is an algorithmic technique that allows a problem, with a potentially very large problem space, to be structured in a way that dramatically decreases this problem space. This is accomplished by discovering a problem's:
- Optimal substructure.
- Overlapping sub-problems.
There are several approaches to structuring a solution to perform Dynamic Programming:
- Memoization (inherently top-down).
- Tabulation (inherently bottom-up).
Both of the above variations of Dynamic Programming require finding a problem's optimal substructure, and overlapping sub-problems. The problem space that is being referred to in the opening sentence is a tree data structure that does not exist in code, other than in the way the problem recurses (memoization) or iterations (tabulation) through the problem space. However, it's important to note that it's possible to build a dynamic programming solution to a problem that iterates and uses memoization.
Table of Contents
Optimal Substructure and Overlapping Sub-Problems
This can be a challenge. Each algorithmic problem has a set of 0 to N
optimal substructures that also have overlapping
sub-problems (which is necessary for a Dynamic Programming solution). This means that not every problem is suited for Dynamic Programming. A great way to understand what this means is to examine
a solution to a problem that has an optimal substructure, but does not have overlapping sub-problems. Here's a high-level
overview of Merge Sort.
Merge Sort has an optimal substructure. It breaks down the problem of sorting all the way down to neighboring elements, and then begins
to recursively sort and merge (using a simpler algorithm that excels at smaller arrays, like Insertion Sort).
Merge Sort does not have overlapping sub-problems where the solution to one sub-problem is re-used as a pre-computation to
potentially N
other sub-problems. Each sub-problem's solution does help the efficiency of subsequent sub-problems, but it's not
an entirely computed value to be used as input to the subsequent sub-problem. This illustrates the distinction between Divide and Conquer and Dynamic Programming.
However, because there is some increase in efficiency from one sub-problem's solution to the next sub-problem, it can be argued
that Merge Sort exhibits some qualities of Dynamic Programming.
Matrix (Plural: Matrices)
A matrix is a sequence of objects. Each of those objects could be another sequence of objects. Furthermore, each of those objects could themselves be another sequence of objects. And so on and so forth.
A 1D (1-dimensional) matrix is technically an array, although matrix is not commonly used to refer to an array. The most
common structure that is being referenced when one uses the term matrix is a 2D (2-dimensional) array. This can be extended N
times to N
dimensions.
Tabulation
Tabulation is the use of a 1
to N
dimension matrix to store precomputed values of sub-problems to solve subsequent sub-problems
in a Dynamic Programming solution. In theory, any problem that can be solved using tabulation can also be solved using memoization.
Tabulation is inherently bottom-up, because you start computing values at the start of the matrix at the first sub-problem and work
your way through it until the end where the final sub-problem is computed using previous sub-problems (that may or may not be adjacent
cells in the matrix, as this depends on the structure of the tabulation).
Examples
A simple example of tabulation using a 1-dimensional array is a solution to compute the Nth Fibonacci sequence number.
A more complex example of tabulation using a 2-dimensional array is a solution to create an algorithm that is able to process regular expressions.
Conclusion
Tabulation is a bottom-up, iterative approach to Dynamic Programming, whereas memoization is typically a recursive, top-down approach (but it can be iterative). There's no call stack to worry about, which lends this approach of Dynamic Programming to problems with a very high combinatorial complexity (which could cause a stack overflow). However, this call stack problem can easily be alleviated using a literally defined stack (or even a double-ended queue or heap, given the problem). The specific problem and implementation considerations are how you decide between these approaches.
To be updated with a section on how to choose the right table structure for the problem.