DFS Backtracking

March 12, 2024

Overview

DFS (Depth-First Search) Backtracking is a brute force mechanism for solving optimization problems. You create a recursive function that tries all possible solutions to find the best one that meets the conditions of the problem. This type of recursion that tries all possible solutions creates a tree. Not a literal tree data structure that exists in memory (at least, not in its entirety), but a tree in the recursive path that the DFS function takes. Only the tree's height exists in memory through its call stack.

It's good to be familiar with both recursion, and tree traversal as we witnessed in InOrder vs PreOrder vs PostOrder.

Table of Contents

Problem Space
Branching Factor of the Problem Space Tree
Height of the Problem Space Tree
Memoization
Minimize Problem Space with a Heuristic
Conclusion

Problem Space
^

The size of problem space of any tree (including a tree that represents the problem space of a DFS Backtracking solution) is:

x = (BfH - 1) / 2

Where:

  • x is the number of possible solutions in the problem space.
  • Bf is the branching factor of each node in the tree of possible solutions (the number of times the recursive function calls itself in its function definition).
  • H is the height of the tree of possible solutions (including the root level -- root is H=1 in this context, however, when we examine the derivation of the formula, we can see how root can be H=0 in another context). For many solutions, this can not be easily ascertained.

Branching Factor of the Problem Space Tree
^

A good way to explain Bf in the above formula is an example:

import sys

minCost = sys.maxsize

def dfs(curBearCost, curCatCost, curDogCost):
    nonlocal minCost
    if someBearConditionIsTrue:
        dfs(curBearCost + 1, curCatCost, curDogCost)
    if someCatConditionIsTrue:
        dfs(curBearCost, curCatCost + 1, curDogCost)
    if someDogConditionIsTrue:
        dfs(curBearCost, curCatCost, curDogCost + 1)

    if someGoodCondition:
        minCost = min(minCost, curBearCost + curCatCost + curDogCost)

dfs(0, 0, 0)
# At this point, all possible minCost values will have been searched and `minCost` will be the optimal value.

In the above example, Bf would be equal to 3.

Height of the Problem Space Tree
^

The height of the problem space in the above example is difficult to ascertain because the variables and conditions don't specify a limit to the depth of recursion. The below example does:

import sys

minCost = sys.maxsize

maxCost = 10

def dfs(curBearCost, curCatCost, curDogCost):
    nonlocal minCost
    if curBearCost < maxCost:
        dfs(curBearCost + 1, curCatCost, curDogCost)
    if curCatCost < maxCost:
        dfs(curBearCost, curCatCost + 1, curDogCost)
    if curDogCost < maxCost:
        dfs(curBearCost, curCatCost, curDogCost + 1)

    if someGoodCondition:
        minCost = min(minCost, curBearCost + curCatCost + curDogCost)

dfs(0, 0, 0)
# At this point, all possible minCost values will have been searched and `minCost` will be the optimal value.

In the above example, we can be sure that the height of our problem space tree is 10. The maximum size of our problem space is:

(311 - 1) / 2 = 88573

The reason we say maximum size is that there might be some other condition (not explicitly defined in our example above) that results in branches of the tree being skipped.

Memoization
^

The above example results in duplicate work. For example, there will necessarily be two different paths in the tree where the values are: curBearCost=1, curCatCost=1, curDogCost=0:

  1. (0 + 1, 0, 0) -> (1, 0 + 1, 0) (first Bear increments, then Cat)
  2. (0, 0 + 1, 0) -> (0 + 1, 1, 0) (first Cat increments, then Bear)

However, what's important to realize is that the second encounter with curBearCost=1, curCatCost=1, curDogCost=0 only occurs after the first branch from the root DFS function call in the problem space tree (where Bear is incremented until its maximum value) is reached, because it is the first recursion in the order of operations in the function (hence, why we're calling it the first branch from the root DFS function call). Once that maximum Bear is reached at curBearCost=10, curCatCost=0, curDogCost=0, the recursive function's call stack begins to unwrap one Bear at a time, and each time, it increments Cat and Dog to their maximums. This unwrapping continues until Bear is 1, and it begins incrementing Cat once again: curBearCost=1, curCatCost=1, curDogCost=0. And then this incrementing of Cat and Dog continues until curBearCost=1, curCatCost=10, curDogCost=10, and the unwrapping of Bear continues until we're at our root (original or base) function call of dfs(0, 0, 0), however, the difference now is that the current operation being executed is the second recursive call: dfs(curBearCost, curCatCost + 1, curDogCost). This will start a new function call that starts at the very top of the order of operations, calling dfs(curBearCost + 1, curCatCost, curDogCost) where the value of curCatCost is already 1.

Note: This simplified explanation might make it difficult to see the breadth of the tree at these lowest depths. These higher values (10) appear many times in the problem tree. Overall, there are only 1331 unique potential solutions.

This duplicate execution can be resolved with memoization:

import sys

minCost = sys.maxsize

maxCost = 10
seen = set()

def dfs(curBearCost, curCatCost, curDogCost):
    nonlocal minCost
    
    if tuple(curBearCost, curCatCost, curDogCost) in seen:
        return
    seen.add(tuple(curBearCost, curCatCost, curDogCost))
    
    if curBearCost < maxCost:
        dfs(curBearCost + 1, curCatCost, curDogCost)
    if curCatCost < maxCost:
        dfs(curBearCost, curCatCost + 1, curDogCost)
    if curDogCost < maxCost:
        dfs(curBearCost, curCatCost, curDogCost + 1)

    if someGoodCondition:
        minCost = min(minCost, curBearCost + curCatCost + curDogCost)

dfs(0, 0, 0)
# At this point, all possible minCost values will have been searched and `minCost` will be the optimal value.

It's arguable whether the seen set in our use case fits the definition of memoization because no stored return value is saved in a dictionary (or HashMap in java), however, it does store the function call arguments and does cut off entire branches of our problem space tree, exactly as memoization would.

Minimize Problem Space with a Heuristic
^

Further optimizations would require more information about the problem. Maybe we could add more conditions instead of just curBearCost < maxCost, curCatCost < maxCost, and curDogCost < maxCost. If we did, we would be introducing a very simple heuristic.

Conclusion
^

DFS Backtracking can solve a very broad set of optimization problems. It's an essential tool in any serious software engineer's toolbox. However, it lacks precision without proper heuristics. Furthermore, for some problems, the combinatorial complexity can be infinite. Use judiciously.

Alternatively, A-Star is associated with pathfinding on a graph, but it can also be used to find a path to the optimal solution in an optimization problem. It can excel where DFS Backtracking fails due to very large combinatorial complexity.

To be updated with diagrams that depict the problem space tree. This will be very valuable in this post.

Updated: 2024-03-14