Topological Sort
December 31, 2025
Overview
Topological sort orders the nodes of a directed graph so that for every edge u → v, node u appears before node v. It’s the canonical solution for prerequisite-driven problems: build systems, dependency graphs, task pipelines, and course schedules.
Topological sort only applies to Directed Acyclic Graphs (DAGs). If a cycle exists, no valid ordering is possible.
Table of Contents
When You Need Topological Sort
- Build / compilation order
- Task scheduling and workflows
- Course prerequisites
- ETL pipelines and DAG schedulers
- Import/module dependency resolution
- Spreadsheet recalculation order
Key Nuances (That Often Trip People Up)
- Order is not unique — many valid answers may exist
- Cycle detection is mandatory in real systems
- Disconnected graphs are valid — results interleave components
- Edge direction matters —
u → vmeans u must come first - Adjacency lists are almost always the right representation
Two Standard Algorithms
1) Kahn’s Algorithm (In-degree / BFS-style)
- Compute in-degree for every node
- Enqueue all nodes with in-degree 0
- Pop nodes, reduce neighbors’ in-degree
- If nodes remain unprocessed → cycle exists
from collections import deque, defaultdict
def topo_sort_kahn(nodes, edges):
adj = defaultdict(list)
indeg = {n: 0 for n in nodes}
for u, v in edges:
adj[u].append(v)
indeg[v] += 1
q = deque([n for n in nodes if indeg[n] == 0])
order = []
while q:
u = q.popleft()
order.append(u)
for v in adj[u]:
indeg[v] -= 1
if indeg[v] == 0:
q.append(v)
if len(order) != len(nodes):
raise ValueError("Graph has a cycle")
return orderNuance: Use a priority queue for deterministic output.
2) DFS Postorder (Stack-style)
Idea
- DFS the graph
- Add nodes to output after exploring children
- Reverse at the end
- Detect cycles via recursion stack
from collections import defaultdict
def topo_sort_dfs(nodes, edges):
adj = defaultdict(list)
for u, v in edges:
adj[u].append(v)
visiting = set()
visited = set()
order = []
def dfs(u):
if u in visiting:
raise ValueError("Cycle detected")
if u in visited:
return
visiting.add(u)
for v in adj[u]:
dfs(v)
visiting.remove(u)
visited.add(u)
order.append(u)
for n in nodes:
if n not in visited:
dfs(n)
order.reverse()
return orderKahn vs DFS: Trade-offs
| Aspect | Kahn’s Algorithm | DFS Postorder |
|---|---|---|
| Cycle detection | Remaining nodes after processing | Back-edge detection |
| Determinism | Easy with heap | Depends on traversal order |
| Scheduling model | Natural “ready queue” | Less explicit |
Complexity
- Time:
O(V + E) - Space:
O(V + E)
Handling Non-Unique Orders (Determinism)
- Use a min-heap with Kahn’s algorithm
- Sort adjacency lists for DFS
Trade-off: determinism often costs log V.
Common Pitfalls (and Fixes)
| Pitfall | Fix |
|---|---|
| Isolated nodes missing | Initialize all nodes explicitly |
| Wrong edge direction | Use prereq → dependent |
| Silent cycles | Validate full traversal |
Practical Example: Course Schedule
Valid outputs are often multiple, and that’s expected.
Beyond Basics: Scheduling With Parallelism
Nodes with in-degree 0 can run in parallel. Kahn’s algorithm directly exposes this.
Alternatives and Related Concepts
- Cycle detection only → SCC algorithms
- Scheduling with durations → longest path in DAG
- Priority ordering → weighted topological sort
Conclusion
Topological sort is foundational for dependency-driven systems. Understanding cycle handling, non-uniqueness, and algorithm trade-offs is what turns it from an interview topic into a production-grade tool.
Key Takeaways
- Only DAGs can be topologically sorted
- Ordering is often not unique
- Kahn’s algorithm fits scheduling pipelines
- DFS fits graph-analysis workflows

