--- title: "Bellman-Ford Shortest Path" format: html: toc: true number-sections: true code-line-numbers: true --- [Pre-class notes](hand_written_notes/BellmanPre.pdf) [In-class notes](hand_written_notes/Bellman2.pdf) ## Learning Goals - Describe the Shortest Path problem and its applications - Design a dynamic programming algorithm for the Shortest Path Problem ## Shortest Path ### Problem Definition **Input:** A directed $G=(V,E)$, $w:E\rightarrow \mathbb{R}$, $s,t\in V$, (we denote $n:=|V|$ and $m:=|E|$) where $G$ has no negative cycles. **Output:** A path $P$ of directed edges from $s$ to $t$ in $G$ (e.g. $P=((s,u), (u,v), \dots, (r,t))$), such that $L(P)$ is minimized where $$ L(P)=\sum_{e\in P}w(e). $${#eq-pathLength} **Example:** ```{dot} //| label: fig-simple //| fig-cap: "Example input to Shortest Path." digraph G { node [shape=circle, width=0.4]; nodesep=1.5; s -> u [label = "-1"]; s -> v [label = "3"]; u -> t [label = "4"]; v -> t [label = "-2"]; } ``` For the graph in @fig-simple, the shorted path is $P=((s,v),(v,t)).$ ### Approaches For this one problem, we can use 3 different paradigms to solve this problem. (And there are even more variants than those I list). 1. Greedy $\rightarrow$ Dijkstra's Algorithm. * Use when all edges have positive weights 2. Brute Force $\rightarrow$ Breadth First Search. * Use when all edges have weight one 3. Dynamic Programming $\rightarrow$ Bellman-Ford. * Edges can have negative weight * Can take global or distributed description of $G$ (more on this later) * Fails if negative cycles exist and path must avoid negative cycles ### Applications of Bellman-Ford Dynamic Programming Algorithm * Data routing in networks * Bartering * Arbitrage: finding inefficiencies (negative cycles) in financial markets and using them repeatedly to make money. For bartering and arbitrage, please consider who might be the winners and who might be the losers if it were implemented. ## Dynamic Programming Approach ### Defining Subproblems The most clever thing about this algorithm is how subproblems are defined. Rather than thinking about the subproblems in terms of smaller graphs, similar to what we did with MWIS on a line, the subproblems are in terms of the number of edges in the path. $$ P_{u,i} = \textrm{the shortest path from $s$ to $u$ that uses at most $i$ edges} $${#eq-subproblems} In @fig-subproblems, what is $L(P_{t,1})$, $L(P_{t,2})$ and $L(P_{t,3})$? ```{dot} //| label: fig-subproblems //| fig-cap: "Graph for practicing identifying subproblems." digraph G { node [shape=circle, width=0.4]; nodesep=1.5; s -> u [label = "5"]; s -> v [label = "1"]; s -> t [label = "2"]; u -> t [label = "-2"]; v -> w [label = "1"]; w -> t [label = "-1"]; } ``` A) $\infty$, 3, 1 B) $\infty$, 2, 3 C) 0, 2, 3 D) 0, 3, 1 ### Developing Recurrence Relations We first create a recurrence relation for $P_{u,i}$: $$ \begin{align} P_{u,i}= \begin{cases} \rule{5cm}{0.15mm} &\textrm{ if shortest path from $i$ edges goes through $v$ prior to $u$}\\ \rule{5cm}{0.15mm} &\textrm{ if shortest path from $i$ edges goes through $w$ prior to $u$}\\ \vdots &\\ \rule{5cm}{0.15mm} &\textrm{ if shortest path to $u$ uses less than $i$ edges} \end{cases} \end{align} $${#eq-recurrence} Next we create a recurrence relation for $L(P_{u,i}):$ $$ \begin{align} L(P_{u,i})= \begin{cases} \rule{5cm}{0.15mm} &\\ \rule{5cm}{0.15mm} &\textrm{ for base cases}\\ \end{cases} \end{align} $$ ### Writing Pseudocode {#sec-Pseudocode} Write Pseudocode to create the array $A:$ **Input:** $n\times n$ array $w$ where $w[u,v]=$ weight of edge $(u,v)$ (vertices are given names from $1$ to $n$), starting vertex $s\in [n]$. Note $w[u,v]=\infty$ if no edge $(u,v)$ and $\forall u\in [n]$, $w[u,u]=0.$ **Output:** ?$\times$? array $A$ such that $A[u,i]=L(P_{u,i}).$ > . > . > . After completing your pseudocode, walk through your code to create the array $A$ for the following graph ```{dot} //| label: fig-pseudocode-practice //| fig-cap: "Graph for testing pseudocode" digraph G { node [shape=circle, width=0.4]; nodesep=1.5; s -> z [label = "4"]; s -> y [label = "-1"]; y -> r [label = "2"]; z -> r [label = "-3"]; z -> y [label = "2"]; y -> z [label = "3"]; } ``` ### Analyzing Runtime: What is the runtime of Bellman-Ford? A) $O(n)$ B) $O(n^2)$ C) $O(n^3)$ D) $O(nm)$ ($m$ is number of edges in the graph) ### Taking advantage of better data structures What if instead of getting an adjacency matrix $w$ as we did in @sec-Pseudocode, we got different data structure. Notice that in our recurrence relation @eq-recurrence, we only need to consider an option if there is actually an edge from that prior vertex to the current node. Suppose we had access to a reverse adjacency list. For the graph in @fig-pseudocode-practice, this would look like: $$ \begin{align} W[s]&=\emptyset\\ W[z]&=((s,4), (y,3))\\ W[r]&=((z,-3), (y,2))\\ \vdots \end{align} $$ If we have access to this data structure instead of an adjacency matrix, what is the runtime of Bellman-Ford? A) $O(n)$ B) $O(n^2)$ C) $O(n^3)$ D) $O(nm)$ ($m$ is number of edges in the graph) This type of data structure allows for a distributed version of Bellman-Ford. In this model, the vertex $v$ can calculate $A[v,i]$ if is gets passed the values of $A[u,i-1]$ for each $u$ with an edge $(u,v).$ Then once $v$ has calculated $A[v,i]$, it passes that value along its edges to that its neighbors can make further calculations.