Training Material

## Heaps

### Dijkstra's algorithm with heaps

Recall Dijkstra's algorithm. We initially set:

• ExpectedBurnTime = 0;
• ExpectedBurnTime[i] = infinity, for i other than 1
(Use a sufficiently large value, say the sum of all the weights in the graph)
• Burnt[i] = 0, for all i

In each round:

• Among all unburnt vertices, pick j with minimum ExpectedBurnTime. (Initially, this is 1 since Burnt[i] = 0 for all i and only ExpectedBurnTime is not infinity
• Set Burnt[j] = 1.
• Update ExpectedBurnTime[k] for each neighbour k of j, if necessary.

The number of rounds is N, the number of vertices, since we pick a vertex and process it in each round.

If we use adjacency matrix, each round takes time N because

• (a) it takes time N to find the minimum unburnt value
• (b) it takes time N to scan all neighbours

We can fix the complexity of (b) by using an adjacency list instead of an adjacency matrix.

To fix (a) we keep the values of the form (v,ExpectedBurnTime) of unburnt vertices in a heap. This allows us to find the minimum unburnt vertex in log n time.

Now, we still have to update the ExpectedBurnTime of all its neighbours. Suppose w is an unburnt neighbour of v. We know we can update the ExpectedBurnTimed of w in log n time, but how do we find w in the heap? If we scan the array searching for a value (w,i), it will again take time N to update values.

To get around this, we maintain extra information telling us where each vertex sits in the heap. Maintain an array Index such that Index[v] = i means that the value for vertex v is at position i in the heap. Note that we have to update entries in Index each time we swap values while inserting, deleting or modifying values in the heap.

What is the complexity?

Naively, a vertex could have N-1 neighbours, so each round can take N log N time.

n

However, let us count more carefully. How many times will the value of vertex w be updated? Once for each of its neighbours. In other words, the value of w will be updated degree(w) times. If we sum up the degrees of all the nodes in the graph, we get 2*M where M is the number of edges in the graph (each edge is counted twice, once at each end point).

So, overall, we make M updates, each taking log N time, so this version takes 2 M log N + N log N = M log N

M could be N2 in which case M log N is N2 log N, which is worse than the complexity of the original implementation, so we should use the heap version only when M is small compared to N2.

### Video material

Video lecture on Heapsort and updating values in a heap, NPTEL online course on Design and Analysis of Algorithms (Lecture slides)