Indian Computing Olympiad

Training Material


Dynamic Programming

Other classical problems


Longest common subword

Suppose we have two words made up from some letters, such as

  • V = T G A T G G A T C
  • W = G T T T G C A C

We want to find the longest contiguous sequence of letters (subword) that is contained in both words. In this case, the longest common subword is "T G".

Solution

  • LCSW[i,j] = length of longest common subword starting at position i in V and position j in W
  • LCSW[i,j] = 0, if V[i] != W[j]
                    = 1 + LCSW[i+1,j+1], otherwise
  • Base case:
    • LCSW[M,j] = 1 if V[N] = W[j], 0 otherwise
    • LCSW[i,N] = 1 if V[i] = W[N], 0 otherwise

We calculate B[i,j] in an M x N array. We know the values in the last row and last column from the base case. Once we know the value at (i+1,j+1), we can compute the value at (i,j). So, we can systematically fill up the values for B[i,j] row by row, right to left, starting from the bottom (or, equivalently, column by column, bottom to top, starting from the right).

Longest common subsequence

As before, we have two words over some letters, such as

  • V = T G A T G G A T C
  • W = G T T T G C A C

Now we can drop some positions and obtain a subsequence. For instance "T G A C" is a subsequence of both words. Our aim is to find the length of the longest common subsequence.

Solution

  • LCSS[i,j] = length of longest common subsequence in the words V[i] V[i+1] ... V[M] and W[j] W[j+1] ... W[N]
  • LCSS[i,j] = LCSS[i+1,j+1] + 1, if V[i] = W[j]
                    = max (LCSS[i+1,j], LCSS[i,j+1]), otherwise

Once again, we can fill up the values LCSS[i,j] in an M x N array, starting with the last row and column as base cases.

  • LCSS[M,j] = 1 if V[M] appears anywhere in W[j] ... W[N]

Fill the last row from right to left and once we fill a 1, all entries to the left are 1.

Similarly

  • LCSS[j,N] = 1 if W[N] appears anywhere in V[j] ... V[N]

Fill the last column from bottom to top and once we fill a 1, all entries above are 1.

Exercise

The recurrences for LCSW and LCSS only tell us the lengths of the longest common subword and subsequence, respectively. How do we recover a witness — that is, a common subword or subsequence of this maximum length?

For subword it is easy — since we know the position where the longest common subword starts, we can read off the subword.

How do we do it for longest common subsequence?

Longest ascending subsequence

Let s = a[1] a[2] ... a[n] be an unsorted sequence of numbers. A subsequence of s is one in which we drop some elements and retain the rest (so we need not pick up a contiguous segment, but we preserve the order of elements).

For instance, a[1] a[3] a[n] is a subsequence.

Problem: Find the (length of the) longest ascending subsequence.

Solution 1

Sort the array and find the longest common subsequence between sorted sequence and original sequence.

This takes time O(n2) (sorting takes time O(n log n) and longest common subsequence takes O(n2).

Solution 2

A direct dynamic programming solution for this problem.

  • LAS[i] : length of longest ascending subsequence in that ends with a[i] (and hence only contains elements from a[1] a[2] ... a[i])
  • LAS[1] = 1
  • LAS[i+1] = 1 + max { LAS[j] | 1 <= j <= i, a[j] <= a[i+1] }

    i.e. to find the length of the longest ascending subsequence ending at a[i+1], look at all values to the left of a[i+1] that are less than or equal to a[i+1] and pick the position among these with the maximum LAS value.

This can be computed in an array. This takes time O(n2) — for each LAS[i], we have to scan values from LAS[1] to LAS[i-1].

This solution is no better than our initial solution of sorting the sequence and then computing the longest common subsequence.

We can improve the second solution to solve the problem in O(n log n) by maintaining an additional array M.

  • M[j] = of all the longest ascending subsequences of length j, M[i] is the smallest final value

    M[j] = min { a[k] | 1 <= k <= j, LAS[k] = j }

Observe that M[j]'s always form an ascending sequence. So, we can find M[j-1] <= a[i] < M[j] using binary search.

Adding a[i] requires log(i) time. Overall we make n updates, so the overall algorithm is O(n log n).

An example:

	i       1    2    3    4    5    6    7    8

	a[i]    8    2    7    3    9    4    6    5

	LAS[i]  1    1    2    2    3    3    4    4

	M[i]    8X   7X   9X   6X
	        2    3    4    5
      

Maximum sum subsection

Given a sequence of integers, a subsection is a block of contiguous integers in the sequence. The task is to find the subsection whose sum is maximum.

For instance, here is a sequence and a subsection marked out whose sum is -1.

        3 -1  2 -1  2 -3  4 -2  1  2  1
          |<------------>|
              sum = -1
      

Find an O(N) algorithm for this problem.

Solution

  • Best(i) = Maximum sum segment ending at i
  • Best(i) = max { A[i], // Best segment starts here
                              A[i] + Best(i-1) // Best segment extends previous
                          }
  • Best(1) = A[1]

This assumes subsections are of length > 0. If zero length segments are permitted, we also have to add a term 0 to the max term when computing Best(i). (For instance, if all values are negative, the maximum sum subsection would then be the empty subsection with sum 0.)

Video material

Video lecture on Dynamic programming, classic problems, NPTEL online course on Design and Analysis of Algorithms