Indian Computing Olympiad

Training Material


Range queries

Range queries with interval updates and point queries

So far, we have considered an array in which we make updates at single positions and ask questions about intervals. What if we have, instead, interval updates and point queries?

Paint (introductory version)

A painter paints a strip N squares wide

        ----------------------
        | 1 | 2 | 3 | .... | N |
        ----------------------
      

In one step, the painter paints a continuous strip from square i to square j, which we denote [i,j]. Periodically, we want to know the number of layers of paint that have been applied at square p.

Assume we have a sequence of instructions of the form:

	  paint([i1,j1])
	  paint([i2,j2])
	  .
	  .
	  paint([ik,jk])
	  layers(p)
	

There are K paint operations and M queries.

Naive solution:

Initialize layers(p) = 0 for all 1 ≤ p ≤ N.

On each paint([i,j]), increment layers(p) for each i ≤ p ≤ j

In the worst case, this could take O(N) time for each paint operation. A query can be answered in constant time. Thus, this solution takes time O(KN).

Better solution:

Maintain a segment tree, but what do we store at each node?

Recall that each node represents an interval. For N = 8, here are the intervals spanned by the nodes in the tree.

                              [1,8]
                                |
                 _______________|________________
                |                                |
              [1,4]                            [5,8]
         _______|_______                _________|________
        |               |              |                  |
      [1,2]           [3,4]          [5,6]              [7,8]
     ___|___         ___|___        ___|___            ___|___
    |       |       |       |      |       |          |       |
  [1,1]   [2,2]   [3,3]   [4,4]  [5,5]   [6,6]      [7,7]   [8,8]
      

At each vertex v, record

coats(v) = number of paint strokes that contain the entire interval Interval(v), but do not contain the entire interval Interval(parent(v)) spanned by the parent of v.

How do we update this?

Consider an operation paint([i,j]):

If [i,j] = [1,N], update coats of the root node and stop. For every other node, Interval(v) is contained in [i,j] but so is Interval(parent(v)), so coats(v) is unchanged.

If [i,j] is not equal to [1,N], break up the interval according to the children.

For instance, in the example above, suppose [i,j] = [2,5]. This does not span [1,8] so we split [2,5] as [2,4] and [5,5], send [2,4] down the left tree and [5,5] down the right tree.

We repeat this recursively at these two children of the root using the same rule. Thus, on the left child of the root, since [2,4] does not span [1,4], we further split it as [2,2] and [3,4] and send [2,2] left and [3,4] right. [3,4] stops, but [2,2] goes down further to the leaf labelled [2,2].

We update the coats(v) for every v where the (sub)interval of [i,j] that reaches v matches Interval(v).

As before, we can argue that each update takes time log N. Thus, over K paint operations, updating the entries takes O(K log N) time (as opposed to O(KN) time in the naive solution).

How do we answer a query of the form layers(p)?

We just have to add up coats(v) for all nodes on the path from the root to [p,p]. How do we do this efficiently? Recall that we have stored the tree in an array of size 2N-1. The leaf nodes are in positions N to 2N-1. Thus, the leaf node corresponding to the interval [P,P] is at position (N-1)+P. We can start here and walk up to the parent till we reach the root.

Alternatively, we can start at the root and walk down the path to [P,P] by checking the current interval and deciding whether to go left or right at each step.

This requires O(log N) steps for each query, or O(M log N) steps for M queries.

Thus, overall, this takes O((K+M) log N) steps to update information and answer queries. If K,M and N are roughly equal, this takes O(N log N) time whereas the naive solution takes O(N^2) time.