Indian National Olympiad in Informatics

Related:
Online Programming Contest, 5-6 March 2005

IARCS home > OLYMPIAD

Advanced Division

Solution to Problem 2: The Bickering Task Force
It is convenient to model this problem as a graph. In general,
any problem that involves a set of entities and relationships
between pairs of them suggests graphs as a possible method to
model and solve the problem.

Let us recall, from the solution to the "The Great Escape
Problem" of October 2004, the basic definitions regarding graphs:

A graph, say G, consists of two sets, a set of vertices or nodes
(V) and a set of edges (E). The set of vertices could be given
finite set.  Each edge is a set consisting of two vertices. We
may interpret the set of edges as describing which pairs of
vertices are "neighbours".

Example: Here is a graph G = (V,E) with 7 vertices and 9 edges:

V = { 1,2,3,4,5,6,7 }

E = { {1,2},{1,3},{2,3},{3,4},{5,6},{5,7},{6,7} }

It is often convenient to draw graphs with boxes or circles to
represent the nodes and lines/curves connecting boxes/circles to
denote edges:

       ---           ---                 ---
      | 1 |---------| 2 |---------------| 7 |
       ---          /---                /---\
        |          /                   /     \
        |   /-----/                   /       \
        |  /                         /         \
       ---/         ---          ---/           \ ---
      | 3 |--------| 4 |--------| 5 |------------| 6 |
       ---          ---          ---              ---

Note that the placement of the boxes or the style of drawing the
edges are irrelevant. The "real" graph is given by the two sets V
and E and the picture is just to help us think.

For the purposes of our problem, the vertices of our graph are
the nobles (1, 2, ... N). Two vertices i and j are connected by a
(undirected edge) if they hate each other.  Thus, the degree of a
vertex i (i.e. the number of edges incident on vertex i) is the
number of nobles that i hates.  For example, in the above graph,
noble 3 hates three other nobles, namely 1, 2 and 4.

We say that a graph G'=(V',E') is a subgraph of a graph G = (V,E)
if V' is a subset of V and E' is a subset of E. For example,

       ---          ---          ---              ---
      | 3 |--------| 4 |        | 5 |------------| 6 |
       ---          ---          ---              ---

is a subgraph of the graph G described above. 

The king wants us to identify a collection of nobles such that
everyone in the collection hates at least K others in the
collection.  In a graph G = (V, E) any subset V_1 of V identifies
a special subgraph G_1 of G containing the elements of V_1 as
vertices with the edge set E_1 containing all the edges in E with
both endpoints in V_1. G_1 is called the "induced subgraph" on
V_1.  The above example of subgrpahs is not an induced subgraph
since the edge connecting vertices 4 and 5 has been omitted. The
induced subgraph on V_1 = { 3, 4, 5, 6 } is

       ---          ---          ---              ---
      | 3 |--------| 4 |--------| 5 |------------| 6 |
       ---          ---          ---              ---

The degree of a vertex v in G_1 is the number of elements in V_1
that are neighbours of v in G (or equivalently G_1).  Thus, our
task is to identify the largest subset V_1 such that in the
induced subgraph on V_1 every vertex has degree at least K.  Let
us call a graph to be K-good if every vertex in the graph has
degree at least K.

The naive idea would be to consider each subset of V and such an 
algorithm would take time proportional to 2^N. We can do much better.

Suppose, every noble hated at least K other nobles, then the
minister could simply appoint every noble to the
committee. Otherwise, there is a nonempty set D_0 containing all
the nobles who hate fewer than K nobles each.  Clearly they
cannot be part of any valid committee that the minister will
constitute. So we need to drop them and get V_1 = V - D_0 and he
might as well restrict his attention to the nobles in this set in
trying to constitute his committee.

If every noble in V_1 hates at least K other nobles in V_1 then
the minister may choose the set V_1 to be his committee.
Otherwise, the subset D_1 of V_1, consisting of all the nobles in
V_1 which hate fewer than K nobles in G_1 is not empty.  Clearly
any member of D_1 cannot belong to any committee (since no member
of D_0 can, and once the members of D_0 are not in the committee
no member of D_1 can have at least K members he hates in the
committee). Thus, the minister should drop all the nobles in D_1
from V_1 and focus on the set of nobles V_2 = V_1 - D_1 to
constitute his committee. Continuing in this manner will lead us
to our solution.

In other words, we come up with two sequences of sets of vertices 

    V = V_0, V_1, ... V_k and D_0, D_1, ... D_k

such that 

   1) D_i is the set of vertices in G_i (the induced subgraph on
      V_i) whose degree is less than K and
    
   2)  V_{i+1} is V_i - D_i.

Eventually we will find a k with D_k being the empty set (since
there are only finitely many elements in V_0). Then, every vertex
in V_k has at least K neighbours in G_k (i.e.) G_k is a K-good
graph.  (Thus, the nobles corresponding to the vertices in V_k
can be picked to form a committee meeting the requirement.)

Is this the largest set with this property?

We first show, by induction on j, that if v is in D_j then it
cannot appear in any K-good induced subgraph of G.

No member of D_0 can appear in any K-good induced subgraph of G_0
by definition. Suppose this is true for D_0, ... D_i.

We exclude all the vertices in

        D_0 U ... U D_i

from V to get V_{i+1} and by the definition of D_{i+1} there is
no K-good induced subgraph of G_{i+1} that contains any vertex
from D_{i+1}. Any K-good induced subgraph of G is also an induced
subgraph of G_{i+1} (by the induction hypothesis) and this means
that it cannot contain any vertex from D_{i+1}.

Thus, any K-good induced subgraph of G is also a (K-good) induced
subgraph of G_k. But G_k itself is a K-good subgraph and thus it
is the largest K-good subgraph of G.

How efficiently can we compute G_k ?

The direct method is to compute G_0, G_1, ... G_k. Computing
G_{i+1} from G_i involves deleting all the vertices whose degree
is less than K from G_i.  In time proportional to E one can
determine the degree of each vertex in G_0. In time proportional
to N one can determine the vertices to be deleted and update the
degrees of the undeleted vertices in time proportional to E. Thus
the computation of the entire sequence takes time proportional to
E*N (since k is bounded by N).

However, we now show that the computation of G_k can be done in
time proportional to E. The idea is to maintain a queue where all
the vertices that have been identifed as "to be deleted" are
placed before they are actually "deleted". Initially the queue is
initialized with all the vertices of G whose degree is below
K. In each iteration a vertex is picked up from the queue, it is
marked as deleted and the degree of all its neighbours is reduced
by 1. If this results in the degree of any of its neighbours
dropping below K, then add that vertex to the queue. The
procedure ends when the queue is empty. The undeleted vertices
will constitute V_k. Here are the details:

-----------------------------------------------------------------------

        /* Determine the degree of all the vertices. If the graph
        is stored as an adjacency matrix, then this will take time
        proportional to N*N, since counting the number of neighbours
        will need a scan of a row of the matrix. If the graph is
        stored as an adjacency list, then it takes time proportional
        to E. */

        for i=1 to N {
                Deg[i] = number of neighbours of i.
                }

        /* Head and tail are set to ensure that the Q is emtpy */

        H = 1;  
        T = 0; 

        /* Initialise queue with all vertices with degree < K */

        for i=1 to N {
                if (Deg[i] < K) {
                        T=T+1;
                        Q[T]=i;
                        InQ[i] = True;  /* Remember that i is in the Q*/
                        }
                }

        While (H ≤ T) {   /* while the queue is not empty */
                v = Q[H];
                H = H+1;  /* delete v from the Queue */

                /* Move v from Q to deleted status */

                InQ[v] = False; /* v is removed from Q*/
                Deleted[v] = True; /* Mark v as deleted */

                for each neighbour w of v {
                        /* For as yet unqueued/undeleted neighbours*/
                        if ((InQ[w]=False) and (Deleted[w]=False)) {
                                Deg[w]=Deg[w]-1;   /* reduce degree */
                                /* Move it to Q if necessary */
                                if (Deg[w] < K) {
                                        T=T+1;
                                        Q[T]=w;
                                        InQ[w]=True;
                                  }
                          }
                  }
          }

          Print the number of undeleted vertices

---------------------------------------------------------------------------

Notice that each vertex enters and leaves the queue at most once.
Thus, the body of the while loop is executed at most once for
each vertex.  Moreover, the inner for loop takes at most time
proportional to N (if the graph is stored as an adjacency matrix
and hence requires a scan of the row).  Thus, this algorithm
takes time bounded by N*N.

We now show that if the graph is stored using what are called
adjacency lists then this algorithm would run in time
proportional to |E|.

In the adjacency matrix representation, it takes time
proportional to N to examine all the neighbours of a given vertex
v. It turns out that there is an alternative representation that
allows us to examine the neighbours of a vertex v in time
proportional to the degree of v.  This turns out to be extremely
useful in many algorithms.

What is an adjacency list?

Let the vertices of the graph be 1 ... N.  For each vertex i, we
maintain an array Ai whose elements are the neighbours of the
vertex i. In effect, we maintain a two dimensional array A[][]
where the elements of the ith row are the neighbours of i.
Observe that not all rows will have equal number of elements. To
keep track of the number of elements in each row, we maintain a
array D where D[i] is the degree of i (i.e. the number of
neighbours of i).

For example, the adjacency lists corresponding the example graph
given at the beginning of this section is the follwoing:

      A   1   2   3   4   5   6                 i    D[i]
          ---------------------                 ---------
     1 |  2   3                                  1     2
     2 |  2   7   1                              2     3
     3 |  1   2   4                              3     3
     4 |  3   5                                  4     2
     5 |  4   6   7                              5     3
     6 |  5   7                                  6     2
     7 |  2   5   6                              7     3

Notice that the number ``used'' entries in the matrix A is just
two times the number of edges in the graph. However, when we use
a 2 dimensional matrix to store the adjacency lists, we need to
define A as a N x N-1 matrix as it is possible that some vertex
may have upto N-1 neighbours. (It is possible to use pointers and
store these much more efficiently. However, we shall not concern
ourselves with pointers in this section.)

Once a graph is stored in the adjacency list form, in order to
process the neighbours of a vertex v it suffices to do

        for i=1 to D[v] 
                process(A[v][i])

which clearly takes time proportional to D[v].

However note that if we need to determine whether there is an
edge between v and a vertex w then we need to scan the adjacency
lists of v (or w) to see if w (or v) appears there. Thus,
checking whether an edge exists takes upto N steps if we use the
adjacency lists representation. On the other hand, in the
adjacency matrix representation this can be done in one step by
examining the A[v][w] entry. So, the choice of representation
should depend on the needs of the algorithm.

We now modify the above algorithm to work with adjacency
lists. The queue stored the ``candidates'' for deletion as
usual. The array Deg stores the number of neighbours of a vertex
that have not been marked as deleted.

-----------------------------------------------------------------------

        /* We assume that the graph is stored in the adjacency
        lists notation using arrays A[][] and D[]. */

        /* Deg will store the number of neighbours of i that have not
           been marked as deleted. Initial Deg[i] = D[i]  */

        for i=1 to N {
                Deg[i] =  D[i]
                }

        /* Head and tail are set to ensure that the Q is emtpy */

        H = 1;  
        T = 0; 

        /* Initialise queue with all vertices with degree < K */

        for i=1 to N {
                if (Deg[i] < K) {
                        T=T+1;
                        Q[T]=i;
                        InQ[i] = True;  /* Remember that i is in the Q*/
                        }
                }

        While (H ≤ T) {   /* while the queue is not empty */
                v = Q[H];
                H = H+1;  /* delete v from the Queue */

                /* Move v from Q to deleted status */

                InQ[v] = False; /* v is removed from Q*/
                Deleted[v] = True; /* Mark v as deleted */

                /* for each neighbour of v in G do */
                for i=1 to D[v]  do { 

                        w = A[v][i];

                        /* If it is still unqueued and not 
                            marked as deleted */

                        if ((InQ[w]=False) and (Deleted[w]=False)) {
                                Deg[w]=Deg[w]-1;   /* reduce degree */
                                /* Move it to Q if necessary */
                                if (Deg[w] < K) {
                                        T=T+1;
                                        Q[T]=w;
                                        InQ[w]=True;
                                  }
                          }
                  }
          }

          Print the number of undeleted vertices

---------------------------------------------------------------------------

As observed earlier the outer while loop executes at the most N
times as in each iteration at least one new vertex is marked as
deleted.  In each iteration, of the while loop we examine the
neighbours of exactly one vertex, the newly deleted vertex, in
the inner for loop. Each vertex is deleted at the most once. Thus
across all the iterations of the while loop the inner for loop is
executed at the most D[1] + D[2]+ ...  D[N] times. But this is
just 2*|E|. Thus, this algorithm runs in time proportional to
|E|. Notice that if E has say 10 * N elements this is a
considerable improvement over N*N. Thus, if there are not too
many edges in the graph the algorithm using adjacency lists does
significantly better. However, if the graph is complete, then it
has of the order of N*(N-1)/2 edges and then its performance is
not significantly better than that of the one using adjacency
matrix.

Adjacency lists can be used to improve the running time of the
DFS and BFS algorithms describe earlier to run in time
proportional to |E| instead of N*N. We encourage the reader to
work out the details of these modifications and convince
themselves that the modified algorithms do run in time
proportional to |E|. We shall be using these facts in some of the
algorithms that follow.