Solution to Problem 1: Average
A simple solution that takes time proportional to N*N*N is quite direct: ----------------------------------------------------------------- count=1; for i=1 to N { Check if there are j and k such that a_i is the average of a_j and a_k if so increment count } print count ----------------------------------------------------------------- The check inside the loop will take roughly N*N steps (examine each pair and see if a_i is the average of the pair). Thus the overall time taken is proportional to N*N*N. This solution would get 30% of the marks. However, we can do significantly better. We can find whether a given element is an average element in roughly N steps instead of the naive N*N approach. Our first observation leading to this algorithm is the following: if a_i is the average of a_j and a_k then one of a_j or a_k must be less than or equal to a_i and the other must be greater than or equal to a_i. Suppose the given sequence is sorted in ascending order, i.e., a_1 ≤ a_2 ≤ a_3 ... a_N. Let us further assume that the numbers are distinct (We leave the extension of the solution to the case where there are repetitions as an exercise.) Thus a_1 < a_2 < a_3 ... a_N. To check if a_i is an average element, we first consider a_{i-1} and a_{i+1}. If Average(a_{i-1},a_{i+1}) is a_i we are done. Otherwise, Average(a_{i-1},a_{i+1}) > a_i or Average(a_{i-1},a_{i+1}) < a_i. Case 1: Average(a_{i-1},a_{i+1}) > a_i Then, a_i cannot be the average of a_{i-1} and any other element --- the average of a_{i-1} and any element a_j with j < i is less then a_i and the average of a_{i-1} and any element a_j with j > i is more than a_i. Thus, we don't need to consider a_{i-1} anymore. Case 2: Average(a_{i-1},a_{i+1}) < a_i. Then, a_i cannot be the average of a_{i+1} and any other element --- the average of a_{i+1} and any element a_j with j > i is more then a_i and the average of a_{i+1} and any element a_j with j < i is less than or equal to Average(a_{i-1},a_{i+1}) < a_i. Thus, we don't have to consider a_{i+1} anymore. This is nice, since we seem to discard a number of pairs after just one comparison. Can we use this to get a algorithm that takes N steps (rather than N*N)? The more general property, whose proof is identical to the proofs given above, is the following: In general, let l < i and u > i. Suppose Average{a_l,a_u} > a_i. - (P1) Then, a_i cannot be the average of a_l and any a_j with j ≥ u. - (P2) Similarly, if Average{a_l,a_u} < a_i then, a_i cannot be the average of a_u and any a_j with j ≤ l. This leads us directly to the efficient algorithm. Our strategy is to maintain two indices l and u, such that 1) l < i and u > i 2) a_i is not the average of any pair involving any a_j with l < j < u. To begin with we initialize l to i-1 and u to i+1. (Recall that our list has no repetitions and is sorted in ascending order and so 2 is also satisfied.) If Average{a_l,a_u} is a_i, we are done. If Average{a_l,a_u} > a_i then we decrement l by 1. Clearly 1 still holds. By 2, we already know that for any j with l < j < u, a_j cannot appear in any pair whose average is a_i. Further, by (P1) we know that a_i cannot be the average of a_l with any a_j with j ≥ i. And clearly a_i cannot be the average of a_l with any j<l. Thus a_i cannot be the average of a_l and any a_j. Thus decrementing l does not violate 2. If Average{a_l,a_u} < a_i, we increment u by 1. A similar argument will verify that l and u continue to satisfy 1 and 2. Of course we stop and conclude that a_i is not an average element if either l reaches 0 or u reaches N+1. Here is the algorithm: --------------------------------------------------------------------- count = 0; for i = 1 to N { l = i-1 u = i+1 while ((l > 0) and (u ≤ N)) { if (Average(a_l,a_u) = a_i) { count = count + 1 break } if (Average(a_l,a_u) < a_i) u = u+1 else l = l-1 } } print count ----------------------------------------------------------------------- In each iteration of the while loop either l is reduced by 1 or u is increased by 1. Thus, the while loop can iterate at the most N times before it finishes. Thus, the entire algorithm takes roughly N*N steps. Well, but we assumed that a_1 < a_2 < ... a_N. As long as the numbers are distinct, we can arrange this by sorting the array. As we know from earlier problems, this can be done easily in N*N steps and in N*log(N) steps using clever algorithms. Either way, the overall time takes time proportional to N*N (since N*log(N) is much less than N*N). If the elements are not distinct then, after sorting our array would look like a_1 ≤ a_2 ≤ ... ≤ a_N. The above algorithm can be modified slightly to work in this case too and this is left as an exercise. You are strongly encouraged to solve this exercise as it illustrates a crucial fact --- neat ideas take you close to the solution, but to solve something completely, you have to be willing to dirty your hands a bit. Notice that the only property of "Average" that is used in the algorithm above is that if x ≤ y and w ≤ z then Average(x,w) ≤ Average(y,w). This property is called monotonicty. If we were asked to compute the number of elements that are sums of two other elements, or that are sums of squares of 2 other elements ... any such properties that satisfies monotonicity can be solved as above in time proportional to N*N rather than N*N*N. (And if we asked to find one such element we can do it in time N instead of the N*N that the naive algorithm would take.)