Demystifying Derandomization

Report on a Workshop at TIFR

Feb 25 -- March 6 1999

Devdatt P. Dubhashi

Department of Computer Science and Engineering,
Indian Institute of Technology, Delhi,
Hauz Khas, New Delhi 110016, INDIA.

E-mail: dubhashi@cse.iitd.ernet.in

``I think I'll stop here.''

The scene was Auditorium AG80, TIFR Mumbai, not the Newton Institute Cambridge, the date, March 5, 1999, not June 23, 1993, and the speaker was Manindra Agrawal, not Andrew Wiles. But the atmosphere was similar, charging up to a dramatic climax. After being initially misled by the innocent title, ``Randomness and Algebra'' scheduled in the afternoon session, when the audience is normally just recovering from lunch and siesta, we soon realised that something special was afoot. Manindra was onto a new primality test, and moreover, via another classical algebraic problem: testing polynomial identity. We sat transfixed through some magical transmutations as cyclotomic polynomials devoured others and Chinese remainders did the cleaning up.

After the talk, we huddled together in groups trying to understand what had just hit us. Somebody announced a la Fermat, a bug in the proof. Would Manindra, like Andrew Wiles, have to go through the agony of retraction? Happily, the similarity ended there: the ``bug'' turned out to be something too embarrassing to detail here. After many rounds of masala chai (made into a feature by our Roman participant and sucker for the Orient, Alessandro Panconesi) and ruthless application of the ``Vinay Doctrine'' (footnote - since repudiated by its creator who, in a classic turnaround now swears that ``God dwells in the details.'') -- ``Strip away the details!'' -- we finally understood it better.

After tendering the obligatory apologies, I nevertheless insist that this accurately captures the spirit of the workshop on ``Derandomization'' organised by Jaikumar Radhakrishnan at TIFR Mumbai between February 25 and March 6, 1999. While the rest of the country was indulging in the annual orgy of colours, we were indulging in one of our own flavour, blending colours of Mathematics and Computer Science.

The Many Facets of Derandomization

``Derandomization'' means, of course, literally removing the randomness. Why and How? That was the central theme of the workshop. The origins lie, I believe, in some work of (who else?!) the Old Master, Paul Erd\"os and Joel Spencer. But the impetus to the subject from a Computer Science viewpoint derives, I believe, from a paper of Mike Luby (and it was therefore quite appropriate to start off the workshop with that paper). In it, he first designed a randomized (parallel) algorithm to find a maximal independent set in a graph. Then he made the beautiful observation that the analysis of the algorithm used such weak properties of the randomness involved that it could be eliminated altogether -- the algorithm could be derandomized! This realization that ``The Probabilistic Construction of Deterministic Algorithms'' (the title of a paper by Prabhakar Raghavan) via derandomization offered a systematic and highly successful methodology for algorithm design sparked off a flood of research in the area. Complexity theorists then asked about more general and less problem--specific versions of the method -- could one derandomize entire complexity classes? Today we have a sophisticated toolkit for derandomization, both algorithmic and complexity--theoretic, at our disposal. The aim of the workshop, as conceived by Jaikumar, was to carry out a comprehensive review of the most important of these concepts and techniques and to keep pace with the latest developments in the area.

Readers who have followed previous reports by the author will be only too aware of his free--wheeling armchair travel guide style and know what to expect below. To my previous defence, I would only like to add that Jaikumar is planning to create a web page for the workshop, where all the material of the workshop, diligently scribed in full detail by hapless student ``volunteers'' will be placed, and a hard--copy of the proceedings will also be generated out of this.

One of the difficulties I faced in writing this report (other than the customary difficulties with deadlines and failing memory) was the format: should I present it in temporal order with tutorials supposedly coming first and advanced material later? But often later material was tutorial and the early material was advanced! Or should one separate out by topics? But then is pseudorandomness an algorithmic topic or complexity--theoretic? My eye was caught by a (now famous) tutorial by Vinay from a past National Seminar where he experimented in a different non--linear style of presentation. It struck me that if I were giving this report on the board, I would have liked to adopt a similar non--linear style -- it would reflect much better the interconnections between seemingly different topics. Unfortunately, my skills with psfig or other such gadgetry is much too scant to permit it here.

Algorithms and Data Structures

Sundar Viswanathan got the workshop off to a great start with a crisp talk on the mother of it all -- Luby's maximal independent set algorithm. This is a deceptively simple algorithm, for it's analysis is quite subtle. Indeed the obvious extension to hypergraphs is still a compelling open problem! Anyway, the analysis of the algorithm set the stage for the whole workshop since it demonstrated that only pairwise independence (as against full independence) sufficed. Luby's original paper makes one wonder if he himself understood the beauty of the analysis! But there are now some much nicer accounts, for instance the one in Dexter Kozen's book The Design and Analysis of Algorithms .

K.V. Subrahmanyam then gave a series of talks on the constructions of small small spaces that satisfied these reduced independence requirements: ``KV speaks k--wise on the Kolaba Koast'', as Ravi quipped. K.V. gave a fairly comprehensive account starting with the constructions converting linear independence (in the linear algebra sense) of columns of generator matrices of codes into stochastic independence, and going into fairly sophisticated (yet easily described) constructions based on Weil estimates of character sums. I wonder if there is one place where one can find a nice account of all these constructions at one place, otherwise the scribes assigned to this would do a very worthy service to us all: these are fundamental tools that are used repeatedly in the derandomization arena.

Three such examples where the method has been successfully applied were presented in the next three speakers I mention now.

Venkatesh Raman gave a talk in two parts with a timer alarm at the halfway point for people who had fallen asleep. The first part dealt with hashing in the Fredman--Komlos--Szemeredi two level strategy. The second dealt with the method of colour coding , which he illustrated with two examples. The first example is to find short paths in a graph. The strategy is to take a random colouring and then find a ``colourful'' path i.e. one on which all vertices have distinct colours. This example kills two birds with one stone: it illustrates the fundamental idea underlying randomised algorithms, namely, ``abundance of witnesses'', as well as serves as a simple illustration of the derandomization method using small sample spaces. I think it can be easily introduced in a beginning algorithms class.

Peter Bro Miltersen gave a wonderful panoramic exposition of hashing, starting with the Carter--Wegman definition of universal hashing (which he corrected from the common misconception), then the Fredman--Koml\"os--Szemeredi two--level scheme and finally to methods that used ``real'' computer operations (like & in C for masking) and error--correcting codes.

V. Kamakoti talked about derandomization in Computational Geometry, a field in which the technique has achieved some of its most dramatic successes. Kamakoti talked mainly about the technology of the so--called $\epsilon$--nets. One wished there was also coverage of Ketan Mulmuley's paper ``Pseudorandomness in Computational Geometry'', where a beautiful general framework of ``configuration spaces'' is set up in which one can uniformly derandomize a whole host of applications. A good account is in Mulmuley's book Computational Geometry: An Introduction through Randomized Algorithms .

Ramesh Hariharan gave a fine account of the second basic method of derandomization, the so--called method of conditional probabilities going back to the old Erd\"os--Selfridge paper. He illustrated it with an application to set discrepancy problem. The clarity of his exposition is evident from the fact that when I asked our undergraduate students from IIT Delhi to apply it to derandomize the MAXSAT approximation algorithm they had seen in class, they proceeded to do so without any ado, despite never having seen any derandomization before! Ramesh also gave, in 10 minutes, a completely clear account of the important application of the method to derandomize the celebrated semi--definite programming based algorithm for MAXCUT due to Goemans and Williamson. Many have been the times when I have tried to read the paper, but put it away looking at the forbidding chains of iterated integrals; how much nicer to get the basic idea in 5 minutes!

Alessandro Panconesi gave two talks on the ``extraordinarily successful careers'' of trivial (randomized) algorithms. The first was on edge--colouring algorithms (where he tried to sneak in his latest agenda of ``experimental analysis of algorithms''). The second was a rather more interesting talk on an application to proving a lower bound on the performance of approximation algorithms for MAX-k-CUT. He showed a very neat probabilistic argument in the course of constructing an approximation--preserving reduction from MAXCUT; it was rather curious that one needs a probabilistic argument and a derandomization (albeit trivial) in order to get a lower bound!

C. Pandurangan talked about a skip--list based solution that achieves comparable performance to that achieved by a considerably more complicated deterministic solution for priority queues due to Gerth Brodal. Once again, the power of randomization is illustrated by the fact that the solution was the work of IIT undergraduates at Madras!

Combinatorics

Sundar gave two delightful talks covering derandomization in combinatorics. The first talk was on derandomizing the Lova\'sz Local Lemma. The Local Lemma is a very useful tool for showing the existence of objects by the probabilistic method even if the probability is exponentially small. Because of the exponentially small probability however, we do not immediately get even a randomized algorithm. However, the method discovered by Beck and simplified and extended by Alon simply skips the intermediate step to go directly to a deterministic algorithm! Sundar showed the original application to hypergraph 2--coloring; the method has since been applied to many more examples.

The second talk was on Ramsey graphs. One of the classical class--room illustrations of the probabilistic methods is to show that for large enough graphs, there are colourings such that no triangle is monochromatic. How does one derandomize this? Sundar gave a delightful exposition of the so--called linear algebra method and its applications to set systems with prescribed intersections to give explicit constructions of these so--called Ramsey graphs. This was another talk particularly enjoyed by our IIT Delhi undergraduates. A great account of the linear algebra method and its applications in combinatorics and computer science is in the draft of a book by that name due to Lazlo Babai and Peter Frankl, available from the University of Chicago (but not, alas, for free!).

Complexity Theory

Perhaps the hardest and most difficult themes running through the workshop was derandomization in complexity theory. The question is a fascinating and fundamental one: does randomness help in computation? Specifically, does BPP, the class of problems solvable efficiently i.e. in polynomial time, equal P? That is, can we obtain a general derandomization of BPP (as opposed to a derandomization of specific problems as we saw with various algorithms)? The answer is, like many other results in complexity theory, a conditional one at present. However, the nature of the condition is rather interesting: one can get the general derandomization i.e. BPP=P if two other complexity classes are different! V. Arvind started the series of talks on this ``Hardness versus randomness'' theme. It all started with a result of Nisan and Wigderson which showed that if there was a function computable in exponential time that was ``exponentially hard'' for circuits of slightly less than exponential size, then BPP=P. A series of subsequent results then successively weakened the hypothesis to the point where one needs only a ``weak hardness'' assumption on the function. The tortuous passage needs many of the fundamental concepts and techniques in complexity theory. Arvind spoke about the pseudorandom generator of Nisan and Wigderson and about the Goldreich--Levin predicate. Manindra Agrawal continued, speaking about Impagliazzo's concept of hardcores. Interestingly, one can take a game--theoretic view and deduce the existence of hardcores via the von Neumann minmax theorem or LP duality, as V. Vinay showed. I thought this was a very nice conceptual framework and the introduction to LP that Vinay gave was hugely enjoyed by the IIT Delhi undergraduate students.

One of the alluring results in complexity theory and one that was used in the passage above is Yao's XOR Lemma: to amplify the hardness of a function, take repeated XORs. Although simple to state and intuitive, the proof is not easy. Jaikumar Radhakrishnan gave a gem of a talk in which he gave a completely understandable proof of the XOR Lemma. One of the revealing features of the proof is how much difference it makes to adopt good notation, even if it is only using $+1$ and $-1$ in place of $0$ and $1$!

It was rather a pity therefore, that after finally having understood the proof of the XOR Lemma and its use in complexity theory, Jaikumar returned in a later talk to utter the following:

Friends, Romans and Countrymen (footnote - Although the Romans had departed by then)
Lend me your ears
I come to bury the XOR Lemma, not to praise it
The evil that men do lives after them
The good is oft interred with their bones
So be it with the XOR Lemma.

He presented results in a recent paper by Madhu Sudan, Luca Trevisan and Salil Vadhan that short--circuits a couple of steps in the chain of steps above using ideas that followed the discovery of a new construction of an extractor.

Extractors are very basic concepts in derandomization. As Amnon Ta Shma said, an extractor is something that takes the energy out of a source in which it is sitting, like squeezing an orange. Amnon gave a very nice discussion of the what, why and how of extractors, which made the literature suddenly more accessible. Nisan's survey on extractors which was circulated is also a great help. Luca Trevisan realised that the same pseudorandom generators that had been around for a while could actually be regarded as extractors after combining it with error correcting codes and in fact gave better constructions than previously known. Of course the consequences for derandomization in complexity are momentous. Ran Raz also gave a talk describing the Trevisan extractor.

A different approach to derandomizing BPP is due to Andreev, Clementi and Rolim. It is based on the concept of hitting sets which are somewhat weaker concepts than pseudorandom generators. The result obtained is somewhat weaker, though of the same ``hardness versus randomness'' form. The proof however, is considerably simpler and Peter Bro Miltersen gave a quite superb account which would have been accessible to the most uninitiated.

Meena Mahajan gave a couple of talks on derandomization of space--bounded complexity classes. In contrast to the earlier theme, the results here are unconditional. In the first talk, she covered the seminal result of Nisan about derandomizing randomized logspace computation, a paper that came soon after his influential discovery of the pseudorandom generator. Meena's talk was a good advertisement for the virtues of using probabilistic concepts like expectation and variance over direct counting or averaging. In the second talk, she covered the significant improvement of Saks and Zhou where a combination of applications of the Nisan pseudorandom generator and perturbations is used to reduce the space needed by the deterministic simulation dramatically.

Muli Safra gave two talks, one about the original seminal PCP characterization of NP, one of the most important results in Theoretical Computer Science, and its consequences for the complexity of approximation. The second talk was about a different, even more bizarre characterization of NP (involving such beasts as linear integer combinations of functionals) which gave the best known lower bounds on the complexity of lattice problems -- the closest vector problem and the shortest vector problem. He gave his patented laptop presentation -- by his own admission, he has started doing this after several disasters with blackboard presentations.

Ramesh Hariharan gave a very nice talk about work towards proving lower bounds on a set cover problem with restrictions on the set intersections, a problem motivated by geometric applications. He showed how the construction of hitting sets for combinatorial rectangles ( different from the hitting sets for BPP above) was a key component in the solution.

Somewhat removed from the main theme, Ran Raz gave a talk which was misleadingly titled ``Quantum Communication Complexity''. In fact, the talk was about the usual probabilistic communication complexity, only about a problem of interest to the Quantum model. Ran just about managed to give the definition of the probabilistic communication complexity model and the statement of the result but ran out of time before a single line of the proof.

Great Innings!

There were several fall--outs from the workshop that I highly recommend for future expositions. For instance, we now have a standard way to benchmark the exercises we leave for our audiences: the Sundar Scale. Starting from $S^-, S$ and $S^+$ exercise, we saw the whole range covered with $S^{\epsilon}$ standing for the truly trivial and $S^{\infty}$ for an unsolved problem (although Sundar might have some objection to placing an unsolved problem outside his ken).

Next we have Vinay's Cricket Scale for rating talks, which I will illustrate by an application to his talk. Copybook defence, bat and pad close together, straight bat and head right on top of the ball at the start of the innings, followed by extravagant cross--batted heaves over mid-wicket in the middle of the innings but surviving eventually to reach a century. Most talks in the workshop reached the magic figure as well, and some, like Jaikumar's XOR lemma talk and Peter Bro's Hitting sets talk reached the pinnacle of achievement, the unbeaten chanceless triple hundred!

Where the Gods Dwell

My friend Alessandro would often dream about the his ideal for a place habited by mathematicians and philosophers: it would be by the seaside, the breeze wafting up to the hallways supported by stately pillars amongst which they walked, engaged in discussions on the most profound problems of the age. Well, he finally saw his dream in real life! It was indeed a great experience to go on to the rocks by the sea after the talks and spend hours talking. The more diligent of us would in fact reconstruct the talks sitting over cups of masala chai . I personally found these to be the most stimulating 10 days I've spent in a long time.

On Modesty: ``Do you know what is a clique?''

One of our Israeli participants started his talk by asking if his audience knew what a clique was; another declared that keeping in mind the audience, he would limit himself to ``simple concepts and definitions''. This kind of condescension predictably annoyed many people, especially those for whom this was the first encounter with Israeli behaviour. From my experience with Israeli friends, I know that this kind of cocksure behaviour is just something one has to get used to, and in retrospect quite amusing.

In sharp contrast was the self--effacing attitude of the Indian participants, and this, despite the overwhelming impression I got of how many capable and quite brilliant people we have in the country today. But for this workshop, which gathered these people together at one place, one would never have realised this. It showed very dramatically how isolated the research community in India unfortunately is. Recently people have made commendable attempts to change this, for instance through the annual FST&TCS conference, the National Seminar, the Banasthali gathering etc. I for one strongly feel that we should take this opportunity to keep up the momentum generated by this workshop into a regular series of such events. The agenda of this kind of series would be somewhat different from the other ones (such as the National Seminar). This should provide a mechanism for Indian research in Theoretical Computer Science to operate at the highest level on the most challenging problems of the day. Incidentally, inspiration for this came from the public lecture by Armand Borel at the Bhabha Auditorium in which he told us how the young Einstein together with his friends resolved to ``study the most important books of the day''. If Indian research would be seen making visible contributions at the most prestigious international forums, we would have correspondingly less condescending talk directed at us.

By the way, as I sign off, I just get this very uncomfortable feeling that I missed out something important. Well, if I can't remember it, it couldn't have been anything of consequence or very memorable, could it ...?

Dodo Piadico
Contact address: Email: dubhashi@cse.iitd.ernet.in