next up previous
Next: Embeddings and Projections Up: dd-mm Previous: Entropy

Probability

A large series of talks was devoted to tools for probabilistic analysis. The beginning was with something one requires in probably the very first probabilistic analysis one goes through -- some nice simple randomized algorithm has an equally nice expected run time, but to be practical we need to bound the probability that the runtime significantly deviates from its mean. Fortunately, in most cases this probability is gratifyingly small (i.e. measure is concentrated around the mean), but how does one go around showing this? Devdatt described how isoperimetric inequalities provide a meta-solution to be appropriately invoked. Talagrand's isoperimetric inequality TII, used with his weird notion of convex distance, is amazingly easy to apply and often provides tail bounds better than Chernoff-Hoeffding. So far, this sounded like hard sell of something relatively unknown, so then Devdatt proceeded to give concrete evidence -- a series of examples where either the (non-uniform) method of bounded differences NMBD, or the usage of hereditary functions of index sets HFIS, using TII, easily yields quite sharp bounds. This could not but convince the sceptics, who then wished to see more of why TII holds or how it gives NMBD or HFIS, and this then led us through another maze where an info-theoretic lemma for measure concentration popped up, involving the mysterious information theoretic concept of divergence. Janson's inequality peeked in, and Talagrand's convex distance function was partially demystified.

A most interesting aside was a surprising combinatorial result with an even more surprisingly simple proof; I offer the result here with the proof left as a puzzle to the reader. Given any number of points in the unit square, there is a tour through them such that the sum of the squares of the edges in the tour is at most 4!

Further on, Devdatt talked about qualifying dependence of variables. In analysing a random event, one often needs independence of sub-events, or at least $k$-wise independence. The theme here was that in practice, it is not always necessary to quantify the dependence; it often suffices to qualify the dependence as positive or negative. Examples followed for positive dependence. For negative dependence, there is apparently not much literature, despite the hope that negative association or regression will give sharper concentration of measure.

The BK inequality due to Berg and Kesten is an analogue of the FKG inequality for negative dependence. It applies to a disjoint conjunction of two events in a probability space i.e. the two events occurring for ``disjoint'' reasons. The inequality states that for monotone events in the discrete space $\{0,1\}^n$ equipped with the uniform measure (or product measure derived from the component spaces), the probability of the disjoint occurrence of two events is at most the product of the probabilities of the events, as is intuitive. There is a simple proof of this inequality via a coupling-like construction. However, the inequality for non-monotone events was left open and remained so for almost a decade. Finally, just a few years ago, the general case was resolved by Dave Reimer, a student of Jeff Kahn. It is quite a story. Reimer was almost at the point of leaving academia to enter Wall Street, and it wasn't clear he even had a thesis in hand. One day, perusing the stuff Reimer had left, Kahn discovered that he had proved the general conjecture, which is now the Berg-Kesten-Reimer (BKR) Theorem: Any product space satisfies the BK inequality for arbitrary events!

There is something very unusual about the proof. First one can easily reduce the general case to the case of the simple space $\{0,1\}^n$ with the uniform measure, so it's just a counting problem. Or so it seems, because no direct counting proof seems to work. Nor even can one exhibit or assert an injective mapping as required. The insight of Reimer was to transform the problem into the domain of linear algebra, where a dimension argument gives the result! The linear algebra involved is quite elementary, but nevertheless seems indispensable. Expository accounts are now available in [10].

Further on in this series, C R Subramaniam tried to outline the use of Janson's inequalities in proving properties of random graphs. For instance, I had not realised how non-trivial it is to estimate the number of $K_4$'s in a random graph.

I hope that some decent lecture notes from this workshop will materialise particularly for the probabilistic tools stuff. (Can't push this demand too aggressively, though, not having written up my own lecture notes yet!) Even better, a reference text should be written and published; royalties will be non-zero as I promise to keep a copy at my elbow as I work ... . Maybe Devdatt's book-in-the-making will fill the gap.


next up previous
Next: Embeddings and Projections Up: dd-mm Previous: Entropy
Meena Mahajan 2002-02-13