A large series of talks was devoted to tools for probabilistic analysis. The beginning was with something one requires in probably the very first probabilistic analysis one goes through -- some nice simple randomized algorithm has an equally nice expected run time, but to be practical we need to bound the probability that the runtime significantly deviates from its mean. Fortunately, in most cases this probability is gratifyingly small (i.e. measure is concentrated around the mean), but how does one go around showing this? Devdatt described how isoperimetric inequalities provide a meta-solution to be appropriately invoked. Talagrand's isoperimetric inequality TII, used with his weird notion of convex distance, is amazingly easy to apply and often provides tail bounds better than Chernoff-Hoeffding. So far, this sounded like hard sell of something relatively unknown, so then Devdatt proceeded to give concrete evidence -- a series of examples where either the (non-uniform) method of bounded differences NMBD, or the usage of hereditary functions of index sets HFIS, using TII, easily yields quite sharp bounds. This could not but convince the sceptics, who then wished to see more of why TII holds or how it gives NMBD or HFIS, and this then led us through another maze where an info-theoretic lemma for measure concentration popped up, involving the mysterious information theoretic concept of divergence. Janson's inequality peeked in, and Talagrand's convex distance function was partially demystified.
A most interesting aside was a surprising combinatorial result with an even more surprisingly simple proof; I offer the result here with the proof left as a puzzle to the reader. Given any number of points in the unit square, there is a tour through them such that the sum of the squares of the edges in the tour is at most 4!
Further on, Devdatt talked about qualifying dependence of
variables. In analysing a random event, one often needs
independence of sub-events, or at least -wise
independence. The theme here was that in practice, it is not
always necessary to quantify the dependence; it often
suffices to qualify the dependence as positive or
negative. Examples followed for positive dependence. For
negative dependence, there is apparently not much
literature, despite the hope that negative association or
regression will give sharper concentration of measure.
The BK inequality due to Berg and Kesten is an analogue of
the FKG inequality for negative dependence. It applies to a
disjoint conjunction of two events in a probability
space i.e. the two events occurring for ``disjoint''
reasons. The inequality states that for monotone events in
the discrete space equipped with the uniform
measure (or product measure derived from the component
spaces), the probability of the disjoint occurrence of two
events is at most the product of the probabilities of the
events, as is intuitive. There is a simple proof of this
inequality via a coupling-like construction. However, the
inequality for non-monotone events was left open and
remained so for almost a decade. Finally, just a few years
ago, the general case was resolved by Dave Reimer, a student
of Jeff Kahn. It is quite a story. Reimer was almost at the
point of leaving academia to enter Wall Street, and it
wasn't clear he even had a thesis in hand. One day, perusing
the stuff Reimer had left, Kahn discovered that he had
proved the general conjecture, which is now the
Berg-Kesten-Reimer (BKR) Theorem: Any product space
satisfies the BK inequality for arbitrary events!
There is something very unusual about the proof. First one
can easily reduce the general case to the case of the simple
space with the uniform measure, so it's just a
counting problem. Or so it seems, because no direct counting
proof seems to work. Nor even can one exhibit or assert an
injective mapping as required. The insight of Reimer was to
transform the problem into the domain of linear algebra,
where a dimension argument gives the result! The linear
algebra involved is quite elementary, but nevertheless seems
indispensable. Expository accounts are now available in
[10].
Further on in this series, C R Subramaniam tried to outline
the use of Janson's inequalities in proving properties of
random graphs. For instance, I had not realised how
non-trivial it is to estimate the number of 's in a
random graph.
I hope that some decent lecture notes from this workshop will materialise particularly for the probabilistic tools stuff. (Can't push this demand too aggressively, though, not having written up my own lecture notes yet!) Even better, a reference text should be written and published; royalties will be non-zero as I promise to keep a copy at my elbow as I work ... . Maybe Devdatt's book-in-the-making will fill the gap.