**1. Probabilistic Approximation of Metrics **

For many optimization problems, the input data involves some notion of distance, which we formalize as a metric. But unfortunately many optimization problems can be quite difficult to solve in an arbitrary metric. In this lecture we present a very approach to dealing with such problems, which is a method to approximate any metric by much simpler metrics. The simpler metrics we will use are trees, i.e., the shortest path metric on a graph that is a tree. Many optimization problems are easy to solve on trees, so in one fell swoop we get algorithms to approximate a huge number of optimization problems.

Roughly speaking, our main result is: *any metric on points can be represented by a distribution on trees, while preserving distances up to a factor*. Consequently: *for many optimization problems involving distances in a metric, if you are content with an -approximate solution, you can assume that your metric is a tree*.

In order to state our results more formally, we will need to deal with a important issue. To illustrate the issue, and how to deal with it, we first present an example.

** 1.1. Example: Approximating a cycle **

Let be a cycle on nodes. The (spanning) subtrees of are simply the paths obtained by deleting a single edge. So let be an edge and let be the corresponding tree. Is the shortest path metric of a good approximation of the shortest path of ? The answer is no: the distance between and in is only , whereas the distance between and in is . So, no matter which subtree of we pick, there will be some pair of nodes whose distance is poorly approximated.

Is there some way around this problem? Perhaps we don’t need to be a *sub*tree of . We could consider a tree (possibly with lengths on the edges) where and is completely unrelated to . Can such a tree do a better job of approximating distances in ? It turns out that the answer is still no: there will always be a pair of nodes whose distance is only preserved up to a factor . But here is a small observation: any subtree of approximately preserves the *average* distances. One can easily check that the total distance between all pairs of nodes is , for both and for any subtree of . Thus, subtrees approximate the distances in “on average”.

So for the -cycle, a subtree cannot approximate all distances, but it can approximate the average distance. This motivates us to apply a trick that is both simple and counterintuitive. It turns out that we *can* approximate all distances if we allow ourself to pick the subtree randomly. (The trick is Von Neumann’s minimax theorem, and it implies that approximating the average distance is equivalent to finding a *distribution* on trees for which *every* distance is approximated in expectation.) To illustrate this, choose any pair of vertices . Let be the distance between and in . Pick a subtree by deleting an edge at random and let be the – distance in . Obviously since we constructed by removing from . We now give an upper bound on . If is on the shortest – path then ; the probability of that happening is . Otherwise, . Thus,

So, *every* edge of is approximated to within a factor of , in expectation.

** 1.2. Main Theorem **

We now show that, for *every* metric with , there is an algorithm that generates a random tree for which *all* distances are approximated to within a factor of , in expectation.

Theorem 1Let be a finite metric with . There is a randomized algorithm that generates a set of vertices , a map , a tree , and weights such that

The main tool in the proof is the random partitioning algorithm that we developed in the last two lectures. For notational simplicity, let us scale our distances and pick a value such that such that for all distinct . Note that does not appear in the statement of the theorem, so we do not care how big it is.

The main idea is to generate a -bounded random partition of for every then assemble those partitions into the desired tree. Assembling them is not too difficult, but there is one annoyance: the parts of have absolutely no relation to the parts of for any . If the parts of were nicely nested inside the parts of then this would induce a natural hierarchy on the parts, and therefore give us a nice tree structure.

The solution to this annoyance is to forcibly construct a nice partition , for , that is nested inside all of . In lattice theory terminology, we define the partition

where is the meet operation in the partition lattice. If you’re not familiar with this notation, don’t worry; it is easy to explain. Simply define , then let

Note that is also a partition of . Furthermore, the parts of are nicely nested inside the parts of , so we have obtained the desired hierarchical structure.

** 1.3. Example **

Consider the following example which shows some possible partitions for the points , and the corresponding partitions .

The tree corresponding to these partitions is as follows.

** 1.4. Algorithm **

More formally, here is our algorithm for generating the random tree.

- For , let be a -bounded random partition generated by our algorithm from the last lecture.
- :
*The vertices in will be pairs of the form where and . The vertices and edges of the tree are generated by the following steps.* - Define . Add the vertex as the root of the tree.
- For downto
- Define .

- For every such set , add the vertex to as a child of , connected by an edge of length .
- Since for all distinct and since is -bounded, the partition must partition into singletons. Therefore we may define the map by .
**1.5. Analysis**

**Claim 2***Fix any distinct points . Let be the largest index with . Then .**Proof:*The level is the highest level of the partitions in which and are separated. A simple inductive argument shows that is also the highest level of the partitions in which and are separated. So the least common ancestor in of and is at level . Let us call the least common ancestor . ThenSince , the proof is complete.

**Claim 3***(1) holds.**Proof:*Let be such that . Since is -bounded, and must lie in different parts of , i.e., . By Claim 2,as required.

**Claim 4***(2) holds.**Proof:*Fix any and let . We havewhere the last inequality, proven in the following claim, applies Theorem 2 of Lecture 22 and peforms a short calculation.

**Claim 5***For any and ,*

*Proof:*Let be the integer with . Thensince when . The final sum is upper bounded as follows.

This proves the claimed inequality.