Alexey and I have recently discussed the following quiz:
You are given an array of length of pairwise independent random real numbers uniformly distributed on . Suppose you are searching for the maximal element in the array by traversing it from left to right and updating a temporary variable holding the maximal value found so far (the variable is initialized with the 0th element of the array before the loop). How many updates will you be making on average?
Frankly, my own intuition for these kinds of things is pretty weak, which is why I felt challenged and sat down to solve the quiz. Below is what I ended up with. This is how not to solve it.
First, let us formalize the problem. Let be the answer for an array of length . Let us figure out what exactly the “on average” part means here. Let denote the number of updates required by an array of length . If the set of all possible arrays were finite, say , we would count the number of updates for each array, and then would be the average of these numbers:
Because can have as its values only integers between and , we can rearrange the terms in the above formula and write it as follows:
Here denotes the cardinality of a set . The quotient
can be interpreted as the probability that an array picked at random from the set requires updates, provided all arrays in are equally likely. In other words, becomes the expectation of . Unfortunately, the set is infinite; in fact, it can be identified with the -dimensional cube . And yet the logic remains the same. Because each coordinate is uniformly distributed over the interval , the distribution of arrays in is also uniform. That is, the probability that an array picked at random from the set is contained in a subset is the -dimensional volume of . In the fancy language of probability theory, the -dimensional volume is the product of copies of the Lebesgue measure on corresponding to the uniform distribution of each of the coordinates. The measure is a probability measure on the set , i.e., , and hence the pair is a probability space. Furthermore, is a function , i.e., a discrete random variable, and is the expectation of :
Let us describe the set . An array requires updates if and only if there exists a sequence of indexes where the updates happen such that and for each and index holds ; here we put and for uniformity. If the indexes are fixed, the volume of the set of arrays satisfying these inequalities is given by the integral
which is fairly easy to compute and which is equal to
The volume of the preimage is equal to the sum of these integrals over all possible sequences of indexes:
The average number of updates is then given by the formula
Clearly, this cannot possibly be an answer to a job interview quiz. It is too complicated. I wasn’t pleased with my solution, so I emailed the quiz to Alexey to see if he could come up with anything better. And indeed, his solution turned out to be very slick: The probability that the largest item in the array is the last is . Whether it is or no, the prefix is isomorphic to the original problem. Thus we have to do work, and with probability we have one more update after that:
An array of length doesn’t require updates, so that . Together these equations imply that , the -th harmonic number less . In particular, this immediately suggests that the number of updates grows logarithmically in , which is very hard to see from my formula.
By the way, is my answer actually correct? At first, I didn’t think so. How can something as hairy as
possibly be equal to I tried to prove that by induction and didn’t succeed. However, this is indeed the case, and I have discovered an elegant algebraic proof, which I am pleased to present.
First, observe that the sum
occurring in is the -th elementary symmetric polynomial in , , …, . Consider the polynomial
The coefficients of are precisely the elementary symmetric polynomials in . The value of at the point is the sum of these polynomials, which is almost , except that in these polynomials show up with coefficients . That’s pretty easy to fix. The trick is to consider the derivative of . On the one hand, it equals
and therefore . On the other hand, using the product rule, we obtain
Furthermore, , and hence . Q. E. D.