What is the connection betweeen birthday paradox and sublinear sortedness testing

I was reading this article. On the first page there is statement in part 2, that when we randomly pick 2 indeccs each iteration, then we’ll have to select s=Ω(√(n))s=\Omega(\sqrt(n)) different iterations. And that can be calculated with birthday paradox. I don’t see the relation between this two parts.

I see it this way:
in total we can pick (n2)n \choose 2 index pairs each iteration. Indeces are ordered.
We have \frac {n}{2}\frac {n}{2} ordered pairs which can catch unsortedness.
So the probability to catch it:
p=\frac{\frac {n}{2}}{{{n} \choose {2}}}p=\frac{\frac {n}{2}}{{{n} \choose {2}}}.

Probability of error in 1 iteration is
q=1-p=p=1-\frac{\frac {n}{2}}{{{n} \choose {2}}}q=1-p=p=1-\frac{\frac {n}{2}}{{{n} \choose {2}}}

Probability of error in s runs is

Then s=\Omega(n)s=\Omega(n).
But in the article stated that s=\Omega(\sqrt(n))s=\Omega(\sqrt(n))
What is wrong?



1 Answer


Let me include what the question is asking about, in case the site goes down:

Firstly, I’d say it doesn’t matter: they’re only saying that (with high probability) we’ll need to select \Omega(\sqrt{n})\Omega(\sqrt{n}) pairs, so if you think you’ve found a better lower bound \Omega(n)\Omega(n), then it doesn’t contradict their claim: any function that is \Omega(n)\Omega(n) is also \Omega(\sqrt{n})\Omega(\sqrt{n}), so you’ve just proved a stronger result than theirs: either way we know that this test won’t work.

Secondly, nothing is wrong in your analysis; your analysis is correct for your (natural) understanding of the algorithm: if on each iteration we choose a pair (x_i, x_j)(x_i, x_j) independently of all previous choices, then indeed the probability of success in any given iteration is p = \frac{n}{2} / \binom{n}{2}p = \frac{n}{2} / \binom{n}{2}, so we need an expected number of 1/p = \Theta(n)1/p = \Theta(n) iterations till we are successful.

What their analysis does is in fact rule out all similar/smarter algorithms one can get by tweaking this algorithm: suppose we say we’re going to remember our choices from earlier iterations and not choose the same numbers again, also we’re somehow going to compare all the numbers we’ve selected so far to see if some pair is out of order. Then, still, we have n/2n/2 buckets from which we’re drawing balls, two at a time, so by the birthday paradox we need \Omega(\sqrt{n/2})\Omega(\sqrt{n/2}) drawings until we happen to randomly draw twice from the same bucket. This is the same as \Omega(\sqrt{n})\Omega(\sqrt{n}).