Large Sieve and Hilbert Inequalities

Large Sieve Inequalities are very powerful tools in analytic number theory to show cancellation in sums of harmonics twisted with coefficients that we know little about or hard to deal with (imagine non-smooth arithmetic coefficients). By regarding coefficients as general unknown coefficients, we can still obtain cancellation when we average over a set of “almost” orthogonal harmonics!

Establishing such inequalities depending on our ability to manipulate and sum over harmonics under consideration (We need some kind of “trace formulae” to execute the summation and identify orthogonality/completeness).

In this post, we consider the simplest cases of large sieves for harmonics like $e^{2\pi nx}$ , $e^{2\pi ia n/q}$ , $\chi(x)$ . Basically related to harmonics analysis over reals, integers and their quotient groups.

Consider the trigonometric sum

$\displaystyle S(x)=\sum_{M+1}^{M+N} a_{n} e(n x) \quad \quad$ where $\quad \quad e(x) = e^{2 \pi i x}.$

A trivial bound on $\left|S(x)\right|^2$ is $\displaystyle N \sum_{M+1}^{M+N}\left|a_{n}\right|^{2}$ , that is sharp when $a_n =e(-nx).$

Can we improve this say on average over $x$ ?

If the points $\displaystyle x_i, i =1, 2,\cdots ,R$ we want to average over are all equal modulo integers (same fractional part), all the values $\displaystyle S(x_i)$ will be equal and the sum

$\displaystyle \sum_{r=1}^{{R}}\left|S\left(x_{r}\right)\right|^{2}$

is as large as

$\displaystyle NR \sum_{M+1}^{M+N}\left|a_{n}\right|^{2}$ .

Hence we want $\displaystyle x_i$ so that $e(nx_i)$ are fairly orthogonal to expect any savings/ cancellation on average. So we assume that the points are well spaced.

Let $\displaystyle \|x\|=\min_{n \in Z}|x-n|$ denote the distance to integers.

Large Sieve inequality:

If the points $x_i, \quad i=1, 2, \cdots, R$ are real numbers such that minimum spacing

$\displaystyle\min_{r \neq s}\left\|x_{r}-x_{s}\right\|$

equals $\displaystyle \delta$ . Then

$\displaystyle \sum_{r=1}^{{R}}\left|S\left(x_{r}\right)\right|^{2} =\sum_{r=1}^{R}\left|\sum_{n=M+1}^{M+N}a_n e(nx_r) \right|^2\le \Delta \sum_{M+1}^{M+N}\left|a_{n}\right|^{2}$

where $\displaystyle \Delta = N+ \frac{1}{\delta}-1.$

The constant $\Delta =N+ \frac{1}{\delta}-1.$ is tight. For instance, take

$\displaystyle x_r = x_0+ \frac{r}{R}, R \mid(N-1), a_{n}=a e\left(-n \alpha_{0}\right)$ for $\displaystyle R \mid n, a_{n}=0$ otherwise. $\displaystyle \delta^{-1}=R^{-1}$

and we have

$\displaystyle\sum_{r=1}^{R}\left|\sum_{n=M+1}^{M+N}a_n e(nx_r) \right|^2=R|a|^{2}\left(\frac{N-1}{R} R+1\right)^{2}=\left(N-1+\delta^{-1}\right) \sum_{0}^{N-1}\left|a_{n}\right|^{2}$

Application to Farey fractions of size $q \le Q$ , with the separation $\delta \ge \frac{1}{Q^2}$ , we get the following inequality

$\displaystyle \sum_{q<Q} \sum_{\substack{a=1 \\ (a, q)=1 } }^{q} \left|\sum_{n=M+1}^{M+N} e(\frac{a n}{q})\right|^{2} \leq \left(N+Q^{2}\right) \sum_{M+1}^{M+N}\left|a_{n}\right|^{2}$

This has a lot of important arithmetic applications!. For instance we get the following

For any subset $Z\subset [M+1, M+N]$ of size $Z$ , we have

$\displaystyle \sum_{p<Q} p \sum_{h=1}^{p}(Z(p, h)-Z / p)^{2} \leqslant\left(N+Q^{2}\right) Z$

where $Z(p, h)$ are elements of $Z$ which are $h \mod p.$ This allows to show that control the size $Z$ for sets which occupy only few residues classes $\mod p$ for many primes. That is “large” portion of $\mod p$ classes are sifted– hence the name “Large Sieve”.

$\displaystyle Z \leqslant\left(N+Q^{2}\right)\left(\sum_{q<Q} \mu^{2}(q) \prod_{p \mid q} \frac{\omega(p)}{p-\omega(p)}\right)^{-1}$

If we move from additive characters to multiplicative characters, we get the following large sieve for characters:

$\displaystyle \sum_{n<0} \frac{q}{\varphi(q)} \sum_{\chi}^{*}\left|\sum_{n=M+1}^{M+N} a_n\chi(n) \right|^{2} \leq \left(N+Q^{2}\right) \sum_{M+1}^{M+N}\left|a_{n}\right|^{2}$

where the sum is over primitive characters $\chi.$

This is a very powerful inequality and is the main ingredient to prove Bombieri-Vinogradov Theorem (one of the most important tools in prime number theory)

$\displaystyle \sum_{q<Q} \max_{y \le x} \max_{\substack {a \\ (a, q)=1}} \left|\pi(y ; q, a)-\frac{\text{li}(y)}{\varphi(q)}\right|=O_{A}\left(x(\log x)^{-A}\right)$

provided $\displaystyle Q \leq x^{1 / 2}(\log x)^{-B}$ where $B=B(A)$

We now move onto the proof of the large sieve:

Gallagher’s Proof: Gallagher proves the inequality with a larger constant

$\displaystyle \Delta =\pi N +\frac{1}{\delta}$ .

Start with the simple Sobolev type inequality:

$\displaystyle |f(x)| \leqslant \int_{0}^{1}\left(|f|+\left|f^{\prime}\right|\right)\quad \quad ~~~~~~~~\text{ }$ for $x \in [0, 1]$

that follows from

$\displaystyle f(x)=\int_{0}^{1} f(u) d u+\int_{0}^{x} u f^{\prime}(u) d u+\int_{x}^{1}(u-1) f^{\prime}(u) d u$

Change of variables to the interval $[x-\delta/2, x+\delta/2]$ gives

$\displaystyle \left|f\left(x\right)\right| \leqslant \delta^{-1} \int_{x-\delta / 2}^{x+\delta / 2}|f(\beta)| d \beta+\frac{1}{2} \int_{x-\delta / 2}^{x+\delta / 2}\left|f^{\prime}(\beta)\right| d \beta$

Applying this to $\displaystyle f(x)= S(x)^2$ , we get

$\displaystyle \left|S\left(x_{r}\right)\right|^{2} \leqslant \delta^{-1} \int_{x_{r}-\delta / 2}^{x_{r}+\delta / 2}|S(\beta)|^{2} d \beta+\int_{x_{r}-\delta / 2}^{x_{r}+\delta / 2}\left|S(\beta) S^{\prime}(\beta)\right| d \beta$

The $\delta$ spacing implies that the intervals $\left(x_{r}-\frac{1}{2} \delta, x_{r}+\frac{1}{2} \delta\right)$ are disjoint (modulo 1) and we have

$\displaystyle \sum_{r=1}^{R}\left|S\left(x_{r}\right)\right|^{2} \leqslant \delta^{-1} \int_{0}^{1}|S(\beta)|^{2} d \beta+\int_{0}^{1}\left|S(\beta) S^{\prime}(\beta)\right| d \beta$

The first term on the right, by Parseval (opening the square and integrating), gives

$\displaystyle \frac{1}{\delta} \sum_{n=M+1}^{M+A}\left|a_{n}\right|^{2}.$

The second term after Cauchy-Schwarz gives

$\displaystyle \leqslant\left(\int_{0}^{1}|S|^{2} d \beta \right)^{1 / 2}\left(\int_{0}^{1}\left|S^{\prime} d \beta\right|^{2}\right)^{1 / 2}=\left(\sum\left|a_{n}\right|^{2}\right)^{1 / 2}\left(\sum\left|2 \pi n a_{n}\right|^{2}\right)^{1 / 2}$

The bound doesn’t depend on $M$ because we can always shift the interval and assume $|n| \le N/2.$

and we get

$\displaystyle \sum_{r=1}^{R}\left|S\left(x_{r}\right)\right|^{2} \leqslant\left(\delta^{-1}+\pi N\right) \sum\left|a_{n}\right|^{2}$

Montgomery’s Proof: The main idea is to use duality. That is if for an $R\times S$ matrix $\displaystyle \left[a_{r s}\right]$

$\displaystyle \sum_{s=1}^{S}\left|\sum_{r=1}^{R} a_{r s} x_{r}\right|^{2} \leqslant C \sum_{r=1}^{R}\left|x_{r}\right|^{2}$

for every complex sequence $(x_r), r=1, 2, \cdots R$ , then we also have

$\displaystyle \sum_{r=1}^{R}\left|\sum_{s=1}^{S} a_{r s} y_{s}\right|^{2} \leqslant C \sum_{s=1}^{S}\left|y_{s}\right|^{2}$

for arbitrary sequences $\displaystyle (y_s), s=1, 2, \cdots S$ .

Duality is really the fact that the operator norm for a matrix and it’s transpose are equal.

So we are going to prove

$\displaystyle \sum_{n}\left|\sum_{r} b_{r} e\left(n x_{r}\right)\right|^{2}<\Delta \sum\left|b_{r}\right|^{2}$

This form is better because we can open the square and execute summation over $M+1\le n \le M+N$ to get

$\displaystyle \sum_{r_1, r_2} b_{r_1} \bar b _{r_2} \sum _{n=M+1}^{M+N} e\left(n\left(x_{r_1}-x_{r_2}\right)\right)$

The diagonal terms $r_1=r_2$ give

$\displaystyle N \sum_{r=1}^{R}\left|b_{r}\right|^{2}$

And the off-diagonal contribute

$\displaystyle \sum_{r_1 \neq r_2} b_{r_1}\bar b_{r_2} e((M+N+1/2)(x_{r_1} -x_{r_2})) \frac{\sin \pi N\left(x_{r_1}-x_{r_2}\right)}{\sin \pi\left(x_{r_1}-x_{r_2}\right)}$

because $\displaystyle \sum_{M+1}^{M+N} e\left(nx\right) = e((M+N+1/2)x)\frac{\sin \left(\pi Nx\right)}{\sin \pi x}$

Taking $u_r = b_r e((M+N+1/2)$ , we are reduced to bounding

$\displaystyle \left|\sum_{r_1 \neq r_2} u_{r_1} \bar{u_{r_2}} \frac{\sin \pi N\left(x_{r_1}-x_{r_2}\right)}{\sin \pi\left(x_{r_1}-x_{r_2}\right)}\right|$

This is the Hilbert Inequality problem for the Kernel $K(x, y) =\frac{\sin (\pi N(x-y))}{\sin (\pi (x-y))}$

We will show that

$\displaystyle \left|\sum_{r_1 \neq r_2} u_{r_1} \bar{u_{r_2}} \frac{\sin \pi \lambda \left(x_{r_1}-x_{r_2}\right)}{\sin \pi\left(x_{r_1}-x_{r_2}\right)}\right| \le \frac{1}{\delta} \sum_{r=1}^{R}|u_r|^2$

and hence the bound for the dual form is $\displaystyle \left(N+\frac{1}{\delta}\right) \sum_{r=1}^{R}|b_r|^2$ , and by duality we get

$\displaystyle \Delta = N+ \frac{1}{\delta}.$

By writing the sine in the numerator in term of exponentials, and the problem with a choice of $w_{r}=u_{r} e^{\pm i \lambda x_{r}}$ reduces to bounding

$\displaystyle \left|\sum_{r_1 \neq r_2} w_{r_1} \bar{w_{r_2}} \frac{1}{\sin \pi\left(x_{r_1}-x_{r_2}\right)}\right| \le \frac{1}{\delta} \sum_{r=1}^{R}|w_r|^2$

By writing $\csc (\pi x)$ as

$\displaystyle \frac{1}{\pi} \sum_{n}(-1)^{n}\frac{1}{x-n}$

we cam further reduce to the problem of proving the following:

Montgomery-Vaughan: For any sequence of points $x_1\le x_2 \le \cdots \le x_R$ , well separated $|x_{i}-x_{i-1}|\ge \delta$ , we have

$\displaystyle \left|\sum_{r \neq s} \frac{w_{r} \bar{w}{s}}{x_{r}-x_{s}}\right| \leq \frac{\pi}{\delta} \sum\left|w_{r}\right|^{2}$

$\square$

Selberg’s improvement:

Some of the issues are coming from the from the sharp cut-off of the intervals $[M+1, M+N]$ for values of $n$ . So if we majorize the interval with a smooth function- good choice of smooth function gives $\Delta = N+\delta^{-1}-1$

If $F(n) \ge 1_{[M+1, M+N]}(n)$ , we observe that

$\displaystyle \sum_{n=M+1}^{M+N}\left|\sum_{r} e(nx_r) v_r \right|^2 \le \sum_{n} F(n) \left|\sum_{r} e(nx_r) v_r \right|^2$

$\displaystyle \sum_{r_1 , r_2} v_{r_1} \bar{v_{r_2}} L (x_{r_1}-x_{r_2})$ , where

$\displaystyle L(x) =\sum_{n} F(n) e(nx)$

If $L(x)$ is zero for $\| x\|\ge \delta$ , then the above sum, because of the well-spaced property, contributes just the diagonal, to give the bound

$\displaystyle L(0) \sum_{r} |v_r|^2$

Selberg’s choice gives

$\displaystyle L(0) = \sum_{n} F(n) = \sum_{n}\hat {F}(n) =\hat {F}(0) =\int F(t) dt=N+\delta^{-1}-1$

Let $\displaystyle G(z)=\left(\frac{\sin \pi z}{\pi}\right)^{2}\left(\sum_{n=0}^{\infty}(z-n)^{-2}+\sum_{n=1}^{\infty}(z+n)^{-2}+2 z^{-1}\right)$

$\displaystyle G_1(z) = \frac{1}{2} G(z)+\frac{1}{2} G(\delta(N-1)-x)$

$\displaystyle G_2(z) =G_1(\delta z)$

$F(n) = G_2(n)$ works and we have $L(x) =0$ for $\| x\|> \delta$

Cohen’s Trick:

One can repeat the sum a number of times , thereby create more points. Because $S(x)$ is invariant under translation with integers, consider $x_r +k , k=1,2,3,\cdots K$ for each of the points to get

$\displaystyle K \sum_{r=1}^{R}\left|S\left(x_{r}\right)\right|^{2}=\sum_{k=1}^{K} \sum_{r=1}^{R}\left|S\left(x_{r}+k\right)\right|^{2}$

We have $n(x+k)= nK (\frac{x}{K}+\frac{k}{N})$ , so apply the large sieve (with Montgomery’s constant) for these points $(\frac{x_r}{K}+\frac{k}{N}$ to get