Large Sieve and Hilbert Inequalities

Large Sieve Inequalities are very powerful tools in analytic number theory to show cancellation in sums of harmonics twisted with coefficients that we know little about or hard to deal with (imagine non-smooth arithmetic coefficients). By regarding coefficients as general unknown coefficients, we can still obtain cancellation when we average over a set of “almost” orthogonal harmonics!

Establishing such inequalities depending on our ability to manipulate and sum over harmonics under consideration (We need some kind of “trace formulae” to execute the summation and identify orthogonality/completeness).

In this post, we consider the simplest cases of large sieves for harmonics like e^{2\pi nx} , e^{2\pi ia n/q} , \chi(x) . Basically related to harmonics analysis over reals, integers and their quotient groups.

Consider the trigonometric sum

\displaystyle S(x)=\sum_{M+1}^{M+N} a_{n} e(n x) \quad \quad where \quad \quad e(x) = e^{2 \pi i x}.

A trivial bound on \left|S(x)\right|^2 is \displaystyle N \sum_{M+1}^{M+N}\left|a_{n}\right|^{2} , that is sharp when a_n =e(-nx).


Can we improve this say on average over x ?


If the points \displaystyle x_i, i =1, 2,\cdots ,R we want to average over are all equal modulo integers (same fractional part), all the values \displaystyle S(x_i) will be equal and the sum

\displaystyle \sum_{r=1}^{{R}}\left|S\left(x_{r}\right)\right|^{2}

is as large as

\displaystyle NR \sum_{M+1}^{M+N}\left|a_{n}\right|^{2} .

Hence we want \displaystyle x_i so that e(nx_i) are fairly orthogonal to expect any savings/ cancellation on average. So we assume that the points are well spaced.

Let \displaystyle \|x\|=\min_{n \in Z}|x-n| denote the distance to integers.

Large Sieve inequality:

If the points x_i, \quad i=1, 2, \cdots, R are real numbers such that minimum spacing

\displaystyle\min_{r \neq s}\left\|x_{r}-x_{s}\right\|

equals \displaystyle \delta . Then

\displaystyle \sum_{r=1}^{{R}}\left|S\left(x_{r}\right)\right|^{2} =\sum_{r=1}^{R}\left|\sum_{n=M+1}^{M+N}a_n e(nx_r) \right|^2\le  \Delta \sum_{M+1}^{M+N}\left|a_{n}\right|^{2}

where \displaystyle \Delta = N+ \frac{1}{\delta}-1.

The constant \Delta =N+ \frac{1}{\delta}-1. is tight. For instance, take

\displaystyle x_r = x_0+ \frac{r}{R}, R \mid(N-1), a_{n}=a e\left(-n \alpha_{0}\right) for \displaystyle R \mid n, a_{n}=0 otherwise. \displaystyle \delta^{-1}=R^{-1}

and we have

\displaystyle\sum_{r=1}^{R}\left|\sum_{n=M+1}^{M+N}a_n e(nx_r) \right|^2=R|a|^{2}\left(\frac{N-1}{R} R+1\right)^{2}=\left(N-1+\delta^{-1}\right) \sum_{0}^{N-1}\left|a_{n}\right|^{2}

Application to Farey fractions of size q \le Q , with the separation \delta \ge \frac{1}{Q^2}, we get the following inequality

\displaystyle \sum_{q<Q} \sum_{\substack{a=1 \\ (a, q)=1 } }^{q} \left|\sum_{n=M+1}^{M+N} e(\frac{a n}{q})\right|^{2} \leq \left(N+Q^{2}\right) \sum_{M+1}^{M+N}\left|a_{n}\right|^{2}

This has a lot of important arithmetic applications!. For instance we get the following

For any subset Z\subset [M+1, M+N] of size Z , we have

\displaystyle \sum_{p<Q} p \sum_{h=1}^{p}(Z(p, h)-Z / p)^{2} \leqslant\left(N+Q^{2}\right) Z

where Z(p, h) are elements of Z which are h \mod p. This allows to show that control the size Z for sets which occupy only few residues classes \mod p for many primes. That is “large” portion of \mod p classes are sifted– hence the name “Large Sieve”.

\displaystyle Z \leqslant\left(N+Q^{2}\right)\left(\sum_{q<Q} \mu^{2}(q) \prod_{p \mid q} \frac{\omega(p)}{p-\omega(p)}\right)^{-1}

If we move from additive characters to multiplicative characters, we get the following large sieve for characters:

\displaystyle \sum_{n<0} \frac{q}{\varphi(q)} \sum_{\chi}^{*}\left|\sum_{n=M+1}^{M+N} a_n\chi(n) \right|^{2} \leq \left(N+Q^{2}\right) \sum_{M+1}^{M+N}\left|a_{n}\right|^{2}

where the sum is over primitive characters \chi.

This is a very powerful inequality and is the main ingredient to prove Bombieri-Vinogradov Theorem (one of the most important tools in prime number theory)


\displaystyle \sum_{q<Q} \max_{y \le x} \max_{\substack {a \\ (a, q)=1}} \left|\pi(y ; q, a)-\frac{\text{li}(y)}{\varphi(q)}\right|=O_{A}\left(x(\log x)^{-A}\right)

provided \displaystyle Q \leq x^{1 / 2}(\log x)^{-B} where B=B(A)


We now move onto the proof of the large sieve:

Gallagher’s Proof: Gallagher proves the inequality with a larger constant

\displaystyle \Delta =\pi N +\frac{1}{\delta} .

Start with the simple Sobolev type inequality:

\displaystyle |f(x)| \leqslant \int_{0}^{1}\left(|f|+\left|f^{\prime}\right|\right)\quad \quad ~~~~~~~~\text{        } for x \in [0, 1]

that follows from

\displaystyle f(x)=\int_{0}^{1} f(u) d u+\int_{0}^{x} u f^{\prime}(u) d u+\int_{x}^{1}(u-1) f^{\prime}(u) d u

Change of variables to the interval [x-\delta/2, x+\delta/2] gives

\displaystyle \left|f\left(x\right)\right| \leqslant \delta^{-1} \int_{x-\delta / 2}^{x+\delta / 2}|f(\beta)| d \beta+\frac{1}{2} \int_{x-\delta / 2}^{x+\delta / 2}\left|f^{\prime}(\beta)\right| d \beta

Applying this to \displaystyle f(x)= S(x)^2 , we get

\displaystyle \left|S\left(x_{r}\right)\right|^{2} \leqslant \delta^{-1} \int_{x_{r}-\delta / 2}^{x_{r}+\delta / 2}|S(\beta)|^{2} d \beta+\int_{x_{r}-\delta / 2}^{x_{r}+\delta / 2}\left|S(\beta) S^{\prime}(\beta)\right| d \beta

The \delta spacing implies that the intervals \left(x_{r}-\frac{1}{2} \delta, x_{r}+\frac{1}{2} \delta\right) are disjoint (modulo 1) and we have


\displaystyle \sum_{r=1}^{R}\left|S\left(x_{r}\right)\right|^{2} \leqslant \delta^{-1} \int_{0}^{1}|S(\beta)|^{2} d \beta+\int_{0}^{1}\left|S(\beta) S^{\prime}(\beta)\right| d \beta

The first term on the right, by Parseval (opening the square and integrating), gives

\displaystyle \frac{1}{\delta} \sum_{n=M+1}^{M+A}\left|a_{n}\right|^{2}.

The second term after Cauchy-Schwarz gives

\displaystyle \leqslant\left(\int_{0}^{1}|S|^{2} d \beta \right)^{1 / 2}\left(\int_{0}^{1}\left|S^{\prime} d \beta\right|^{2}\right)^{1 / 2}=\left(\sum\left|a_{n}\right|^{2}\right)^{1 / 2}\left(\sum\left|2 \pi n a_{n}\right|^{2}\right)^{1 / 2}

The bound doesn’t depend on M because we can always shift the interval and assume |n| \le N/2.

and we get

\displaystyle \sum_{r=1}^{R}\left|S\left(x_{r}\right)\right|^{2} \leqslant\left(\delta^{-1}+\pi N\right) \sum\left|a_{n}\right|^{2}


Montgomery’s Proof: The main idea is to use duality. That is if for an R\times S matrix \displaystyle \left[a_{r s}\right]

\displaystyle \sum_{s=1}^{S}\left|\sum_{r=1}^{R} a_{r s} x_{r}\right|^{2} \leqslant C \sum_{r=1}^{R}\left|x_{r}\right|^{2}

for every complex sequence (x_r), r=1, 2, \cdots R , then we also have

\displaystyle \sum_{r=1}^{R}\left|\sum_{s=1}^{S} a_{r s} y_{s}\right|^{2} \leqslant C \sum_{s=1}^{S}\left|y_{s}\right|^{2}

for arbitrary sequences \displaystyle (y_s), s=1, 2, \cdots S .

Duality is really the fact that the operator norm for a matrix and it’s transpose are equal.

So we are going to prove

\displaystyle \sum_{n}\left|\sum_{r} b_{r} e\left(n x_{r}\right)\right|^{2}<\Delta \sum\left|b_{r}\right|^{2}

This form is better because we can open the square and execute summation over M+1\le n \le M+N to get

\displaystyle \sum_{r_1, r_2} b_{r_1} \bar b _{r_2} \sum _{n=M+1}^{M+N} e\left(n\left(x_{r_1}-x_{r_2}\right)\right)

The diagonal terms r_1=r_2 give

\displaystyle N \sum_{r=1}^{R}\left|b_{r}\right|^{2}

And the off-diagonal contribute

\displaystyle \sum_{r_1 \neq r_2} b_{r_1}\bar b_{r_2} e((M+N+1/2)(x_{r_1} -x_{r_2}))  \frac{\sin \pi N\left(x_{r_1}-x_{r_2}\right)}{\sin \pi\left(x_{r_1}-x_{r_2}\right)}

because \displaystyle \sum_{M+1}^{M+N} e\left(nx\right) = e((M+N+1/2)x)\frac{\sin \left(\pi Nx\right)}{\sin \pi x}

Taking u_r = b_r e((M+N+1/2) , we are reduced to bounding

\displaystyle \left|\sum_{r_1 \neq r_2} u_{r_1} \bar{u_{r_2}} \frac{\sin \pi N\left(x_{r_1}-x_{r_2}\right)}{\sin \pi\left(x_{r_1}-x_{r_2}\right)}\right|

This is the Hilbert Inequality problem for the Kernel K(x, y) =\frac{\sin (\pi N(x-y))}{\sin (\pi (x-y))}

We will show that

\displaystyle \left|\sum_{r_1 \neq r_2} u_{r_1} \bar{u_{r_2}} \frac{\sin \pi \lambda \left(x_{r_1}-x_{r_2}\right)}{\sin \pi\left(x_{r_1}-x_{r_2}\right)}\right| \le \frac{1}{\delta} \sum_{r=1}^{R}|u_r|^2

and hence the bound for the dual form is \displaystyle \left(N+\frac{1}{\delta}\right) \sum_{r=1}^{R}|b_r|^2 , and by duality we get

\displaystyle \Delta = N+ \frac{1}{\delta}.

By writing the sine in the numerator in term of exponentials, and the problem with a choice of w_{r}=u_{r} e^{\pm i \lambda x_{r}} reduces to bounding

\displaystyle \left|\sum_{r_1 \neq r_2} w_{r_1} \bar{w_{r_2}} \frac{1}{\sin \pi\left(x_{r_1}-x_{r_2}\right)}\right| \le \frac{1}{\delta} \sum_{r=1}^{R}|w_r|^2

By writing \csc (\pi x) as

\displaystyle \frac{1}{\pi} \sum_{n}(-1)^{n}\frac{1}{x-n}

we cam further reduce to the problem of proving the following:

Montgomery-Vaughan: For any sequence of points x_1\le x_2 \le \cdots \le x_R, well separated |x_{i}-x_{i-1}|\ge \delta , we have

\displaystyle \left|\sum_{r \neq s} \frac{w_{r} \bar{w}{s}}{x_{r}-x_{s}}\right| \leq \frac{\pi}{\delta} \sum\left|w_{r}\right|^{2}

\square


Selberg’s improvement:

Some of the issues are coming from the from the sharp cut-off of the intervals [M+1, M+N] for values of n . So if we majorize the interval with a smooth function- good choice of smooth function gives \Delta = N+\delta^{-1}-1

If F(n) \ge 1_{[M+1, M+N]}(n) , we observe that

\displaystyle \sum_{n=M+1}^{M+N}\left|\sum_{r} e(nx_r) v_r \right|^2 \le \sum_{n} F(n) \left|\sum_{r} e(nx_r) v_r \right|^2

\displaystyle \sum_{r_1 , r_2} v_{r_1} \bar{v_{r_2}} L (x_{r_1}-x_{r_2}) , where

\displaystyle L(x) =\sum_{n} F(n) e(nx)

If L(x) is zero for \| x\|\ge \delta , then the above sum, because of the well-spaced property, contributes just the diagonal, to give the bound

\displaystyle L(0) \sum_{r} |v_r|^2

Selberg’s choice gives

\displaystyle L(0) = \sum_{n} F(n) = \sum_{n}\hat {F}(n) =\hat {F}(0) =\int F(t) dt=N+\delta^{-1}-1

Let \displaystyle G(z)=\left(\frac{\sin \pi z}{\pi}\right)^{2}\left(\sum_{n=0}^{\infty}(z-n)^{-2}+\sum_{n=1}^{\infty}(z+n)^{-2}+2 z^{-1}\right)

\displaystyle G_1(z) = \frac{1}{2} G(z)+\frac{1}{2} G(\delta(N-1)-x)

\displaystyle G_2(z) =G_1(\delta z)

F(n) = G_2(n) works and we have L(x) =0 for \| x\|> \delta

Cohen’s Trick:

One can repeat the sum a number of times , thereby create more points. Because S(x) is invariant under translation with integers, consider x_r +k , k=1,2,3,\cdots K for each of the points to get

\displaystyle K \sum_{r=1}^{R}\left|S\left(x_{r}\right)\right|^{2}=\sum_{k=1}^{K} \sum_{r=1}^{R}\left|S\left(x_{r}+k\right)\right|^{2}

We have n(x+k)= nK (\frac{x}{K}+\frac{k}{N}) , so apply the large sieve (with Montgomery’s constant) for these points (\frac{x_r}{K}+\frac{k}{N} to get

\displaystyle \le (K(N-1)+1+\frac{K}{\delta}) \sum\left|a_{n}\right|^{2}

for the repeated sum and

\displaystyle \le \frac{1}{K} (K(N-1)+1+\frac{K}{\delta}) \sum\left|a_{n}\right|^{2}

for the original sum.

Taking the limit \displaystyle K \to \infty we recover \displaystyle \Delta = N+\delta^{-1}-1


Posted in $.

Leave a comment