Hölder’s inequality is repeated Cauchy–Schwarz

Cauchy–Schwarz and Hölder’s inequality are the basic tools for controlling the interaction, or correlation, of two functions. For finite sequences $f=(f_i)$ and $g=(g_i)$ , their correlation is measured by the inner product

$\displaystyle \langle f,g\rangle=\sum_i f_i\overline{g_i}.$

The triangle inequality reduces the problem to estimating $\sum_i|f_i g_i|.$ Thus the central question is this: how can one control the total interaction $\displaystyle \sum_i|f_i g_i|$ using only separate information about the sizes of $f$ and $g?$ Cauchy–Schwarz answers this when both sequences are measured using squares:

$\displaystyle \sum_i|f_i g_i|\le\Big(\sum_i|f_i|^2\Big)^{1/2}\Big(\sum_i|g_i|^2\Big)^{1/2}.$

Hölder’s inequality is the general version. It says that one may measure $f$ and $g$ using different powers, provided the powers fit together correctly. If $\displaystyle 1<p,q<\infty$ and $\displaystyle \frac1p+\frac1q=1,$ then

$\displaystyle \Big|\langle f,g\rangle\Big|\le\sum_i|f_i g_i|\le\Big(\sum_i|f_i|^p\Big)^{1/p}\Big(\sum_i|g_i|^q\Big)^{1/q}.$

Equivalently, in norm notation,

$\displaystyle \Big|\langle f,g\rangle\Big|\le ||f||_{p}||g||_{q}.$

The deeper point of this discussion is that Hölder is not an independent miracle that sits above Cauchy–Schwarz. Morally, it is Cauchy–Schwarz repeated at finer and finer interpolation scales. Cauchy–Schwarz controls a midpoint. Repeating midpoint estimates reaches dyadic fractions such as $\displaystyle 1/2,1/4,3/4,1/8,3/8$ , while finite chains of such steps already produce every rational proportion. Continuity then fills in every real interpolation parameter.

Examples:

For $p=q=2,$ Hölder is precisely Cauchy–Schwarz. The next case $(p,q)=(4,\frac43),$ already shows the main idea.. Let

$\displaystyle S:=\sum_i|f_i g_i|,\quad A:=\sum_i|f_i|^4,\quad B:=\sum_i|g_i|^{4/3}.$

We want to move from the mixed product $|f_i g_i|$ toward the two pure endpoint expressions $|f_i|^4$ and $|g_i|^{4/3}.$ The first split is

$\displaystyle |f_i g_i|=(|f_i|^2|g_i|^{2/3})^{1/2}(|g_i|^{4/3})^{1/2}.$

Applying Cauchy–Schwarz gives

$\displaystyle S^2\le\Big(\sum_i|f_i|^2|g_i|^{2/3}\Big)B.$

This does not finish the proof, because the first factor is still mixed. But it is less mixed than before: the power of $f$ has moved from $1$ to $2,$ while the remaining power of $g$ has been reduced. We apply Cauchy–Schwarz once more:

$\displaystyle (\sum_i|f_i|^2|g_i|^{2/3} )^2\le AB.$

Combining the two inequalities gives $\displaystyle S^4\le AB^3,$ and hence

$\displaystyle \sum_i|f_i g_i|\le\Big(\sum_i|f_i|^4\Big)^{1/4}\Big(\sum_i|g_i|^{4/3}\Big)^{3/4}.$

Thus Hölder for $\displaystyle \Big(4,\frac43\Big)$ is simply Cauchy–Schwarz applied twice. The proof works because each application moves the mixed expression one step nearer to a pure endpoint.

The case $(p,q)=(8,\frac87)$ is the same story, with one additional intermediate stage. Define

$\displaystyle S_0:=\sum_i|f_i g_i|,\quad S_1:=\sum_i|f_i|^2|g_i|^{6/7},\qquad S_2:=\sum_i|f_i|^4|g_i|^{4/7},$

and let $A:=\sum_i|f_i|^8,\quad B:=\sum_i|g_i|^{8/7}.$ Cauchy–Schwarz supplies the chain

$\displaystyle S_0^2\le S_1B,\quad S_1^2\le S_2B,\quad S_2^2\le AB.$

The final inequality reaches the pure $f$ endpoint. Working backwards through the chain gives

$\displaystyle S_0^8\le AB^7,$

and therefore taking eighth roots gives Hölder for $(p,q)=(8,8/7).$

$\displaystyle \sum_i|f_i g_i|\le(\sum_i|f_i|^8)^{1/8}(\sum_i|g_i|^{8/7})^{7/8}.$

The pattern is now visible. Every Cauchy–Schwarz step doubles the current power of $f,$ while the unassigned power of $g$ is moved into the fixed endpoint quantity $\displaystyle \sum_i|g_i|^{8/7}.$ These are the easiest cases because the relevant interpolation parameters are dyadic.

The pair $\displaystyle (p,q)=(\frac32,3)$ has a different appearance. The proof no longer moves along one straight ladder toward an endpoint. Instead, two mixed sums arise and control each other. Repeated Cauchy–Schwarz forms a short cycle. Let

$\displaystyle S:=\sum_i x_i y_i,\quad A:=\sum_i x_i^{3/2},\quad B:=\sum_i y_i^3.$

We begin from the product $x_i y_i$ and split it as $x_i y_i=x_i^{3/4}(x_i^{1/4}y_i),$ so Cauchy–Schwarz gives

$\displaystyle S^2\le A U,\quad U:=\sum_i x_i^{1/2}y_i^2.$

The first step has created a new mixed quantity $U.$ It is not an endpoint norm, but it admits a complementary decomposition: $x_i^{1/2}y_i^2=y_i^{3/2}\Big(x_i^{1/2}y_i^{1/2}\Big),$ and hence applying Cauchy–Schwarz again gives $\displaystyle U^2\le BS.$

The two estimates form a loop: $S^2\le AU,\quad U^2\le BS.$ The first inequality carries us from $S$ to $U,$ while the second carries us back from $U$ to $S.$ Eliminating the intermediate quantity $U$ gives

$\displaystyle S^4\le A^2U^2\le A^2BS.$

Dividing by $S$ and taking cube roots gives

$\displaystyle \sum_i x_i y_i\le\Big(\sum_i x_i^{3/2}\Big)^{2/3}\Big(\sum_i y_i^3\Big)^{1/3}.$

The two Cauchy–Schwarz estimates form the loop $\displaystyle S^2\le AU,\ U^2\le BS,$ which closes to give Hölder for $(3/2,3).$ The point of this example is not merely that it proves one more case. It shows that “repeated Cauchy–Schwarz” need not mean a rigid one-directional iteration. It may produce a small system of inequalities between several mixed sums; solving that system gives the final exponent.

Every rational Hölder exponent

The preceding examples are all instances of one finite construction. Fix an integer $\displaystyle m\ge2$ , let $\displaystyle u_i,v_i\ge0,$ and define

$\displaystyle M_r:=\sum_i u_i^{m-r}v_i^r,\qquad 0\le r\le m.$

The endpoints are the two pure sums are $M_0=\sum_i u_i^m,\quad M_m=\sum_i v_i^m.$ The intermediate sums are obtained by transferring one power at a time from $u_i$ to $v_i.$ The crucial observation is that each intermediate monomial is exactly the geometric mean of its two neighbors:

$\displaystyle u_i^{m-r}v_i^r=\Big(u_i^{m-r+1}v_i^{r-1}\Big)^{1/2}\Big(u_i^{m-r-1}v_i^{r+1}\Big)^{1/2}.$

Therefore Cauchy–Schwarz gives, for every $1\le r\le m-1,$

$\displaystyle M_r^2\le M_{r-1}M_{r+1}.$

Thus the list $M_0,M_1,\ldots,M_m$ is log-convex. This is the finite-chain version of midpoint convexity. Writing $\displaystyle L_r:=\log M_r,$ this says

$\displaystyle 2L_r\le L_{r-1}+L_{r+1}.$

Equivalently, the discrete slopes increase:

$L_1-L_0\le L_2-L_1\le\cdots\le L_m-L_{m-1}.$

Defining the successive slopes $d_r:=L_r-L_{r-1}$ , $d_1\le d_2\le\cdots\le d_m.$ This ordered list of slopes is the whole mechanism. To estimate $L_k$ , split the total change from $L_0$ to $L_m$ at $k$ :

$\displaystyle L_k-L_0=d_1+\cdots+d_k,\quad L_m-L_k=d_{k+1}+\cdots+d_m.$

Every slope in the first sum is at most every slope in the second sum. Hence the average slope on the left is at most the average slope on the right:

$\displaystyle \frac{L_k-L_0}{k}\le\frac{L_m-L_k}{m-k}.$

Rearranging gives $mL_k\le(m-k)L_0+kL_m$ . In geometric language, the points $(r,L_r )$ lie below the line segment joining the endpoint points $(0,L_0 )$ and $(m,L_m).$ Thus, for every $0\le k\le m,$

$\displaystyle L_k\le\frac{m-k}{m}L_0+\frac{k}{m}L_m.$

Exponentiating gives $M_k\le M_0^{(m-k)/m}M_m^{k/m}.$ In other words,

$\displaystyle \sum_i u_i^{m-k}v_i^k\le\Big(\sum_i u_i^m\Big)^{(m-k)/m}\Big(\sum_i v_i^m\Big)^{k/m}.$

This formula already contains Hölder. Choose $u_i:=|f_i|^{1/(m-k)},\quad v_i:=|g_i|^{1/k}.$ Then $u_i^{m-k}v_i^k=|f_i g_i|,$ while $u_i^m=|f_i|^{m/(m-k)},\quad v_i^m=|g_i|^{m/k}.$ The chain estimate becomes

$\displaystyle \sum_i|f_i g_i|\le\Big(\sum_i|f_i|^{m/(m-k)}\Big)^{(m-k)/m}\Big(\sum_i|g_i|^{m/k}\Big)^{k/m}.$

This is Hölder with $p=\frac{m}{m-k},\quad q=\frac{m}{k},\quad \frac1p+\frac1q=1.$ Every rational exponent $\displaystyle p>1$ arises in this way. Indeed, write $\displaystyle p=s/r$ with integers $\displaystyle s>r\ge1.$ Taking $m=s,\quad k=s-r,$ gives

$\displaystyle p=\frac{m}{m-k}=\frac{s}{r},\quad q=\frac{m}{k}=\frac{s}{s-r}.$

There is also a completely multiplicative elimination procedure, which shows exactly how the local Cauchy–Schwarz inequalities combine. Define

$\displaystyle D_r:=\frac{M_r^2}{M_{r-1}M_{r+1}}\le1,\qquad 1\le r\le m-1.$

We want to multiply powers of the $D_r$ so that every interior $M_j$ cancels except $M_k$ . The required powers are not guessed. Set $c_0=c_m=0$ and require that the exponent of $L_j$ in the product vanish for every $j\ne k$ . Since $\log D_r=2L_r-L_{r-1}-L_{r+1}$ , this cancellation condition is $2c_j-c_{j-1}-c_{j+1}=0,\quad j\ne k.$ Thus the weights must change at a constant rate on either side of $k$ , so they are forced to be the piecewise-linear sequence

$\displaystyle c_r=\begin{cases}r(m-k),&1\le r\le k,\\ k(m-r),&k\le r\le m-1.\end{cases}$

At $r=k$ , the slope has a jump of size $m$ ; this produces precisely the desired power $M_k^m$ . Expanding the product now gives the exact identity

$\displaystyle \prod_{r=1}^{m-1}D_r^{c_r}=\frac{M_k^m}{M_0^{m-k}M_m^k}.$

Since every $D_r\le1$ and every $c_r\ge0,$ the product is at most $1$ , proving again that $M_k^m\le M_0^{m-k}M_m^k$ .

The $(\frac32,3)$ cycle above is exactly the case $\displaystyle m=3$ and $\displaystyle k=1.$ Indeed, taking $\displaystyle u_i=x_i^{1/2}$ and $\displaystyle v_i=y_i$ gives

$\displaystyle M_0=\sum_i x_i^{3/2},\quad M_1=\sum_i x_i y_i,\quad M_2=\sum_i x_i^{1/2}y_i^2,\quad M_3=\sum_i y_i^3.$

The two local midpoint inequalities are precisely

$\displaystyle M_1^2\le M_0M_2,\qquad M_2^2\le M_1M_3.$

Eliminating $M_2$ yields $M_1^3\le M_0^2M_3.$

Likewise, $\displaystyle m=5$ and $\displaystyle k=2$ give

$\displaystyle {M_1^2}\le {M_0M_2},\quad {M_2^2}\le {M_1M_3},\quad {M_3^2}\le{M_2M_4},\quad {M_4^2}\le {M_3M_5}.$

We want to start from $M_2^2\le M_1M_3$ and remove every quantity except $M_2$ itself and the endpoints $M_0,M_5$ . First remove $M_1$ using $M_1\le M_0^{1/2}M_2^{1/2}$ , obtaining $M_2^{3/2}\le M_0^{1/2}M_3$ . The only remaining unwanted term is now $M_3$ . To remove it, use its local relation $M_3^2\le M_2M_4$ ; this introduces $M_4$ , but only as a temporary obstruction. Remove that new term immediately with $M_4\le M_3^{1/2}M_5^{1/2}$ , which follows from $M_4^2\le M_3M_5$ . Hence $M_3^2\le M_2M_3^{1/2}M_5^{1/2}$ , so $M_3\le M_2^{2/3}M_5^{1/3}$ . Substituting this into $M_2^{3/2}\le M_0^{1/2}M_3$ leaves only $M_0,M_2,M_5$ : $M_2^{3/2}\le M_0^{1/2}M_2^{2/3}M_5^{1/3}$ . Thus $M_2^{5/6}\le M_0^{1/2}M_5^{1/3}$ , and therefore $M_2^5\le M_0^3M_5^2$ . Equivalently,

$\displaystyle M_2\le M_0^{3/5}M_5^{2/5},$

which is Hölder for $\displaystyle (p,q)=(\frac53,\frac52).$

Thus finite repeated Cauchy–Schwarz arguments prove Hölder for every rational conjugate pair.

Cauchy–Schwarz as midpoint interpolation

The general mechanism is clearest when one removes the specific exponents. Let $\displaystyle A_i,B_i\ge0$ and define

$\displaystyle H(t):=\sum_{i=1}^N A_i^{1-t}B_i^t,\quad 0\le t\le1.$

The endpoints are $H(0)=\sum_iA_i$ and $H(1)=\sum_iB_i.$ The parameter $t$ measures how much of the exponent has been transferred from $A_i$ to $B_i.$ Now take two parameters $s,t\in[0,1].$ At their midpoint,

$\displaystyle H\Big(\frac{s+t}{2}\Big)=\sum_i\Big(A_i^{1-s}B_i^s\Big)^{1/2}\Big(A_i^{1-t}B_i^t\Big)^{1/2}.$

Cauchy–Schwarz therefore gives the midpoint inequality

$\displaystyle H\Big(\frac{s+t}{2}\Big)^2\le H(s)H(t).$

Whenever $H$ is positive, this is equivalent to

$\displaystyle \log H\Big(\frac{s+t}{2}\Big)\le\frac{\log H(s)+\log H(t)}{2}.$

Thus $\displaystyle \log H(t)$ is convex: the logarithm of the mixed sum lies below the straight line joining its endpoint values. This is the exact analytic content of repeated Cauchy–Schwarz.

Applying the midpoint estimate first to $\displaystyle 0$ and $\displaystyle 1,$ then to $\displaystyle 0$ and $\displaystyle 1/2,$ and so on, gives

$\displaystyle H\Big(\frac{k}{2^m}\Big)\le H(0)^{1-k/2^m}H(1)^{k/2^m}$

for every dyadic rational $\displaystyle k/2^m.$ Since $H(t)$ is continuous, dyadic rationals approximate every $\displaystyle \theta\in[0,1],$ and hence

$\displaystyle \sum_iA_i^{1-\theta}B_i^\theta\le\Big(\sum_iA_i\Big)^{1-\theta}\Big(\sum_iB_i\Big)^\theta.$

This is the basic interpolation inequality. Its proof is nothing more than Cauchy–Schwarz at all dyadic midpoint scales, followed by a limiting argument.

Now let $\displaystyle 1<p,q<\infty$ satisfy $\displaystyle 1/p+1/q=1.$ Set

$\displaystyle A_i:=|f_i|^p,\quad B_i:=|g_i|^q,\qquad \theta:=\frac1q.$

Then $\displaystyle 1-\theta=1/p,$ and the mixed monomial becomes

$\displaystyle A_i^{1-\theta}B_i^\theta=\Big(|f_i|^p\Big)^{1/p}\Big(|g_i|^q\Big)^{1/q}=|f_i g_i|.$

The interpolation inequality is therefore exactly Hölder’s inequality:

$\displaystyle \sum_i|f_i g_i|\le\Big(\sum_i|f_i|^p\Big)^{1/p}\Big(\sum_i|g_i|^q\Big)^{1/q}.$

Thus Hölder is the continuous completion of repeated Cauchy–Schwarz: Cauchy–Schwarz gives midpoint control, finite chains give rational positions, and continuity gives every real position.

Convexity proof

The standard proof uses convexity. For $\displaystyle u,v\ge0$ and conjugate exponents $p,q,$ weighted arithmetic-geometric mean gives

$\displaystyle u^{1/p}v^{1/q}\le\frac{u}{p}+\frac{v}{q}.$

This is Young’s inequality, and it follows from convexity of the exponential function. Normalize the sequences by setting

$\displaystyle u_i:=\frac{|f_i|^p}{||f||_{p}^p},\quad v_i:=\frac{|g_i|^q}{||g||_{q}^q}.$

Then $\sum_i u_i=\sum_i v_i=1.$ Summing Young’s inequality gives

$\displaystyle \frac{\sum_i|f_i g_i|}{||f||_{p}||g||_{q}}=\sum_i u_i^{1/p}v_i^{1/q}\le\frac1p\sum_i u_i+\frac1q\sum_i v_i=1.$

This proves Hölder immediately. It is often the quickest route, but the interpolation proof explains the deeper geometry: Hölder is the log-convex continuation of Cauchy–Schwarz from one midpoint to every point between two endpoints.

The same argument works for integrals. One replaces finite sums by integrals, applies the finite or simple-function case first, and then passes to general measurable functions by approximation. Thus Hölder is the fundamental principle that a product can be controlled by assigning complementary portions of its exponent to the two factors.

Share this:

Related

Leave a comment Cancel reply