The four-square theorem through Hermitian forms and reduction

The equation \displaystyle n=x_1^2+x_2^2+x_3^2+x_4^2 already suggests grouping the four real variables into two Gaussian integers. Put z=x_1+ix_2 and w=x_3+ix_4 . Then

\displaystyle x_1^2+x_2^2+x_3^2+x_4^2=|z|^2+|w|^2.

Thus Lagrange’s theorem says that every positive integer is represented by the binary Hermitian form H_0(z,w)=|z|^2+|w|^2 on \mathbb Z[i]^2 . This reformulation changes the point of view. Rather than seeking four integers directly, we study positive definite Hermitian forms over the Gaussian integers and ask whether an apparently new unimodular form can really differ from the standard norm form.

A general positive definite integral binary Hermitian form is

\displaystyle H(z,w)=A|z|^2+B\overline z\,w+\overline B\,z\overline w+C|w|^2,

where A,C\in\mathbb Z , B\in\mathbb Z[i] , and positivity means A>0 together with

\displaystyle \det(H):=AC-|B|^2>0.

Equivalently, H is represented by the Hermitian matrix

\begin{pmatrix}A&B\\ \overline B&C\end{pmatrix} .

A Gaussian-integral change of variables by U\in{\text{GL}}_2(\mathbb Z[i]) with unit determinant replaces this matrix by U^*HU . When one writes the transformation in real and imaginary parts, it becomes an integral unimodular change of four real variables compatible with the complex structure.

Reduction of forms

Every positive definite integral binary Hermitian form of determinant 1 is equivalent over {\text{GL}}_2(\mathbb Z[i]) to the standard form \displaystyle |z|^2+|w|^2. This is the Hermitian analogue of Gauss reduction. Let H have determinant 1 , and choose a nonzero vector v\in\mathbb Z[i]^2 on which H(v) is as small as possible. The vector v must be primitive: if v=\delta v' with a nonunit \delta\in\mathbb Z[i] , then H(v)=|\delta|^2H(v')>H(v') . Since \mathbb Z[i] is Euclidean, a primitive vector extends to a Gaussian-integral basis. After changing variables, we may therefore suppose that the minimum occurs at (1,0) , so that A is the least positive value of H on nonzero Gaussian-integral vectors.

Now use a shear. Replacing z by z+qw , with q\in\mathbb Z[i] , changes the mixed coefficient by a multiple of A . Choose q so that the real and imaginary parts of the resulting coefficient B' lie between -A/2 and A/2 . Then |B'|^2\le A^2/2 . The new coefficient C' of |w|^2 is the value of the transformed form at (0,1) , hence C'\ge A by minimality of A . Determinant is unchanged, so

\displaystyle 1=AC'-|B'|^2\ge A^2-\frac{A^2}{2}=\frac{A^2}{2}.

As A is a positive integer, this forces A=1 . Then |B'|^2\le1/2 , so B'=0 , and the determinant identity gives C'=1 . The reduced form is exactly |z|^2+|w|^2 . Thus there is only one positive definite unimodular binary Hermitian form over \mathbb Z[i] .

Congruence

We now apply the lemma to an odd prime p . First choose a,b\in\mathbb Z with

\displaystyle a^2+b^2\equiv-1\pmod p.

Such a pair always exists. The set of quadratic residues modulo p , including zero, has (p+1)/2 elements, and so does its translate by -1 . Since the two sets together have p+1 elements inside a set of only p residue classes, they intersect. Let \lambda=a+bi\in\mathbb Z[i] . Then |\lambda|^2\equiv-1\pmod p .

Define the Gaussian lattice

\displaystyle L_{\lambda}=\{(z,w)\in\mathbb Z[i]^2 \mid z\equiv\lambda w\pmod p\}.

It is a rank-two \mathbb Z[i] -module, with basis (p,0) and (\lambda,1) : every element has the form (z,w)=(p\xi+\lambda\eta,\eta) for \xi,\eta\in\mathbb Z[i] . The reason for choosing this lattice is that its vectors automatically have norm divisible by p . Indeed, if (z,w)\in L_\lambda , then

\displaystyle |z|^2+|w|^2\equiv|\lambda w|^2+|w|^2=\bigl(|\lambda|^2+1\bigr)|w|^2\equiv0\pmod p.

We may therefore divide the norm by p and obtain the integral Hermitian form

\displaystyle H_\lambda(\xi,\eta):=\frac{|p\xi+\lambda\eta|^2+|\eta|^2}{p}.

Expanding gives \displaystyle H_\lambda(\xi,\eta)=p|\xi|^2+\lambda\overline\xi\eta+\overline\lambda\xi\overline\eta+\frac{|\lambda|^2+1}{p}|\eta|^2.

Its Hermitian matrix is

\displaystyle \begin{pmatrix}p&\lambda\\ \overline\lambda&\dfrac{|\lambda|^2+1}{p}\end{pmatrix},

and its determinant is \displaystyle p\cdot\frac{|\lambda|^2+1}{p}-|\lambda|^2=1. Thus H_\lambda is a positive definite integral binary Hermitian form of determinant 1 . The reduction lemma says that it is equivalent to H_0 ; in particular it represents 1 . Hence there are \xi,\eta\in\mathbb Z[i] such that \displaystyle H_\lambda(\xi,\eta)=1.

Returning to the corresponding vector (z,w)=(p\xi+\lambda\eta,\eta)\in L_\lambda , the definition of H_\lambda yields \displaystyle |z|^2+|w|^2=p. Writing z=x_1+ix_2 and w=x_3+ix_4 , we obtain

\displaystyle p=x_1^2+x_2^2+x_3^2+x_4^2.

Thus every odd prime is a sum of four squares; and 2=1^2+1^2+0^2+0^2 . Finally, Euler’s four-square identity shows that products of sums of four squares are again sums of four squares. Every positive integer is therefore a sum of four integer squares.

The elementary descent proof begins with a representable multiple mp and repeatedly reduces its multiplier. The present proof packages that descent into reduction theory. The congruence z\equiv\lambda w\pmod p builds a Gaussian lattice on which the usual norm is automatically divisible by p . After dividing by p , one obtains a unimodular integral Hermitian form. The reduction lemma says that there is only one such form, namely |z|^2+|w|^2 . A vector of norm 1 for the rescaled form is exactly a vector of norm p in the original congruence lattice.

This is the Hermitian analogue of the class-number-one proof of the two-square theorem. For \mathbb Z[i] itself, a prime p\equiv1\pmod4 yields a Gaussian ideal above p , and principality yields an element of norm p . Here the passage from one Gaussian variable to two turns the binary norm into the four-square form. The congruence r^2+s^2\equiv-1\pmod p always has a solution, so the Hermitian setting has enough room to represent every prime.

Leave a comment