We prove Fermat’s two-squares theorem for primes. The theorem says that whenever a prime is congruent to one modulo four, it is a sum of two squares.
The purpose of this exposition is not merely to collect five proofs. It is to explain how the first family of proofs is really organized around one object and one problem. The object is created by a square root of minus one modulo the prime. The problem is to turn that local congruence into an actual element of norm equal to the prime. In elementary number theory, this process appears as descent. In Gaussian integers, it appears as the Euclidean algorithm. In quadratic forms, it appears as reduction of a form of discriminant minus four. In ideal theory, it appears as principalization of an ideal. In geometry of numbers, it appears as finding a short vector in the same ideal viewed as a lattice. These are not five unrelated miracles. They are five coordinate systems for the same underlying phenomenon: a local splitting gives an object above the prime, and reduction forces that object to become principal.
Assume throughout that the prime satisfies the congruence condition.
By Euler’s criterion, minus one is a quadratic residue modulo this prime. Hence there exists an integer whose square is congruent to minus one modulo the prime. We choose such an integer and call it .
Equivalently, there is a positive integer such that
This equation is the seed of the entire argument. It says that a multiple of the prime is already a sum of two squares.
What remains is to remove the extra multiplier. This is the whole problem. The descent proof lowers the multiplier directly. The Gaussian proof lowers it by norm-Euclidean division. The quadratic-form proof lowers it by basis reduction. The ideal proof says that the obstruction to lowering it vanishes because the class number is one. The geometry-of-numbers proof finds, in the same ideal lattice, a short vector whose norm is forced to be exactly the prime.
1. Elementary descent Proof:
The elementary proof begins from a general representation . At the start, one may take
because
The descent step proves the following: if the multiplier is greater than one, then one can construct a new representation
with
Once this is known, the proof is finished, because a strictly decreasing sequence of positive integers cannot continue forever. Eventually the multiplier must become one, and then the desired representation appears.
To perform the descent, choose least absolute residues of and
modulo
. Thus choose integers
and
such that
and
Since we have
. Passing to the chosen residues gives
. Therefore there is an integer
such that
.
The smallness of and
is the key point. It gives
. Since
we obtain
.
Thus the new multiplier will be smaller, provided we can produce a new sum-of-two-squares representation of . This is where the multiplicative identity for sums of two squares enters.
The left-hand side is
Now we check that the two terms on the right are divisible by . Since
and
, we have
and
Hence the following are integers:
Dividing the two-squares identity by gives
with
This is the descent. It is important to see exactly what the proof has done. The congruence condition produced a multiple of the prime represented as a sum of two squares. Then, by choosing small residues modulo the multiplier and using the norm identity, we divided out the multiplier in a controlled way and replaced it by a smaller one. The proof terminates because the multiplier is a positive integer. The endpoint is precisely the disappearance of the multiplier.
This proof is elementary, but it already contains the whole story. The two-squares identity is norm multiplicativity in disguise. The least-residue choice is Euclidean division in disguise. The decreasing multiplier is the norm being reduced in disguise.
2. Gaussian integers: Euclidean Algorithm
Now rewrite the previous descent inside the Gaussian integers.
Let . Then
The descent proof chose residues and
of
and
modulo
. In Gaussian language, combine them into one Gaussian integer.
.
The congruences become the single congruence
. The equation
becomes
and the inequality
is just the assertion that the residue
was chosen with norm
, smaller than
.
Now define . This is a Gaussian integer. Indeed, since
we get
Therefore divides
in the Gaussian integers, so
belongs to
. Taking norms gives
. Using
we obtain
.
This is exactly the descent step. To see that it is literally the same formula, expand the numerator.
Thus The real and imaginary parts are exactly the integers obtained in the elementary descent proof, up to the harmless sign difference between
and
. So the elementary descent proof is not merely analogous to the Gaussian proof. It is the Gaussian norm-reduction step written in real and imaginary coordinates.
The Gaussian proof is often stated in the language of greatest common divisors. From we get
. Thus
inside
. But
cannot divide
, since that would force
to divide both
and
. Similarly,
cannot divide
. Therefore
is not prime in
.
More constructively, compute the Gaussian greatest common divisor . This greatest common divisor is not a unit, because
has a nontrivial common factor with
. It is also not associated to
, because
does not divide
. Therefore
is a proper nontrivial divisor of
in
. Taking norms, we know that
. The only possible norms are
. The first would make
a unit, and the last would make
associated to
. Hence
. Writing
we obtain
.
This is the Gaussian integer proof. Its relationship with descent is exact: the Euclidean algorithm in repeatedly chooses small remainders, and the elementary descent proof is the same remainder-reduction process after unpacking the real and imaginary parts. The multiplier
in the descent proof is the visible shadow of the Gaussian norm being reduced.
3. Quadratic forms: Reduction
Now we pass to binary quadratic forms. From the seed equation define the form
Its discriminant is
Thus is a positive definite binary quadratic form of discriminant
. Also,
. So
is represented by a form of discriminant
. If this form is equivalent to the principal form
then
is represented by
, and the theorem follows.
The crucial point is that this form has not been pulled out of thin air. It is exactly the normalized norm form of the Gaussian ideal attached to the congruence. Define As a free abelian group of rank two, this ideal has basis
. A general element of the ideal is
Compute its norm: . Therefore
. Expanding gives
Using this becomes
. Factoring out
, we obtain
. Hence
The ideal has norm
. One way to see this is to look at the quotient. In the quotient by
, we have
and
, and because
, this gives exactly a quotient of size
. Thus
So we can write the form more conceptually as
This identity is the bridge. It says that the quadratic form is the norm form of the ideal
, expressed in the particular basis
and normalized by the ideal norm.
Now consider what equivalence of quadratic forms means. A change of variables
does not change the set of represented integers. It only changes the coordinates. In the ideal language, this is exactly a change of -basis.
If the old basis is then a unimodular change gives a new basis
. The determinant condition ensures that the new pair is still a basis of the same ideal. Therefore equivalent quadratic forms are not different objects. They are the same ideal written in different bases. Form reduction is therefore basis reduction. One changes the basis of the ideal so that the coefficients of the norm form become smaller and better balanced. This is the quadratic-form version of Euclidean reduction. The Gaussian proof chooses smaller remainders. The form proof chooses a better basis. These are two presentations of the same act.
For discriminant , the reduction theory is especially simple. Suppose
is a reduced positive definite form of discriminant
. Then
and reduction means
. From the discriminant equation,
. Since
we have
and since
we have
. Thus
. Hence
so
. Substituting back gives
Since
, we must have
and then
Thus the only reduced positive definite form of discriminant is
which is the form
.
Therefore the form reduces to
Since represents
, the equivalent principal form also represents
. Hence
.
This is the quadratic-form proof. But now its meaning is clear: it is the same Gaussian ideal being rewritten in a better basis until its norm form becomes the principal form.
4. Ideals: class number one
The ideal-theoretic proof is the cleanest conceptual version. The congruence creates the ideal
This ideal lies above and has norm
. The theorem is now reduced to a single question: is this ideal principal? If
for some Gaussian integer
then
and therefore
.
Thus the whole theorem becomes the principalization statement .
The reason this is true is that has class number one. In fact, it is Euclidean with respect to the norm
and every Euclidean domain is a principal ideal domain. Therefore every ideal of
is principal. In particular, the ideal
must be generated by one Gaussian integer. Because its norm is
, that generator has norm
.
This is not a different proof in spirit from descent. It is what descent is proving by hand. The descent proof shows directly that the multiplier can be lowered until an element of norm appears. The Gaussian Euclidean proof computes a generator by a greatest-common-divisor algorithm. The quadratic-form proof changes the basis until the norm form becomes principal. The ideal proof says that such a generator must exist because the ideal class group is trivial.
The dictionary is exact. The elementary representation is the same as the Gaussian norm equation
. The starting congruence produces the ideal
and the quadratic form
. The descent endpoint
is the Gaussian endpoint
which is the form endpoint
which is the ideal endpoint
.
This is the central point of the first family of proofs. They are not merely similar. They are the same principalization process written in progressively more invariant languages. Descent is the coordinate computation. Gaussian integers reveal the norm structure. Quadratic forms record the norm structure as a basis-dependent form. Ideals remove the basis and identify the true invariant object.
5. Geometry of numbers:
The geometry-of-numbers proof belongs to the same family because it uses the same ideal, but it produces the generator by a different reduction principle. Instead of iteratively reducing a multiplier or applying Euclidean division, it views the ideal as a lattice in the plane and finds a short vector.
Consider a lattice
This lattice is not merely analogous to the ideal . It is exactly that ideal written in real coordinates. Indeed, an element of
has the form
. So if
then
. Conversely, if
then
for some integer
, and hence
.
Therefore as a lattice in the complex plane is precisely
The covolume of this lattice is
. This follows because
is the kernel of the surjective homomorphism
Thus the index of
in
is
, and so
.
Now observe the arithmetic property of this lattice. If then
. Therefore
So every vector in the lattice has squared length divisible by
.
. But
.
Thus the geometry-of-numbers proof is still a norm proof. It is looking for an element of the same ideal whose norm is exactly .
Minkowski’s convex body theorem now supplies the reduction step. In a lattice of covolume , every centrally symmetric convex region in the plane whose area is greater than
contains a nonzero lattice point. Consider the disk
. Its radius is
and its area is
. Since
Minkowski’s theorem gives a nonzero vector
inside this disk. Hence
.
But we already know that . The only positive multiple of
strictly smaller than
is
itself. Therefore
. This proves the theorem.
The geometry proof is sometimes described as different from descent, and in a narrow algorithmic sense that is true: it does not visibly run the sequence . But conceptually it is still principalizing the same ideal. The lattice
is the ideal
. Minkowski finds a nonzero element of this ideal with norm less than
. Since all norms in this ideal are divisible by
, this short element must have norm exactly
. If
and
then
.
So geometry of numbers proves principalization by shortest-vector reduction (the existence of which here is proved by a pigeonhole style argument). Descent principalizes by repeated local improvements. Euclidean algorithm principalizes by small remainders. Form reduction principalizes by changing the basis. Geometry principalizes by finding the shortest possible nonzero vector in the ideal lattice.
Each proof finds the same hidden generator by some form of reduction. Elementary descent reduces the extra multiplier in a norm equation until the multiplier becomes . Gaussian integers reveal this as Euclidean norm reduction in
, where the generator appears as
. Quadratic forms express the same object as the normalized norm form of the ideal; reducing
to
is exactly the passage to a principal basis. Geometry of numbers performs the same search geometrically: it draws the ideal as a lattice and uses short-vector reduction to find an element whose norm, being positive and divisible by
, must equal
.
The conceptual reason all these reductions succeed is class number one. The congruence creates an ideal
with
. The task is to reduce this ideal, form, lattice, or norm equation until it becomes principal. In a ring with nontrivial class group, such a reduction could stop in a nonprincipal class; the local congruence would create an object above
, but not necessarily a global element of norm
. For
, however, the class number is one, so there is no such obstruction. Every reduction route must end at a principal generator. Thus
and taking norms gives
. The theorem is therefore the passage from local splitting to global principality: the square root of
modulo
creates the ideal, and class-number-one reduction forces that ideal to produce the desired element of norm
.