Euler’s Proof of Fermat’s theorem for n=3

We will prove that there do not exist nonzero integers (X,Y,Z) such that

\displaystyle X^{3}+Y^{3}+Z^{3}=0.

Replacing (Z) by (-Z) , this is the same as saying that (X^{3}+Y^{3}=Z^{3}) has no nonzero integer solutions. The statement is the exponent-three case of Fermat’s Last Theorem. Euler’s proof is an early and striking example of a general mathematical strategy: begin with the ordinary factorization that everyone knows, notice exactly why it is not strong enough, enlarge the setting just enough to repair that defect, and then use the repaired factorization to force an impossible infinite descent.

The proof will use complex numbers, but only in a concrete way. We will not need the language of number fields, ideals, or abstract algebra. We will construct a triangular grid in the plane, show that one can divide inside that grid much as one divides ordinary integers, and use that fact to justify a special kind of factorization. The complex numbers enter because cubes naturally bring in the third roots of unity, just as squares naturally bring in (i=\sqrt{-1}) .

The first elementary observation is worth keeping in mind throughout. If a product of pairwise coprime ordinary integers is a cube, then each factor is itself a cube, allowing negative cubes when the product is negative. This is simply prime factorization. Every ordinary prime occurs in the total product to an exponent divisible by (3) . If the factors do not share primes, then the exponent of every prime inside each individual factor must already be divisible by (3) . Much of the proof consists of arranging situations where this simple fact can be applied.

We argue by contradiction. Suppose that some nonzero triple ((X,Y,Z)) satisfies (X^{3}+Y^{3}+Z^{3}=0) . Among all such triples choose one for which the positive number (|XYZ|) is as small as possible. The entire proof will be devoted to constructing another nonzero solution whose product of absolute values is strictly smaller. Since there cannot be an endlessly decreasing sequence of positive integers, that will give the contradiction.

The chosen three integers must be pairwise coprime. Indeed, if an ordinary prime divided both (X) and (Y) , then the equation would show that it divides (Z^{3}) , hence divides (Z) as well. Dividing all three integers by that prime would give a new nonzero solution with a strictly smaller product. The same argument works for any pair. Since the three integers are pairwise coprime, at most one of them is even. They cannot all be odd, because the cube of an odd integer is odd, and the sum of three odd integers is odd rather than zero. Thus exactly one of them is even. By renaming the variables, we may suppose that (Z) is even and (X,Y) are odd. Now set \displaystyle R=\frac{X+Y}{2}, \quad S=\frac{X-Y}{2}.

This is just a change of coordinates: (X=R+S) and (Y=R-S) . Since (X) and (Y) are odd, both (R) and (S) are integers. They are coprime, because a common divisor of (R) and (S) would divide both (R+S=X) and (R-S=Y) . They have opposite parity, because (R+S=X) is odd. They are also nonzero. If (R=0) , then (X=-Y) , forcing (Z=0) . If (S=0) , then (X=Y) , so (Z^{3}=-2X^{3}) ; but (X) is odd, and the power of (2) in the right-hand side is exactly one, while the power of (2) in a cube must be divisible by three.

The advantage of the substitution is that the sum of cubes becomes symmetrical. Expanding ((R+S)^{3}+(R-S)^{3}) , all terms with odd powers of (S) cancel, leaving

\displaystyle (R+S)^{3}+(R-S)^{3}=2R(R^{2}+3S^{2}).

Thus our supposed Fermat solution gives

\displaystyle -Z^{3}=2R(R^{2}+3S^{2}).

This is the crucial shape of the problem. The ordinary factorization of (X^{3}+Y^{3}) has become a product of (2R) and (R^{2}+3S^{2}) . If these two factors were always coprime, the proof would now be nearly finished. The trouble is that the prime (3) can interfere. The expression (R^{2}+3S^{2}) therefore needs to be understood more deeply.

The right way to see it is to draw a triangular grid in the complex plane. Let \displaystyle \omega=-\frac12+\frac{\sqrt3}{2}i. This number has length one and points at an angle of (120^\circ) . It satisfies (\omega^{3}=1) and (1+\omega+\omega^{2}=0) . Consider all points of the form (a+b\omega) , where (a) and (b) are ordinary integers. These points form the vertices of a tiling of the plane by equilateral triangles of side length one. From each point one can move to its six closest neighbors by adding one of (\pm1,\pm\omega,\pm\omega^{2}) . The important fact is that this triangular grid is closed under multiplication. Since (\omega^{2}=-1-\omega) , multiplying two points in the grid gives

\displaystyle (a+b\omega)(c+d\omega)=(ac-bd)+(ad+bc-bd)\omega,

which is again a point of the same grid. Thus the grid has both addition and multiplication, just as the ordinary integers do. For a grid point (a+b\omega) , define its size to be its ordinary squared Euclidean distance from the origin. Since the complex conjugate of (\omega) is (\omega^{2}) , this size is

\displaystyle (a+b\omega)(a+b\omega^{2})=a^{2}-ab+b^{2}.

The key property is that sizes multiply. This is just the familiar fact that the squared length of a product of complex numbers is the product of the squared lengths. In symbols, if (\alpha) and (\beta) are grid points, then the size of (\alpha\beta) equals the size of (\alpha) times the size of (\beta) .

The six points of size one are precisely (\pm1,\pm\omega,\pm\omega^{2}) . Multiplying by one of them merely rotates or reverses the triangular grid. These six elements play the role played by (\pm1) among ordinary integers: they do not change divisibility in any serious way, because each has an inverse still inside the grid.

The whole reason for using this particular triangular grid is geometric. Every point in the plane lies in one of the equilateral triangles of the tiling. The furthest point of a unit equilateral triangle from its three vertices is its center, which lies at distance (1/\sqrt3) from each vertex. Therefore every complex number lies within distance at most (1/\sqrt3) of some point of the grid.

This gives a division algorithm. Take two grid points (\alpha,\beta) , with (\beta\neq0) . The quotient (\alpha/\beta) is an ordinary complex number. Choose a nearby grid point (q) whose distance from (\alpha/\beta) is at most (1/\sqrt3) , and form the remainder (r=\alpha-q\beta) . Then (\alpha=q\beta+r) , and the size of (r) is strictly smaller than the size of (\beta) , unless (r=0) . The reason is simply that

\displaystyle |r|=|\beta|\left|\frac{\alpha}{\beta}-q\right| \leq \frac{|\beta|}{\sqrt3}.

After squaring, the size of the remainder is at most one third of the size of the divisor.

This is exactly the feature of ordinary integers that makes the Euclidean algorithm work. Repeatedly divide and take remainders; the sizes are nonnegative integers and strictly decrease, so eventually the remainder becomes zero. Hence grid points have greatest common divisors, up to multiplication by one of the six rotations. Running the Euclidean algorithm backward gives the usual Bézout relation: if two grid points have no common nontrivial divisor, then some grid-linear combination of them equals (1) . From this one obtains the familiar prime-divisibility rule. If an indivisible grid point divides a product but does not divide one factor, then it must divide the other. The proof is the same as for ordinary integers. If such a point (\pi) does not divide (\alpha) , then (\pi) and (\alpha) have no common nontrivial divisor, so there are grid points (u,v) with (u\pi+v\alpha=1) . Multiplying by (\beta) , one finds that (\beta=u\pi\beta+v\alpha\beta) , and both terms on the right are divisible by (\pi) . Therefore (\pi) divides (\beta) .

Consequently, factorization into indivisible grid factors works exactly as it does in the ordinary integers. Every nonzero grid point can be broken into indivisible factors, because if it breaks into two nontrivial pieces then each piece has smaller size. A factorization is unique because an indivisible factor on one side must divide one factor on the other side, allowing cancellation and repetition. The only ambiguity comes from the six rotations of the grid.

We now return to the expression (R^{2}+3S^{2}) . Write \displaystyle D=\sqrt{-3}=i\sqrt3. This is a point of the triangular grid because (D=1+2\omega) . Therefore (R+SD) and (R-SD) are both grid points. Their product is exactly

\displaystyle (R+SD)(R-SD)=R^{2}+3S^{2}.

So the troublesome quadratic expression is not really a quadratic expression at all. Inside the triangular grid, it becomes the product of two ordinary-looking linear factors. The next fact is the heart of the proof. Suppose that (r,s) are coprime nonzero ordinary integers and that (r^{2}+3s^{2}) is a cube. Then there exist ordinary integers (a,b) such that

\displaystyle r=a(a^{2}-9b^{2}), \qquad s=3b(a^{2}-b^{2}).

At first this formula may look invented. In reality it is forced by the expansion of a cube. Indeed,

\displaystyle (a+bD)^{3}=a(a^{2}-9b^{2})+3b(a^{2}-b^{2})D.

Thus the two formulas simply say that (r+sD) itself is a cube in the triangular grid.

Before proving that, observe first that (r) and (s) have opposite parity. They cannot both be even because they are coprime. If they were both odd, then (r^{2}+3s^{2}) would be congruent to (1+3=4) modulo (8) . But no cube is congruent to (4) modulo (8) : an odd cube is odd, while an even cube is divisible by (8) . Also, (3) cannot divide (r) . If it did, then (3) would not divide (s) , and modulo (9) one would have (r^{2}+3s^{2}\equiv3) . But cubes modulo (9) are only (0,1,-1) .

These two observations show that the ordinary integers (2r) and (r^{2}+3s^{2}) are coprime. The quadratic expression is odd because (r,s) have opposite parity; it is not divisible by (3) because (3) does not divide (r) ; and any other common prime divisor would divide both (r) and (s) .

Now factor the cube inside the triangular grid: \displaystyle (r+sD)(r-sD)=r^{2}+3s^{2}. The two factors on the left have no common nontrivial grid divisor. Indeed, any common divisor divides their sum (2r) , and it also divides their product (r^{2}+3s^{2}) . But these two ordinary integers are coprime. Therefore a common grid divisor would divide an ordinary integer combination of them equal to (1) , so it can only be one of the six harmless rotations.

Because the product of the two relatively prime grid points is a cube, unique factorization says that each one is a cube, apart from one of the six rotations. Thus \displaystyle r+sD=\varepsilon\Gamma^{3}, where (\Gamma) is a grid point and (\varepsilon) is one of (\pm1,\pm\omega,\pm\omega^{2}) .

There is a minor parity adjustment before we can write (\Gamma) in the desired shape (a+bD) . Write (\Gamma=p+q\omega) . A point has the form (a+bD) exactly when its coefficient of (\omega) is even, because (a+bD=(a+b)+2b\omega) . If (q) is already even, nothing needs to be done. If (q) is odd and (p) is odd, multiply (\Gamma) by (\omega) ; its new (\omega) -coefficient becomes (p-q) , which is even. If (q) is odd and (p) is even, multiply (\Gamma) by (\omega^{2}) ; its new (\omega) -coefficient becomes (-p) , again even. Since both (\omega) and (\omega^{2}) have cube equal to (1) , this rotation changes neither (\Gamma^{3}) nor the essential content of the equation. We may therefore assume that (\Gamma=a+bD) .

It remains to show that the leftover rotation (\varepsilon) is actually (\pm1) , rather than one of the four nonreal rotations. Let \displaystyle (a+bD)^{3}=F+GD, so that (F=a(a^{2}-9b^{2})) and (G=3b(a^{2}-b^{2})) . The four nonreal rotations have the form ((\epsilon+\eta D)/2) , where (\epsilon,\eta) are each either (1) or (-1) . Multiplying gives

\displaystyle \frac{\epsilon+\eta D}{2}(F+GD) = \frac{\epsilon F-3\eta G}{2} + \frac{\eta F+\epsilon G}{2}D.

For this to have integer coefficients, (F) and (G) must have the same parity. But if (a) and (b) had opposite parity, then the formulas for (F) and (G) show that they would have opposite parity. Therefore (a) and (b) must have the same parity.

If (a,b) are both even, then (F) and (G) are both divisible by (8) . If (a,b) are both odd, then (a^{2}\equiv b^{2}\equiv1\pmod8) , so both (a^{2}-9b^{2}) and (a^{2}-b^{2}) are divisible by (8) , again making (F) and (G) divisible by (8) . In either case the displayed formula would make both (r) and (s) even. But (r,s) were coprime and of opposite parity. This contradiction shows that a nonreal rotation is impossible. The remaining rotation is (\pm1) , and the minus sign can be absorbed by changing (a,b) to (-a,-b) .

We have therefore proved the promised cubic description: every primitive solution of (r^{2}+3s^{2}=\text{cube}) comes from expanding a cube of the form (a+b\sqrt{-3}) .

We now return to the supposed smallest solution ((X,Y,Z)) . Recall that \displaystyle -Z^{3}=2R(R^{2}+3S^{2}), that (R,S) are coprime and of opposite parity, and that (Z) is even. Since (R^{2}+3S^{2}) is odd, the right side shows that (2R) is divisible by (8) . Hence (R) is divisible by (4) , and (S) is odd.

There are now two possibilities. First suppose that (3) does not divide (R) . Then (2R) and (R^{2}+3S^{2}) are coprime. The latter is odd; it is not divisible by (3) ; and any other common prime would divide both (R) and (S) . Their product is a signed cube, namely (-Z^{3}) , so each factor is a signed cube. Thus there are integers (L,M) with

\displaystyle 2R=L^{3}, \quad R^{2}+3S^{2}=M^{3}.

The cubic description just proved supplies integers (a,b) such that

\displaystyle R=a(a^{2}-9b^{2}), \quad S=3b(a^{2}-b^{2}).

Substituting the first formula into (2R=L^{3}) gives

\displaystyle L^{3}=2a(a-3b)(a+3b).

The three factors (2a) , (a-3b) , and (a+3b) are pairwise coprime. To see why, first note that (a,b) are coprime, because any common divisor would divide both (R) and (S) . Since (S) is odd, (b) is odd and (a) is even. Since (3) does not divide (R) , it does not divide (a) . Thus (a-3b) and (a+3b) are odd and are not divisible by (3) . Any common prime divisor of (2a) and one of (a\pm3b) would therefore have to divide both (a) and (b) . Similarly, a common divisor of (a-3b) and (a+3b) divides both their sum (2a) and their difference (6b) , and again must be trivial.

Since these three pairwise coprime factors multiply to a cube, each one is a signed cube. There are therefore nonzero integers (U,V,W) such that

\displaystyle U^{3}=-2a, \quad V^{3}=a-3b, \quad W^{3}=a+3b.

Adding the three displayed cubes gives

\displaystyle U^{3}+V^{3}+W^{3}=-2a+(a-3b)+(a+3b)=0.

Thus ((U,V,W)) is another nonzero solution of the original cubic equation. It is smaller, because

\displaystyle |UVW|^{3} = |2a(a-3b)(a+3b)| = |2R| = |X+Y|.

Since (Z) is a nonzero even integer, (|Z|\geq2) . Therefore

\displaystyle |X+Y| \leq |X|+|Y| \leq 2|X||Y| \leq |XYZ|.

The number (|XYZ|) is at least two, and hence is strictly smaller than its own cube. It follows that (|UVW|<|XYZ|) , contradicting the choice of ((X,Y,Z)) as the smallest nonzero solution.

It remains to consider the case in which (3) divides (R) . Write (R=3T) . Since (R) is divisible by (4) , (T) is even; since (R,S) are coprime, (T,S) are coprime and (3) does not divide (S) . The main identity becomes

\displaystyle -Z^{3}=18T(S^{2}+3T^{2}).

The two factors (18T) and (S^{2}+3T^{2}) are coprime. The second is odd because (S) is odd and (T) is even. It is not divisible by (3) , because modulo (3) it is congruent to (S^{2}) , and (3) does not divide (S) . Finally, any prime larger than (3) dividing both factors would divide both (T) and (S) , which is impossible.

Thus (18T) and (S^{2}+3T^{2}) are each signed cubes. Apply the cubic description to the latter. There are integers (c,d) such that

\displaystyle S=c(c^{2}-9d^{2}), \quad T=3d(c^{2}-d^{2}).

Substituting the second formula into the cube (18T) gives

\displaystyle 18T=54d(c-d)(c+d).

Since the right side is divisible by (27) , its cube root is divisible by (3) . After dividing the cube identity by (27) , we obtain a cube of the form

\displaystyle \text{cube}=2d(c-d)(c+d).

The three factors (2d) , (c-d) , and (c+d) are pairwise coprime. Since (S) is odd, the formula (S=c(c^{2}-9d^{2})) forces (c) to be odd and (d) to be even. Hence (c-d) and (c+d) are odd. Any common divisor of (2d) and (c\pm d) would divide both (c) and (d) . Any common divisor of (c-d) and (c+d) divides both (2c) and (2d) . Since (c,d) are coprime, these common divisors are trivial.

Again each factor must be a signed cube. Thus there are nonzero integers (U,V,W) such that

\displaystyle U^{3}=-2d, \quad V^{3}=d-c, \quad W^{3}=d+c.

Their cubes add to zero: \displaystyle U^{3}+V^{3}+W^{3}=-2d+(d-c)+(d+c)=0.

This is another nonzero solution of the same equation. Its product is even smaller than before, because

\displaystyle |UVW|^{3} = |2d(c^{2}-d^{2})| = \frac{2|T|}{3} = \frac{|X+Y|}{9}.

Thus (|UVW|^{3}<|X+Y|\leq|XYZ|) , and therefore (|UVW|<|XYZ|) . This contradicts minimality once again.

Both possibilities for (R) lead to a strictly smaller nonzero solution. Hence the supposedly smallest solution cannot exist. Therefore there was no nonzero solution in the first place.

\displaystyle X^{3}+Y^{3}+Z^{3}=0 \text{ has no solution in nonzero integers.}

The proof becomes easier to remember once one sees its internal shape. Ordinary algebra transforms the original equation into a statement about the product (2R(R^{2}+3S^{2})) . The factor (R^{2}+3S^{2}) is the squared length of the complex point (R+S\sqrt{-3}) . The triangular grid lets us factor that squared length into two conjugate pieces, (R+S\sqrt{-3}) and (R-S\sqrt{-3}) , and its geometry guarantees that divisibility and factorization behave as well there as they do for ordinary integers.

The cubic parametrization is not a lucky guess. It is simply the expansion of ((a+b\sqrt{-3})^{3}) . Once the quadratic expression is shown to arise from an actual cube in the triangular grid, the factors (a) , (a-3b) , and (a+3b) , or in the second case (d) , (c-d) , and (c+d) , appear automatically. Their sum has exactly the form needed to manufacture a new equation of the same type. The descent is built into the algebra.

One can also now see why the prime (3) was exceptional from the beginning. In the triangular grid one has \displaystyle (1-\omega)^{2}=-3\omega. So, up to a rotation, the ordinary integer (3) is already a square of a special grid point. This is why the prime (3) interacts differently with the factor (R^{2}+3S^{2}) , and why Euler’s proof naturally divides into two cases. Nothing arbitrary is happening: the special treatment of (3) is already encoded in the geometry of the triangular grid.

Leave a comment