Fundamental Theorem of Algebra: an algebraic proof

We will prove Fundamental Theorem of Algebra which says that every nonconstant polynomial with complex coefficients has a complex root. Equivalently, every polynomial with complex coefficients can be broken completely into linear factors. For example, a polynomial of degree n can be written in the form \displaystyle c(x-z_1)(x-z_2)\cdots(x-z_n), where c,z_1,\dots,z_n\in\mathbb C.

The theorem says that once we have added the number i , satisfying i^2=-1, to real numbers there is no further algebraic kind of number which is needed in order to solve polynomial equations. At first this is not obvious. Quadratic equations forced us to leave the real line: the equation x^2+1=0 has no real solution, so we introduced i . But why should a cubic, quartic, or much larger polynomial not force us to introduce a new number beyond the complex numbers? Why should the process stop at \mathbb C ? The proof below answers exactly that question.

The basic idea is this. Suppose there really were a larger algebraic number system containing roots which are not complex numbers. Suppose that some irreducible real polynomial really did require a root beyond the complex numbers. Put all of its roots and create one extension L . We will study the arithmetic symmetries of L : the ways of permuting its roots while preserving addition, multiplication, and all real numbers. The symmetry argument does not itself produce the final contradiction. Its purpose is to reduce the possible counterexample step by step. It shows that, if there were anything genuinely new beyond the complex numbers, then there would have to be a field M with \mathbb R\subseteq \mathbb C\subseteq M such that [M:\mathbb C]=2. In words: there would have to be one more quadratic layer after the complex numbers. The final calculation then deals directly with a proposed new element of this quadratic layer. After completing the square, that element would satisfy \theta^2=a, \quad a\in\mathbb C. But every complex number already has a complex square root. Thus a=c^2 for some c\in\mathbb C , and \theta^2=c^2. Therefore (\theta-c)(\theta+c)=0, so \theta=\pm c . The supposedly new element was already complex. Thus the whole proof reduces every possible alleged number beyond \mathbb C to a quadratic equation which already splits in \mathbb C .

Only two facts about the real numbers are used from analysis. Every real polynomial of odd degree has a real root. Every positive real number has a real square root. Both facts follow from the intermediate value theorem. Everything else below is algebra. This proof in modern terms belongs to Galois theory. We will calculate directly with roots, coordinate systems, sums over symmetries, products over symmetries, and the fields fixed by those symmetries.

Before beginning the details, here is the route of the proof. We explain what it means for a number system (field) to have finite size over \mathbb R . We prove that there is no nontrivial finite enlargement of \mathbb R of odd size. We prove that the only nontrivial quadratic enlargement of \mathbb R is \mathbb C . We build a temporary finite number system containing all roots of a given real polynomial. We count the arithmetic-preserving ways of replacing its roots by other roots. We show that the symmetry group must have size a power of 2 and must be generated by one repeated symmetry. We show that a larger system would force a quadratic enlargement of \mathbb C . We prove directly that no such enlargement exists. At the end, we first obtain the result for real polynomials and then pass to arbitrary complex coefficients.

Finite extensions

A field is a number system in which we can add, subtract, multiply, and divide by every nonzero element. The real numbers \mathbb R and the complex numbers \mathbb C are fields. Suppose that K is a field containing another field \mathbb F . We say that K has degree m over \mathbb F if there are elements e_1,e_2,\dots,e_m\in K such that every element of K can be written in one and only one way as

\displaystyle a_1e_1+a_2e_2+\cdots+a_me_m, \quad a_1,\dots,a_m\in\mathbb F.

We write [K:\mathbb F]=m. The notation looks abstract, but its meaning is very concrete: m is the number of coordinates needed to describe an element of K . For example, every complex number has the form a+bi, ~ a,b\in\mathbb R. The two numbers 1,\ i are enough to describe every complex number, and the description is unique. Therefore [\mathbb C:\mathbb R]=2. So one may think of the complex numbers as a two-dimensional coordinate system over the real numbers.

Suppose that F is a field and that h(x)\in F[x] is irreducible of degree d . We create a formal symbol \alpha satisfying h(\alpha)=0. Every polynomial expression in \alpha can be reduced to the form

\displaystyle a_0+a_1\alpha+\cdots+a_{d-1}\alpha^{d-1}, \quad a_j\in F.

Why? The equation h(\alpha)=0 allows us to replace \alpha^d by a combination of lower powers. Repeating this process reduces every higher power. We also need to know that nonzero expressions can be divided. Let A(\alpha)\neq 0. After reducing powers, assume that \deg A<d. The polynomials A(x) and h(x) have no common factor, because h is irreducible and has larger degree. The Euclidean algorithm therefore gives polynomials U(x),V(x)\in F[x] such that U(x)A(x)+V(x)h(x)=1. Substitute x=\alpha . Since h(\alpha)=0 , we get U(\alpha)A(\alpha)=1. Thus U(\alpha) is the inverse of A(\alpha) . So the formal expressions form a field. We denote it by F(\alpha). It has degree [F(\alpha):F]=d.

Suppose that we have three fields \mathbb R\subseteq F\subseteq K. Assume that F needs r real coordinates and that K needs s coordinates from F . In symbols, suppose [F:\mathbb R]=r, \quad [K:F]=s. Then [K:\mathbb R]=rs. This is the multiplication rule for degrees. To see it directly, choose numbers u_1,\dots,u_r which form a coordinate system for F over \mathbb R , and choose numbers v_1,\dots,v_s which form a coordinate system for K over F . We claim that the rs products \displaystyle u_iv_j, \qquad 1\leq i\leq r,\ 1\leq j\leq s, form a coordinate system for K over \mathbb R . First, they span K . Take any x\in K . Since the v_j form a basis over F , we can write x=a_1v_1+\cdots+a_sv_s, \quad a_j\in F. Each coefficient a_j belongs to F , so it can itself be written as a_j=c_{1j}u_1+\cdots+c_{rj}u_r, \qquad c_{ij}\in\mathbb R. Putting these two expressions together gives

\displaystyle x=\sum_{j=1}^{s}\sum_{i=1}^{r} c_{ij}u_iv_j.

Thus the products u_iv_j span all of K . Now suppose that \sum_{j=1}^{s}\sum_{i=1}^{r} c_{ij}u_iv_j=0. Group the expression according to the v_j . \left(\sum_{i=1}^{r}c_{i1}u_i\right)v_1+\cdots+\left(\sum_{i=1}^{r}c_{is}u_i\right)v_s=0. The v_j are independent over F , so every coefficient must vanish: \sum_{i=1}^{r}c_{ij}u_i=0 for every j. But the u_i are independent over \mathbb R . Hence every c_{ij} is zero. So the products are independent. This proves the multiplication rule.

We will repeatedly use the following simple idea. If a number system has only finitely many real coordinates, then sufficiently many powers of any element must satisfy a relation. Suppose that [K:\mathbb R]=m and let u\in K . Look at the m+1 numbers \displaystyle 1,u,u^2,\dots,u^m. Since K has only m independent real coordinate directions, these m+1 numbers cannot all be independent over \mathbb R . Therefore there are real numbers a_0,\dots,a_m , not all zero, such that

\displaystyle a_0+a_1u+\cdots+a_mu^m=0.

In other words, u satisfies some polynomial equation with real coefficients. Among all nonzero real polynomials which vanish at u , choose one of smallest degree. Call it m_u(x). This is called the minimal polynomial of u over \mathbb R . Suppose that \displaystyle m_u(x)=A(x)B(x) with both A(x) and B(x) having smaller positive degree. Substituting x=u gives \displaystyle 0=m_u(u)=A(u)B(u). Therefore either A(u)=0 or B(u)=0. But this would produce a polynomial of smaller degree which vanishes at u , contradicting the choice of m_u . Therefore: m_u(x) is irreducible over \mathbb R.

Suppose that \deg m_u=d. Then the powers 1,u,u^2,\dots,u^{d-1} are independent over \mathbb R . Indeed, a relation among them would be a polynomial equation for u of degree smaller than d , which is impossible. They also span every expression made from u . The equation m_u(u)=0 allows us to rewrite u^d as a linear combination of lower powers. Repeating this process reduces every higher power u^{d+1},u^{d+2},\dots to a combination of 1,u,\dots,u^{d-1}. Thus the number system generated by u , written \mathbb R(u) , has degree [\mathbb R(u):\mathbb R]=d=\deg m_u. This is important because it turns a polynomial degree into a concrete count of coordinates.

Odd-degree extension of \mathbb R

We now prove the first major restriction. There is no nontrivial finite extension of \mathbb R having odd degree. Suppose that K is a finite extension containing \mathbb R and suppose that [K:\mathbb R] is odd. We want to show that no genuinely new number can occur in K .

Assume, for contradiction, that there is some u\in K\setminus\mathbb R. Let m_u(x) be its minimal polynomial. We have already seen that [\mathbb R(u):\mathbb R]=\deg m_u. Since \mathbb R\subseteq \mathbb R(u)\subseteq K, the multiplication rule gives [K:\mathbb R]=[K:\mathbb R(u)],[\mathbb R(u):\mathbb R]. Therefore \deg m_u=[\mathbb R(u):\mathbb R] divides the odd number [K:\mathbb R] . So \deg m_u is itself odd.

But every real polynomial of odd degree has a real root. Thus there is some r\in\mathbb R such that \displaystyle m_u(r)=0. Therefore m_u(x)=(x-r)q(x) for some real polynomial q(x) . This says that m_u is reducible. But the minimal polynomial was irreducible. Contradiction. Hence no such u exists. Therefore K=\mathbb R. So there are no finite extensions of \mathbb R of odd degrees.

This is our first major piece of information. Any finite algebraic extension of the real numbers must have even degree.

Quadratic Extensions of \mathbb R

Now let us examine the smallest possible nontrivial extension. Suppose that [K:\mathbb R]=2. Choose some u\in K\setminus\mathbb R. Then 1,\ u form a basis of K over \mathbb R . Since every element of K has the form a+bu , the square u^2 must have the form

u^2=A+Bu, ~ A,B\in\mathbb R.

Rearranging gives u^2-Bu-A=0. This is just a quadratic equation. Complete the square. Define \beta=u-\frac{B}{2}. Then \beta^2=A+\frac{B^2}{4}. Write \beta^2=d, \quad d\in\mathbb R. There are three possibilities. If d>0 , then d=s^2 for some real s . Hence (\beta-s)(\beta+s)=\beta^2-s^2=0. Since K is a field, one of the factors must be zero. Thus \beta=s \quad \text{or}\quad \beta=-s. So \beta\in\mathbb R , which would imply u\in\mathbb R , contrary to our choice of u . If d=0 , then \beta^2=0. Again this gives \beta=0, which is again impossible. The only remaining possibility is d<0. Write d=-s^2, \quad s>0. Then \left(\frac{\beta}{s}\right)^2=-1. Define i=\frac{\beta}{s}. Then i^2=-1. Since every element of K has the form a+bu , it also has the form \displaystyle a+bi, \quad a,b\in\mathbb R. Therefore \displaystyle K=\mathbb R(i)=\mathbb C.

We have proved: Every nontrivial quadratic extension of \mathbb R is equal to \mathbb C. In particular, there is only one way to leave the real line by solving a quadratic equation.

Complex square roots:

The next fact looks familiar, but we will prove it carefully because it is the final obstruction which prevents any further quadratic extension. Let z=u+vi, \quad u,v\in\mathbb R. We claim that there is a complex number w such that w^2=z. If v=0 , then z is real. If u\geq 0 , then \sqrt u is a square root of z . If u<0 , then i\sqrt{-u} is a square root of z . Now suppose that v\neq 0. Define r=\sqrt{u^2+v^2}. Since v\neq 0 , we have r>|u|. In particular, r+u>0. Define x=\sqrt{\frac{r+u}{2}}, \quad y=\frac{v}{2x}. Then 2xy=v. Also, y^2=\frac{v^2}{4x^2}=\frac{v^2}{2(r+u)}. Since r^2=u^2+v^2, we have v^2=r^2-u^2=(r-u)(r+u). Therefore y^2=\frac{r-u}{2}. Hence x^2-y^2=\frac{r+u}{2}-\frac{r-u}{2}=u. Now calculate:

\displaystyle (x+yi)^2=x^2-y^2+2xyi=u+vi=z.

Thus every complex number has a complex square root. This fact means that no equation of the form x^2=z, \quad z\in\mathbb C, forces us to leave the complex numbers.

Splitting field

We now begin the main part of the proof. Let f(x)\in\mathbb R[x] be an irreducible real polynomial. We want to prove that its degree is at most 2 . At this point we are not allowed to assume that f already has complex roots, because that is exactly what we are trying to prove. Instead, we will temporarily build a larger algebraic number system in which all roots of f exist. Think of this as a calculation device. We introduce formal symbols which obey polynomial equations, and then calculate with them exactly as we would calculate with ordinary numbers. Start with the real numbers. Adjoin one root of f , then adjoin another root which is still missing, then another, and continue until all roots of f are present. After finitely many steps we obtain a field L which contains every root of f . Because each step has finite degree, the multiplication rule shows that [L:\mathbb R] is finite. This field L is called a splitting field of f . We will not use any mysterious property of it. We will only use two simple facts: It contains every root of f . It has finitely many real coordinates. Now work inside the finite extension L .

Galois Group

A Galois symmetry is a map \sigma:L\longrightarrow L which preserves the ordinary arithmetic rules: \sigma(x+y)=\sigma(x)+\sigma(y), \sigma(xy)=\sigma(x)\sigma(y), \sigma(1)=1, and which leaves every real number unchanged: \sigma(r)=r \quad \text{for every }r\in\mathbb R. Such a map cannot destroy a real polynomial equation. If p(\alpha)=0, \quad p(x)\in\mathbb R[x], then 0=\sigma(p(\alpha))=p(\sigma(\alpha)). Therefore \sigma(\alpha) is another root of the same real polynomial. So the map permutes roots of a polynomial. For example, in \mathbb C there are symmetries fixing \mathbb R : z\longmapsto z, and z\longmapsto \overline z. The second map fixes all real numbers and sends i\longmapsto -i. It does not change addition or multiplication: \overline{z+w}=\overline z+\overline w, \overline{zw}=\overline z\overline w.

Let G be the collection of all symmetries of L which fix \mathbb R . We can combine two such symmetries by applying one after the other. We can also reverse one, because a symmetry is injective, and every injective map from a finite set to itself is automatically onto. Thus each symmetry has an inverse. So G is a finite group of invertible arithmetic symmetries.

The first important claim is: |G|=[L:\mathbb R]. Here |G| means the number of symmetries. The number of arithmetic-preserving ways to move the roots around is exactly the number of real coordinates in the field.

Write the construction of L as a sequence \mathbb R=F_0\subseteq F_1\subseteq F_2\subseteq\cdots\subseteq F_s=L. At the j -th stage, we adjoin a root \alpha_j : F_j=F_{j-1}(\alpha_j). Let d_j=[F_j:F_{j-1}]. The multiplication rule gives [L:\mathbb R]=d_1d_2\cdots d_s. We now show that the number of symmetries is exactly the same product. Suppose that a symmetry has already been defined on F_{j-1} . The element \alpha_j satisfies a minimal polynomial m_j(x)\in F_{j-1}[x] of degree d_j . Since \alpha_j is a root of the original polynomial f , the polynomial m_j(x) divides f(x) when we work over F_{j-1} . Apply the already-defined symmetry to the coefficients of m_j . This creates another polynomial of degree d_j . Its roots are still roots of f , and all roots of f lie in L . Therefore there are exactly d_j possible choices for the image of \alpha_j . Once we choose one of these roots, the extension of the map is forced. Indeed, every element of F_j has the form a_0+a_1\alpha_j+\cdots+a_{d_j-1}\alpha_j^{d_j-1}, \qquad a_k\in F_{j-1}. If we decide that \alpha_j goes to some root \beta , then this element must go to \sigma(a_0)+\sigma(a_1)\beta+\cdots+\sigma(a_{d_j-1})\beta^{d_j-1}. Thus each stage has exactly d_j possible choices. Multiplying the choices gives |G|=d_1d_2\cdots d_s=[L:\mathbb R]. So we have proved: |G|=[L:\mathbb R].

Fixed Fields

Let H be a collection of symmetries which is closed under composition and inverses. In other words, H is a smaller symmetry subgroup inside G . Consider all elements of L which are unchanged by every symmetry in H :

\displaystyle L^H=\{x\in L:\sigma(x)=x\text{ for every }\sigma\in H\}.

These fixed elements form a field. Indeed, if x,y\in L^H , then for every \sigma\in H , \sigma(x+y)=\sigma(x)+\sigma(y)=x+y, \sigma(xy)=\sigma(x)\sigma(y)=xy. If x\neq 0 , then 1=\sigma(1)=\sigma(xx^{-1})=\sigma(x)\sigma(x^{-1})=x\sigma(x^{-1}), so \sigma(x^{-1})=x^{-1}. Thus x^{-1}\in L^H as well. The fixed field L^H contains exactly the numbers which the symmetries in H are unable to move.

For any x\in L , define the sum over H by S_H(x)=\sum_{\sigma\in H}\sigma(x), and define the product over H by P_H(x)=\prod_{\sigma\in H}\sigma(x). These two expressions are fixed by every symmetry in H . For example, let \rho\in H . Then \rho(S_H(x))=\sum_{\sigma\in H}\rho(\sigma(x))=\sum_{\sigma\in H}(\rho\sigma)(x). As \sigma runs through H , so does \rho\sigma . Therefore \rho(S_H(x))=S_H(x). The same reordering argument gives \rho(P_H(x))=P_H(x). Thus S_H(x)\in L^H, \quad P_H(x)\in L^H. This is the general version of familiar complex-number formulas. If H consists of the identity and conjugation, then S_H(z)=z+\overline z=2{\text{Re}}(z), and P_H(z)=z\overline z=|z|^2.

We now prove the central fact: [L:L^H]=|H|. . Once this is proved, the multiplication rule gives [L:\mathbb R]=[L:L^H][L^H:\mathbb R]. Since [L:\mathbb R]=|G| and [L:L^H]=|H| , we obtain [L^H:\mathbb R]=\frac{|G|}{|H|}. The quotient \frac{|G|}{|H|} is usually written [G:H] and means the number of equally sized blocks into which H divides G .

Every element of H fixes the field L^H . Choose finitely many elements \beta_1,\dots,\beta_t\in L which generate L over L^H . We can build L in stages: L^H\subseteq L^H(\beta_1)\subseteq L^H(\beta_1,\beta_2)\subseteq\cdots\subseteq L. At each stage, the image of a new generator has at most as many choices as the degree of its minimal polynomial over the previous field. Therefore the number of possible L^H -preserving symmetries is at most [L:L^H]. Since every symmetry in H is one of these possible maps, |H|\leq [L:L^H].

Now write H={\tau_1,\tau_2,\dots,\tau_h}, \quad h=|H|. We will prove that L can be described using at most h coordinates from the fixed field L^H . The key claim is that the maps \tau_1,\dots,\tau_h are linearly independent as functions. Suppose, for contradiction, that there is a relation

\displaystyle c_1\tau_1(x)+c_2\tau_2(x)+\cdots+c_h\tau_h(x)=0

for every x\in L , where the coefficients c_j\in L are not all zero. Choose such a relation with as few nonzero coefficients as possible. It cannot have only one nonzero term, because each \tau_j is a nonzero map. So at least two maps occur. Choose two different maps, say \tau_1,\tau_k. Because they are different maps, there is some b\in L such that \tau_1(b)\neq \tau_k(b). Apply the relation to bx instead of x : \sum_{j=1}^{h}c_j\tau_j(bx)=0. Since each \tau_j preserves multiplication, \tau_j(bx)=\tau_j(b)\tau_j(x). Thus \sum_{j=1}^{h}c_j\tau_j(b)\tau_j(x)=0. Now subtract \tau_1(b) times the original relation: \sum_{j=1}^{h}c_j\bigl(\tau_j(b)-\tau_1(b)\bigr)\tau_j(x)=0. The coefficient of \tau_1 is now zero. But the coefficient of \tau_k is nonzero because \tau_k(b)-\tau_1(b)\neq 0. So we have produced a nonzero relation with fewer nonzero terms. This contradicts our minimal choice. Therefore the maps \tau_1,\dots,\tau_h are linearly independent.

Now form the column vector

\displaystyle v(x)=\begin{pmatrix}\tau_1(x)\\ \tau_2(x)\\ \vdots\ \tau_h(x)\end{pmatrix}.

As x ranges over L , these vectors span all of L^h . Otherwise there would be a nonzero row vector (c_1,\dots,c_h) which annihilates every v(x) , producing a forbidden relation c_1\tau_1(x)+\cdots+c_h\tau_h(x)=0. Therefore we can choose elements y_1,\dots,y_h\in L such that the matrix

\displaystyle A=\begin{pmatrix}\tau_1(y_1)&\tau_1(y_2)&\cdots&\tau_1(y_h)\\ \tau_2(y_1)&\tau_2(y_2)&\cdots&\tau_2(y_h)\\ \vdots&\vdots&&\vdots\\ \tau_h(y_1)&\tau_h(y_2)&\cdots&\tau_h(y_h) \end{pmatrix}

is invertible. Take any x\in L . Since A is invertible, there are unique coefficients c_1,\dots,c_h\in L such that

\displaystyle A\begin{pmatrix}c_1\\c_2\\ \vdots\\ c_h\end{pmatrix} =\begin{pmatrix}\tau_1(x)\\ \tau_2(x)\\ \vdots\\ \tau_h(x)\end{pmatrix}.

Now apply any \rho\in H to this system. Applying \rho changes every row indexed by \tau_j into the row indexed by \rho\tau_j . But as \tau_j runs through H , so does \rho\tau_j . Thus applying \rho merely rearranges the rows of the system. The transformed system therefore has exactly the same equations in a different order. Since the solution is unique, \rho(c_j)=c_j \qquad \text{for every }\rho\in H. Hence c_j\in L^H. Now look at the row corresponding to the identity map. It gives x=c_1y_1+\cdots+c_hy_h. So every element of L is a linear combination of y_1,\dots,y_h with coefficients in L^H . Therefore [L:L^H]\leq h=|H|.

Combining the two inequalities gives [L:L^H]=|H|. This tells us exactly how the number of symmetries controls the size of the fixed field.

Sylow Theorems

We know that |G|=[L:\mathbb R]. Write | G|=2^rq, where q is odd. We will prove that q=1. In other words, the number of symmetries must be a power of 2 .

Choose a subgroup P\subseteq G having the largest possible size among all subgroups whose sizes are powers of 2 . Write |P|=2^s. We claim that s=r . Assume instead that s<r. Consider the left cosets of P in G : gP=\{gp:p\in P\}. There are [G:P]=2^{r-s}q such cosets. This number is even because r-s>0 . The group P acts on these cosets by left multiplication: p\cdot(gP)=(pg)P. Every orbit has a number of elements dividing |P| , so every orbit has power-of-two size. In particular, every orbit with more than one point has even size. The total number of cosets is even. Therefore the number of one-point orbits must also be even. One one-point orbit is certainly P itself. So there must be another fixed coset gP\neq P. Being fixed means that (pg)P=gP for every p\in P. Equivalently, g^{-1}Pg=P. Thus g carries the subgroup P back to itself under conjugation. Let N be the collection of all elements of G with this property: N=\{g\in G:g^{-1}Pg=P\}. The quotient N/P has even size because there are an even number of fixed cosets. In any finite group with even size, there is a nonidentity element whose square is the identity. Indeed, pair every element with its inverse. Elements which are not their own inverses occur in pairs. If the identity were the only self-inverse element, the total number of elements would be odd. Therefore there must be another self-inverse element. Thus there is some g\in N\setminus P such that g^2\in P. Now consider the union P\cup gP. Because g preserves P under conjugation and because g^2\in P , this union is closed under multiplication. It is therefore a subgroup. It contains two distinct cosets of P , so it has size 2|P|. But this is a larger power-of-two subgroup than P , contradicting the choice of P . Therefore |P|=2^r.

Now consider the fixed field F=L^P. The fixed-field counting rule gives [F:\mathbb R]=[G:P]=q. But q is odd, and we proved that there are no nontrivial odd-degree extensions of \mathbb R . Hence F=\mathbb R. Therefore q=1. We have proved: |G|=2^r.

Thus the possible size of the group of symmetries is a power of two and no odd prime factor can occur.

Now suppose that H\subseteq G has index 2 . [G:H]=2. The fixed-field counting rule gives [L^H:\mathbb R]=2. But every quadratic extension of \mathbb R is \mathbb C . Therefore L^H=\mathbb C. Different subgroups give different fixed fields, so there can be only one subgroup of index 2 .

We now use a fact about groups. Let M be a maximal proper subgroup of a finite group whose size is a power of 2 . “Maximal proper” means that M is not the whole group, but there is no subgroup strictly between M and the whole group. Using the same coset-counting argument as above, one shows that M must be normal: every group element carries M back to itself under conjugation. Therefore the quotient group G/M is defined. The quotient has power-of-two size and has no nontrivial proper subgroup, because M was maximal. The only power-of-two group with no nontrivial proper subgroup has two elements. Hence [G:M]=2. Therefore every maximal proper subgroup has index 2 .

But G has only one subgroup of index 2 . Call it M . Choose \sigma\in G\setminus M. We claim that repeated application of \sigma gives every element of G : G=\{1,\sigma,\sigma^2,\sigma^3,\dots\}. Suppose not. Then the subgroup generated by \sigma , written \langle \sigma\rangle, would be proper. Since the group is finite, it would lie inside some maximal proper subgroup. But every maximal proper subgroup has index 2 , and there is only one such subgroup, namely M . This would imply \sigma\in M, contradicting the choice of \sigma . Therefore G=\langle\sigma\rangle.

Suppose, for contradiction, that |G|\geq 4. Since G is cyclic and has power-of-two size, write |G|=2^r, \quad r\geq 2. Let \sigma generate G . Consider the subgroup H=\langle \sigma^4\rangle. The quotient G/H has four elements, so the fixed-field counting rule gives [L^H:\mathbb R]=4. Call this four-dimensional field M=L^H. Inside M lies the unique quadratic field \displaystyle \mathbb C. Indeed, the subgroup generated by \sigma^2 has index 2 , so its fixed field has degree 2 over \mathbb R , hence is \mathbb C . Therefore

\displaystyle \mathbb R\subseteq \mathbb C\subseteq M.

Since [M:\mathbb R]=4 and [\mathbb C:\mathbb R]=2, the multiplication rule gives [M:\mathbb C]=2. So a larger degree extension would force a nontrivial quadratic extension of the complex numbers.

We now show directly that such a thing cannot exist. Suppose that [M:\mathbb C]=2. Choose some \beta\in M\setminus\mathbb C. Then 1,\beta form a basis for M over \mathbb C . Therefore \beta^2=A+B\beta, \qquad A,B\in\mathbb C. Complete the square by putting \theta=\beta-\frac{B}{2}. Then \theta^2=A+\frac{B^2}{4}. Write \theta^2=a, \qquad a\in\mathbb C. We proved that every complex number has a complex square root. Therefore there is some c\in\mathbb C such that c^2=a. Thus \theta^2-c^2=0. Factor the left-hand side: (\theta-c)(\theta+c)=0. Since M is a field, one factor must vanish. Therefore \theta=c \qquad \text{or}\quad \theta=-c. In either case, \theta\in\mathbb C. But then \beta=\theta+\frac{B}{2}\in\mathbb C, contradicting the choice of \beta . Therefore: There is no nontrivial quadratic extension of \mathbb C.

This contradicts the conclusion of the previous section. Hence |G|\not\geq 4. Since |G| is a power of 2 , we must have |G|\leq 2. Because |G|=[L:\mathbb R], we obtain [L:\mathbb R]\leq 2.

Return to the irreducible polynomial f(x)\in\mathbb R[x]. We built a finite field L containing all of its roots. We proved that [L:\mathbb R]\leq 2. Let \alpha be any root of f . Then \mathbb R(\alpha)\subseteq L. Therefore [\mathbb R(\alpha):\mathbb R]\leq [L:\mathbb R]\leq 2. But [\mathbb R(\alpha):\mathbb R]=\deg f. Hence \deg f\leq 2. Thus every irreducible polynomial in \mathbb R[x] has degree 1 or 2. Every real polynomial can therefore be factored into real linear factors and real quadratic factors. Every real quadratic splits over \mathbb C , by the quadratic formula. Therefore every polynomial with real coefficients splits completely into linear factors over \mathbb C .

Now let Q(x)=c_nx^n+c_{n-1}x^{n-1}+\cdots+c_0, \qquad c_j\in\mathbb C. We want to prove that Q has a complex root. Form the polynomial obtained by conjugating each coefficient: Q^\ast(x)=\overline{c_n}x^n+\overline{c_{n-1}}x^{n-1}+\cdots+\overline{c_0}. Now multiply: P(x)=Q(x)Q^\ast(x). The polynomial P(x) has real coefficients. To see this, conjugate all of its coefficients. Since conjugation reverses Q and Q^\ast , we get \overline{P(x)}=\overline{Q(x)},\overline{Q^\ast(x)}=Q^\ast(x)Q(x)=P(x). Hence every coefficient of P(x) is real. We have already proved that every nonconstant real polynomial has a complex root. So there is some z\in\mathbb C such that P(z)=0. Therefore Q(z)Q^\ast(z)=0. \displaystyle Q(z)=0, or Q^\ast(z)=0. In the second case, take complex conjugates: 0=\overline{Q^\ast(z)}=Q(\overline z). Thus Q has a complex root in either case. After finding one root, divide by the corresponding linear factor and repeat. Therefore every complex polynomial splits completely into linear factors.

Summary of the Proof:

Take a real irreducible polynomial and put all of its roots in a finite splitting field L . The symmetry group of L has as many elements as L has real coordinates. Odd-degree extensions add no new elements, so the odd part of the symmetry group disappears. The only possible first quadratic layer over \mathbb R is \mathbb C . The uniqueness of that quadratic layer forces the remaining symmetry group into one cyclic chain. If the splitting field were larger than \mathbb C , the chain would contain a degree-four field M over \mathbb R , and therefore a quadratic extension of \mathbb C : \mathbb R\subseteq\mathbb C\subseteq M, \qquad [M:\mathbb C]=2. But a proposed new element of such an extension can be completed to a square: \displaystyle \theta^2=a, \quad a\in\mathbb C. Since a already has a complex square root, this forces \theta and hence the proposed new element to lie in \mathbb C . Thus no new layer remains after the complex numbers.

The precise conclusion is that there is no finite algebraic extension of \mathbb C obtained by adjoining roots of polynomial equations with complex coefficients. That is exactly what it means for \mathbb C to be algebraically closed.

Leave a comment