Hensel’s Lemma

Hensel’s lemma gives necessary conditions to lift a solution mod {p} of a polynomial equation to a solution mod {p^k}. Collecting all these solutions mod {p^k}, we get a solution in the p-adic integers.


Basic Version:

If an integer polynomial {f(X) \in \mathbb Z[X]} satisfies

\displaystyle f(a) \equiv 0 \bmod p, \quad f^{\prime}(a) \not \equiv 0 \bmod p,

then there is unique solution to {f(x_k) =0 \bmod p^k}. All these solutions can be put together to get a solution {\alpha} in {\mathbb Z_p} such that {f(\alpha)=0.}


Examples:

Consider the equation {X^2-11 =0}

There is a solution {\bmod ~5} given by {X=1 \bmod 5}.

\displaystyle 11=1^2 \mod 5

{11 \neq 1^2 \mod 5^2}, but we can add a multiple of {5} to get a solution, and continue similarly.

\displaystyle \begin{aligned} 11=& (1+ 1.5)^2 \mod 5^2\\ 11=& (1+1.5 + 2.5^2) \mod 5^3\\ 11=& (1+1.5 + 2.5^2+ 0.5^3) \mod 5^4\\ 11=& (1+1.5 + 2.5^2+ 0.5^3 +0.5^4) \mod 5^5\\ 11=& (1+1.5 +2.5^2+ 0.5^3 +0.5^4+2.5^5) \mod 5^6\\ 11=& (1+1.5 +2.5^2+ 0.5^3 +0.5^4+2.5^5+0.5^6) \mod 5^7\\ 11=& (1+1.5 +2.5^2+ 0.5^3 +0.5^4+2.5^5+0.5^6+3.5^7) \mod 5^8\\ \end{aligned}

Therefore

\displaystyle \alpha =1+1.5 +2.5^2+ 0.5^3 +0.5^4+2.5^5+0.5^6+3.5^7+ \cdots

is a solution to {X^2-11} lifting {1 \mod 5}.

It’s not clear that we can continue the above process indefinitely, but Hensel lemma guarantees that there is a way to do it, and in fact a unique way to lift at each stage.

The other solution is obtained by lifting {X= 4 \mod 5}.

\displaystyle \beta = 4 + 3.5 + 2.5^2 + 4.5^3 + 4.5^4 + 2.5^5 + 4.5^6 + 5^7 + 5^8 + 2.5^9 \cdots

Example 2: {X^3-3} in {\mathbb Z_5[X]}

The solution lifting {X=2 \mod 5} is given by

\displaystyle 2 + 2.5 + 3.5^2 + 5^3 + 3.5^5 + 3.5^6 + 2.5^7 + 5^8 + 2.5^9 \cdots

This is the unique solution in {\mathbb Z_5} because {2 \mod 5} is the unique solution. Note that {f'(2)= 12 \neq 0 \mod 5.}

Example 3: \displaystyle f(X)=X^6-X-5 \in \mathbb Z_5[X]

There are two solutions {X=0, 1 \mod 5}. But \displaystyle f'(0)=-1, f'(1)=0 \mod 5

{X=0} lifts to a unique solution

\displaystyle \alpha =0+ 4.5 + 4.5^2 + 4.5^3 + 4.5^4 + 4.5^5 \cdots

But {X=1} doesn’t lift because, in fact for {\mod 25}, {X=20 =0 \mod 5} is the only solution.

Example 4: \displaystyle X^2-2X + 1 \in \mathbb Z_5[X]

The only root {\mod 5} is {X=1}. But {f(X)=f'(X)=0 \mod 5.} So Hensel’s lemma doesn’t guarantee a lift, but {X=1} is a solution {\mod 5^k} and hence {X=1} is a solution in {\mathbb Z_5}.

Example 5: {X^n-a \in \mathbb Z_p}, where {a=1 \mod p}.

If {a= 1 \mod p}, then {X=1} is a solution modulo {p}. And we have {f'(1) = n.}

So if {p \nmid n}, there is unique solution to {X^n=a} in {\mathbb Z_p} with {X =1 \mod p}.

In fact, if we have {a=1}, we are trying to find roots of unity. If {p} is co-prime to {n}, we have a unique solution lifting a solution {\mod p}. We can see that {X^{p-1}=1} has a unique solution lifting the solution {x \mod p}, where {x} is any non-zero residue. So we have {p-1} roots of unity for {n=p-1}. What about other {n}?

When {n} is coprime to {p}, then {X^{n(p-1)}=1} has solutions that lift any residue class, but these solutions because of the uniqueness of the lifting should be exactly the solutions to {X^{p-1}=1}. It can also seen that for {n} divisible by {p} (odd), there are no solutions to {X^n=1} other than the trivial solution {X=1}. (For {p=2}, we also get {X=\pm 1})

Example 6: \displaystyle f(X) = X^3-19 \in \mathbb Z_{3}[X]

The only solution {\mod 3} turns out to be {X=1} because {f(X) =(X-1)^3 \mod 3.} But {f'(1)=0 \mod 3}, we cannot decide using the lemma if there is a lifting.

We will come back to this example later.


Proof of Lemma: The proof is easy. If we have lifted the solution to {f(x_n) =0 \mod p^n}, then to get a solution mod {p^{n+1}}, we need

{x_{n+1} =x_n + p^n y_n}, and we want to determine {y_n} up to {\mod p} such that

\displaystyle f(x_{n+1}) = f(x_n + p^n y_n) =0 \mod p^{n+1}

\displaystyle \implies f(x_n) + f'(x_n)y_np^n = 0 \mod p^{n+1} \implies y_n = -\frac{f(x_n)}{p^n f'(a)} \mod p

So we can find y_n and hence find a lift to the solution. Note that we crucially use that the derivative doesn’t vanish in the above equation.


Newton’s method of approximations: We can think of the above method as starting from an approximate solution and applying Newton’s iteration to get better approximations. The difference from classical Newton’s method is that our notion of approximation or closeness is p-adic. (Number divisible by large powers of {p} are small) Newton infact used this method to solve equation in power series, the p-adic integers are somewhat like the power series in p

What happens when {f'(a)=o \mod p}.? We can see examples of cases where there are no lifts, cases where there are unique lifts and also cases with multiple lifts from the same solution {\mod p}.


We now give a stronger version of the lemma.

Stronger version: If a polynomial {f(X) \in \mathbb Z_p[X]}

\displaystyle |f(a)|_{p}<\left|f^{\prime}(a)\right|_{p}^{2}

for some { a \in \mathbb Z_p}, then we have a unique {\alpha \in \mathbb Z_p} such that {f(\alpha)=0} and {\alpha} is close to {a}, that is with

\displaystyle |\alpha-a|_{p} = \left|f(a) / f^{\prime}(a)\right|_{p}<\left|f^{\prime}(a)\right|_{p}

The case of {\left|f^{\prime}(a)\right|_{p}=1} reduces to the basis version mentioned before.


As mentioned before we want to prove the convergence of the Newton iteration

\displaystyle x_{n+1}=x_{n}-\frac{f\left(x_{n}\right)}{f^{\prime}\left(x_{n}\right)}.

The conditions of the theorem are exactly designed to get the convergence. We skip the calculations in the proof. The strong theorem can be deduced from the weak version by applying the theorem to a modification of the polynomial involved, so the abstract content of both the results is the same although the strong weaker seems to have the basic version as a special case.


IMPORTANT: In the weaker version, we lifted each solution uniquely to higher powers and constructed the solution. But for the strong version, the uniqueness needn’t be at every finite level {\mod p^n}, the solution is unique only in the limit in {\mathbb Z_p} .That is there could be multiple solutions lifting {\mod p} solutions to {\mod p^n}, for a large {n}. but only one of these solutions extends to all the higher powers indefinitely. Some solutions lift to higher powers and then stop extending further. To get uniqueness we have to look at solution not just mod {p}, but a high power {p^n}. There is a high power after which there is a unique lift provided the conditions of the theorem are satisfied.


{X^2-17 \in \mathbb Z_2[X]} has only one solution {X=1\mod 2}. But {f'(0)=0}, and hence we cannot say that the lift are unique or even that they exist. But

\displaystyle \alpha=1+2^{3}+2^{5}+2^{6}+2^{7} \cdots

\displaystyle \beta=1+2+2^{2}+2^{4}+2^{8}+\cdots

are the solutions. So we see that there are multiple lifts of the solution {\bmod 2}.

\displaystyle |f(1)|_{2} =\frac{1}{16} < \left|f^{\prime}(1)\right|_{2}^{2} =\frac{1}{4},

so the conditions of the second version hold and we have convergence to a unique solution with

\displaystyle |x-1|_{2} <\left|f^{\prime}(1)\right|_{2} =\frac{1}{2}

So {x} has to be {\alpha= 1+2^{3}+2^{5}+2^{6}+2^{7} \cdots.}.

If we start with {a=3}, then

\displaystyle |f(3)|_{2} =\frac{1}{8} < \left|f^{\prime}(3)\right|_{2}^{2} =\frac{1}{4},

so the solution converges to {x} such that

\displaystyle |x-3|_{2} <\left|f^{\prime}(3)\right|_{2} =\frac{1}{2}

Hence the solution in this case has to be {\beta=1+2+2^{2}+2^{4}+2^{8}+\cdots}

Example 6: We will look at the previous example.

\displaystyle f(X) = X^3-19 \in \mathbb Z_{3}[X]

{X=1} doesn’t satisfy the conditions of the basic version and even this version, but {X=7} satisfies the strong version:

\displaystyle |f(1)|_{3}= \frac{1}{81}< \frac{1}{3^2} =\left|f^{\prime}(1)\right|_{3}^{2}.

Hence there is a unique solution such that {|\alpha-7|_{3} < \frac{1}{3}} which is given by

\displaystyle \alpha=1 + 2.3 + 2.3^3 + 2.3^4 + 3^5 + 3^6 + 3^7\cdots

Finally, the conditions of the stronger version are always true if we choose {a} to be close to a root of {f(X)} in {\mathbf Z_p[X]} provided the root is a simple root. That is the conditions are necessary and not just sufficient to find a simple root.


Multivariable Hensel’s Lemma:

The strong version can be generalized as follows.

If a polynomial {f(X_1, X_2, \cdots X_n) \in \mathbb Z_p[X_1, X_2, \cdots, X_n]} is such that

\displaystyle |f(\mathbf a)|<\left\|\nabla f(\mathbf a)\right\|_{\infty}^{2}

for some { \mathbf a \in \mathbb Z_p^n}, then we have a unique {\boldsymbol{\alpha} \in \mathbb Z_p^n} such that {f(\boldsymbol\alpha)=0} and {\boldsymbol \alpha} is close to {\mathbf a}, that is with

\displaystyle |\boldsymbol \alpha-\mathbf a|_{\infty} <\left\|\nabla f(\mathbf a)\right\|_{ \infty}

Where are the norms are p-adic and {\infty} refers to taking maximum of the absolute values of coordinates. That is

\displaystyle |\mathbf a|_{\infty} =\max_{i} |a_i|_p, \text{ where } \mathbf a = (a_1, a_2, \cdots a_n).

Here {\nabla} is the gradient.

More generally if {\mathbf f =(f_1, f_2, \cdots f_m)} are polynomials in {\mathbb Z_p[X_1, X_2, \cdots, X_n]} such that

\displaystyle \left\|\mathbf f(\mathbf a)\right\|_{\infty}<\left\|J f(\mathbf a)\right\|^{2}

for some { \mathbf a \in \mathbb Z_p^n}, then we have a unique {\boldsymbol{\alpha} \in \mathbb Z_p^n} such that {\mathbf f(\boldsymbol\alpha)=\mathbf 0} and {\boldsymbol \alpha} is close to {\mathbf a}, that is with

\displaystyle |\boldsymbol \alpha-\mathbf a|_{\infty} <\left\| J f(\mathbf a)\right\|

Here {J} is the Jacobian, the determinant of the Derivative matrix.

Posted in $.

Leave a comment