Hilbert Finite Generation of Invariants, Null Cones, Derksen’s Ideal

Suppose a group {G} acts on a vector space and we want to classify the invariant polynomials on the space.

1. For example, consider the action of {SL_2(\mathbb R)} on the three dimensional space of binary quadratic forms {ax^2+bxy+cy^2}. The invariant polynomials are generated by the discriminant {\Delta b^2-4ac}, that is the ring of invariants in {\mathbb C[\Delta]}.

2. Similarly for the action of {SL_2} on binary cubic form forms \displaystyle a x_{1}^{3}+bx_{1}^{2} x_{2}+c x_{1} x_{2}^{2}+d x_{2}^{3} , the invariants are generated by the discriminant polynomial

\displaystyle \Delta=27 a^2 d^{2}-18 a b c d+4 a c^{3}+4 b^{3} d-b^{2} c^{2}.

3. If we act by {\mathbb Z/2} on a two dimensional vector space by {(x, y) \rightarrow (-x, -y)}, the polynomials invariant under this action are given by

\displaystyle \{ P| P(x, y) = P(-x, -y)\} = k[x^2, y^2, xy] = k[X, Y, Z]/\langle Z^2=XY \rangle.

4. The ring of symmetric polynomials is generated by the elementary symmetric polynomials. That is

\displaystyle k[x_1, x_2, \cdots , x_n]^{S_n} = K[e_1, e_2, \cdots e_n]

where

\displaystyle e_{k}=\sum_{1 \leq i_{1}<i_{2}<\cdots<i_{k} \leq n} x_{i_{1}} x_{i_{2}} \cdots x_{i_{k}},

that is

\displaystyle e_1=x_1+x_2+\cdots x_n, \quad e_2 = x_1x_2 + x_1x_3 + \cdots x_{n-1}x_n, \cdots , \quad e_n = x_1x_2\cdots x_n.

5. What are the invariants of the Grassmanian {G(k, n)}, the space of {k+1} dimensional planes in {n+1} space when viewed as sets of {k+1} basis vectors under under the action of {GL_{k+1}} (the change of basis/coordinates)?

The Plucker coordinates, which are the {(k+1) \times (k+1)} minors of the matrix containing the {k+1} vectors are invariant functions. In fact, they generate all the functions.

6. For the action of {SL_3} on ternary cubic forms, the invariants are generated by {I_4} (Aronhold invariant ) and {I_6}.

In all these cases we have a finite list of generators possibly with relations between them? Is the list always finite? If it is finite, how do we find a generating set and relations between them?

It’s not always true that the ring is finitely generated, but Hilbert proved finite generation in many interesting cases (complex representations of finite groups, matrix groups etc).

To prove this result, we first need finite generation of any ideal of polynomials.


Hilbert Basis Theorem: Ideals of polynomial ring are finitely generated.

Proof: If we have a polynomial in one variable we can take the minimum degree polynomial inside the ideal. By Euclidean/division algorithm, this ideal has to divide all the polynomials in the ideals and hence generate the ideal. What about more variables?

Let’s say we have two variables {X, Y}. Look at all the polynomials as polynomials in the second variable {Y} and find the polynomials in the ideal whose highest {Y} degree coefficient (which is a polynomial in one variable {X}) has smallest degree as polynomials in {X}. So some combination of these polynomials can be used to remove the highest degree term from any given polynomial in the ideal. We can repeat the same with the smaller degree polynomial obtained, thus we have a way to write any given polynomial in terms of these few polynomials coming from the polynomials with “smallest” highest degree terms.

We can continue the induction and do this generally for more variables We can view the polynomials as a polynomials in the last variable, and consider the polynomials whose highest degree coefficients (lesser degree polynomials) generate all the highest degree coefficients.

To summarize the ideal in {R[X]} is generated by the polynomials in the ideal whose leading terms generated {R}. (Here {R} is the polynomial ring with fewer variables.)


We will use Hilbert basis theorem to prove finiteness of ring of invariant polynomials in many cases.

Consider the case {G} is finite and {k} is {\mathbb C} or any characteristic {0} field.

Consider the ring of invariants {k[x_1, x_2, \cdots x_n]^{G}}. The non-constant invariants generate an ideal in the whole {k[x_1, x_2, \cdots x_n]} and are generated by a finite set, say {e_1, e_2, \cdots e_k}

Given any invariant polynomial {F}, we can write it in terms of the generators as

\displaystyle F= a_1e_1+a_2e_2+ \cdots a_k e_k

Now carry the following averaging operation called Reynolds operator.

\displaystyle R(x) =\frac {1 } {|G|} \displaystyle \sum_{g \in G} g.x

to get \displaystyle R(F) =F= R(a_1)e_1+R(a_2)e_2+ \cdots R(a_k)e_k.

We can repeat this process with {R(a_i)} to get a polynomial expression completely in terms of {e_1, e_2, \cdots e_k}. We are done!


What about non-finite {G} ?

If {G} is compact, we can integrate and the averaging make sense. If {G} is non-compact like {SL_2(\mathbb C)}, one has to use Weyl’s unitary trick to reduce to compact case. For example we can integrate over a maximal compact subgroup- this will be automatically invariant under the whole group.

Hilbert proved the result for linear “reductive” groups. A group is reductive if every invariant subspace has an complementary invariant subspace. The Reynold operator {R} gives the splitting of the whole vector space into subspace of invariants and a complement. But in general an abstract splitting is enough. Example of reductive groups are {GL_n, SL_n, O_n}

This proof is not explicit and doesn’t give any bounds on the number or complexities of the generators because we need to assume that we have the list of all invariant polynomials given to us and that we can find a collection of them with leading terms “small”. And finally the Reynold operator may not be explicit like in the case of finite case groups always.

One obvious way to find invariants is to solve the equation {g\cdot P(x)=P(x)}. If {G} is finite, this gives a finite linear set of equations on the subspace of degree {d} polynomials. For Lie group we can solve for infinitesimal version: We can start with {g \cdot P =P} and differentiate to get {X_g \cdot P= P}.


So we give another constructive proof due to Hilbert. We will focus on {k =\mathbb C}.

Null Cone Proof:

The null cone is defined by as the set of vectors whose orbit closures contain {0}. In terms of the invariant polynomials, these are vectors indistinguishable from the origin using the values on the invariants polynomials.

\displaystyle \{v \in V \mid 0 \in \overline{G \cdot v}\}=\left\{v \in V \mid P(v)=P(0) \text { for all } P \in {k}[{x_1, x_2, \cdots x_n}]^{G}\right\}

If {P} is homogenous we have {P(0)=0}. Thus null-cone is the variety defined by the homogenous generating polynomials.

Hilbert-Mumford Criterion: A vector is in the null cone if and only if there is a one parameter subgroup of {a: \mathbb C^{*} \rightarrow G} such that

\displaystyle a(t) v \rightarrow 0 for {t \rightarrow 0} .

Examples:

1. Consider the conjugate action of {GL_n} on {n \times n} matrices. The invariants of this action are generated by {\text{tr}(A^k)} or the coefficients of the characteristic polynomial: {\text{tr}(A), \det(A)} etc So the null cone of this action corresponds to matrices whose characteristic polynomials has the same characteristic polynomial as {\mathbf 0.} These are exactly the nilpotent matrices {A^n=0}. We can see that nilpotent matrices are exactly the matrices that go to zero when acted by a torus action.

2. The action of {S_n} by permuting the coordinates. The null cone is just the origin

3. Action of {SL_n} on the matrices {M_{n \times n}}. The nullcone is the singular matrices.

For a vector we want the element in the orbit with minimal norm to the origin for it to be in the null cone.

Moment map: The derivative of map {g \rightarrow \|gv \|} at identity is called the moment map at {v}.

Kemp-ness: A element is not in the null-cone iff there is a non-zero element in it’s orbit closure whose moment map is zero. Because if the orbit closure doesn’t contain the origin, we can take the element closest to the origin.


Derksen Ideal and Algorithm to find the generators:

Consider the set of vectors which cannot be differentiated with invariant polynomials, that is {v, w} such that

{P(v) =P(w)} for all {P \in {k}[{x_1, x_2, \cdots x_n}]^{G}.}

The set {(v, w)} can be captured by the condition


\displaystyle P(u)-P(v)=P(x_1, x_2, \cdots x_n)-P(y_1, y_2, \cdots y_n)=0..


This generates an ideal in the polynomial ring over {2n} variables called Derksen’s ideal.

Now the above Hilbert ideal of non-constant invariant polynomials (in the finiteness proof) can be obtained by setting {y_i=0} in the generators of the Derksen’s ideal.

Example: For action of {SL_2} on binary quadratic forms, the group is given by the relations

\displaystyle ad-bc=1

The action of {SL_2} on binary form {[x_1, x_2, x_3]= x_1x^2+x_2xy+x_3y^2} is given by

\displaystyle x_1 \rightarrow y_1 =a^2x_{1}+ac x_{2}+c^{2} x_{3}
\displaystyle x_2 \rightarrow y_2 =2 ab x_{1}+\left(a d+ bc\right) x_{2}+2 cd x_{3}
\displaystyle x_3 \rightarrow y_3=b^{2} x_{1}+b d x_{2} + d^{2} x_{3}

The Derksen’s ideal is obtained by taking the ideal in {k[a, b, c, d, x_1, x_2, x_3, y_1, y_2, y_3]} is given by

\displaystyle \left\langle ad-bc-1, a^2x_{1}+ac x_{2}+c^{2} x_{3}-y_1, 2 ab x_{1}+\left(a d+ bc\right) x_{2}+2 cd x_{3}-y_2, b^{2} x_{1}+b d x_{2} + d^{2} x_{3} -y_3 \right\rangle

and eliminating the variables {a, b, c, d}.

The resulting elimination ideal can be computed to be

\displaystyle D= \left\langle 4 x_{1} x_{3}-x_{2}^{2}-4 y_{1} y_{3}+y_{2}^{2} \right\rangle

This is Derksen’s ideal.

Now setting {y_i=0} we get \displaystyle \Delta = \langle 4 x_{1} x_{3}-x_{2}^{2} \rangle .


Primary Invariants:

There are some linearly independent elements {f_1, f_2, \cdots, f_l} of the invariant ring such that the ring is finitely generated module over the polynomial ring {k[f_1, f_2, \cdots f_l].}

Secondary Invariants:

We have

\displaystyle k[x_1, x_2, \cdots x_n]^{G}=\bigoplus_{j=1}^{r} k\left[f_{1}, \ldots, f_{l}\right] g_{j}

The {f_i} are called primary invariants and {g_i} are called secondary invariants.

Posted in $.

Leave a comment