Orthogonal Matrices over Rationals and Cayley correspondence

Orthogonal matrices are matrices such that \displaystyle AA^{T} =A^TA=I_n . Skew symmetric matrices \displaystyle X+X^{T} =O correpond to infinitesimal orthogonal matrices (Lie algebra). That is

If \displaystyle A \approx I+\epsilon X , \displaystyle AA^{T} =A^TA=I_n  \implies (I+\epsilon X)(I+\epsilon X^{T}) =I_n \implies X+X^{T}=0.

Question: How do we parametrize orthogonal matrices over rational numbers?

The standard method over reals is to pick orthonormal basis of columns We have to just pick any unit vector, and then move to the orthogonal plane and pick two orthogonal vectors. But over rationals to start with unit vectors with rational coordinates already put constraints on what kind of vectors we can choose. By Gauss theorem about numbers of the form {x^2+y^2+z^2}, we get congruence constraints on the common denominators of the coordinates. But then it’s not clear if the orthogonal space has an orthonormal basis with rational coordinates. (The orthogonal space will have rational basis, but it’s not obvious see that we can choose unit vectors, that if the sphere restricted to the space has any rational points)

Before we describe a way to parametrize the orthogonal vectors, here is a way to see that there are some orthogonal matrices with a given rational vector as the first column: Consider reflection which takes a standard unit vector {e_i} to the chosen unit rational vector. This is a rational matrix and hence the image of the standard basis gives us a orthonormal basis with a chosen unit rational vector in the first column. This also implies by induction that there is an orthogonal matrix with any given subset of unit orthonormal vectors as first few columns. That is we can extend a rational orthonormal basis of a subspace to a rational orthonormal basis of the whole space.

Now we describe the parametrization of the rational orthonormal basis:

In 3-dimensions rational unit vectors exactly corresponds to the solutions to integer vectors with integer length. That is by clearing denominators we get the equation

\displaystyle X^2+Y^2+Z^2= d^2

We can parametrize these solutions, alternatively the rational solutions to {x^2+y^2+z^2=1} by drawing lines from a point {(1, 0, 0)} on the sphere. (This is similar to the construction for Pythagorean triplets and rational points on the circle).

The result is a parametrization

\displaystyle X= 2ac, ~~Y = 2bc,~~ Z= -a^2-b^2+c^2,~~ d =a^2+b^2+c^2

for the integer points with integer lengths and

\displaystyle x= 2t_1, y= 2t_2, z= 1-t_1^2-t_2^2

for rational points on the sphere x^2+y^2+z^2=1 .

So start with any rational unit vector {\mathbf u= (r, s, t)} (just pick some values in the above parametrization), and then substitute the above parametrization in the equation {\mathbf u \cdot \mathbf x=0}, to get

\displaystyle r (2t_1) + s(2t_2) + t(1-t_1^2-t_2^2)=0

When {t=0} we get the solution {\mathbf x = (s, -r, 0)}. So assume {t \neq 0.}

When we view the equation as a quadratic over {t_1}, we need the discriminant to be a square.

\displaystyle r^{2}-t\left(t t_2^{2}-2 s t_2-t\right) = k^2

\displaystyle \implies (s-t t_2)^{2}+k^{2}= r^2+s^2+t^2=1

This is a circle with a rational point {t_2=s/t, k=1}, we can parametrize the rational points on the using Pythagorean triples (drawing lines from a rational point.) to get

\displaystyle s-t t_2 = \frac{2b}{1+b^2}, \quad k = \frac{1-b^2}{1+b^2}.

From these relations, we can solve for {t_1, t_2} and hence find a one parameter family of solutions to unit rational vector orthogonal to the given vector. The third vector orthogonal to both the vector is then defined uniquely up to sign.

To summarize:

a) Start with any rational unit vector- we can do this by choosing special values for the parameters in the parameterization { X= 2t_1, Y= 2t_2, Z= 1-t_1^2-t_2^2}.

b) Now use the process shown above to find the parametrization for unit vector orthogonal to the first vector.

c) Finally the third vector can be obtained as the cross product {\pm \mathbf u_1 \times \mathbf u_2}. (This can be written in terms of determinants with coordinates of {\mathbf u_1} and {\mathbf u_2}, hence it is rational.)

This is for 3-dimensional orthogonal matrices. We can similarly carry out the process for n – dimensional matrices and we will have n-1 parameters for the first column, n-2 for the second column and so on, to get the \frac{n(n-1)}{2} dimensional orthogonal group. We explicitly did this computation for 3-dimensions, but it’s not clear that the lower dimensional spheres obtained when we put orthogonality constraints have rational points to be able to use them to draw lines and get a rational parametrization. But like mentioned before the reflection construction helps us to create at least one orthogonal matrix with a few given choice of orthonormal columns, and this means that there are rational points on the spheres orthogonal to the columns.

Next, we describe another way to generate the orthogonal matrices.

We have the following correspondence due to Cayley between orthogonal matrices without eigenvalue 1 and skew-symmetric matrices.

\displaystyle X \to A =(I+X)(I-X)^{-1} and \displaystyle A \to X = (I+A)^{-1}(I-A)

We have the following correspondence due to Cayley between orthogonal matrices without eigenvalue {-1 } and skew-symmetric matrices.

\displaystyle \displaystyle X \rightarrow A =(I+X)(I-X)^{-1} \mbox{ and } \quad \displaystyle A \rightarrow X = (I+A)^{-1}(I-A)

Proof: First note that given an skew-symmetric matrix {X}, {A=I+X)(I-X)^{-1}} satisfies

\displaystyle A^{T} A= \left((I+X)(I-X)^{-1}\right)^{T}(I+X)(I-X)^{-1} =\left((I-X)^{-1}\right)^{T}(I+X)^{T}(I+X)(I-X)^{-1}

\displaystyle = (I+X)^{-1}(I-X)(I+X)(I-X)^{-1}=(I+X)^{-1}(I+X)(I-X)(I-X)^{-1}=I_n

where we used the {X} is skew-symmetric: {X=-X^{T}.}

Similarly if {A} is an orthogonal matrix, then

\displaystyle X^{T} = \left((I+A)^{-1}(I-A)\right)^{T} =\left(A^{T}-I^{T}\right)\left(A^{T}+I^{T}\right)^{-1} =\left(A^{-1}-I\right)\left(A^{-1}+I\right)^{-1}

\displaystyle =\left(A^{-1}-I\right) A A^{-1}\left(A^{-1}+I\right)^{-1} =-(A-I)(A+I)^{-1}

\displaystyle = -(A+I)^{-1}(A-I) =-X

Check that the maps are inverses of each other:

\displaystyle \displaystyle A =(I+X)(I-X)^{-1} \implies A (I-X) = (I+X)

\displaystyle A -AX = I + X \implies A-I = AX+ X =(A+I)X \implies X = (I+A)^{-1}(I-A)

This construction is valid over any field, and in particular rationals.

So we have a method to construct orthogonal matrices provided they don’t have the eigenvalue {-1}. What do we do about matrices with eigenvalues {-1}?

We just multiply a diagonal matrix with entries {\pm 1} to get an orthogonal matrix without eigenvalue {-1}. How do we know that there exists such a matrix? Not just for orthogonal matrices, any matrix over a field (of characteristic not equal to {2}) can be multiplied by a diagonal matrix to get a matrix without the eigenvalue {-11}.

Because if that’s not the case, we have {DAx =-x} for every diagonal {\pm 1} matrix {D}, then {\det (A-D)=0} for all possible choices of {\pm 1} for {D}.

Expanding the determinant and assuming inductively that the {n-1 \times n-1} submatrix {A_11} (removing first column and row) has a non-zero determinant for some choice of {D},

\displaystyle \det (A-D) = (a_{11}- d_{11}) \det (A_{11} -D_{11}) + F(d_{22}, d_{33}, \cdots d_{nn})

we get that { a_{11}- d_{11}} is constant for any of {\pm 1} values for {d_11}, which is a contradiction.

So in this method we create skew symmetric matrices (which basically is a choice of three rational numbers). And then use the Cayley formula to generate a orthogonal matrix. Further multiply with a diagonal matrix with {\pm 1} entries.

Remark:

In the first method: We have 2 parameters for the initial vector, one parameter to choose the second vector and then a choice of {\pm} to choose the third vector.

But in this second method with skew-symmetric matrices, we set all the three parameters in the beginning to define the skew-symmetric matrix.

Explicit formula for {n=3}: We can make both the methods explicit, but we can use the relation between quaternions and rotations to get the formula

\displaystyle \frac{1}{a^{2}+b^{2}+c^{2}+1}\left[\begin{array}{ccc} a^{2}+b^{2}-c^{2}-1 & 2 b c-2 a & 2 a c+2 b \\ 2 a +2 b c & a^{2}-b^{2}+c^{2}-1 & 2 c -2 a b \\ 2 b -2 a c & 2 a b+2 c & a^{2}-b^{2}-c^{2}+1 \end{array}\right]

The formula corresponds to the conjugation action {z\rightarrow qzq^{-1}} by an element {q=a+bi+cj+dk} on pure quaternions.

Posted in $.

Leave a comment