Section 5.3 Orthogonal Matrices
In this chapter, we explore matrices that have special properties and their applications to quadratic forms. We will see that symmetric matrices have real eigenvalues and orthogonal eigenvectors, and that we can use these properties to simplify quadratic forms.
Before defining a orthogonal matrix and its corresponding transformation, letβs recall a few definitions and theorems about symmetric matrices.
The next theorem is an important result about the eigenvalues of symmetric matrices, and it will be used in the next section to show that the eigenvectors of a symmetric matrix are orthogonal.
Theorem 5.3.2.
The eigenvalues of a symmetric matrix are real.
Proof.
Let \(\lambda\) be an eigenvalue of \(A\) with corresponding eigenvector \(\boldsymbol{v}\) that is:
\begin{equation}
A \boldsymbol{v} = \lambda \boldsymbol{v}\tag{5.3.1}
\end{equation}
\begin{equation*}
\overline{\boldsymbol{v}}^{\intercal} A \boldsymbol{v} = \lambda \overline{\boldsymbol{v}}^{\intercal} \boldsymbol{v}
\end{equation*}
resulting in
\begin{equation}
\lambda = \frac{\overline{\boldsymbol{v}}^{\intercal} A \boldsymbol{v}}{\overline{\boldsymbol{v}}^{\intercal} \boldsymbol{v}}\tag{5.3.2}
\end{equation}
Next, take the transpose of (5.3.1) and recall that \((AB)^{\intercal} = B^{\intercal} A^{\intercal}\)
\begin{equation*}
\begin{aligned} \boldsymbol{v}^{\intercal} A^{\intercal} \amp = \lambda \boldsymbol{v}^{\intercal} \\ \boldsymbol{v}^{\intercal} A \amp = \lambda \boldsymbol{v}^{\intercal} \\ \end{aligned}
\end{equation*}
where the symmetry of \(A\) has been employed. Take the complex conjugate of the above equation:
\begin{equation*}
\begin{aligned} \overline{\boldsymbol{v}^{\intercal} A} \amp = \overline{\lambda \boldsymbol{v}^{\intercal}} \\ \overline{\boldsymbol{v}}^{\intercal} \overline{A} \amp = \overline{\lambda} \overline{\boldsymbol{v}}^{\intercal} \\ \overline{\boldsymbol{v}}^{\intercal} A \amp = \overline{\lambda} \overline{\boldsymbol{v}}^{\intercal} \end{aligned}
\end{equation*}
where \(\overline{A} = A\) has been used because \(A\) is a real matrix. Right multiply this by \(\boldsymbol{v}\)
\begin{equation*}
\overline{\boldsymbol{v}}^{\intercal} \overline{A} \boldsymbol{v} = \overline{\lambda} \overline{\boldsymbol{v}}^{\intercal} \boldsymbol{v}
\end{equation*}
Solving this for \(\overline{\lambda}\) results in
\begin{equation*}
\overline{\lambda} = \frac{\overline{\boldsymbol{v}}^{\intercal} A \boldsymbol{v}}{ \overline{\boldsymbol{v}}^{\intercal} \boldsymbol{v}}
\end{equation*}
which is identical to (5.3.2) showing that \(\lambda = \overline{\lambda}\) which shows that \(\lambda\) is real.
Theorem 5.3.3.
The eigenvectors of a symmetric matrix are orthogonal.
Proof.
Let \(\lambda_1\) and \(\lambda_2\) be two distinct eigenvalues of a symmetric matrix \(A\) with corresponding eigenvectors \(\boldsymbol{v}_1\) and \(\boldsymbol{v}_2\text{.}\) Then
\begin{equation*}
A \boldsymbol{v}_1 = \lambda_1 \boldsymbol{v}_1
\end{equation*}
and
\begin{equation*}
A \boldsymbol{v}_2 = \lambda_2 \boldsymbol{v}_2
\end{equation*}
Left multiply the first equation by \(\boldsymbol{v}_2^{\intercal}\) and the second equation by \(\boldsymbol{v}_1^{\intercal}\) to get
\begin{equation*}
\begin{aligned} \boldsymbol{v}_2^{\intercal} A \boldsymbol{v}_1 \amp = \lambda_1 \boldsymbol{v}_2^{\intercal} \boldsymbol{v}_1 \\ \boldsymbol{v}_1^{\intercal} A \boldsymbol{v}_2 \amp = \lambda_2 \boldsymbol{v}_1^{\intercal} \boldsymbol{v}_2 \\ \end{aligned}
\end{equation*}
Since \(A\) is symmetric, then \(\boldsymbol{v}_2^{\intercal} A \boldsymbol{v}_1 = \boldsymbol{v}_1^{\intercal} A \boldsymbol{v}_2\text{.}\) Therefore,
\begin{equation*}
(\lambda_1 - \lambda_2) \boldsymbol{v}_2^{\intercal} \boldsymbol{v}_1 = 0
\end{equation*}
Since the eigenvalues are distinct, then \(\lambda_1 - \lambda_2 \neq 0\) which implies that \(\boldsymbol{v}_2^{\intercal} \boldsymbol{v}_1 = 0\text{.}\)
Definition 5.3.4.
A square matrix \(Q=[\boldsymbol{q}_1\;\; \boldsymbol{q}_2\;\;\boldsymbol{q}_3\;\; \cdots\;\;\boldsymbol{q}_n] \) is said to be orthogonal if
\begin{equation*}
\boldsymbol{q}_i^{\intercal} \boldsymbol{q}_j = 0
\end{equation*}
for all \(i \neq j\text{.}\) If in addition that \(\boldsymbol{q}_i^{\intercal}\boldsymbol{q}_i = 1\) for all \(i\) then the matrix is also said to be orthonormal.
Sometimes orthogonal matrices also have the condition that each column has length 1.
Theorem 5.3.5.
Proof.
This is the same as showing that \(Q^{\intercal} Q = I\text{,}\) which is the definition of an orthonormal matrix.
The last theorem in this section is about the determinants. We show that the determinant of a orthogonal matrix is either 1 or -1.
Theorem 5.3.6.
Proof.
\begin{equation*}
\det(Q^{\intercal})\det(Q) = \det(Q^{\intercal}Q) = \det(I) = 1
\end{equation*}
Then since the determinant is a real number, then \(\det(Q) = 1\) or \(\det(Q)=-1\text{.}\)
Subsection 5.3.1 The Transformation as an Orthogonal Matrix
Recall that as discussed in SectionΒ 4.5 that matrix multiplication is a linear transformation. In short, transformations with these matrices are isometries, in which the geometry of the transformation is preserved. That is, if \(Q\) is an orthogonal matrix, then the following are preserved.
- Length
- The length of the vector stays the same. That is, \(||Q\boldsymbol{v}|| = ||\boldsymbol{v}||\text{.}\)
- Angle
- The angle between two vectors \(\boldsymbol{u}\) and \(\boldsymbol{v}\) stays the same. That is,\begin{equation*} \cos \theta = \frac{\langle \boldsymbol{u}, \boldsymbol{v} \rangle}{||\boldsymbol{u}|| ||\boldsymbol{v}||} = \frac{\langle Q\boldsymbol{u}, Q\boldsymbol{v} \rangle}{||Q\boldsymbol{u}|| ||Q\boldsymbol{v}||}\text{.} \end{equation*}
- Volume
- The volume of a shape is preserved. That is, if \(S\) is a shape in \(\mathbb{R}^n\text{,}\) then the volume of \(Q(S)\) is the same as the volume of \(S\text{.}\)
Proof.
The proof of the length is shown and the remainder is left to the reader. Without loss of generality, we will find the square of the length of the vector \(Q\boldsymbol{v}\) or
\begin{equation*}
\begin{aligned}
||Q\boldsymbol{v}||^2 \amp = \langle Q\boldsymbol{v}, Q\boldsymbol{v}\rangle \\
\amp = (Q \boldsymbol{v})^{\intercal} Q \boldsymbol{v} \\
\amp = \boldsymbol{v}^{\intercal} Q^{\intercal} Q \boldsymbol{v} \\
\amp = \boldsymbol{v}^{\intercal} I \boldsymbol{v} \\
\amp = \boldsymbol{v}^{\intercal} \boldsymbol{v}
\end{aligned}
\end{equation*}
where \(Q^{\intercal}Q=I\) is used from TheoremΒ 5.3.5.
Subsection 5.3.2 Constructing Orthogonal Matrices
Letβs look at \(2 \times 2\) orthogonal matrices. A general matrix has the form:
\begin{equation*}
Q = \begin{bmatrix}
a \amp b \\ c \amp d
\end{bmatrix}
\end{equation*}
Since each column has a norm of 1 and the columns are orthogonal then
\begin{equation*}
\begin{aligned}
a^2 + c^2 \amp = 1 \\ b^2 + d^2 \amp = 1 \\ ab + cd \amp = 0
\end{aligned}
\end{equation*}
We can let \(a = \cos \theta\text{,}\) resulting in \(c = \sin \theta\) and \(b = \sin \alpha\) resulting in \(d = \cos \alpha\text{.}\) The third equation shows
\begin{align}
\cos \theta \sin \alpha + \sin \theta \cos \alpha \amp = 0 \notag\\
\sin(\theta + \alpha) \amp = 0 \tag{5.3.3}
\end{align}
One such solution to this is \(\alpha = -\theta\text{,}\) which would give the matrix:
\begin{equation*}
Q = \begin{bmatrix}
\cos \theta \amp \sin (-\theta) \\ \sin \theta \amp \cos (-\theta)
\end{bmatrix} = \begin{bmatrix}
\cos \theta \amp -\sin \theta \\ \sin \theta \amp \cos \theta
\end{bmatrix}
\end{equation*}
which is a rotation matrix that takes a vector in \(\mathbb{R}^2\) and rotates the vector counter-clockwise \(\theta\) radians. An alternative solution to (5.3.3) is \(\theta+\alpha = \pi\) and this would result in the matrix:
\begin{equation*}
Q = \begin{bmatrix}
\cos \theta \amp \sin (\pi - \theta) \\ \sin \theta \amp \cos (\pi-\theta)
\end{bmatrix} = \begin{bmatrix}
\cos \theta \amp \sin \theta \\ \sin \theta \amp -\cos \theta
\end{bmatrix}
\end{equation*}
