Orthogonal Matrix and Orthogonal Projection Matrix

In this article, I cover orthogonal transformations in detail. After that, I present orthogonal and transpose properties and orthogonal matrices. Towards the end, I examine the orthogonal projection matrix and provide many examples and exercises.

Orthogonal Transformations and Orthogonal Matrices

A linear transformation $T$ from $\mathbb{R}^n$ to $\mathbb{R}^n$ is called an orthogonal transformation if it preserves the length of vectors: $\left|\left|T(x)\right|\right| = \left|\left|x\right|\right|$ for all $x\in \mathbb{R}^n.$ If $T(x)=Ax$ is an orthogonal transformation, we say $A$ is an orthogonal matrix.

Lemma. (Orthogonal Transformation) Let $T$ be an orthogonal transformation from $\mathbb{R}^n$ to $\mathbb{R}^n.$ If $v, w \in \mathbb{R}^n$ are orthogonal, then $T(v), T(w) \in \mathbb{R}^n$ are orthogonal.

Proof. We want to show $T(v), T(w)$ are orthogonal, and by the Pythagorean theorem, we have to show $$ \left|\left| T(v)+T(w)\right|\right|^2= \left|\left| T(v)\right|\right| ^2 + \left|\left|T(w)\right|\right|^2. $$ This equality follows \begin{align*} \left|\left| T(v)+T(w)\right|\right|^2 & = \left|\left|T(v+w)\right|\right|^2 =\left|\left|v+w\right|\right|^2 \\ & =\left|\left|v\right|\right|^2+\left|\left| w\right|\right|^2 =\left|\left| T(v)\right|\right|^2 + \left|\left| T(w)\right|\right|^2 \end{align*} since $T$ is linear, orthogonal and that $v, w$ are orthogonal, respectively.

Theorem. A linear transformation $T$ from $\mathbb{R}^n$ to $\mathbb{R}^n$ is orthogonal if and only if the vectors $T(e_1), \ldots, T(e_n)$ form an orthonormal basis.

Proof. If $T$ is an orthogonal transformation, then by definition, the $T(e_i)$ are unit vectors, and also, by Orthogonal Transformation they are orthogonal. Therefore, $T(e_1),\ldots, T(e_n)$ form an orthonormal basis. Conversely, suppose $T(e_1)$, \ldots, $T(e_n)$ form an orthonormal basis. Consider a vector $x=x_1 e_1+\cdots +x_n e_n.$ Then \begin{align*} \left|\left|T(x)\right|\right|^2 &=\left|\left|T(x_1 e_1+\cdots + x_n e_n)\right|\right|^2 =\left|\left|x_1 T(e_1)+\cdots + x_n T(e_n)\right|\right|^2 \\ &=\left|\left|x_1T(e_1)\right|\right|^2+\cdots + \left|\left|x_nT(e_n)\right|\right|^2 = x_1^2+\cdots + x_n^2 =\left|\left| x\right|\right|^2. \end{align*} Taking the square root of both sides shows that $T$ preserves lengths and therefore, $T$ is an orthogonal transformation.

Corollary. An $n \times n$ matrix $A$ is orthogonal if and only if its columns form an orthonormal basis.

Proof. The proof is left for the reader.

The transpose $A^T$ of an $n\times n$ matrix $A$ is the $n\times n$ matrix whose $ij$-th entry is the $ji$-th entry of $A.$ We say that a square matrix $A$ is symmetric if $A^T=A$, and $A$ is called skew-symmetric if $A^T=-A.$

Theorem. (Orthogonal and Transpose Properties)

(1) The product of two orthogonal $n\times n$ matrices is orthogonal.

(2) The inverse of an orthogonal matrix is orthogonal.

(3) If the products $(A B)^T$ and $B^T A^T$ are defined then they are equal.

(4) If $A$ is invertible then so is $A^T$, and $(A^T)^{-1}=(A^{-1})^T.$

(5) For any matrix $A$, $\text{rank}\,(A) = \text{rank} \,(A^T).$

(6) If $v$ and $w$ are two column vectors in $\mathbb{R}^n$, then $v \cdot w = v^T w.$

(7) The $n \times n$ matrix $A$ is orthogonal if and only if $A^{-1}=A^T.$

Proof. The proof of each part follows.

  • Suppose $A$ and $B$ are orthogonal matrices, then $AB$ is an orthogonal matrix since $T(x)=AB x$ preserves length because $$ \left|\left|T(x)\right|\right| = \left|\left|AB x\right|\right| = \left|\left|A(B x)\right|\right| = \left|\left|B x\right|\right| = \left|\left|x\right|\right|. $$
  • Suppose $A$ is an orthogonal matrix, then $A^{-1}$ is orthogonal an matrix since $T(x)=A^{-1} x$ preserves length because $\left|\left| A^{-1}x \right|\right| = \left|\left| A(A^{-1}x) \right|\right| = \left|\left| x \right|\right|.$
  • We will compare entries in the matrices $(AB)^T$ and $B^T A^T$ as follows: $$ \begin{array}{rl} i j \text{-th entry of }(AB)^T &= ji \text{-th entry of }AB\\ & = (j \text{-th row of } A) \cdot (i \text{-th column of } B)\\ \ i j \text{-th entry of }B^TA^T &=(i \text{-th row of } B^T) \cdot (j \text{th column of } A^T)\\ & = (i \text{-th column of } B) \cdot (j \text{-th row of } A)\\ & = (j \text{-th row of } A) \cdot (i \text{-th column of } B). \end{array} $$ Therefore, the $ij$-th entry of $(AB)^T$ is the same of the $ij$-th entry of $B^T A^T.$
  • Suppose $A$ is invertible, then $A A^{-1}=I_n.$ Taking the transpose of both sides along with (iii) it yields, $(A A^{-1})^T=(A^{-1})^T A^T=I_n.$ Thus $A^T$ is invertible and since inverses are unique, it follows $(A^T)^{-1}=(A^{-1})^T.$
  • Exercise.
  • If $v=\begin{bmatrix}a_1\\ \vdots \\ a_n\end{bmatrix}$ and $w=\begin{bmatrix}b_1 \\ \vdots \\ b_n\end{bmatrix}$, then $$ v \cdot w=\begin{bmatrix}a_1 \\ \vdots \\ a_n \end{bmatrix} \cdot \begin{bmatrix} b_1\\ \vdots\\ b_n\end{bmatrix} = a_1b_1+\cdots +a_n b_n =\begin{bmatrix} a_1 & \cdots & a_n\end{bmatrix} \begin{bmatrix} b_1 \\ \vdots\\ b_n\end{bmatrix} =\begin{bmatrix}a_1 \\ \vdots \\ a_n\end{bmatrix}^T w=v^T w. $$
  • Let’s write $A$ in terms of its columns: $A=\begin{bmatrix}v_1 & \cdots & v_n \end{bmatrix}.$ Then \begin{equation*} \label{ata} A^T A= \begin{bmatrix} v_1^T \\ \vdots \\ v_n^T \end{bmatrix} \begin{bmatrix} v_1 & \cdots & v_n \end{bmatrix} =\begin{bmatrix}v_1 \cdot v_1 & & v_1 \cdot v_n \\ \vdots & \cdots & \vdots \\ v_n \cdot v_1 & & v_n \cdot v_n\end{bmatrix}. \end{equation*} Now $A$ is orthogonal, if and only if $A$ has orthonormal columns, meaning $A$ is orthogonal if and only if $A^TA=I_n$. Therefore, $A$ is orthogonal if and only if $A^{-1}=A^T.$

Theorem. (Orthogonal Projection Matrix)

Let $V$ be a subspace of $\mathbb{R}^n$ with orthonormal basis $u_1, \ldots, u_m.$ The matrix of the orthogonal projection onto $V$ is $Q Q^T$ where $Q= \begin{bmatrix} u_1 & \cdots & u_m \end{bmatrix}.

Let $V$ be a subspace of $\mathbb{R}^n$ with basis $v_1,\ldots,v_m$ and let $A=\begin{bmatrix}v_1 & \cdots v_m \end{bmatrix}$, then the orthogonal projection matrix onto $V$ is $A(A^T A)^{-1}A^T.$

Proof. The proof of each part follows.

  • Since $u_1$, \ldots, $u_m$ is an orthonormal basis of $V$ we can, by Orthogonal Projection, write, \begin{align*} \text{proj}_V (x) & =(u_1 \cdot x) u_1 + \cdots + (u_m \cdot x) u_m =u_1 u_1^T x + \cdots +u_m u_m^T x & \\ &=(u_1 u_1^T + \cdots +u_m u_m^T) x = \begin{bmatrix} u_1 & \cdots & u_m \end{bmatrix} \begin{bmatrix} u_1^T \\ \vdots \\ u_m^T \end{bmatrix} x =QQ^Tx. \end{align*}
  • Since $v_1,\ldots,v_m$ form a basis of $V$, there exists unique scalars $c_1,\ldots,c_m$ such that $\text{proj}_V(x)=c_1 v_1+\cdots +c_m v_m.$ Since $A=\begin{bmatrix}v_1 & \cdots & v_m \end{bmatrix}$ we can write $\text{proj}_V(x)=A c.$ Consider the system $A^TAc =A^T x$ where $A^TA$ is the coefficient matrix and $c$ is the unknown. Since $c$ is the coordinate vector of $\text{proj}_V(x)$ with respect to the basis $(v_1,\ldots,v_m)$, the system has a unique solution. Thus, $A^TA$ must be invertible, and so we can solve for $c$, namely $c=(A^T A)^{-1}A^Tx.$ Therefore, $\text{proj}_V(x)=A c =A (A^T A)^{-1}A^T $ as desired. Notice it suffices to consider the system $A^TAc =A^T x$, or equivalently $A^T(x-A c)=0$, because $$ A^T(x -A c)=A^T(x-c_1 v_1-\cdots – c_m v_m) $$ is the vector whose $i$th component is $$ (v_i)^T(x-c_1 v_1-\cdots -c_m v_m)=v_i\cdot(x-c_1v_1-\cdots -c_m v_m) $$ which we know to be zero since $x-\text{proj}_V(x)$ is orthogonal to $V.$

Example. Is there an orthogonal transformation $T$ from $\mathbb{R}^3$ to $\mathbb{R}^3$ such that $$ T\begin{bmatrix} 2\\ 3\\ 0\end{bmatrix} =\begin{bmatrix} 3\\ 0\\ 2\end{bmatrix} \qquad \text{and} \qquad T\begin{bmatrix}-3\\ 2\\ 0\end{bmatrix} = \begin{bmatrix} 2\\ -3\\ 0\end{bmatrix}? $$ No, since the vectors $\begin{bmatrix}2\\ 3\\ 0\end{bmatrix}$ and $\begin{bmatrix}-3\\ 2\\ 0\end{bmatrix}$ are orthogonal, whereas $\begin{bmatrix}3\\ 0\\ 2\end{bmatrix}$ and $\begin{bmatrix}2\\ -3\\ 0\end{bmatrix}$ are not, by Orthogonal Transformation.

Example. Find an orthogonal transformation $T$ from $\mathbb{R}^3$ to $\mathbb{R}^3$ such that $$ T\begin{bmatrix}2/3\\ 2/3\\ 1/3\end{bmatrix} = \begin{bmatrix}0 \\ 0\\ 1\end{bmatrix}. $$ Let’s think about the inverse of $T$ first. The inverse of $T$, if it exists, must satisfy $T^{-1}(e_3) = \begin{bmatrix}2/3\\ 2/3\\ 1/3\end{bmatrix} = v_3.$ Furthermore, the vectors $v_1, v_2, v_3$ must form an orthonormal basis of $\mathbb{R}^3$ where $T^{-1}x=\begin{bmatrix}v_1 & v_2 & v_3\end{bmatrix} x.$ We require a vector $v_1$ with $v_1\cdot v_3=0$ and $\left|\left|v_1 \right|\right| =1.$ By inspection, we find $v_1=\begin{bmatrix} -2/3\\ 1/3\\ 2/3\end{bmatrix}.$ Then $$
v_2=v_1\times v_3 =\begin{bmatrix} -2/3\\ 1/3\\ 2/3\end{bmatrix} \times \begin{bmatrix} 2/3\\ 2/3\\ 1/3\end{bmatrix} =\begin{bmatrix} 1/9-4/9\\ -(-2/9-4/9)\\ -4/9-2/9 \end{bmatrix} = \begin{bmatrix} -1/3\\ 2/3\\ -2/3\end{bmatrix} $$ does the job since $$ \left|\left| v_1 \right| \right| = \left|\left| v_2 \right| \right| = \left|\left| v_3 \right| \right| =1 $$ and $$ v_1\cdot v_2=v_1\cdot v_3=v_2\cdot v_3=0. $$ In summary $$ T^{-1}=\begin{bmatrix}-2/3 & -1/3 & 2/3 \\ 1/3 & 2/3 & 2/3 \\ 2/3 & -2/3 & 1/3\end{bmatrix}x. $$ By Orthogonal and Transpose Properties the matrix of $T^{-1}$ is orthogonal and the matrix $T=(T^{-1})^{-1}$ is the transpose of the matrix of $T^{-1}.$ Therefore, it suffices to use $$ T=\begin{bmatrix}-2/3 & -1/3 & 2/3 \\ 1/3 & 2/3 & 2/3 \\ 2/3 & -2/3 & 1/3\end{bmatrix}^Tx=\begin{bmatrix}-2/3 & 1/3 & 2/3 \\ -1/3 & 2/3 & -2/3 \\ 2/3 & 2/3 & 1/3 \end{bmatrix} x. $$

Example. Show that a matrix with orthogonal columns need not be an orthogonal matrix. For example $A=\begin{bmatrix}4 & -3 \\ 3 & 4 \end{bmatrix}$ is not an orthogonal matrix $Tx=Ax$ does not preserve length by comparing the lengths of $x$ and $Tx$ with $\begin{bmatrix}-3\\ 4\end{bmatrix}.$

Example. Find all orthogonal $2\times 2$ matrices. Write $A=\begin{bmatrix}v_1 & v_2\end{bmatrix}.$ The unit vector $v_1$ can be expressed as $v_1=\begin{bmatrix}\cos \theta\ \sin \theta\end{bmatrix}$, for some $\theta.$ Then $v_2$ will be one of the two unit vectors orthogonal to $v_1$, namely $v_2=\begin{bmatrix}-\sin \theta \ \cos \theta\end{bmatrix}$ or $v_2=\begin{bmatrix} \sin \theta\ -\cos \theta\end{bmatrix}.$ Therefore, an orthogonal $2\times 2$ matrix is either of the form $$ A=\begin{bmatrix}\cos \theta & -\sin \theta \\ \sin \theta & \cos \theta \end{bmatrix}\hspace{1cm} \text{or} \hspace{1cm} A=\begin{bmatrix}\cos \theta & \sin \theta \ \sin \theta & -\cos \theta \end{bmatrix} $$ representing a rotation or a reflection, respectively.

Example. Given $n\times n$ matrices $A$ and $B$ which of the following must be symmetric?

  • $B B^T$
  • $A^T B^TB A$
  • $B(A+A^T)B^T$

The solution to each part follows.

  • By Orthogonal and Transpose Properties, $B B^T$ is symmetric because $$(B B^T)^T=(B^T)^TB^T=B B^T. $$
  • By Orthogonal and Transpose Properties}, $A^T B^TB A$ is symmetric because $$ (A^TB^TBA)^T=A^TB^T(B^T)^T(A^T)^T=A^TB^TBA. $$
  • By Orthogonal and Transpose Properties}, $B(A+A^T)B^T$ is symmetric because $$ (B(A+A^T)B^T)^T=((A+A^T)B^T)^TB^T=B(A+A^T)^TB^T $$ $$ =B(A^T+A)^TB^T=B((A^T)^T+A^T)B^T=B(A+A^T)B^T. $$

Example. If the $n\times n$ matrices $A$ and $B$ are symmetric which of the following must be symmetric as well?

  • $2I_n+3A-4 A^2$,
  • $A B^2 A.$

The solution to each part follows.

  • First note that $(A^2)^T=(A^T)^2=A^2$ for a symmetric matrix $A.$ Now we can use the linearity of the transpose, $$ (2I_n+3A-4 A^2)^T=2I_n^T+3A^T-4 (A^2)^T=2I_n+3A-4 A^2 $$ showing that the matrix $2I_n+3A-4 A^2$ is symmetric.
  • The matrix $A B^2 A$ is symmetric since, $$ (AB^2A)^T=(ABBA)^T=(BA)^T(AB)^T=A^TB^TB^TA^T=AB^2A. $$

Example. Use Orthogonal Projection Matrix to find the matrix $A$ of the orthogonal projection onto $$ W=\mathop{span} \left(\begin{bmatrix} 1\\ 1\\ 1\\ 1\end{bmatrix}, \begin{bmatrix} 1\\ 9\\ -5\\ 3\end{bmatrix}\right). $$ Then find the matrix of the orthogonal projection onto the subspace of $\mathbb{R}^4$ spanned by the vectors $\begin{bmatrix}1\\ 1\\ 1\\ 1\end{bmatrix}$ and $\begin{bmatrix}1\\ 2\\ 3\\ 4\end{bmatrix}.$

First we apply Gram-Schmidt Process, to $W=\mathop{span}(v_1, v_2)$, to find that the vectors $$ u_1=\frac{v_1}{\left|\left| v_1 \right|\right| } =\begin{bmatrix}1/2 \\ 1/2 \\ 1/2 \\ 1/2\end{bmatrix}, u_2 =\frac{v_2^\perp}{\left|\left| v_2^\perp \right|\right| } =\frac{v_2-\left(u_1 \cdot v_2\right) u_1}{\left|\left| v_2-\left(u_1 \cdot v_2\right)u_1 \right|\right| } =\begin{bmatrix}-1/10 \\ 7/10 \\ -7/10 \\ 1/10\end{bmatrix} $$ form an orthonormal basis of $W.$ By Orthogonal Projection Matrix, the matrix of the projection onto $W$ is $A=Q Q^T$ where $Q=\begin{bmatrix}u_1 & u_2\end{bmatrix}.$ Therefore the orthogonal projection onto $W$ is $$ A= \begin{bmatrix} 1/2 & -1/10 \\ 1/2 & 7/10 \\ 1/2 & -7/10 \\ 1/2 & 1/10 \end{bmatrix} \begin{bmatrix} 1/2 & 1/2 & 1/2 & 1/2 \\ -1/10 & 7/10 & -7/10 & 1/10 \end{bmatrix} =\frac{1}{100} \begin{bmatrix} 26 & 18 & 32 & 24 \\ 18 & 74 & -24 & 32 \\ 32 & -24 & 74 &18 \\ 24 & 32 & 18 & 26 \end{bmatrix}. $$ Let $A=\begin{bmatrix}1 & 1 \\ 1 & 2 \\ 1 & 3 \\ 1 & 4 \end{bmatrix}$ and then the orthogonal projection matrix is $$ A(A^TA)^{-1}A^T =\frac{1}{10}\begin{bmatrix}7 & 4 & 1 & -2 \\ 4 & 3 & 2 & 1 \\ 1 & 2 & 3 & 4 \\ -2 & 1 & 4 & 7 \end{bmatrix}. $$

David A. Smith at Dave4Math

David Smith (Dave) has a B.S. and M.S. in Mathematics and has enjoyed teaching precalculus, calculus, linear algebra, and number theory at both the junior college and university levels for over 20 years. David is the founder and CEO of Dave4Math.

Leave a Comment