# Singular value decomposition

The singular value decomposition (SVD) of a matrix allows us to decompose any (not necessarily square) matrix into a product of three terms:

1. a unitary matrix;

2. a matrix having positive entries on its main diagonal and zero entries elsewhere;

3. another unitary matrix.

Easily recognizable subsets of the columns of the two unitary matrices involved in the decomposition have the remarkable property of being orthonormal bases for the column space and the null space of the original matrix.

The singular value decomposition is derived by using results about the diagonalization of a positive definite matrix. We therefore recommend to revise the lecture on positive definite matrices before reading this one.

## The decomposition

Here is the decomposition.

Proposition Let be a matrix. Then, there exist a unitary matrix and an unitary matrix such thatwhere denotes the conjugate transpose of , is a matrix such that if and if .

Proof

We start by observing that the matrixis Hermitian (i.e., equal to its conjugate transpose), hence normal and unitarily diagonalizable (i.e., unitarily similar to a diagonal matrix). Moreover, for any vector , we have As a consequence is positive semi-definite, which implies that its eigenvalues are real non-negative numbers. Thus, we can diagonalize as where is a unitary matrix and is a diagonal matrix such that its diagonal entries are real non-negative numbers. When we diagonalize a matrix, we can arbitrarily choose the order in which the eigenvalues appear on the diagonal of . Therefore, we can perform the diagonalization in such a manner thatwhere the block is a diagonal matrix having all the strictly positive eigenvalues of on its main diagonal. Suppose that there are strictly positive eigenvalues, that is, is . Partition into two blocks:where is . By using the rules for the multiplication of block matrices, we obtainNote thatand the last equation impliesAs a consequence, the squared norms of the columns of (which are on the main diagonal of the matrix on the left-hand side of the equation) are all zero. Hence, by the positive definiteness of the norm. Define the matrixwhere is a diagonal matrix whose diagonal entries are equal to the square roots of the diagonal entries of . Note that so that the columns of are orthonormal. Find any matrix having orthonormal columns and such thatIn other words, the columns of are chosen so as to "complete the orthonormal basis", that is, so as to form an orthonormal basis together with the columns of . Stated differently, the columns of span the orthogonal complement of the span of the columns of . DefineThus, is a unitary matrix. Then,By pre-multiplying both sides by and post-multiplying both sides by , we obtain

The diagonal entries of (i.e., the entries for ) are called singular values of .

Example Ifthen the singular values are , and .

Example LetThen, the singular values are , and .

Example Ifthen the singular values are , and .

## Uniqueness

As shown in the proof above, the singular value decomposition of is obtained from the diagonalization of . But the diagonalization is not unique (as discussed in the lecture on diagonalization). Therefore, also the SVD is not unique.

## Compact SVD

A more compact form of the SVD, called compact SVD, can be obtained as follows.

Proposition Let be a matrix. Let be the rank of . Then, there exist a matrix and an matrix , both having orthonormal columns, such thatwhere denotes the conjugate transpose of and is an diagonal matrix with strictly positive entries on its main diagonal.

Proof

In this proof we use all the matrices defined in the above proof of the non-compact SVD. Note that (i.e., the number of non-zero singular values) is the rank of and, as a consequence, also the rank of (because multiplication by the full-rank matrices and does not change the rank of ). We have where we have defined .

## The real case

If the matrix is real (i.e., all its entries are real numbers), then the SVD iswhere and are real orthogonal matrices (we just need to replace conjugate transposition with simple transposition in the proof of the SVD).

## Column space

Let be a matrix and let be the space of all column vectors.

Remember that the column space is the span of the columns of :

Proposition Let be a matrix. Let be the rank of . Letbe an SVD of such that the non-negative singular values of are the first entries on the main diagonal of . Let be the block formed by the first columns of . Then,where is the span of the columns of and is the span of the columns of .

Proof

The proof of this proposition is in Solved Exercise 1 below.

Note that is the matrix appearing in the compact SVD above.

Since the columns of are orthonormal, they are also linearly independent. Thus, the matrix in the compact SVD exposes an orthonormal basis for the column space of .

## Null space

Remember that the null space is the kernel of the linear transformation defined by :

Proposition Let be a matrix. Let be the rank of . Letbe an SVD of such that the non-negative singular values of are the first entries on the main diagonal of . Let be the block formed by the last columns of . Then,where is the null space of and is the span of the columns of .

Proof

The proof of this proposition is in Solved Exercise 2 below.

## Solved exercises

Below you can find some exercises with explained solutions.

### Exercise 1

Prove the above proposition about the column space of .

Solution

Suppose that . Then, there exists an vector such thatwhere we have used the compact SVD of . Thus, andwhere is the span of the columns of (i.e., the column space of ). Now, suppose that . Then, there exists an vector such thatwhere again we have used the compact SVD of . Hence, and But (1) and (2) imply

### Exercise 2

Prove the above proposition about the null space of .

Solution

Suppose that . The columns of and those of together form a basis for the space of vectors. As a consequence, there exist an vector and an vector such thatWe can write the equation asBy pre-multiplying both sides of the equation by , we obtainbecause and , as previously demonstrated. But Thus, we haveSince and are both full-rank, the last equation implieswhich in turn impliesThus, . We have just demonstrated thatNow, suppose that . Hence, there exists a vector such that By pre-multiplying both sides of the equation by , we getwhich implies that . Thus,But (1) and (2) imply