Consider a scalar-valued function of a single variable, such as, for example, the exponential function or the sine function.
Can we extend the definition of that function in such a way that it takes square matrices as arguments and returns square matrices of the same dimension as outputs?
This lecture deals with the problem of defining such extensions in a useful manner.
Table of contents
Let be a field of scalars, such as the set of real numbers or the set of complex numbers .
We will consider functions and the problem of extending them is such a way thatis a well-defined output matrix, for any input matrix .
The simplest way to extend is to apply it element-wise, by definingfor and , where is the -th entry of and is the -th entry of .
This extension, which has numerous applications, is straightforward and we do not need to discuss it further.
The element-wise extension has the problem that it is not compatible with the way in which we have defined matrix polynomials.
Remember that an ordinary polynomial is a function defined aswhere the coefficients also belong to the field of scalars .
The extension of to square matrices, which is called a matrix polynomial, is defined aswhere is a matrix and is the identity matrix.
In the definition of a matrix polynomial, the powers are matrix powers obtained by repeatedly multiplying by itself
As we know, multiplying two matrices is not the same as taking the element-wise products of their entries.
Therefore, the above definition of an element-wise matrix function is not consistent with the definition of a matrix polynomial.
We are now going to discuss some properties that should be satisfied by a sound definition of matrix function.
We have already said about the consistency with the definition of a matrix polynomial.
In the next sections we will discuss some properties that are desirable when:
is an analytic function;
is diagonalizable;
is a Jordan block;
is a matrix in Jordan form.
Finally, we will provide a formal definition that satisfies all these desirable properties.
Note: since the next sections only serve as motivation for the definition of a matrix function, the discussion therein is not completely rigorous (e.g., possible issues about the convergence of infinite series are not analyzed rigorously).
Many interesting and useful functions are analytic, that is, they coincide with their (convergent) Taylor series:where:
the equality holds for all points in a neighborhood of a point ;
is the -th derivative of at ;
is the factorial of .
Roughly speaking, a Taylor series is an infinite polynomial. Therefore, to be consistent with the definition of a matrix polynomial, the definition of a matrix function should be such thatwhenever is analytic.
Suppose that is analytic in a neighborhood of , so that the Taylor expansion above becomes
Further assume that is diagonalizable, that is, where is an invertible matrix and is a diagonal matrix whose diagonal entries are equal to the eigenvalues of , denoted by .
Then, we haveprovided that the eigenvalues of are included in the neighborhood over which is analytic.
This derivation shows another property that a definition of matrix function should have: if has the diagonalization then where is obtained by applying to each diagonal entry of .
Remember that a matrix is said to be a Jordan block with eigenvalue if and only if all its diagonal entries are equal to , all its supradiagonal entries are equal to , and all the remaining entries are equal to .
As in Higham (2008), we focus on Jordan blocks, which have the form
We can also writewhereis a nilpotent matrix.
Raising to integer powers moves the diagonal of s towards the upper-right corner:and for .
When is analytic in a neighborhood of , its Taylor expansion can be written as:
Then,
By using the same technique, we can show that, for larger Jordan blocks , has a similar structure.
For example, if is , then
This kind of structure for is another property that our definition of a matrix function should satisfy.
Remember that a matrix is in Jordan form if it is block-diagonal and all its diagonal blocks are Jordan blocks:
where are Jordan blocks.
Then, if is analytic in a sufficiently large neighborhood of a point , we havewhere can be computed as in the previous section.
Finally, any matrix is similar to a matrix in Jordan form:where is an invertible matrix.
Therefore,where has been derived above.
After this long motivating discussion, we are ready to provide the standard definition of a matrix function.
Definition Let be a matrix having distinct eigenvalues . Letbe a Jordan decomposition of , where is a matrix in Jordan form, with Jordan blocks . Denote by the dimension of the largest Jordan block having eigenvalue . Let be a scalar function. Suppose thatexists for any and any . Then, the matrix function is defined aswhere, for a Jordan block having dimension and eigenvalue , is defined as
Since the Jordan decomposition of a matrix is not unique, this definition makes sense only as long as does not depend on which decomposition we pick. A proof of the lack of of this kind of dependence can be found, for example, in Meyer (2000).
Clearly, this definition satisfies the properties that we have outlined in the motivating discussion about Jordan blocks and matrices in Jordan form.
Moreover, if is diagonalizable and its eigenvalues are , then , is diagonal, the Jordan blocks have dimension , and
This is the property previously outlined in the section about diagonalizable matrices.
The last thing to note is that is not required to be analytic in the definition above. However, it is possible to prove the following property, which completes the list of desirable properties put forward in our motivating discussion.
Proposition Suppose that is analytic in a neighborhood of a point , and that the neighborhood includes all the eigenvalues of . Let be as in the standard definition above. Then,
The definition of a matrix function we have just provided has several useful applications. One of them is to solve linear systems of differential equations.
Let be a vector whose entries are functions of time.
The -th derivative of with respect to is denoted by
Suppose that satisfies the linear system of differential equationswith initial condition , where is a matrix and is a vector.
By repeatedly differentiating both sides of the equation with respect to , we obtain
Thus, an analytic solution of the system of differential equations can be worked out as follows:
By using the theory developed above, we can re-write the solution aswhere the scalar function is easily recognized to be the exponential function:
If is diagonalizable as and the two eigenvalues on the diagonal of are and , thenand the solution of the system of differential equations is
If is not diagonalizable, it has the Jordan decomposition
Then,and the solution of the system of differential equations is
Below you can find some exercises with explained solutions.
Let
Compute by using the definition of a matrix function.
The matrix is already in Jordan form. It has two Jordan blocksThe scalar function is extended to the matrix as follows:
Let
Find the solution of the system of differential equationswith initial condition
Since is already in Jordan form, the solution is
Golub, G. H., Van Loan, C. F. (2013) Matrix Computations, Johns Hopkins University Press.
Higham, N. J. (2008) Functions of Matrices: Theory and Computation, SIAM.
Meyer, C. D. (2000) Matrix Analysis and Applied Linear Algebra, SIAM.
Please cite as:
Taboga, Marco (2021). "Matrix function", Lectures on matrix algebra. https://www.statlect.com/matrix-algebra/matrix-function.
Most of the learning materials found on this website are now available in a traditional textbook format.