Consider a scalar-valued function of a single variable, such as, for example, the exponential function or the sine function.
Can we extend the definition of that function in such a way that it takes square matrices as arguments and returns square matrices of the same dimension as outputs?
This lecture deals with the problem of defining such extensions in a useful manner.
Table of contents
Let
be a field of scalars, such as
the set of real numbers
or the set of complex numbers
.
We will consider functions
and the problem of extending them is such a way
that
is
a well-defined
output matrix, for any
input matrix
.
The simplest way to extend
is to apply it element-wise, by
defining
for
and
,
where
is the
-th
entry of
and
is the
-th
entry of
.
This extension, which has numerous applications, is straightforward and we do not need to discuss it further.
The element-wise extension has the problem that it is not compatible with the way in which we have defined matrix polynomials.
Remember that an
ordinary
polynomial is a function
defined
as
where
the coefficients
also belong to the field of scalars
.
The extension of
to square matrices, which is called a
matrix polynomial, is defined
as
where
is a
matrix and
is the
identity matrix.
In the definition of a matrix polynomial, the powers
are matrix powers obtained by
repeatedly multiplying
by
itself
As we know, multiplying two matrices is not the same as taking the element-wise products of their entries.
Therefore, the above definition of an element-wise matrix function is not consistent with the definition of a matrix polynomial.
We are now going to discuss some properties that should be satisfied by a sound definition of matrix function.
We have already said about the consistency with the definition of a matrix polynomial.
In the next sections we will discuss some properties that are desirable when:
is an analytic function;
is diagonalizable;
is a Jordan block;
is a matrix in Jordan form.
Finally, we will provide a formal definition that satisfies all these desirable properties.
Note: since the next sections only serve as motivation for the definition of a matrix function, the discussion therein is not completely rigorous (e.g., possible issues about the convergence of infinite series are not analyzed rigorously).
Many interesting and useful functions
are analytic,
that is, they coincide with their (convergent)
Taylor
series:
where:
the equality holds for all points
in a neighborhood of a point
;
is the
-th
derivative of
at
;
is the factorial of
.
Roughly speaking, a Taylor series is an infinite polynomial. Therefore, to be
consistent with the definition of a matrix polynomial, the definition of a
matrix function should be such
thatwhenever
is analytic.
Suppose that
is analytic in a neighborhood of
,
so that the Taylor expansion above
becomes
Further assume that
is diagonalizable, that
is,
where
is an invertible matrix and
is a diagonal matrix whose
diagonal entries are equal to the
eigenvalues of
,
denoted by
.
Then, we
haveprovided
that the eigenvalues of
are included in the neighborhood over which
is analytic.
This derivation shows another property that a definition of matrix function
should have: if
has the diagonalization
then
where
is obtained by applying
to each diagonal entry of
.
Remember that a
matrix
is said to be a Jordan block with
eigenvalue
if and only if all its diagonal entries are equal to
,
all its supradiagonal entries are equal to
,
and all the remaining entries are equal to
.
As in Higham (2008), we focus on
Jordan blocks, which have the
form
We can also
writewhere
is
a nilpotent matrix.
Raising
to integer powers moves the diagonal of
s
towards the upper-right
corner:
and
for
.
When
is analytic in a neighborhood of
,
its Taylor expansion can be written
as:
Then,
By using the same technique, we can show that, for larger Jordan blocks
,
has a similar structure.
For example, if
is
,
then
This kind of structure for
is another property that our definition of a matrix function should satisfy.
Remember that a matrix
is in Jordan form if it is
block-diagonal and all its diagonal blocks are Jordan blocks:
where
are Jordan blocks.
Then, if
is analytic in a sufficiently large neighborhood of a point
,
we
have
where
can be computed as in the previous section.
Finally, any matrix
is similar to a matrix
in Jordan
form:
where
is an invertible matrix.
Therefore,where
has been derived above.
After this long motivating discussion, we are ready to provide the standard definition of a matrix function.
Definition
Let
be a
matrix having
distinct eigenvalues
.
Let
be
a Jordan decomposition of
,
where
is
a matrix in Jordan form, with Jordan blocks
.
Denote by
the dimension of the largest Jordan block having eigenvalue
.
Let
be a scalar function. Suppose
that
exists
for any
and any
.
Then, the matrix function
is defined
as
where,
for a Jordan block
having dimension
and eigenvalue
,
is defined
as
Since the Jordan decomposition of a matrix is not unique, this definition
makes sense only as long as
does not depend on which decomposition we pick. A proof of the lack of of this
kind of dependence can be found, for example, in Meyer
(2000).
Clearly, this definition satisfies the properties that we have outlined in the motivating discussion about Jordan blocks and matrices in Jordan form.
Moreover, if
is diagonalizable and its eigenvalues are
,
then
,
is diagonal, the Jordan blocks have dimension
,
and
This is the property previously outlined in the section about diagonalizable matrices.
The last thing to note is that
is not required to be analytic in the definition above. However, it is
possible to prove the following property, which completes the list of
desirable properties put forward in our motivating discussion.
Proposition
Suppose that
is analytic in a neighborhood of a point
,
and that the neighborhood includes all the eigenvalues of
.
Let
be as in the standard definition above.
Then,
The definition of a matrix function we have just provided has several useful applications. One of them is to solve linear systems of differential equations.
Let
be
a
vector whose entries are functions of time.
The
-th
derivative of
with respect to
is denoted
by
Suppose that
satisfies the linear system of differential
equations
with
initial condition
,
where
is a
matrix and
is a
vector.
By repeatedly differentiating both sides of the equation with respect to
,
we
obtain
Thus, an analytic solution of the system of differential equations can be
worked out as
follows:
By using the theory developed above, we can re-write the solution
aswhere
the scalar function
is easily recognized to be the exponential
function:
If
is diagonalizable as
and the two eigenvalues on the diagonal of
are
and
,
then
and
the solution of the system of differential equations
is
If
is not diagonalizable, it has the Jordan
decomposition
Then,and
the solution of the system of differential equations
is
Below you can find some exercises with explained solutions.
Let
Compute
by using the definition of a matrix function.
The matrix
is already in Jordan form. It has two Jordan
blocks
The
scalar function
is
extended to the matrix
as
follows:
Let
Find the solution of the system of differential
equationswith
initial condition
Since
is already in Jordan form, the solution
is
Golub, G. H., Van Loan, C. F. (2013) Matrix Computations, Johns Hopkins University Press.
Higham, N. J. (2008) Functions of Matrices: Theory and Computation, SIAM.
Meyer, C. D. (2000) Matrix Analysis and Applied Linear Algebra, SIAM.
Please cite as:
Taboga, Marco (2021). "Matrix function", Lectures on matrix algebra. https://www.statlect.com/matrix-algebra/matrix-function.
Most of the learning materials found on this website are now available in a traditional textbook format.