Pseudospectrum

This post proposes to discover the concept of pseudospectrum which is an extension of the usual spectrum in linear algebra. Indeed, eigenvalue analysis shows its limits in the case of non-normal matrices. The pseudospectrum takes place as an interesting tool to perform analysis of non-normal matrices. The present discussion is restricted to the finite-dimensional case.

EIGENVALUES

The concept of eigenvalue is very important in linear algebra, it provides a large number of applications in physics, engineering and many sciences in general. Intuitively, an eigenvector is a non zero vector that can be scaled up without change of direction during a linear transform, the corresponding eigenvalue is then the coefficient of the scaling. More precisely :

Eigenvalue : Let A be a square n-by-n matrix of complex numbers. If there exists a non zero vector $X \in \mathbb{C}^n \neq 0$ satisfying the following characteristic equation

$\displaystyle A X = \lambda X$

for some scalar $\lambda \in \mathbb{C}$, then X is called an eigenvector of A and $\lambda$ is the corresponding eigenvalue.

This equation is in fact equivalent to a homogeneous linear system where $I$ is the identity matrix :

$\displaystyle (A-\lambda I)X = 0$

It has non trivial solutions if and only if the determinant vanishes :

$\displaystyle \det(A-\lambda I) = 0$

The set of eigenvalues of A is called the spectrum of A and is denoted by $\sigma(A)$.

Spectrum : Let A be a square n-by-n matrix of complex numbers. The spectrum of A is by definition

$\displaystyle \sigma(A) = \{ \lambda \in \mathbb{C} : \det(A-\lambda I)=0 \}$

NORMALITY

In practice, eigenvalue analysis is well suited with a particular type of matrices that are called normal. Although the definition of a normal matrix looks like a bit technical, it can be completely characterized by the so-called spectral theorem which gives an orthonormal basis of eigenvectors. In practice, normal matrices behave well whereas non-normal matrices can raise problems. The concept of pseudospectrum was elaborated to deal with these problems.

We need some technical definitions.

Diagonalizability : Let A be a square n-by-n matrix of complex numbers. A is said to be diagonalizable if there exists an invertible matrix P such that $D = P^{-1} A P$ is a diagonal matrix. In this case, D and A have the same eigenvalues.

Normal matrix : Let A be a square matrix of complex numbers. Then, A is said to be normal if $A^* A = A A^*$ where $A^*$ is the conjugate transpose of A.

As we can see, it is easy to check whether a matrix is normal. For instance, hermitian matrices are always normal. The fundamental property of normal matrices is that they are completely characterized by the following theorem.

Spectral theorem : Let A be a square matrix of complex numbers. Then, A is normal if and only if it can be made diagonal by a unitary transform.

This means that there exists a unitary matrix U such that $A = U D U^*$ where D is a diagonal matrix. The word unitary means that $U U^* = U^* U = I$. The entries of the diagonal matrix are then the eigenvalues of A and the columns of U are the corresponding eigenvectors.

PSEUDOSPECTRUM

The key question is then:

What does happen to non-normal matrices ?

Let us consider for instance

$M = \left( \begin{array}{cc} 0 & 1\\ 0 & 0 \end{array} \right)$

The only eigenvalue is zero and thus the matrix is non-normal and even non-diagonalizable. However, the problem of this matrix is that it has too many zeros. If we consider a small perturbation

$M_{\epsilon} = M+E = \left( \begin{array}{cc} 0 & 1\\ \epsilon & 0 \end{array} \right)$

then the matrix is still non-normal but it becomes diagonalizable with eigenvalues $\pm \sqrt{\epsilon}$, and this is true for $\epsilon > 0$ as small as desired.

This leads us to consider an alternative to the spectrum of a non-normal matrix. As we explained before, the spectrum is the following set

$\displaystyle \sigma(A) = \{ \lambda \in \mathbb{C} : \det(A-\lambda I) = 0 \}$

We can formulate an equivalent formula by considering the so-called resolvent $(A-\lambda I)^{-1}$ as follows

$\displaystyle \sigma(A) = \{ \lambda \in \mathbb{C} : (A-\lambda I) \textrm{ is not invertible} \}$

Now we can extend the spectrum to the pseudospectrum as follows.

Pseudospectrum : Let A be a square n-by-n matrix of complex numbers and $\epsilon > 0$. The ε-pseudospectrum of A is by definition

$\displaystyle \sigma_{\epsilon}(A) = \{ \lambda \in \mathbb{C} : ||(A-\lambda I)^{-1}|| \geq \frac{1}{\epsilon}\}$

where $||.||$ is a matrix norm. When $\lambda$ is an eigenvalue we put $||(A-\lambda I)^{-1}|| = \infty$ so that the spectrum is contained into the pseudospectrum. There is an equivalent definition in terms of perturbations as suggested previously :

Pseudospectrum : Let A be a square n-by-n matrix of complex numbers and $\epsilon > 0$. The ε-pseudospectrum of A is given by

$\displaystyle \sigma_{\epsilon}(A) = \{ \lambda \in \mathbb{C} : \lambda \in \sigma(A+E) : ||E|| \leq \epsilon \}$

EXAMPLES

We illustrate the concept of pseudospectrum on Toeplitz matrices. These matrices are useful in many applications such as numerical computing. A Toeplitz matrix is a matrix of the following form :

$\displaystyle A = \left( \begin{array}{ccccc} a_{0} & a_{-1} & a_{-2} & \ldots & a_{-n+1} \\ a_{1} & a_{0} & a_{-1} & \ldots & \ldots \\ a_{2} & a_{1} & a_{0} & \ldots & a_{-2} \\ \ldots & \ldots & \ldots & \ldots & \ldots \\ a_{n-1} & \ldots & a_{2} & a_{1} & a_{0} \end{array} \right)$

The coefficients are given by a simple formula :

$\displaystyle A_{ij} = a_{i-j}$

These coefficients can be encoded in a Laurent series which is, roughly speaking, a power series including terms of negative degree :

$\displaystyle L(t) = \sum_{i=-\infty}^{+\infty} a_i t^i$

This series is called the symbol of the Toeplitz matrix and it contains all the information about the coefficients.

All the plots were elaborated with SciLab by using a naive (inefficient) algorithm on matrices of dimension 10. The norm of the resolvent $\lambda \mapsto ||(A-\lambda I)^{-1}||$ is just cross-sectionned at $1/\epsilon = 5$. The pseudospectrum is the set of all $\lambda$ such that the surface is above the level $1/\epsilon$, but only a restricted slice is represented in the contour plots in order to avoid numerical difficulties. Specialized software such as EigTool do exist if one wants to perform efficient calculations.

Example 1 :

The symbol is $L(t) = t$

Eigenvalues

Cross-section at 1/epsilon

Pseudospectrum levels above 1/epsilon

Example 2 :

The symbol is $L(t) = -t+t^{-5}$

Eigenvalues

Cross-section at 1/epsilon

Pseudospectrum levels above 1/epsilon

Example 3 :

The symbol is $L(t) = -t+1+t^{-1}+t^{-2}+t^{-3}$

Eigenvalues

Cross-section at 1/epsilon

Pseudospectrum levels above 1/epsilon

COMPLEMENTS

A book :

Lloyd Trefethen and Mark Embree, Spectra and Pseudospectra : The Behavior of Nonnormal Matrices and Operators, Princeton University Press, 2005

A rich web site about the subject :

Pseudospectra gateway