An algorithm to compute eigen values and eigen vectors in Julia

This small notebook shows a naive implementation of an algorithm to compute the eigen values and eigen vectors of a matrix, written in the Julia programming language.

Definition

For a real or complex valued square matrix $A \in \mathbb{M}_{n,n}(\mathbb{K})$ (on a field $\mathbb{K}=\mathbb{R}$ or $\mathbb{C}$), an eigenvalue $\lambda$ (of multiplicity $k\in\mathbb{N}^*$) and its associated eigenvector $v$ satisfy $$ (A - \lambda I)^k v = 0.$$

Examples

We will need at least two examples of matrix, to test our algorithms.

In [1]:
A = [ 1. 2 3; 4 5 6; 7 8 9 ]
Out[1]:
3×3 Array{Float64,2}:
 1.0  2.0  3.0
 4.0  5.0  6.0
 7.0  8.0  9.0
In [2]:
B = [12 -51 4; 6 167 -68; -4 24 -41]
Out[2]:
3×3 Array{Int64,2}:
 12  -51    4
  6  167  -68
 -4   24  -41

Using the builtin functions

Julia has, of course, a builtin eig function.

Testing

Based on the definition, it is not difficult to test if a pair $\lambda, v$ is an eigenvalue, eigenvector pair.

We can write a small function that will check if the eigen values and eigen vectors are correct.

In [3]:
function check_DV(A, D, V)
    I = eye(size(A)[1])
    for i = 1:size(A)[1]
        check = (A - D[i] * I) * V[:,i]
        println("For the ", i, "th eigen value and vector, (A - lambda I)^k * v = ", check, "\n\twhich has a L2 norm = ", norm(check))
        assert(norm(check) <= 1e-14)
    end
end
Out[3]:
check_DV (generic function with 1 method)

First example

In [4]:
D_A, V_A = eig(A)
Out[4]:
([16.1168, -1.11684, -1.30368e-15], [-0.231971 -0.78583 0.408248; -0.525322 -0.0867513 -0.816497; -0.818673 0.612328 0.408248])
In [5]:
check_DV(A, D_A, V_A)
For the 1th eigen value and vector, (A - lambda I)^k * v = [5.32907e-15, -1.77636e-15, -8.88178e-16]
	which has a L2 norm = 5.687116766346677e-15
For the 2th eigen value and vector, (A - lambda I)^k * v = [-6.66134e-16, -8.88178e-16, -2.66454e-15]
	which has a L2 norm = 2.886579864025407e-15
For the 3th eigen value and vector, (A - lambda I)^k * v = [-2.22045e-16, -1.77636e-15, 4.44089e-16]
	which has a L2 norm = 1.8444410139024814e-15

Second example

A second example, showing that $B$ has two eigen values that are $0$ (numerically found very close to $0$).

In [6]:
D_B, V_B = eig(B)
Out[6]:
([156.137, 16.06, -34.1967], [0.328147 -0.990526 0.254758; -0.936881 0.0871754 0.302793; -0.120717 0.106104 0.918376])
In [7]:
check_DV(B, D_B, V_B)
For the 1th eigen value and vector, (A - lambda I)^k * v = [-2.88658e-15, -1.95399e-14, 3.55271e-15]
	which has a L2 norm = 2.0068951041893117e-14
AssertionError: 

Stacktrace:
 [1] assert at ./error.jl:68 [inlined]
 [2] check_DV(::Array{Int64,2}, ::Array{Float64,1}, ::Array{Float64,2}) at ./In[3]:6
 [3] include_string(::String, ::String) at ./loading.jl:515

Computing the largest eigenvalue with power iteration

Let's compute the eigenpair with largest eigenvalue, through the power iteration method.

In [8]:
"Power iteration method on A, to find eigenpair with largest eigen value."
function poweriteration(A, maxiter=20, userandomstart=true)
    n = size(A)[1]
    if userandomstart
        X = randn((n, 1))
    else
        X = ones((n, 1))
    end
    for i = 1:maxiter
        next_X = A * X
        X = next_X / norm(next_X)
    end
    # lambda = (X^* ⋅ A ⋅ X) / (X^* ⋅ X)
    lambda = (transpose(conj(X)) * (A * X)) / (transpose(conj(X)) * X)
    lambda, X
end
Out[8]:
poweriteration

For instance, we obtain correctly the largest eigenpair for our example $A$. (upto a $-1$ sign in $v$)

In [9]:
poweriteration(A)
Out[9]:
([16.1168], [0.231971; 0.525322; 0.818673])
In [10]:
D_A, V_A = eig(A)
Out[10]:
([16.1168, -1.11684, -1.30368e-15], [-0.231971 -0.78583 0.408248; -0.525322 -0.0867513 -0.816497; -0.818673 0.612328 0.408248])
In [11]:
(D_A[1], -1 * V_A[:,1])
Out[11]:
(16.116843969807043, [0.231971, 0.525322, 0.818673])

What is the typical dependency on the number of iterations?

In [12]:
poweriteration
Out[12]:
poweriteration (generic function with 3 methods)
In [13]:
println(poweriteration(A, 1))
println(poweriteration(A, 2))
println(poweriteration(A, 5))
println(poweriteration(A, 10))
([15.3427], [0.130291; 0.494751; 0.859212])
([16.1581], [-0.239638; -0.527385; -0.815131])
([16.1168], [-0.23197; -0.525322; -0.818674])
([16.1168], [0.231971; 0.525322; 0.818673])

QR algorithm

The QR method can be used to compute eigenvalues, and then X method to compute an approximation of a correspond eigenvector for each eigenvalue.

First, we need to implement the QR decomposition, which can be based on the Gram-Schmidt process.

Gram-Schmidt process

It requires a generic inner-product function:

In [14]:
"Inner product of two vectors v, w. Can be vertical (n,1) or horizontal (1,n) vectors."
function inner(v, w)
    assert(size(v) == size(w))
    nm = size(v)
    if length(nm) == 1
        return transpose(conj(v)) * w
    elseif nm[1] == 1
        return conj(v) * transpose(w)
    end
end
Out[14]:
inner

It works for both $(1,n)$ and $(n,1)$ vectors:

In [15]:
inner([1 1 1], [2 3 4])
Out[15]:
1×1 Array{Int64,2}:
 9
In [16]:
inner([1; 1; 1], [2; 3; 4])
Out[16]:
9

Then a projection operator:

In [17]:
"projection(a, e): projection operator of vector a onto base vector e. Uses inner(e, a) and inner(e, e)."
function projection(a, e)
    return (inner(e, a) / inner(e, e)) * e 
end
Out[17]:
projection

Then the Gram-Schmidt process:

In [18]:
"gramschmidt(A): Gram-Schmidt orthogonalization operator, returns us, es (unormalized matrix, base matrix)."
function gramschmidt(A, verbose=false)
    assert(size(A)[1] == size(A)[2])
    n = size(A)[1]
    if verbose
        println("n = ", n)
    end
    us = zeros((n, n))
    es = zeros((n, n))
    for i = 1:n
        if verbose
            println("i = ", i)
        end
        us[:,i] = A[:,i] 
        for j = 1:i-1
            if verbose
                println("\tj = ", j)
                println("\tus[:,i] = ", us[:,i])
            end
            us[:,i] -= projection(A[:,i], us[:,j]) 
        end
        if verbose
            println("us[:,i] = ", us[:,i])
        end
        es[:,i] = us[:,i] / norm(us[:,i])
        if verbose
            println("es[:,i] = ", es[:,i])
        end
    end
    return us, es
end
Out[18]:
gramschmidt

Example:

In [19]:
A
Out[19]:
3×3 Array{Float64,2}:
 1.0  2.0  3.0
 4.0  5.0  6.0
 7.0  8.0  9.0
In [20]:
us, es = gramschmidt(A)
Out[20]:
([1.0 0.818182 7.99361e-15; 4.0 0.272727 3.44169e-15; 7.0 -0.272727 -7.77156e-16], [0.123091 0.904534 0.914844; 0.492366 0.301511 0.393891; 0.86164 -0.301511 -0.0889431])
In [21]:
us
Out[21]:
3×3 Array{Float64,2}:
 1.0   0.818182   7.99361e-15
 4.0   0.272727   3.44169e-15
 7.0  -0.272727  -7.77156e-16
In [22]:
es
Out[22]:
3×3 Array{Float64,2}:
 0.123091   0.904534   0.914844 
 0.492366   0.301511   0.393891 
 0.86164   -0.301511  -0.0889431

We can check that $a_1 = <e_1,a_1> e_1$, $a_2 = <e_1,a_2> e_1 + <e_2,a_2> e_2$ and so on:

In [23]:
inner(es[:,1], A[:,1]) * es[:,1]
Out[23]:
3-element Array{Float64,1}:
 1.0
 4.0
 7.0
In [24]:
inner(es[:,1], A[:,2]) * es[:,1] + inner(es[:,2], A[:,2]) * es[:,2]
Out[24]:
3-element Array{Float64,1}:
 2.0
 5.0
 8.0

It's not true for the last one, as $A$ was not full rank.

In [25]:
rank(A)
Out[25]:
2
In [26]:
inner(es[:,1], A[:,3]) * es[:,1] + inner(es[:,2], A[:,3]) * es[:,2] + inner(es[:,3], A[:,3]) * es[:,3]
Out[26]:
3-element Array{Float64,1}:
 6.94059
 7.69664
 8.61689

$E$ is an orthogonal matrix indeed:

In [27]:
[ norm(es[:,i]) for i = 1:size(es)[1] ]
Out[27]:
3-element Array{Float64,1}:
 1.0
 1.0
 1.0
In [28]:
(es' * es)[1:2,1:2]
Out[28]:
2×2 Array{Float64,2}:
  1.0          -7.77156e-16
 -7.77156e-16   1.0        

Second example:

In [29]:
us, es = gramschmidt(B)
Out[29]:
([12.0 -69.0 -11.6; 6.0 158.0 1.2; -4.0 30.0 -33.0], [0.857143 -0.394286 -0.331429; 0.428571 0.902857 0.0342857; -0.285714 0.171429 -0.942857])
In [30]:
[ norm(es[:,i]) for i = 1:size(es)[1] ]
Out[30]:
3-element Array{Float64,1}:
 1.0
 1.0
 1.0
In [31]:
es * es'
Out[31]:
3×3 Array{Float64,2}:
  1.0          -5.89806e-17   5.55112e-17
 -5.89806e-17   1.0          -6.93889e-17
  5.55112e-17  -6.93889e-17   1.0        

Last example:

In [32]:
V = [3 2; 1 2]
Out[32]:
2×2 Array{Int64,2}:
 3  2
 1  2
In [33]:
us, es = gramschmidt(V)
Out[33]:
([3.0 -0.4; 1.0 1.2], [0.948683 -0.316228; 0.316228 0.948683])
In [34]:
es' * es
Out[34]:
2×2 Array{Float64,2}:
  1.0          -2.77556e-16
 -2.77556e-16   1.0        

QR decomposition

It is very easy from the Gram-Schmidt decomposition:

In [35]:
"QR decomposition, returns (Q, R): Q is orthogonal and R is upper-triangular, such that A = Q R."
function QR(A)
    assert(size(A)[1] == size(A)[2])
    n = size(A)[1]
    us, es = gramschmidt(A)
    Q = copy(es)
    R = zeros((n, n))
    for i = 1:n
        for j = i:n
            R[i,j] = inner(es[:,i], A[:,j])
        end
    end
    return Q, R
end
Out[35]:
QR

First example

In [36]:
Q, R = QR(A)
Out[36]:
([0.123091 0.904534 0.914844; 0.492366 0.301511 0.393891; 0.86164 -0.301511 -0.0889431], [8.12404 9.60114 11.0782; 0.0 0.904534 1.80907; 0.0 0.0 4.30739])
In [37]:
A
Out[37]:
3×3 Array{Float64,2}:
 1.0  2.0  3.0
 4.0  5.0  6.0
 7.0  8.0  9.0
In [38]:
Q
Out[38]:
3×3 Array{Float64,2}:
 0.123091   0.904534   0.914844 
 0.492366   0.301511   0.393891 
 0.86164   -0.301511  -0.0889431
In [39]:
R
Out[39]:
3×3 Array{Float64,2}:
 8.12404  9.60114   11.0782 
 0.0      0.904534   1.80907
 0.0      0.0        4.30739
In [40]:
Q * R
Out[40]:
3×3 Array{Float64,2}:
 1.0  2.0  6.94059
 4.0  5.0  7.69664
 7.0  8.0  8.61689

Second example

In [41]:
Q2, R2 = QR(B)
Out[41]:
([0.857143 -0.394286 -0.331429; 0.428571 0.902857 0.0342857; -0.285714 0.171429 -0.942857], [14.0 21.0 -14.0; 0.0 175.0 -70.0; 0.0 0.0 35.0])
In [42]:
B
Out[42]:
3×3 Array{Int64,2}:
 12  -51    4
  6  167  -68
 -4   24  -41
In [43]:
Q2
Out[43]:
3×3 Array{Float64,2}:
  0.857143  -0.394286  -0.331429 
  0.428571   0.902857   0.0342857
 -0.285714   0.171429  -0.942857 
In [44]:
R2
Out[44]:
3×3 Array{Float64,2}:
 14.0   21.0  -14.0
  0.0  175.0  -70.0
  0.0    0.0   35.0
In [45]:
Q2 * R2
Out[45]:
3×3 Array{Float64,2}:
 12.0  -51.0    4.0
  6.0  167.0  -68.0
 -4.0   24.0  -41.0

Due to numerical errors, $A \neq QR$ but almost:

In [46]:
B == Q2 * R2
Out[46]:
false
In [47]:
assert(norm(B - (Q2 * R2)) <= 1e-13)

QR method

In [48]:
"Apply the QR method for maxiter steps. Should return a triangular matrix similar to A (same eigenvalues)."
function QR_method(A, maxiter=50)
    Ak = A
    for k = 1:maxiter
        Qk, Rk = QR(Ak)
        Ak = Rk * Qk
    end
    return Ak
end
Out[48]:
QR_method

It should produce matrix which are almost triangular!

In [49]:
Ak = QR_method(A)
Out[49]:
3×3 Array{Float64,2}:
 18.0249        -7.91186       -1.81805   
  2.23294e-35   -3.41599       -1.67461   
  2.70327e-164  -4.65436e-129   0.00799986
In [50]:
"Truncate to zero values under the diagonal (have to be smaller than tolerance)"
function truncate_to_zero_below_diag(A, tolerance=1e-12)
    assert(size(A)[1] == size(A)[2])
    n = size(A)[1]
    for j = 1:n
        for i = j+1:n
            assert(norm(A[i,j]) <= tolerance)
            A[i,j] = 0
        end
    end
    return A
end
Out[50]:
truncate_to_zero_below_diag
In [51]:
truncate_to_zero_below_diag(Ak)
Out[51]:
3×3 Array{Float64,2}:
 18.0249  -7.91186  -1.81805   
  0.0     -3.41599  -1.67461   
  0.0      0.0       0.00799986
In [52]:
Ak = QR_method(B)
Out[52]:
3×3 Array{Float64,2}:
 156.137         62.2705       -87.4604
   3.54467e-31  -34.1967        15.8136
  -2.29384e-47    1.53496e-14   16.06  
In [53]:
truncate_to_zero_below_diag(Ak)
Out[53]:
3×3 Array{Float64,2}:
 156.137   62.2705  -87.4604
   0.0    -34.1967   15.8136
   0.0      0.0      16.06  

Finally, we can extract the eigen values from the diagonal:

In [54]:
"Extract the diagonal from the QR method."
function QR_eigenvalues(A, maxiter=50)
    return diag(QR_method(A, maxiter))
end
Out[54]:
QR_eigenvalues

It does not work for matrices which are not full rank:

In [55]:
QR_eigenvalues(A)
Out[55]:
3-element Array{Float64,1}:
 18.0249    
 -3.41599   
  0.00799986
In [56]:
eigvals(A)
Out[56]:
3-element Array{Float64,1}:
 16.1168     
 -1.11684    
 -1.30368e-15

But it works fine for full rank matrix:

In [57]:
QR_eigenvalues(B)
Out[57]:
3-element Array{Float64,1}:
 156.137 
 -34.1967
  16.06  
In [58]:
eigvals(B)
Out[58]:
3-element Array{Float64,1}:
 156.137 
  16.06  
 -34.1967

Computing the eigen vectors

Now, we have a numerical algorithm that gives an approximation of the eigen values. For each eigen value $\lambda$, we can use the inverse iteration to find the corresponding eigen vector.

In [59]:
"Inverse iteration method to find the eigen vector corresponding to a given eigen value."
function inverseIteration(A, val, maxiter=20, userandomstart=true)
    mu_I = val * eye(A)
    inv_A_minus_muI = inv(A - mu_I)
    n = size(A)[1]
    if userandomstart
        X = randn((n, 1))
    else
        X = ones((n, 1))
    end
    for i = 1:maxiter
        next_X = inv_A_minus_muI * X
        X = next_X / norm(next_X)
    end
    X
end
Out[59]:
inverseIteration

Now, putting it all together:

In [60]:
"Approximation of D, V: D contains the eigen values (vector) and V the eigen vectors (column based)."
function QR_eigen(A, maxiter=20, userandomstart=true)
    n = size(A)[1]
    D = QR_eigenvalues(A, maxiter)
    V = zeros((n,n))
    for i = 1:n
        V[:,i] = inverseIteration(A, D[i], maxiter, userandomstart)
    end
    return D, V
end
Out[60]:
QR_eigen
In [61]:
D2, V2 = QR_eigen(B)
Out[61]:
([156.137, -34.1966, 16.06], [0.328147 0.254758 -0.990526; -0.936881 0.302793 0.0871754; -0.120717 0.918376 0.106104])
In [62]:
D2s, V2s = eig(B)
Out[62]:
([156.137, 16.06, -34.1967], [0.328147 -0.990526 0.254758; -0.936881 0.0871754 0.302793; -0.120717 0.106104 0.918376])

Up to a permutation, and up to multiplication by $-1$ of some eigen vectors, we got the good values!


Conclusion

It is well known that finding eigenpairs and finding roots of a polynomial are equivalent problems (thanks to companion matrices). The Abel-Ruffini theorem states that there is no exact algorithm to find roots of polynomial of degrees larger than $4$, and therefore there is no hope of finding an exact algorithm for the eigenpair problem.

Hence, we focussed only on one numerical (i.e., iterative) method.

It's sad to see that there is not more details about general algorithms for the eigenpair problem.

That's it for today, folks! See here for other notebooks I wrote.