Minimal polynomial (linear algebra)

<h2 id="formal-definition">Formal definition</h2>
Given an <a href="/facts/Endomorphism/s7NNlSea">endomorphism</a> T on a finite-dimensional <a href="/facts/Vector_space/M1pxMLx2">vector space</a> V over a <a href="/facts/Field_(mathematics)/xAjAS4ko">field</a> F, let IT be the set defined as

I
            
          
          
            T
          
        
        =
        {
        p
        ∈
        
          F
        
        [
        t
        ]
        ∣
        p
        (
        T
        )
        =
        0
        }
        ,
      
    
    {\displaystyle {\mathit {I}}_{T}=\{p\in \mathbf {F} [t]\mid p(T)=0\},}

where F[t ] is the space of all polynomials over the field F. IT is a <a href="/facts/Ideal_(ring_theory)/vCvytduX">proper ideal</a> of F[t ]. Since F is a field, F[t ] is a <a href="/facts/Principal_ideal_domain/JCwAUnyW">principal ideal domain</a>, thus any ideal is generated by a single polynomial, which is unique up to a <a href="/facts/Unit_(ring_theory)/GJSBu8Y7">unit</a> in F. A particular choice among the generators can be made, since precisely one of the generators is <a href="/facts/Monic_polynomial/4JzuIJ5R">monic</a>. The minimal polynomial is thus defined to be the monic polynomial that generates IT. It is the monic polynomial of least degree in IT.

<h2 id="applications">Applications</h2>
An <a href="/facts/Endomorphism/s7NNlSea">endomorphism</a> φ of a finite-dimensional vector space over a field F is <a href="/facts/Diagonalizable/y3MkyTCy">diagonalizable</a> <a href="/facts/If_and_only_if/bYSxGJ66">if and only if</a> its minimal polynomial factors completely over F into distinct linear factors. The fact that there is only one factor X − λ for every eigenvalue λ means that the <a href="/facts/Generalized_eigenspace/Lf7MgEGJ">generalized eigenspace</a> for λ is the same as the <a href="/facts/Eigenspace/8TjEoT8u">eigenspace</a> for λ: every <a href="/facts/Jordan_block/RSDz7mn5">Jordan block</a> has size 1. More generally, if φ satisfies a polynomial equation P(φ) = 0 where P factors into distinct linear factors over F, then it will be diagonalizable: its minimal polynomial is a divisor of P and therefore also factors into distinct linear factors. In particular one has:

<ul><li>P = X k − 1: finite order endomorphisms of <a href="/facts/Complex_number/2w7mqI7e">complex</a> vector spaces are diagonalizable. For the special case k = 2 of <a href="/facts/Involution_(mathematics)/kKxYrUVa">involutions</a>, this is even true for endomorphisms of vector spaces over any field of <a href="/facts/Characteristic_(algebra)/D2VzcQaG">characteristic</a> other than 2, since X 2 − 1 = (X − 1)(X + 1) is a factorization into distinct factors over such a field. This is a part of <a href="/facts/Representation_theory/3ydLtaGH">representation theory</a> of <a href="/facts/Cyclic_group/JL5zfFuL">cyclic groups</a>.</li>
<li>P = X 2 − X = X(X − 1): endomorphisms satisfying φ2 = φ are called <a href="/facts/Projection_(linear_algebra)/sElXGkxD">projections</a>, and are always diagonalizable (moreover their only eigenvalues are 0 and 1).</li>
<li>By contrast if μφ = X k with k ≥ 2 then φ (a <a href="/facts/Nilpotent/rERm2Pp7">nilpotent</a> endomorphism) is not necessarily diagonalizable, since X k has a repeated root 0.</li></ul>
These cases can also be <a href="/facts/Mathematical_proof/7ECDrU80">proved</a> directly, but the minimal polynomial gives a unified perspective and proof.

<h2 id="computation">Computation</h2>
For a nonzero vector v in V define:

I
            
          
          
            T
            ,
            v
          
        
        =
        {
        p
        ∈
        
          F
        
        [
        t
        ]
        
        
          |
        
        
        p
        (
        T
        )
        (
        v
        )
        =
        0
        }
        .
      
    
    {\displaystyle {\mathit {I}}_{T,v}=\{p\in \mathbf {F} [t]\;|\;p(T)(v)=0\}.}

This definition satisfies the properties of a proper ideal. Let μT,v be the monic polynomial which generates it.

<h3>Properties</h3>
<ul><li>Since IT,v contains the minimal polynomial μT, the latter is divisible by μT,v.</li><li>If d is the least <a href="/facts/Natural_number/u65og8JE">natural number</a> such that v, T(v), ..., Td(v) are <a href="/facts/Linearly_dependent/8JrHfIIa">linearly dependent</a>, then there exist unique a0, a1, ..., ad−1 in F, not all zero, such that

a
          
            0
          
        
        v
        +
        
          a
          
            1
          
        
        T
        (
        v
        )
        +
        ⋯
        +
        
          a
          
            d
            −
            1
          
        
        
          T
          
            d
            −
            1
          
        
        (
        v
        )
        +
        
          T
          
            d
          
        
        (
        v
        )
        =
        0
      
    
    {\displaystyle a_{0}v+a_{1}T(v)+\cdots +a_{d-1}T^{d-1}(v)+T^{d}(v)=0}

and for these coefficients one has

μ
 
 T
 ,
 v
 
 
 (
 t
 )
 =
 
 a
 
 0
 
 
 +
 
 a
 
 1
 
 
 t
 +
 …
 +
 
 a
 
 d
 −
 1
 
 
 
 t
 
 d
 −
 1
 
 
 +
 
 t
 
 d
 
 
 .
 
 
 {\displaystyle \mu _{T,v}(t)=a_{0}+a_{1}t+\ldots +a_{d-1}t^{d-1}+t^{d}.}
 
</li><li>Let the <a href="/facts/Linear_subspace/2wtKTgHL">subspace</a> W be the image of μT,v (T ), which is <a href="/facts/Stability_spectrum/DrJwMgXf">T-stable</a>. Since μT,v (T ) annihilates at least the vectors v, T(v), ..., T d−1(v), the <a href="/facts/Codimension/AlGab3eJ">codimension</a> of W is at least d.</li><li>The minimal polynomial μT is the product of μT,v and the minimal polynomial Q of the restriction of T to W. In the (likely) case that W has dimension 0 one has Q = 1 and therefore μT = μT,v ; otherwise a recursive computation of Q suffices to find μT .</li></ul>
<h3>Example</h3>
Define T to be the endomorphism of R3 with matrix, on the <a href="/facts/Canonical_basis/I3dJq0Q3">canonical basis</a>,

(
            
              
                
                  1
                
                
                  −
                  1
                
                
                  −
                  1
                
              
              
                
                  1
                
                
                  −
                  2
                
                
                  1
                
              
              
                
                  0
                
                
                  1
                
                
                  −
                  3
                
              
            
            )
          
        
        .
      
    
    {\displaystyle {\begin{pmatrix}1&-1&-1\\1&-2&1\\0&1&-3\end{pmatrix}}.}

Taking the first canonical <a href="/facts/Basis_vector/89IPoN6c">basis vector</a> e1 and its repeated images by T one obtains

e
          
            1
          
        
        =
        
          
            [
            
              
                
                  1
                
              
              
                
                  0
                
              
              
                
                  0
                
              
            
            ]
          
        
        ,
        
        T
        ⋅
        
          e
          
            1
          
        
        =
        
          
            [
            
              
                
                  1
                
              
              
                
                  1
                
              
              
                
                  0
                
              
            
            ]
          
        
        .
        
        
          T
          
            2
          
        
        
        ⋅
        
          e
          
            1
          
        
        =
        
          
            [
            
              
                
                  0
                
              
              
                
                  −
                  1
                
              
              
                
                  1
                
              
            
            ]
          
        
        
          
             and
          
        
        
        
          T
          
            3
          
        
        
        ⋅
        
          e
          
            1
          
        
        =
        
          
            [
            
              
                
                  0
                
              
              
                
                  3
                
              
              
                
                  −
                  4
                
              
            
            ]
          
        
      
    
    {\displaystyle e_{1}={\begin{bmatrix}1\\0\\0\end{bmatrix}},\quad T\cdot e_{1}={\begin{bmatrix}1\\1\\0\end{bmatrix}}.\quad T^{2}\!\cdot e_{1}={\begin{bmatrix}0\\-1\\1\end{bmatrix}}{\mbox{ and}}\quad T^{3}\!\cdot e_{1}={\begin{bmatrix}0\\3\\-4\end{bmatrix}}}

of which the first three are easily seen to be <a href="/facts/Linearly_independent/8JrHfIIa">linearly independent</a>, and therefore <a href="/facts/Linear_span/vPjclEtb">span</a> all of R3. The last one then necessarily is a linear combination of the first three, in fact

T 3 ⋅ e1 = −4T 2 ⋅ e1 − T ⋅ e1 + e1,
so that: 

μT, e1 = X 3 + 4X 2 + X − I.
This is in fact also the minimal polynomial μT and the characteristic polynomial χT : indeed μT, e1 divides μT which divides χT, and since the first and last are of <a href="/facts/Degree_of_a_polynomial/Bf8vEIhf">degree</a> 3 and all are monic, they must all be the same. Another reason is that in general if any polynomial in T annihilates a vector v, then it also annihilates T ⋅v (just apply T to the equation that says that it annihilates v), and therefore by iteration it annihilates the entire space generated by the iterated images by T of v; in the current case we have seen that for v = e1 that space is all of R3, so μT, e1(T ) = 0. Indeed one verifies for the full matrix that T 3 + 4T 2 + T − I3 is the <a href="/facts/Zero_matrix/uqbGFFww">zero matrix</a>:

[
            
              
                
                  0
                
                
                  1
                
                
                  −
                  3
                
              
              
                
                  3
                
                
                  −
                  13
                
                
                  23
                
              
              
                
                  −
                  4
                
                
                  19
                
                
                  −
                  36
                
              
            
            ]
          
        
        +
        4
        
          
            [
            
              
                
                  0
                
                
                  0
                
                
                  1
                
              
              
                
                  −
                  1
                
                
                  4
                
                
                  −
                  6
                
              
              
                
                  1
                
                
                  −
                  5
                
                
                  10
                
              
            
            ]
          
        
        +
        
          
            [
            
              
                
                  1
                
                
                  −
                  1
                
                
                  −
                  1
                
              
              
                
                  1
                
                
                  −
                  2
                
                
                  1
                
              
              
                
                  0
                
                
                  1
                
                
                  −
                  3
                
              
            
            ]
          
        
        +
        
          
            [
            
              
                
                  −
                  1
                
                
                  0
                
                
                  0
                
              
              
                
                  0
                
                
                  −
                  1
                
                
                  0
                
              
              
                
                  0
                
                
                  0
                
                
                  −
                  1
                
              
            
            ]
          
        
        =
        
          
            [
            
              
                
                  0
                
                
                  0
                
                
                  0
                
              
              
                
                  0
                
                
                  0
                
                
                  0
                
              
              
                
                  0
                
                
                  0
                
                
                  0
                
              
            
            ]
          
        
      
    
    {\displaystyle {\begin{bmatrix}0&1&-3\\3&-13&23\\-4&19&-36\end{bmatrix}}+4{\begin{bmatrix}0&0&1\\-1&4&-6\\1&-5&10\end{bmatrix}}+{\begin{bmatrix}1&-1&-1\\1&-2&1\\0&1&-3\end{bmatrix}}+{\begin{bmatrix}-1&0&0\\0&-1&0\\0&0&-1\end{bmatrix}}={\begin{bmatrix}0&0&0\\0&0&0\\0&0&0\end{bmatrix}}}

<h2 id="see-also">See also</h2>
<ul><li>Annihilating polynomial</li></ul>

<ul><li><a href="/facts/Serge_Lang/I7MW2rUu">Lang, Serge</a> (2002), Algebra, <a href="/facts/Graduate_Texts_in_Mathematics/VTIfY29j">Graduate Texts in Mathematics</a>, vol. 211 (Revised third ed.), New York: Springer-Verlag, <a href="/facts/ISBN_(identifier)/15AdSPa9">ISBN</a> 978-0-387-95385-4, <a href="/facts/MR_(identifier)/uP137L11">MR</a> <a href="https://mathscinet.ams.org/mathscinet-getitem?mr=1878556">1878556</a></li></ul>

Minimal polynomial (linear algebra) open-in-new

Minimal polynomial (linear algebra)