Covariant transformation

<h2 id="examples-of-covariant-transformation">Examples of covariant transformation</h2>
<h3>The derivative of a function transforms covariantly</h3>
The explicit form of a covariant transformation is best introduced with the transformation properties of the derivative of a function. Consider a scalar function f (like the temperature at a location in a space) defined on a set of points p, identifiable in a given coordinate system 
 
 
 
 
 x
 
 i
 
 
 ,
 
 i
 =
 0
 ,
 1
 ,
 …
 
 
 {\displaystyle x^{i},\;i=0,1,\dots }
 
 (such a collection is called a <a href="/facts/Manifold/rXPERJtu">manifold</a>). If we adopt a new coordinates system 
 
 
 
 
 
 
 x
 ′
 
 
 
 j
 
 
 ,
 j
 =
 0
 ,
 1
 ,
 …
 
 
 {\displaystyle {x'}^{j},j=0,1,\dots }
 
 then for each i, the original coordinate 
 
 
 
 
 
 x
 
 
 i
 
 
 
 
 {\displaystyle {x}^{i}}
 
 can be expressed as a function of the new coordinates, so 
 
 
 
 
 x
 
 i
 
 
 
 (
 
 
 
 x
 ′
 
 
 
 j
 
 
 )
 
 ,
 j
 =
 0
 ,
 1
 ,
 …
 
 
 {\displaystyle x^{i}\left({x'}^{j}\right),j=0,1,\dots }
 
 One can express the derivative of f in old coordinates in terms of the new coordinates, using the <a href="/facts/Chain_rule/aMLYxP0x">chain rule</a> of the derivative, as

∂
              f
            
            
              ∂
              
                
                  x
                
                
                  i
                
              
            
          
        
        =
        
          
            
              ∂
              f
            
            
              ∂
              
                
                  
                    x
                    ′
                  
                
                
                  j
                
              
            
          
        
        
        
          
            
              ∂
              
                
                  
                    x
                    ′
                  
                
                
                  j
                
              
            
            
              ∂
              
                
                  x
                
                
                  i
                
              
            
          
        
      
    
    {\displaystyle {\frac {\partial f}{\partial {x}^{i}}}={\frac {\partial f}{\partial {x'}^{j}}}\;{\frac {\partial {x'}^{j}}{\partial {x}^{i}}}}

This is the explicit form of the covariant transformation rule. The notation of a normal derivative with respect to the coordinates sometimes uses a comma, as follows

f
          
            ,
            i
          
        
         
        
          
            
              
                =
              
              
                
                  d
                  e
                  f
                
              
            
          
        
         
        
          
            
              ∂
              f
            
            
              ∂
              
                x
                
                  i
                
              
            
          
        
      
    
    {\displaystyle f_{,i}\ {\stackrel {\mathrm {def} }{=}}\ {\frac {\partial f}{\partial x^{i}}}}

where the index i is placed as a lower index, because of the covariant transformation.

<h3>Basis vectors transform covariantly</h3>
A vector can be expressed in terms of basis vectors. For a certain coordinate system, we can choose the vectors tangent to the coordinate grid. This basis is called the coordinate basis.
To illustrate the transformation properties, consider again the set of points p, identifiable in a given coordinate system 
 
 
 
 
 x
 
 i
 
 
 
 
 {\displaystyle x^{i}}
 
 where 
 
 
 
 i
 =
 0
 ,
 1
 ,
 …
 
 
 {\displaystyle i=0,1,\dots }
 
 (<a href="/facts/Manifold/rXPERJtu">manifold</a>). A scalar function f, that assigns a real number to every point p in this space, is a function of the coordinates 
 
 
 
 f
 
 
 (
 
 
 x
 
 0
 
 
 ,
 
 x
 
 1
 
 
 ,
 …
 
 )
 
 
 
 {\displaystyle f\;\left(x^{0},x^{1},\dots \right)}
 
. A curve is a one-parameter collection of points c, say with curve parameter λ, c(λ). A tangent vector v to the curve is the derivative 
 
 
 
 d
 c
 
 /
 
 d
 λ
 
 
 {\displaystyle dc/d\lambda }
 
 along the curve with the derivative taken at the point p under consideration. Note that we can see the <a href="/facts/Tangent_vector/UEq5yffm">tangent vector</a> v as an operator (the <a href="/facts/Directional_derivative/Fjc3JSVn">directional derivative</a>) which can be applied to a function

v
        
        [
        f
        ]
         
        
          
            
              
                =
              
              
                
                  d
                  e
                  f
                
              
            
          
        
         
        
          
            
              d
              f
            
            
              d
              λ
            
          
        
        =
        
          
            
              d
              
              
            
            
              d
              λ
            
          
        
        f
        (
        c
        (
        λ
        )
        )
      
    
    {\displaystyle \mathbf {v} [f]\ {\stackrel {\mathrm {def} }{=}}\ {\frac {df}{d\lambda }}={\frac {d\;\;}{d\lambda }}f(c(\lambda ))}

The parallel between the tangent vector and the operator can also be worked out in coordinates

v
        
        [
        f
        ]
        =
        
          
            
              d
              
                x
                
                  i
                
              
            
            
              d
              λ
            
          
        
        
          
            
              ∂
              f
            
            
              ∂
              
                x
                
                  i
                
              
            
          
        
      
    
    {\displaystyle \mathbf {v} [f]={\frac {dx^{i}}{d\lambda }}{\frac {\partial f}{\partial x^{i}}}}

or in terms of operators 
 
 
 
 ∂
 
 /
 
 ∂
 
 x
 
 i
 
 
 
 
 {\displaystyle \partial /\partial x^{i}}

v
        
        =
        
          
            
              d
              
                x
                
                  i
                
              
            
            
              d
              λ
            
          
        
        
          
            
              ∂
              
              
            
            
              ∂
              
                x
                
                  i
                
              
            
          
        
        =
        
          
            
              d
              
                x
                
                  i
                
              
            
            
              d
              λ
            
          
        
        
          
            e
          
          
            i
          
        
      
    
    {\displaystyle \mathbf {v} ={\frac {dx^{i}}{d\lambda }}{\frac {\partial \;\;}{\partial x^{i}}}={\frac {dx^{i}}{d\lambda }}\mathbf {e} _{i}}

where we have written 
 
 
 
 
 
 e
 
 
 i
 
 
 =
 ∂
 
 /
 
 ∂
 
 x
 
 i
 
 
 
 
 {\displaystyle \mathbf {e} _{i}=\partial /\partial x^{i}}
 
, the tangent vectors to the curves which are simply the coordinate grid itself.
If we adopt a new coordinates system 
 
 
 
 
 
 
 x
 ′
 
 
 
 i
 
 
 ,
 
 i
 =
 0
 ,
 1
 ,
 …
 
 
 {\displaystyle {x'}^{i},\;i=0,1,\dots }
 
 then for each i, the old coordinate 
 
 
 
 
 
 x
 
 i
 
 
 
 
 
 {\displaystyle {x^{i}}}
 
 can be expressed as function of the new system, so 
 
 
 
 
 x
 
 i
 
 
 
 (
 
 
 
 x
 ′
 
 
 
 j
 
 
 )
 
 ,
 j
 =
 0
 ,
 1
 ,
 …
 
 
 {\displaystyle x^{i}\left({x'}^{j}\right),j=0,1,\dots }

Let 
 
 
 
 
 
 e
 
 
 i
 
 ′
 
 =
 
 ∂
 
 
 /
 
 
 ∂
 
 
 
 x
 ′
 
 
 
 i
 
 
 
 
 
 {\displaystyle \mathbf {e} '_{i}={\partial }/{\partial {x'}^{i}}}
 
 be the basis, tangent vectors in this new coordinates system. We can express 
 
 
 
 
 
 e
 
 
 i
 
 
 
 
 {\displaystyle \mathbf {e} _{i}}
 
 in the new system by applying the <a href="/facts/Chain_rule/aMLYxP0x">chain rule</a> on x. As a function of coordinates we find the following transformation

e
          
          
            i
          
          ′
        
        =
        
          
            ∂
            
              ∂
              
                
                  
                    x
                    ′
                  
                
                
                  i
                
              
            
          
        
        =
        
          
            
              ∂
              
                x
                
                  j
                
              
            
            
              ∂
              
                
                  
                    x
                    ′
                  
                
                
                  i
                
              
            
          
        
        
          
            ∂
            
              ∂
              
                x
                
                  j
                
              
            
          
        
        =
        
          
            
              ∂
              
                x
                
                  j
                
              
            
            
              ∂
              
                
                  
                    x
                    ′
                  
                
                
                  i
                
              
            
          
        
        
          
            e
          
          
            j
          
        
      
    
    {\displaystyle \mathbf {e} '_{i}={\frac {\partial }{\partial {x'}^{i}}}={\frac {\partial x^{j}}{\partial {x'}^{i}}}{\frac {\partial }{\partial x^{j}}}={\frac {\partial x^{j}}{\partial {x'}^{i}}}\mathbf {e} _{j}}

which indeed is the same as the covariant transformation for the derivative of a function.

<h2 id="contravariant-transformation">Contravariant transformation</h2>
The components of a (tangent) vector transform in a different way, called contravariant transformation. Consider a tangent vector v and call its components 
 
 
 
 
 v
 
 i
 
 
 
 
 {\displaystyle v^{i}}
 
 on a basis 
 
 
 
 
 
 e
 
 
 i
 
 
 
 
 {\displaystyle \mathbf {e} _{i}}
 
. On another basis 
 
 
 
 
 
 e
 
 
 i
 
 ′
 
 
 
 {\displaystyle \mathbf {e} '_{i}}
 
 we call the components 
 
 
 
 
 
 
 v
 ′
 
 
 
 i
 
 
 
 
 {\displaystyle {v'}^{i}}
 
, so

v
        
        =
        
          v
          
            i
          
        
        
          
            e
          
          
            i
          
        
        =
        
          
            
              v
              ′
            
          
          
            i
          
        
        
          
            e
          
          
            i
          
          ′
        
      
    
    {\displaystyle \mathbf {v} =v^{i}\mathbf {e} _{i}={v'}^{i}\mathbf {e} '_{i}}

in which

v
          
            i
          
        
        =
        
          
            
              d
              
                x
                
                  i
                
              
            
            
              d
              λ
            
          
        
        
        
          
             and 
          
        
        
        
          
            
              v
              ′
            
          
          
            i
          
        
        =
        
          
            
              d
              
                
                  
                    x
                    ′
                  
                
                
                  i
                
              
            
            
              d
              λ
            
          
        
      
    
    {\displaystyle v^{i}={\frac {dx^{i}}{d\lambda }}\;{\mbox{ and }}\;{v'}^{i}={\frac {d{x'}^{i}}{d\lambda }}}

If we express the new components in terms of the old ones, then

v
              ′
            
          
          
            i
          
        
        =
        
          
            
              d
              
                
                  
                    x
                    ′
                  
                
                
                  i
                
              
            
            
              d
              λ
              
              
            
          
        
        =
        
          
            
              ∂
              
                
                  
                    x
                    ′
                  
                
                
                  i
                
              
            
            
              ∂
              
                x
                
                  j
                
              
            
          
        
        
          
            
              d
              
                x
                
                  j
                
              
            
            
              d
              λ
            
          
        
        =
        
          
            
              ∂
              
                
                  
                    x
                    ′
                  
                
                
                  i
                
              
            
            
              ∂
              
                x
                
                  j
                
              
            
          
        
        
          
            v
          
          
            j
          
        
      
    
    {\displaystyle {v'}^{i}={\frac {d{x'}^{i}}{d\lambda \;\;}}={\frac {\partial {x'}^{i}}{\partial x^{j}}}{\frac {dx^{j}}{d\lambda }}={\frac {\partial {x'}^{i}}{\partial x^{j}}}{v}^{j}}

This is the explicit form of a transformation called the contravariant transformation and we note that it is different and just the inverse of the covariant rule. In order to distinguish them from the covariant (tangent) vectors, the index is placed on top.

<h3>Basis differential forms transform contravariantly</h3>
An example of a contravariant transformation is given by a <a href="/facts/Differential_form/T9gyHtMv">differential form</a> df. For f as a function of coordinates 
 
 
 
 
 x
 
 i
 
 
 
 
 {\displaystyle x^{i}}
 
, df can be expressed in terms of the basis 
 
 
 
 d
 
 x
 
 i
 
 
 
 
 {\displaystyle dx^{i}}
 
. The differentials dx transform according to the contravariant rule since

d
        
          
            
              x
              ′
            
          
          
            i
          
        
        =
        
          
            
              ∂
              
                
                  
                    x
                    ′
                  
                
                
                  i
                
              
            
            
              ∂
              
                
                  x
                
                
                  j
                
              
            
          
        
        
          
            d
            x
          
          
            j
          
        
      
    
    {\displaystyle d{x'}^{i}={\frac {\partial {x'}^{i}}{\partial {x}^{j}}}{dx}^{j}}

<h2 id="dual-properties">Dual properties</h2>
Entities that transform covariantly (like basis vectors) and the ones that transform contravariantly (like components of a vector and differential forms) are "almost the same" and yet they are different. They have "dual" properties.
What is behind this, is mathematically known as the <a href="/facts/Dual_space/4Uw4Knz1">dual space</a> that always goes together with a given linear <a href="/facts/Vector_space/M1pxMLx2">vector space</a>.
Take any vector space T. A function f on T is called linear if, for any vectors v, w and scalar α:

f
                (
                
                  v
                
                +
                
                  w
                
                )
              
              
                
                =
                f
                (
                
                  v
                
                )
                +
                f
                (
                
                  w
                
                )
              
            
            
              
                f
                (
                α
                
                  v
                
                )
              
              
                
                =
                α
                f
                (
                
                  v
                
                )
              
            
          
        
      
    
    {\displaystyle {\begin{aligned}f(\mathbf {v} +\mathbf {w} )&=f(\mathbf {v} )+f(\mathbf {w} )\\f(\alpha \mathbf {v} )&=\alpha f(\mathbf {v} )\end{aligned}}}

A simple example is the function which assigns a vector the value of one of its components (called a projection function). It has a vector as argument and assigns a real number, the value of a component.
All such scalar-valued linear functions together form a vector space, called the dual space of T. The sum f+g is again a linear function for linear f and g, and the same holds for scalar multiplication αf.
Given a basis 
 
 
 
 
 
 e
 
 
 i
 
 
 
 
 {\displaystyle \mathbf {e} _{i}}
 
 for T, we can define a basis, called the dual basis for the dual space in a natural way by taking the set of linear functions mentioned above: the projection functions. Each projection function (indexed by ω) produces the number 1 when applied to one of the basis vectors 
 
 
 
 
 
 e
 
 
 i
 
 
 
 
 {\displaystyle \mathbf {e} _{i}}
 
. For example, 
 
 
 
 
 ω
 
 0
 
 
 
 
 {\displaystyle \omega ^{0}}
 
 gives a 1 on 
 
 
 
 
 
 e
 
 
 0
 
 
 
 
 {\displaystyle \mathbf {e} _{0}}
 
 and zero elsewhere. Applying this linear function 
 
 
 
 
 
 ω
 
 
 0
 
 
 
 
 {\displaystyle {\omega }^{0}}
 
 to a vector 
 
 
 
 
 v
 
 =
 
 v
 
 i
 
 
 
 
 e
 
 
 i
 
 
 
 
 {\displaystyle \mathbf {v} =v^{i}\mathbf {e} _{i}}
 
, gives (using its linearity)

ω
          
            0
          
        
        (
        
          v
        
        )
        =
        
          ω
          
            0
          
        
        (
        
          v
          
            i
          
        
        
          
            e
          
          
            i
          
        
        )
        =
        
          v
          
            i
          
        
        
          ω
          
            0
          
        
        (
        
          
            e
          
          
            i
          
        
        )
        =
        
          v
          
            0
          
        
      
    
    {\displaystyle \omega ^{0}(\mathbf {v} )=\omega ^{0}(v^{i}\mathbf {e} _{i})=v^{i}\omega ^{0}(\mathbf {e} _{i})=v^{0}}

so just the value of the first coordinate. For this reason it is called the projection function.
There are as many dual basis vectors 
 
 
 
 
 ω
 
 i
 
 
 
 
 {\displaystyle \omega ^{i}}
 
 as there are basis vectors 
 
 
 
 
 
 e
 
 
 i
 
 
 
 
 {\displaystyle \mathbf {e} _{i}}
 
, so the dual space has the same dimension as the linear space itself. It is "almost the same space", except that the elements of the dual space (called dual vectors) transform covariantly and the elements of the tangent vector space transform contravariantly.
Sometimes an extra notation is introduced where the real value of a linear function σ on a tangent vector u is given as

σ
        [
        
          u
        
        ]
        :=
        ⟨
        σ
        ,
        
          u
        
        ⟩
      
    
    {\displaystyle \sigma [\mathbf {u} ]:=\langle \sigma ,\mathbf {u} \rangle }

where 
 
 
 
 ⟨
 σ
 ,
 
 u
 
 ⟩
 
 
 {\displaystyle \langle \sigma ,\mathbf {u} \rangle }
 
 is a real number. This notation emphasizes the bilinear character of the form. It is linear in σ since that is a linear function and it is linear in u since that is an element of a vector space.

<h2 id="co--and-contravariant-tensor-components">Co- and contravariant tensor components</h2>
<h3>Without coordinates</h3>
A <a href="/facts/Tensor/pNjFo5tV">tensor</a> of <a href="/facts/Type_of_a_tensor/I5H8r3a2">type (r, s)</a> may be defined as a real-valued multilinear function of r dual vectors and s vectors. Since vectors and dual vectors may be defined without dependence on a coordinate system, a tensor defined in this way is independent of the choice of a coordinate system.
The notation of a tensor is

T
                
                  (
                  
                    σ
                    ,
                    …
                    ,
                    ρ
                    ,
                    
                      u
                    
                    ,
                    …
                    ,
                    
                      v
                    
                  
                  )
                
              
            
            
              
                ≡

T
                      
                        σ
                        …
                        ρ
                      
                    
                  
                  
                    
                      u
                    
                    …
                    
                      v
                    
                  
                
              
            
          
        
      
    
    {\displaystyle {\begin{aligned}&T\left(\sigma ,\ldots ,\rho ,\mathbf {u} ,\ldots ,\mathbf {v} \right)\\\equiv {}&{T^{\sigma \ldots \rho }}_{\mathbf {u} \ldots \mathbf {v} }\end{aligned}}}

for dual vectors (differential forms) ρ, σ and tangent vectors 
 
 
 
 
 u
 
 ,
 
 v
 
 
 
 {\displaystyle \mathbf {u} ,\mathbf {v} }
 
. In the second notation the distinction between vectors and differential forms is more obvious.

<h3>With coordinates</h3>
Because a tensor depends linearly on its arguments, it is completely determined if one knows the values on a basis 
 
 
 
 
 ω
 
 i
 
 
 …
 
 ω
 
 j
 
 
 
 
 {\displaystyle \omega ^{i}\ldots \omega ^{j}}
 
 and 
 
 
 
 
 
 e
 
 
 k
 
 
 …
 
 
 e
 
 
 l
 
 
 
 
 {\displaystyle \mathbf {e} _{k}\ldots \mathbf {e} _{l}}

T
        (
        
          ω
          
            i
          
        
        ,
        …
        ,
        
          ω
          
            j
          
        
        ,
        
          
            e
          
          
            k
          
        
        …
        
          
            e
          
          
            l
          
        
        )
        =
        
          
            
              T
              
                i
                …
                j
              
            
          
          
            k
            …
            l
          
        
      
    
    {\displaystyle T(\omega ^{i},\ldots ,\omega ^{j},\mathbf {e} _{k}\ldots \mathbf {e} _{l})={T^{i\ldots j}}_{k\ldots l}}

The numbers 
 
 
 
 
 
 
 T
 
 i
 …
 j
 
 
 
 
 k
 …
 l
 
 
 
 
 {\displaystyle {T^{i\ldots j}}_{k\ldots l}}
 
 are called the components of the tensor on the chosen basis.
If we choose another basis (which are a linear combination of the original basis), we can use the linear properties of the tensor and we will find that the tensor components in the upper indices transform as dual vectors (so contravariant), whereas the lower indices will transform as the basis of tangent vectors and are thus covariant. For a tensor of rank 2, we can verify that

A
              ′
            
          
          
            i
            j
          
        
        =
        
          
            
              ∂
              
                x
                
                  l
                
              
            
            
              ∂
              
                
                  
                    x
                    ′
                  
                
                
                  i
                
              
            
          
        
        
          
            
              ∂
              
                x
                
                  m
                
              
            
            
              ∂
              
                
                  
                    x
                    ′
                  
                
                
                  j
                
              
            
          
        
        
          A
          
            l
            m
          
        
      
    
    {\displaystyle {A'}_{ij}={\frac {\partial x^{l}}{\partial {x'}^{i}}}{\frac {\partial x^{m}}{\partial {x'}^{j}}}A_{lm}}
  
 covariant tensor

A
 ′
 
 
 
 
 i
 j
 
 
 =
 
 
 
 ∂
 
 
 
 x
 ′
 
 
 
 i
 
 
 
 
 ∂
 
 x
 
 l
 
 
 
 
 
 
 
 
 ∂
 
 
 
 x
 ′
 
 
 
 j
 
 
 
 
 ∂
 
 x
 
 m
 
 
 
 
 
 
 A
 
 l
 m
 
 
 
 
 {\displaystyle {A'\,}^{ij}={\frac {\partial {x'}^{i}}{\partial x^{l}}}{\frac {\partial {x'}^{j}}{\partial x^{m}}}A^{lm}}
 
 contravariant tensor
For a mixed co- and contravariant tensor of rank 2

A
              ′
            
            
          
          
            i

j
          
        
        =
        
          
            
              ∂
              
                
                  
                    x
                    ′
                  
                
                
                  i
                
              
            
            
              ∂
              
                x
                
                  l
                
              
            
          
        
        
          
            
              ∂
              
                x
                
                  m
                
              
            
            
              ∂
              
                
                  
                    x
                    ′
                  
                
                
                  j
                
              
            
          
        
        
          A
          
            l

m
 
 
 
 
 {\displaystyle {A'\,}^{i}{}_{j}={\frac {\partial {x'}^{i}}{\partial x^{l}}}{\frac {\partial x^{m}}{\partial {x'}^{j}}}A^{l}{}_{m}}
 
 mixed co- and contravariant tensor
<h2 id="see-also">See also</h2>
<ul><li><a href="/facts/Covariance_and_contravariance_of_vectors/s3HGMkhz">Covariance and contravariance of vectors</a></li>
<li><a href="/facts/General_covariance/KzTG2t9g">General covariance</a></li>
<li><a href="/facts/Lorentz_covariance/K5AEuGKg">Lorentz covariance</a></li></ul>

<h2 id="references">References</h2>

<ol>
<li id="fn:1">Fleisch, Daniel A. (2011). "Covariant and contravariant vector components". A Student's Guide to Vectors and Tensors. <a href="#fnref:1" class="footnote-back-ref">↩</a></li>
</ol>

Covariant transformation open-in-new

Covariant transformation