|
|
|
COVARIANCE AND
CONTRAVARIANCE First we
will explain the distinction between the covariant and contravariant
components of vectors, thinking of vector-fields where a vector is
defined at a point rather than as a position vector. This extends
naturally to the components of higher order tensors. Strictly
speaking, despite usage to the contrary, there is no such thing as a
“covariant vector” or a “contravariant vector”. A vector is
a vector is a vector. However it may be handled in two ways. Firstly by
means of its components parallel to the coordinate
directions which form a parallelogram in the two-dimensional case, in
the same way that dx
and dy are
defined as the sides of the parallelogram related to an infinitesimal
displacement ds. These
components are referred to as its contravariant components.
Secondly we may handle it by means of its resolved parts along
the coordinate directions, which are its covariant components.
The latter are the inner products of the vector with the coordinate
unit vectors. The distinction is
important e.g. when finding inner products such as F.s
for the work done by a force F
producing a displacement s. We will follow that up later. We
will work with vectors in two dimensions to
illustrate the principles involved. We will use non-orthogonal
cartesian coordinates i.e. coordinates defined relative to
non-orthogonal axes. However tensors are especially concerned with
the use of curvilinear coordinates, where vectors and tensors are
referred to curved coordinate lines which approach linearity at
infinitesimal distances. The coordinate axes used below should be
regarded as the tangents to such coordinate lines in such cases, and
vectors as directed magnitudes at an origin O which is a local point
in a field. The coordinate directions thus vary as O is varied. This
covers cases where both coordinates are of the same type
(polar coordinates in two dimensions are an example where they are
not).
Contravariant Components The components of a vector in two dimensions are
defined in
the literature in relation to a change of coordinates from (x,y) to
(x',y'), say. The contravariant components are those which transform as follows e.g. for the new
coordinate x' in terms of the old (x,y):
and
similarly for y'. This is far from obvious at first sight, so we
will show how the partial derivatives relate to the geometry.
The vector
at O is represented by OV and the parallelogram-component on the axis
OX is OA, where VA is parallel to the axis OY. We will only
illustrate the situation for the x-components. If we change
coordinates to OX', OY' then the new x-component is OA' where VA' is
parallel to OY'. Now we join A to P on OX' such that AP is parallel
to OY'. Using the sine rule we get
where γ=φ+β-α
and μ=180-φ-β. Noting that
OA'=x', OA=x and AV=y, partial differentiation of
this
with respect to x gives
from
triangle VQA', holding x
constant,giving from (2)
as
required. A similar argument holds for the new y coordinate. The
generalised version of (1) for more than two dimensions, using
overlines instead of primes, is
or,
using the repeated-index summing convention for k,
For the
contravariant components it is customary to use superscripts for the indices
such as j and k. Useful expressions
for the contravariant coordinates of
OV are, using the sine rule, (4)Covariant Components The
covariant components of a vector are defined by the transformation
using subscripts for the indices in the covariant
case. For
the x-coordinate in two dimensions this is
where the partial derivatives are "inverted"
compared with the contravariant case.
![]() Figure 3 Then
(7)Solving for θ gives
which by (7)
is which by (8)
is
We now encounter a subtlety of the meaning of the
"inverted" partial derivatives, for they refer to the coordinates which
are contravariant, so we must relate this back to them as follows: ![]() Figure 4
If OX'=δx',
OX=δx and OY=δy then using the sine rule
in the infinitesimal case we get
showing
that (9) is the same as (6), as required. For more than two
dimensions the principle is the same but OV is no longer necessarily
in a coordinate plane. We have thus exhibited
how the geometrical interpretation of covariance and contravariance
relates to the formal definitions when the components are of the same
type. The distinction between contravariance and covariance is important e.g. when finding inner products such as F.s for the work W done by a force F producing a displacement s. We take the inner product of the two vectors which usually means resolving F along the direction of s. The actual evaluation of W amounts to summing the products of the coordinate-system-components of s by the resolved parts of F. That is, we sum the products of the contravariant components of s and the covariant components of F as for an inner vector product. To use instead the contravariant components of F (which are perfectly respectable quantities) would obviously give the wrong result for W. However, we may instead use the covariant components of s multiplied by the contravariant ones of F and get the correct result, but it seems an unnatural way to handle the problem. It is more natural to handle F by means of its covariant components, which is perhaps why the loose description of a force as a “covariant vector” has crept in. Similarly s is most naturally handled by means of its parallelogram components. We will now show
how this works
explicitly. Applying
(4) to the vector s
represented by OV of length s
as in Figure 2, but at an angle ψ to
OX, gives
The covariant
components of F represented by OV
as in Figure 3 are:
and combining the two gives
the inner product in tensor
form:
which is the standard expression for the inner product. If we change the coordinate system then the covariant components of F will change such that the above inner product remains invariant (and valid!). This may explain the use of covariant for such components. Generally a tensor is characterised by a set of functions defining how its components vary with the coordinates. A set of functions comprise a tensor if the components satisfy (3) or (5). Another test is to multiply a set of functions by a tensor, and if the result is a tensor then so are those functions. To find out whether the functions are the simplest possible for a tensor is more difficult, remembering that the tensor is an entity that is described by the functions, just as a velocity is an independent physical entity that may be described in various ways. Such an entity exists independently of the coordinates used to describe it since any equations involving it will, in view of (3) and (5), be the same in any coordinate system e.g. work done expressed by an inner product. However the functions may prove to be simpler in one coordinate system than another e.g. a radial electric field is better described in polar coordinates than cartesian. |