< REL=STYLESHEET TYPE="text/css" HREF="cstyle.css">>
Mathematical Appendix
Vectors
Physical effects involve things acting on other things to produce a change of position, tension etc. These effects usually depend upon the strength, angle of contact, separation etc of the interacting things rather than on any absolute reference frame so it is useful to describe the rules that govern the interactions in terms of the relative positions and lengths of the interacting things rather than in terms of any fixed viewpoint or coordinate system. Vectors were introduced in physics to allow such relative descriptions.
The use of vectors in elementary physics often avoids any real understanding of what they are. They are a new concept, as unique as numbers themselves, which have been related to the rest of mathematics and geometry by a series of formulae such as linear combinations, scalar products etc.
Vectors are defined as "directed line segments" which means they are lines drawn in a particular direction. The introduction of time as a geometric entity means that this definition of a vector is rather archaic, a better definition might be that a vector is information arranged as a continuous succession of points in space and time. Vectors have length and direction, the direction being from earlier to later.
Vectors are represented by lines terminated with arrow symbols to show the direction. A point that moves from the left to the right for about three centimetres can be represented as:
------------------->
If a vector is represented within a coordinate system it has components along each of the axes of the system. These components do not normally start at the origin of the coordinate system.
The vector represented by the bold arrow has components a, b and c which are lengths on the coordinate axes. If the vector starts at the origin the components become simply the coordinates of the end point of the vector and the vector is known as the position vector of the end point.
Addition of Vectors
If two vectors are connected so that the end point of one is the start of the next the sum of the two vectors is defined as a third vector drawn from the start of the first to the end of the second:
c is the sum of a and b:
c = a + b
If the components of a are a, b, c and the components of b are d, e, f then the components of the sum of the two vectors are (a+d), (b+e) and (c+f). In other words, when vectors are added it is the components that add numerically rather than the lengths of the vectors themselves.
Rules of Vector Addition
1. Commutativity a + b = b + a
2. Associativity (a + b) + c = a + (b + c)
If the zero vector (which has no length) is labelled as 0
3. a + (-a) = 0
4. a + 0 = a
Rules of Vector Multiplication by a Scalar
The discussion of components and vector addition shows that if vector a has components a,b,c then qa has components qa, qb, qc. The meaning of vector multiplication is shown below:
The bottom vector c is added three times which is equivalent to multiplying it by 3.
1. Distributive laws q(a + b) = qa + qb and (q + p)a = qa + pa
2. Associativity q(pa) = qpa
Also 1 a = a
If the rules of vector addition and multiplication by a scalar apply to a set of elements they are said to define a vector space.
Linear Combinations and Linear Dependence
An element of the form:
q1a1 + q2a2 + q3a3 +.... + qmam
is called a linear combination of the vectors.
The set of vectors multiplied by scalars in a linear combination is called the span of the vectors. The word span is used because the scalars (q) can have any value - which means that any point in the subset of the vector space defined by the span can contain a vector derived from it.
Suppose there were a set of vectors (a1,a2,.... ,am) , if it is possible to express one of these vectors in terms of the others, using any linear combination, then the set is said to be linearly dependent. If it is not possible to express any one of the vectors in terms of the others, using any linear combination, it is said to be linearly independent.
In other words, if there are values of the scalars such that:
(1). a1 = q2a2 + q3a3 +.... + qmam
the set is said to be linearly dependent.
There is a way of determining linear dependence. From (1) it can be seen that if q1 is set to minus one then:
q1a1 + q2a2 + q3a3 +.... + qmam = 0
So in general, if a linear combination can be written that sums to a zero vector then the set of vectors (a1,a2,.... ,am) are not linearly independent.
If two vectors are linearly dependent then they lie along the same line (wherever a and b lie on the line, scalars can be found to produce a linear combination which is a zero vector). If three vectors are linearly dependent they lie on the same line or on a plane (collinear or coplanar).
Dimension
If n+1 vectors in a vector space are linearly dependent then n vectors are linearly independent and the space is said to have a dimension of n. The set of n vectors is said to be the basis of the vector space.
Scalar Product
Also known as the 'dot product' or 'inner product'. The scalar product is a way of removing the problem of angular measures from the relationship between vectors and, as Weyl put it, a way of comparing the lengths of vectors that are arbitrarily inclined to each other.
Consider two vectors with a common origin:
The projection of a on b is:
P = | a | cos q
Where | a | is the length of a.
The scalar product is defined as:
(2) a . b = | a | | b | cos q
Notice that cos q is zero if a and b are perpendicular. This means that if the scalar product is zero the vectors composing it are orthogonal (perpendicular to each other).
(2) also allows cos q to be defined as:
cos q = a . b / ( | a | | b |)
The definition of the scalar product also allows a definition of the length of a vector in terms of the concept of a vector itself. The scalar product of a vector with itself is:
a . a = | a | | a | cos 0
cos 0 (the cosine of zero) is one so:
a . a = a2
which is our first direct relationship between vectors and scalars. This can be expressed as:
(3) a = (a . a)1/2
where a is the length of a.
Properties:
1. Linearity [Ga + Hb].c = Ga.c + Hb.c
2. symmetry a.b = b.a
3. Positive definiteness a.a is greater than or equal to 0
4. Distributivity for vector addition (a + b).c = a.c + b.c
5. Schwarz inequality | a.b | £ ab
6. Parallelogram equality | a + b |2 + | a - b |2 = 2( | a |2 + | b |2)
From the point of view of vector physics the most important property of the scalar product is the expression of the scalar product in terms of coordinates.
7. a.b = a1b1 + a2b2 + a3b3
This gives us the length of a vector in terms of coordinates (Pythagoras' theorem) from:
8. a.a = a2 = (a1) 2 + (a2) 2 + (a3)2
The derivation of 7 is:
a = a1i + a2j + a3k
where i, j, k are unit vectors along the coordinate axes. From (4)
a.b = (a1i + a2j + a3k) .b = a1i.b + a2j.b + a3k.b
but b = b1i + b2j + b3k
so:
a.b = b1a1i .i + b2a1i .j + b3a1i .k + b1a2j.i + b2a2j.j + b3a2j.k + b1a3k.i + b2a3k.j + b3a3k.k
i .j, i .k, j .k, etc. are all zero because the vectors are orthogonal, also i .i, j.j and k.k are all one (these are unit vectors defined to be 1 unit in length).
Using these results:
a.b = a1b1 + a2b2 + a3b3
Matrices
Matrices are sets of numbers arranged in a rectangular array. They are especially important in linear algebra because they can be used to represent the elements of linear equations.
11a + 2b = c
5a + 7b = d
The constants in the equation above can be represented as a matrix:
11 2
A =
5 7
The elements of matrices are usually denoted symbolically using lower case letters:
a11 a12
A =
a21 a22
Matrices are said to be equal if all of the corresponding elements are equal.
Eg: if aij= bij
Then A = B
Matrix Addition
Matrices are added by adding the individual elements of one matrix to the corresponding elements of the other matrix.
cij= aij + bij
or C = A + B
Matrix addition has the following properties:
1. Commutativity A + B = B + A
2. Associativity (A + B) + C = A + (B + C)
and
3. A + (-A) = 0
4. A + 0 = A
From matrix addition it can be seen that the product of a matrix A and a number p is simply pA where every element of the matrix is multiplied individually by p.
Transpose of a Matrix
A matrix is transposed when the rows and columns are interchanged:
a11 a12 a13
A = a21 a22 a23
a31 a32 a33
a11 a21 a31
AT = a12 a22 a32
a13 a23 a33
Notice that the principal diagonal elements stay the same after transposition.
A matrix is symmetric if it is equal to its transpose eg: akj = ajk.
It is skew symmetric if AT = -A eg: akj = -ajk. The principal diagonal of a skew symmetric matrix is composed of elements that are zero.
Other Types of Matrix
Diagonal matrix: all elements above and below the principal diagonal are zero.
4 0 0
0 -1 0
0 0 2
Unit matrix: denoted by I, is a diagonal matrix where all elements of the principal diagonal are 1.
1 0 0
0 1 0
0 0 1
Matrix Multiplication
Matrix multiplication is defined in terms of the problem of determining the coefficients in linear transformations.
Consider a set of linear transformations between 2 coordinate systems that share a common origin and are related to each other by a rotation of the coordinate axes.
Two Coordinate Systems Rotated Relative to Each Other
If there are 3 coordinate systems, x, y, and z these can be transformed from one to another:
x1 = a11y1 + a12y2
x2 = a21y1 + a22y2
y1 = b11z1 + b12z2
y2 = b21z1 + b22z2
x1 = c11z1 + c12z2
x2 = c21z1 + c22z2
By substitution:
x1 = a11(b11z1 + b12z2) + a12(b21z1 + b22z2)
x2 = a21(b11z1 + b12z2) + a22(b21z1 + b22z2)
x1 = (a11b11 + a12b21)z1 + (a11b12 + a12b22)z2
x2 = (a21b11 + a22b21)z1 + (a21b12 + a22b22)z2
Therefore:
c11 = (a11b11 + a12b21) c12 = (a11b12 + a12b22)
c21 = (a21b11 + a22b21) c22 = (a21b12 + a22b22)
The coefficient matrices are:
a11 a12
A = a21 a22
b11 b12
B = b21 b22
c11 c12
C = c21 c22
From the linear transformation the product of A and B is defined as:
(a11b11 + a12b21) (a11b12 + a12b22)
C = AB =
(a21b11 + a22b21) (a21b12 + a22b22)
In the discussion of scalar products it was shown that, for a plane the scalar product is calculated as: a.b = a1b1 + a2b2 where a and b are the coordinates of the vectors a and b.
Now mathematicians use a sleight of hand that probably deserves closer inspection; they define the rows and columns of a matrix as vectors:
b11
A Column vector is b = b21
And a Row vector a = a11 a12
Matrices can be described as vectors eg:
a11 a12 a1
A = a21 a22 = a2
and
b11 b12
B = b21 b22 = b1 b2
Matrix multiplication is then defined as the scalar products of the vectors so that:
a1.b1 a1.b2
C = a2.b1 a2.b2
From the definition of the scalar product a1.b1 = a11b11 + a12b21 etc.
In the general case:
a1.b1 a1.b2 . a1.bn
C = a2.b1 a2.b2 . a2.bn
. . . .
am.b1 . . am.bn
This is described as the multiplication of rows into columns (eg: row vectors into column vectors). The first matrix must have the same number of columns as there are rows in the second matrix or the multiplication is undefined.
After matrix multiplication the product matrix has the same number of rows as the first matrix and columns as the second matrix:
1 3 4 times 2 has 2 rows and 1 column 39
6 3 2 3 35
7
Properties of Matrix Multiplication
1. Not commutative AB ¹ BA
2. Associative A(BC) = (AB)C
(kA)B = k(AB) = A(kB)
3. Distributative for matrix addition
(A + B)C = AC + BC
matrix multiplication is not commutative so C(A + B) = CA + CB is a separate case.
4. The cancellation law is not always true:
AB = 0 does not mean A=0 or B=0
There is a case where matrix multiplication is commutative. This involves the scalar matrix where the values of the principle diagonal are all equal. Eg:
k 0 0
S = 0 k 0
0 0 k
In this case AS = SA = kA. If the scalar matrix is the unit matrix: AI = IA = A.
Linear Transformations
A simple linear transformation such as:
x1 = a11y1 + a12y2
x2 = a21y1 + a22y2
can be expressed as:
x = Ay
eg:
x1 a11 a12 y1
= *
x2 a21 a22 y2
and
y1 = b11z1 + b12z2
y2 = b21z1 + b22z2
as: y = Bz
Using the associative law:
x = A(Bz) = ABz = Cz
and so:
(a11b11 + a12b21) (a11b12 + a12b22)
C = AB =
(a21b11 + a22b21) (a21b12 + a22b22)
Indicial Notation
Consider a simple rotation of coordinates:
xm is x1 , x2
xn is x'1 , x'2
The scalar product can be written as:
s.s =gm n xm xn
Where:
1 0
gm n =
0 1
and is called the metric for this 2D space.
s.s = g11 x1x'1 + g12 x1x'2 + g21 x2x'1 + g22 x2x'2
Now, g11 = 1, g12 = 0, g21 = 0, g22 = 1 so:
s.s = x1x'1 + x2x'2
If there is no rotation of coordinates the scalar product is:
s.s = x1x1 + x2x2
s2 = x12 + x22
Which is Pythagoras' theorem.
The Summation Convention
Indexes that appear as both subscripts and superscripts are summed over.
gm n xm xn = g11 x1x'1 + g12 x1x'2 + g21 x2x'1 + g22 x2x'2
gm n xm xn = g11 x1x'1 + g21 x2x'1 where n = 1
by promoting n to a superscript it is taken out of the summation.
Matrix Multiplication in Indicial Notation
Consider:
Columns times rows:
x1 times y1 y2 = x1 y1 x1 y2
x2 x2 y1 x2 y2
Matrix product XY = xiyj Where i = 1, 2 j = 1, 2
There being no summation the indexes are both subscripts.
Rows times columns:
x1 x2 times y1 = x1 y1 + x2 y2
y2
Matrix product XY = d ijxiyj
Where d ij is known as Kronecker delta and has the value 0 when i ¹ j and 1 when i = j. It is the indicial equivalent of the unit matrix:
1 0
0 1
There being summation one value of i is a subscript and the other a superscript.
A matrix in general can be specified by any of:
Mij , Mij , Mij , Mij depending on which subscript or superscript is being summed over.
Vectors in Indicial Notation
A vector can be expressed as a sum of basis vectors.
x = a1e1 + a2e2 + a3e3
In indicial notation this is: x = aiei
Linear Transformations
Consider x = Ay where A is a coefficient matrix and x and y are coordinate matrices.
In indicial notation this is:
xm = Am n xn
which becomes:
x1 = a11 x'1+ a12 x'2+ a13 x'3
x2 = a21 x'1+ a22 x'2+ a23 x'3
x3 = a31 x'1+ a32 x'2+ a33 x'3
The Scalar Product
In indicial notation the scalar product is:
x.y = d ijxiyj
In matrix notation the scalar product of two column vectors is:
x1 y1
X = x2 Y = y2
x3 y3
x.y = XTY = d ijxiyj
Eigenvalues and Eigenfunctions
Important classes of second order differential equations are examples of the "Sturm-Liouville equation". These consist of the differential equation plus boundary conditions over some interval. A typical example is:
d2y/dx2 + λy = 0
with the boundary conditions that y at zero equals 0 and y at π = 0
For positive values of λ the solution is obtained by substituting λ=ν2 so that
y(x) = A cos νx + B sin νx
A can be found to be zero from the first boundary condition where y(0) = 0, therefore:
y(x) = B sin νx
The second boundary condition y(π)=0 gives, when x= π:
B sin ν π = 0
This applies when ν = 0, ±1,±2, ±3 etc..
Now λ=ν
2 so λ= 1,4,9 etc.. (ν2=0 results in y = 0 which is not a permitted solution)When B equals 1:
y(x) = sin νx
The function sin νx is known as a characteristic function, or
eigenfunction, of this problem and the characteristic values, or eigenvalues, are the values of λ.In other words an eigenfunction is a solution of the equation that satisfies the boundary conditions and the eigenvalues are values of the constant in the main "Sturm-Liouville equation" equation that apply in this circumstance. These characteristic properties are very important when considering wave equations. In the case of a vibrating string the eigenvalues specify resonant modes of vibration, a value of 1 being a half cycle, 2 a full cycle and so on, because the boundary values incorporate no displacement at the ends where the string is fixed.
These have many uses, including solving simultaneous differential equations ie: equations such as:
d2y1/dt2 = -5y1 + 2y2
d2y2/dt2 = 2y1 - 2y2
This sort of equation is common in the description of combinations of harmonic motions. Mathematicians solve this sort of equation by converting it into a "vector equation"
The right hand side, -5y1 + 2y2 and 2y1 - 2y2 is expressed as two matrices:
|
y = |
y1 |
A = |
-5 |
2 |
|
|
y2 |
|
2 |
-2 |
And the second differential coefficients are expressed as:
|
Y = |
d2y1/dt2 |
|
|
d2y2/dt2 |
So the overall vector equation is Y = Ay (those familiar with vector multiplication should be able to easily reconstitute the original simultaneous equations see above).
The solution of second order differential equations is usually accomplished by substituting an exponential function, in this case this is achieved by substituting
|
xeωt for y: |
y = xeωt |
|
So that |
Y = A xeωt |
|
And hence |
ω 2xeωt = A xeωt |
|
So |
Ax = ω2x |
|
And |
Ax = λx where λ= ω2 |
This means that the solution of the simultaneous equations must conform to solutions of Ax = λx. Remarkably, this equation can be solved if the values that compose A are known. If x and λ can be found then, from y = xeωt, y can be found at any t and the equation solved. The vector, x, is known as an eigenvector of A and the constant λ is known as an eigenvalue of A. How x and λ are found from A is discussed below.
This vector equation can be written out in full as:
|
a11x1 + ... |
...+... |
... + a1nxn |
= λx1 |
|
... |
... |
... |
.... |
|
an1x1 + ... |
...+... |
... + annxn |
= λxn |
Which becomes, transferring the right hand side to the left becomes Bx = 0 ie:
|
(a11-λ) x1 + ... |
...+... |
... + a1nxn |
= 0 |
|
... |
... |
... |
.... |
|
an1x1 + ... |
...+... |
... + (ann-λ)xn |
= 0 |
This set of simultaneous linear equations is solved using determinants. In this particular case, where there are n linear equations in n unknowns all with a solution of zero (ie: homogenous) the equations only have non-trivial solutions if the determinant of the coefficients is zero.
|
(a11-λ) |
a12 |
... |
a1n |
|
|
D(λ) = |
... |
... |
... |
.... |
|
|
an1 |
an2 |
... |
(ann-λ) |
This determinant is zero: D(λ) = 0 and is known as the
characteristic determinant.Returning to the problem of solving Ax = λx. Suppose A has the following values:
|
5 |
4 |
|
1 |
2 |
The characteristic determinant is created by subtracting λ from the diagonal elements:
|
|
5- λ |
4 |
|
D(λ) = |
|
|
|
|
1 |
2- λ |
Resolving the determinant:
D(λ) = λ 2 - 7 λ + 6 = 0
This equation is known as the characteristic polynomial. The roots in this case are: λ1 = 6 and λ2 = 1. Evaluating Bx = 0 gives:
-x1 + 4x2 = 0
x1 - 4x2 = 0
so x1 = 4x2
The eigenvector of A, x1, is, using the eigenvalue from λ1 = 6:
|
|
4 |
|
x1 = |
|
|
|
1 |
Another eigenvector of A, x2, is, using the eigenvalue from λ2 = 1:
|
|
1 |
|
X2 = |
|
|
|
-1 |
The values of the eigenvectors and eigenvalues can be substituted into the original equations, such as Y = Ay, to allow them to be solved.