Linear Algebra Done Right

Notes for Linear Algebra Done Right——魏公子

Chapter1. Vector Space 1.A Complex Number and

C^{n}

1.B Definition of Vector Space 1.C Subspaces Definition Sums of Subspaces Chapter2. Finite-Dimensional Vector Space 2.A Span and Linear Independence Linear Combinations and Span Linear Independence 2.B Bases 2.C Dimension Chapter3. Linear Maps 3.A The Vector Space of Linear Maps Definition and Examples of Linear Maps Algebraic Operations on

L (V, W)

3.B Null Spaces and Ranges Null Spaces and Injectivity Range and Surjectivity Fundamental Theorem of Linear Maps 3.C Matrices Representing a Linear Map by a Matrix 3.D Invertibility and Isomorphic Vector Spaces Invertible Linear Maps Isomorphic Vector Spaces Linear Maps thought of as Matrix Multiplication Operators 3.E Products and Quotients of Vector Spaces Products of Vector Spaces Products and Direct Sums Quotients of Vector Space 3.F Duality The Dual Space and the Dual Map The null Space and Range of the Dual of a Linear Map The Matrix of the Dual map of a Linear Map The Rank of a Matrix Chapter4. Polynomials 4.A polynomials The Division Algorithm for Polynomials Chapter5. Eigenvalues, Eigenvectors, and Invariant Subspaces 5.A Invariant Subspaces Eigenvalues and Eigenvectors Restriction and Quotient Operators 5.B Eigenvectors and Upper-Triangular Matrices Polynomials Applied to Operators Existence of Eigenvalues Upper-Triangular Matrices 5.C Eigenspaces and Diagonal Matrices Chapter6. Inner Product Spaces 6.A Inner Products and Norms Inner Products Norms 6.B Orthonormal Bases Linear Functional on Inner Product Spaces 6.C Orthogonal Complements and Minimization Problems Orthogonal Complements Minimization Problems Chapter7. Operators on Inner Product Spaces 7.A Self-Adjoint and Normal Operators Adjoints Normal Operators 7.B The spectral Theorem The Complex Spectral Theorem The Real Spectral Theorem 7.C Positive Operators and Isometries Positive Operators Isometries (unitary operator) 7.D Polar Decomposition and Singular Value Decomposition Polar decomposition Singular Value Decomposition Chapter8. Operators on Complex Vector Spaces 8.A Generalized Eigenvectors and Nilpotent Operators Null Spaces of Powers of an Operator Generalized Eigenvectors Nilpotent Operators 8.B Decomposition of an Operator Description of Operators on Complex Vector Spaces Multiplicity of an Eigenvalue Block Diagonal Matrices Square Roots 8.C Characteristic and Minimal Polynomial The Cayley-Hamilton Theorem The Minimal Polynomial 8.D Jordan Form Chapter9. Operators on Real Vector Spaces 9.A Complexification of a Vector Space Complexification of an Operator The Minimal Polynomial of the Complexification Eigenvalues of the Complexification Chapter10. Trace and Determinant 10.A Trace Change of Basis Trace: A Connection Between Operators and Matrices 10.B Determinant Determinant of an operator Determinant of a matrix

Chapter1. Vector Space1.A Complex Number and

C^{n}

Definition

1.A.1

subtractionlet

- 𝛼

denotes the complex number which satisfies:

𝛼 + (- 𝛼) = 0

then subtraction is defined by :

𝛽 - 𝛼 : = 𝛽 + (- 𝛼)

□• Division is defined similarly

Definition

1.A.2

Notation

F

F

stands for either

R

C

Definition

list and length

A list of length

n

is (must be finite, because it has a length)

(x_{1}, x_{2}, \dots, x_{n})

Many mathematicians call a list of length n an n-tuple.

Definition

1.A.1

F^{n}

Definition

1.A.2

0Let 0 denote the list of length n whose coordinates are all 0:

(0, \dots, 0)

1.B Definition of Vector Space

Definition

Notation

F^{S}

A vector space is a set

V

, along with an addition on

V

and a scalar multiplication on

V

such that the following properties hold: • commutativity

u + v = v + u

• associativity

(u + v) + w = u + (v + w)

• additive identity

v + 0 = v

• additive inverse

v + w = 0

• multiplicative identity

1 \cdot v = v

• distributive properties

a (u + v) = a u + a v

infinite vector space:•

F^{\infty}

is defined to be the set of all sequences of elements of

F

F^{\infty} = {(x_{1}, x_{2}, \dots) : x_{j} \in F f o r j = 1, 2, \dots}

Definition

Notation

F^{S}

• If

S

is a set, then

F^{S}

denotes the set of functions from

S

F

• For

f, g \in F^{S}

, the sum

f + g \in F^{S}

is the function defined by

(f + g) (x) = f (x) + g (x)

• For

𝜆 \in F

and

f \in F^{S}

, the product

𝜆 f \in F^{S}

is the function defined by

(𝜆 f) (x) = 𝜆 f (x)

• for all

x \in S .

For example,

R^{[0, 1]}

is the set of real-values functions on the interval

[0, 1] .

But, wait!

Example

2.1

F^{S}

is a vector space

We just need to define:• The additive identity of

F^{S}

is the function

0 : S \to F

defined by

0 (x) = 0

• The additive inverse of

f

is the function

- f : S \to F

defined by

(- f) (x) : = - f (x)

• for all

x \in S .

1.C SubspacesDefinition

Definition

subspaces

A subset

U

V

is called a subspace if

U

is also a vector spaces, using the same addition and scalar multiplication.

Sums of Subspaces

Definition

sum of subsets

Suppose

U_{1}, \dots, U_{m}

are subsets of

V .

The sum of

U_{1}, \dots U_{m}

, denoted

U_{1} + \dots + U_{m}

, is the set of all possible sums of elements of

U_{1}, \dots, U_{m}

. More precisely,

U_{1} + \dots + U_{m} : = {u_{1} + \dots + u_{m} : u_{1} \in U_{1}, \dots, u_{m} \in U_{m}}

Example

4.1

Suppose

U

is the set of all elements of

F^{3}

whose second and third coordinates equal 0, and

W

is the set of all elements of

F^{3}

whose first and third coordinates equal 0:

U = {(x, 0, 0) \in F^{3} : x \in F} a n d W = {(0, y, 0) \in F^{3} : y \in F}

Then

U + W = {(x, y, 0) \in F^{3} : x, y \in F}

Theorem

Sum of subspaces is the smallest containing subspace

Suppose

U_{1}, \dots U_{m}

are subspaces of

V

. Then

U_{1} + \dots + U_{m}

is the smallest subspace of

V

containing

U_{1}, \dots, U_{m}

.Proof : • First it's easy to prove

U_{1} + \dots + U_{m}

is a subspace of V. • Then, every subspace in

U_{1} + \dots + U_{m}

containing

U_{1}, \dots, U_{m}

must contain

U_{1} + \dots + U_{m}

because subspaces must contain all finite sums of their elements (closed under addition), so if it contains

U_{1}

and

U_{2}

, then the element of

U_{1} + U_{2}

must be contained to.

Now, what about the situation each vector in

U_{1} + \dots U_{m}

can be represented in the from above in only one way?

Definition

Direct Sum

Suppose

U_{1}, \dots U_{m}

are subspaces of

V

. • Then

U_{1} + \dots + U_{m}

is called a direct sum if each element of

U_{1} + \dots + U_{m}

can be written in only one way as a sum

u_{1} + \dots u_{m}

, where each

u_{j}

is in

U_{j}

• If

U_{1} + \dots + U_{m}

is a direct sum, then

U_{1} \oplus \dots \oplus U_{n}

denotes

U_{1} + \dots U_{m}

, with the

\oplus

notation serving as an indication that this is a direct sum.

Example

6.1

direct sum

Suppose

U

is the subspace of

F^{3}

of those vectors whose last coordinates equal 0, and

W

is the subspace of

F^{3}

of those vectors whose first two coordinates equal 0:

U = {(x, y, 0) \in F^{3} : x, y \in F} a n d W = {(0, 0, z) \in F^{3} : z \in F} .

The

F^{3} = U \oplus W

The definition of direct sum requires that every vector in the sum have a unique representation as an appropriate sum. The next result shows that when deciding whether a sum is direct, you only need consider whether 0 can be uniquely written as an appropriate sum.

Theorem

The condition of direct sum

Suppose

U_{1}, \dots U_{m}

are subspaces of

V

. Then

U_{1} + \dots + U_{m}

is a direct sum if and only if the only way to write

0

as a sum

u_{1} + \dots + u_{m}

, where each

u_{j}

is in

U_{j}

, is by taking each

u_{j}

equal to

0

Proof

• First suppose

U_{1} + \dots + U_{m}

is a direct sum. Then the definition requires 0 can only be represented by one way. Then because each

U_{j}

is a subspace of

U

, so it means

0 \in U_{j}

, thus

0 + \dots + 0 = 0

is possible. So the only way to write 0 as a sum is it.• If we can write 0 in the only way, then we can't write a vector in two ways:

v = u_{1} + \dots u_{m} = 𝜈_{1} + \dots 𝜈_{m}

• Subtracting these two equations, we have

0 = (u_{1} - 𝜈_{1}) + \dots (u_{m} - 𝜈_{m})

• Because

0 = 0 + \dots + 0

, so

u_{1} = 𝜈_{1}, u_{2} = 𝜈_{2}, \dots

The next result gives a simple condition for testing which pairs of subspaces give a direct sum.

Theorem

Direct sum of two subspaces

Suppose

U

and

W

are subspaces of

V

. Then

U + W

is a direct sum if and only if

U \cap W = {0}

Proof

If it's a direct sum, then, there can't be other representation of 0 except

0 + 0

. If

U \cap W = {0, s}

, then because of the addition conservation, there must be a

- s

W

as well. So there can be another way to represent:

0 = s + (- s)

Besides, it's easy to prove that if

U \cap W = {0}

, there is only one way to represent 0. Thus Theorem "Condition of direct sum" is satisfied.□

More information

The results above only deals with the case of two subspaces. When asking about a possible direct sum with more than two subspaces, it is not enough to test whether

U_{1} \cap U_{2} = U_{1} \cap U_{3} = U_{2} \cap U_{3} = {0}

Chapter2. Finite-Dimensional Vector Space2.A Span and Linear IndependenceTo avoid confusion, we will usually write lists of vectors without surrounding parentheses. For example: (4,1,6), (9,5,7) is a list of length 2 of vectors in

R^{3}

Linear Combinations and SpanAdding up scalar multiples of vectors in a list gives what is called a linear combination of the list:

Definition

Linear Combination

A linear combination of a list

v_{1}, \dots v_{m}

of vectors in

V

is a vector of the form

a_{1} v_{1} + \dots + a_{m} v_{m}

where

a_{1}, \dots, a_{m} \in F

Definition

Span

(Some mathematicians also call it linear span)

The set of all linear combination of a list of vectors

v_{1}, \dots, v_{m}

V

is called the span of

v_{1}, \dots, v_{m}

, denoted

s p a n (v_{1}, \dots, v_{m})

. In other words

s p a n (v_{1}, \dots, v_{m}) = {a_{1} v_{1} + \dots + a_{m} v_{m} : a_{1}, \dots, a_{m} \in F}

The span of the empty list ( ) is defined to be {0}. Besides, we introduce a verb spans. We say that

v_{1}, \dots, v_{m}

spans

V

Theorem

Span is the smallest containing subspace

Proof

First we show that

s p a n (v_{1}, \dots v_{m})

is a subspace of

V

. • The additive identity is in it: set all

a_{i} = 0

in equ

(1)

. • It's closed under addition• It's closed under scalar multiplicationThen we show that

s p a n

is the smallest subspace• Each

v_{j}

is a linear combination of

v_{1}, \dots, v_{m}

, thus is in the span.• Conversely, if we need to contain all vectors in the list, we must allow the addition and scalar multiplication. Thus we must include the span if we want to include all the vectors in the list□

Now we can make one of the key definition in linear algebra!!!

Definition

finite-dimensional vector space

A vector space is called finite-dimensional if some list of vectors in it spans the space.

Recall that by definition every list has finite length.

Definition

polynomial,

P (F)

• A function

p : F \to F

is called a polynomial with coefficients in

F

if there exist

a_{0}, \dots, a_{m} \in F

such that

p (z) = a_{0} + a_{1} z + a_{2} z^{2} + \dots + a_{m} z^{m}

• for all

z \in F

.•

P (F)

is the set of all polynomial with coefficients in

F

With the usual operations of addition and scalar multiplication,

P (F)

is a vector space over

F

. In other words,

P (F)

is a subspace of

F^{F}

, the vector space of functions from

F

F

. If a polynomial is represented by two sets of coefficients, then subtracting one representation of the polynomial from the other produces a polynomial that is identically zero as a function on

F

and hence has all zero coefficients. Conclusion: the coefficients of a polynomial are uniquely determined by the polynomial.

Definition

degree of a polynomial:

d e g p

A polynomial is said to have a degree

m

if there exist

a_{m} \neq 0

such that

p (z) = a_{0} + a_{1} z + \dots a_{m} z^{m}

for all

z \in F

. If

p

has degree

m

, we write

d e g p = m

. ★ The polynomial that is identically 0 is called to have degree

- \infty

In next definition, we use the convention that

- \infty < m

, which means that the polynomial

0

is in any set of polynomials

P_{m} (F)

Definition

P_{m} (F)

For

m

a non-negative integer,

P_{m} (F)

denotes the set of all polynomials with coefficients in

F

and degree at most

m

Example

15.1

P_{m} (F)

is a finite-dimensional vector space.

P_{m} (F)

is a finite-dimensional vector space for each non-negative integer

m

. Note that

P_{m} (F) = s p a n (1, z, \dots, z^{m})

; here we are slightly abusing notation by letting

z^{k}

denote a function, so

z^{m} \in F^{F}

, and with all the number smaller than

m

, they spans a bigger subspace

P_{m} (F)

in the vector space

F^{F}

However,

P (F)

is infinite-dimensional

Example

15.2

P (F)

is an infinite-dimensional vector space.

Consider any list of elements in

P (F)

, let

m

denotes the highest degree in the list. Then the span of the list cannot represent

z^{m + 1}

Linear IndependenceSuppose

v_{1}, \dots, v_{m} \in V

and

v \in s p a n (v_{1}, \dots, v_{m})

. If

0 \in s p a n (v_{j})

can only be written as the combination of

v_{j}

in one trivial way——

0 v_{1} + 0 v_{2} + \dots

, then this situation is so important that we give it a special name -- linear independence

Definition

Linearly Independence

A list

v_{1}, \dots, v_{m}

of vectors in

V

is called linearly independent if the only choice of

a_{1}, . . ., a_{m} \in F

that makes

a_{1} v_{1} + \dots + a_{m} v_{m}

equal 0 is

a_{1} = \dots = a_{m} = 0

. ★ The empty list ( ) is also declared to be linearly independent.

The reasoning above shows that

v_{1}, \dots, v_{m}

is linearly independent if and only if each vector in

s p a n (v_{1}, \dots, v_{m})

has only one representation as a linear combination of

v_{j}

Now we define Linear Dependent

Definition

Linearly dependent

A list

v_{1}, \dots, v_{m}

of vectors in

V

is called linearly dependent if it is not linearly independent. In other words, a list of vector is linearly dependent if there exist

a_{1}, \dots, a_{m} \in F

, not all 0, such that

a_{1} v_{1} + \dots a_{m} v_{m} = 0

Lemma

17.1

Linear Dependence Lemma

Suppose

v_{1}, \dots, v_{m}

is a linearly dependent list in

V

. Then there exists

j \in {1, 2, \dots, m}

such that the following hold:★

v_{j} \in s p a n (v_{1}, \dots, v_{j - 1})

★ if the

j^{t h}

term is removed from

v_{1}, \dots, v_{m}

, the span of the remaining list equals previous span.The proof is easy. Just change the form of the

0 = a_{1} v_{1} + \dots + a_{m} v_{m}

, and then use other vector in the list to present any of the

v_{j}

The second lemma above is very useful. It means most time, we can simply add any vector into the list and remove another:

Theorem

Length of linearly independent list

⩽

length of spanning list

Proof

Let the vector in the linearly independent list denoted by

u_{j}

, those in the spanning list denoted by

w_{j}

, up to

w_{n}

. And we do the following steps:(1) Each time, we add a new

u_{j}

into the spanning list, thus we change it into a linear dependent list(2) Then, we remove any

w_{j}

of the list. Because

u_{j}

is independent.Then we create a spanning list of length

n

, any other

u_{j}

cannot be linearly independent with this spanning list.□

We can use these theorem to prove some "trivial" true:

Theorem

Finite-dimensional subspaces

Every subspaces of a finite-dimensional vector space is finite-dimensional

Proof

Let

v_{0} = 0

. If

U = s p a n (v_{1}, \dots, v_{j - 1})

, then

U

is finite-dimensional. Otherwise, choose a vector

v_{j} \in U

such that

v_{j} \notin s p a n (v_{1}, \dots, v_{j - 1})

After each step, as long as the process continues, we have constructed a list of vectors such that no vector in this list is in the span of the previous vectors. Thus after each step, the list is linearly independent. By linear independent Lemma, This linearly independent list cannot be longer than any spanning list of

V

. □

2.B BasesIn the last section, we discussed linearly independent lists and spanning lists. Now we bring there concepts together.

Definition

basis

A basis of

V

is a list of vectors in

V

that is linearly independent and spans

V

Theorem

Criterion for basis

A list

v_{1}, \dots v_{n}

of vectors in

V

is a basis of

V

if and only if every

v \in V

can be written uniquely in the form

v = a_{1} v_{1} + \dots + a_{n} v_{n}

where

a_{1}, \dots, a_{n} \in F

Theorem

spanning list contains a basis

Proof

★ If spanning list is linearly independent, it's a basis★ Otherwise, remove any vector of it until it's linearly independent.

Theorem

Basis of finite-dimensional vector space

Any finite-dimensional vector space has a basis, because it has some list spanning it.

Our next theorem is in some sense a dual of Theorem "spanning list contains a basis"

Theorem

Linearly independent list extends to a basis

Every linearly independent list of vectors in a finite-dimensional vector space can be extended to a basis of the vector space.

Proof

Keep adding additional linearly independent vector into the list, until its length equal to the basis of the space. ——not formal We can simply add the basis to the independent list, keep removing vectors of it until it becomes a basis. ( Because the vector in previous list

v_{1} \dots v_{n}

is linearly independent, so we don't need to remove any

v_{j}

)

As an application of the results above, we now show that every subspace of a finite-dimensional vector space can be paired with another subspace to form a direct sum of the whole space

Theorem

Every subspace of

V

is part of sum equal to

V

A list

v_{1}, \dots v_{n}

of vectors in

V

is a basis of

V

if and only if every

v \in V

can be written uniquely in the form

v = a_{1} v_{1} + \dots + a_{n} v_{n}

where

a_{1}, \dots, a_{n} \in F

2.C DimensionAlthough we have been discussing finite-dimensional vector space, what's the dimension of such an object? A reasonable definition should force the dimension of

F^{n}

to equal

n

. Notice that the standard basis

(1, 0, \dots, 0), \dots

F^{n}

has length

n

. Thus we try to define the dimension as the length of a basis. However, before doing this, can we promise that every basis of a vector space has equal length?

Theorem

Basis length does not depend on basis

Proof

Suppose

V

is finite-dimensional. Let

B_{1}

and

B_{2}

be two bases. Then

B_{1}

is a linearly independent list in the view of

B_{2}

. The length of

B_{2}

thus must bigger than length of

B_{1}

. And vice versa.

Now we can formally define the dimension of such spaces:

Definition

dimension,

d i m V

The dimension of a finite-dimensional vector space is the length of any basis of the vector space.

Example

27.1

d i m P_{m} (F) = m + 1

A polynomial has dimension:

d i m P_{m} (F) = m + 1

,because the basis is

1, z, \dots z^{m}

To check that a list of vectors in

V

is a basis of

V

, we must, according to the definition, show that the list in question satisfies two properties. However, sometimes it's easier than this:

Theorem

Linearly independent list of right length is a basis

Proof

Suppose

d i m V = n

and

v_{1}, \dots, v_{n}

is linearly independent in

V

. The list

v_{1}, \dots, v_{n}

can be extended to the basis, but basis has length equal

n

, thus the list itself is already a basis.

Example

28.1

Show that list

(5, 7), (4, 3)

is a basis of

F^{2}

It's easy to show. Just remember that two vector with "," is a list.

Similarly, we have

Theorem

Spanning list of right length is a basis

Proof

Spanning list contains a basis. But all bases have the same length. Thus the list itself is a basis.

The next result gives a formula for the dimension of the sum of two subspaces of a finite-dimensional vector space. The formula is analogous to a familiar counting formula: the number of elements in the union of two set equals the number of elements in the first set, plus the number of elements in the second set , minus the number of element in the intersection of the two set.

Theorem

Dimension of a sum

U_{1}

and

U_{2}

are subspaces of a finite-dimensional vector space, then

d i m (U_{1} + U_{2}) = d i m U_{1} + d i m U_{2} - d i m (U_{1} \cap U_{2}) .

Proof

Let

u_{1}, \dots, u_{m}

be the basis of

U_{1} \cap U_{2}

, because

u_{1}, \dots, u_{m}

are linear independent vectors in

U_{1}

and

U_{2}

, we extend them to basis of

U_{i}

. For example,

u_{1}, \dots, u_{m}, v_{1}, \dots v_{n}

is a basis of

U_{1}

u_{1}, \dots, u_{m}, w_{1}, \dots w_{k}

is a basis of

U_{2}

. Now we will prove that

u_{1}, \dots, u_{m}, v_{1}, \dots, v_{n}, w_{1}, \dots, w_{k}

is a basis of

U_{1} + U_{2}

.★ First, it's easy to show that

u_{1}, \dots, u_{m}, v_{1}, \dots, v_{n}, w_{1}, \dots, w_{k}

is a spanning list.★ Then, if

v_{1}

is linearly dependent with

u_{1}, \dots, u_{m}, w_{1}, \dots, w_{k}

, we can represent

v_{1}

with this list. According to the addition reservation,

v_{1}

is a vector of

U_{2}

. Besides, it's a vector of

U_{1}

, so it's in the intersection

U_{1} \cap U_{2}

, which means it should be represented by the basis list

u_{1}, \dots, u_{m}

. However, we supposed that

u_{1}, \dots, u_{m}, v_{1}, \dots v_{n}

is a basis of

U_{1}

v_{1}

cannot be represented by

u_{1}, \dots, u_{m}

. So

v_{1}

must be linear independent with

u_{1}, \dots, u_{m}, w_{1}, \dots, w_{k}

★ So

u_{1}, \dots, u_{m}, v_{1}, \dots, v_{n}, w_{1}, \dots, w_{k}

is a linearly independent list

Chapter3. Linear MapsHere comes something!★ Fundamental Theorem of Linear Maps★ the matrix of a linear map with respect to given bases★ isomorphic vector spaces★ product spaces★ quotient spaces★ the dual space of a vector space and the dual of a linear map3.A The Vector Space of Linear MapsDefinition and Examples of Linear Maps Now we are ready for one of the key definitions in linear algebra.

Definition

linear map

A linear map from

V

W

is a function

T : V \to W

with the following properties:★ additivity

T (u + v) = T u + T v, f o r a l l u, v \in V;

★ homogeneity

T (𝜆 v) = 𝜆 T (v), f o r a l l 𝜆 \in F a n d a l l v \in V .

The set of all linear maps from

V

W

is denoted

L (V, W)

More information

★ Some mathematicians use the term linear transform the same as linear map★ For linear maps we often use the notation

T v

as well as the more standard functional notation

T (v)

Example

31.1

Some linear maps

★ zero map★ identity map★ differential★ integral★ multiplication by

x^{2}

T \in L (P (R), P (R)), (T p) (x) : = x^{2} p (x)

★ backward shift:

T \in L (F^{\infty}, F^{\infty}), T (x_{1}, x_{2}, x_{3}, \dots) : = (x_{2}, x_{3}, \dots)

The existence part of the next results means we can find a linear map that takes on whatever values we wish on the vectors in a basis. The uniqueness part means that a linear map is completely determined by its values on a basis.

Theorem

Linear maps and basis of domain

Suppose

v_{1}, \dots, v_{n}

is a basis of

V

and

w_{1}, \dots, w_{n} \in W

. Then there exists a unique linear map

T : V \to W

such that

T v_{j} = w_{j}

for each

j = 1, \dots, n .

Proof

We can define a linear map that maps each

v_{j}

w_{j}

T \in L (V, W), T (a_{1} v_{1} + \dots + a_{n} v_{n}) : = a_{1} w_{1} + \dots a_{n} w_{n} .

And it's easy to show that

T

is a linear map.If there exist another

S

which can map each

v_{j}

w_{j}

, then for every vector in the space, two maps must be equal. So actually, there is only one map.

Algebraic Operations on

L (V, W)

We begin by defining addition and scalar multiplication on

L (V, W)

Definition

addition and scalar multiplication on

L (V, W)

Suppose

S, T \in L (V, W)

and

𝜆 \in F

. The sum

S + T

and the product

𝜆 T

are the linear maps from

V

W

defined by

(S + T) (v) : = S v + T v

(𝜆 T) (v) : = 𝜆 (T v)

The next result should not be a surprise.

Definition

L (V, W)

is a vector space

With the operations defined in Definition "addition and scalar multiplication"

Usually it makes no sense to multiply together two elements of a vector space, but for some pairs of linear maps a useful product exists. We will need a third vector space, so for the rest of this section suppose

U

is a vector space over

F

Definition

Product of Linear Maps

T \in L (U, V)

and

S \in L (V, W)

, then the product

S T \in L (U, W)

is defined by

(S T) (u) : = S (T u) .

It's just a notation for linear mapping twice. It's just the usual composition

S \circ T

of two functions.Note that

S T

is defined only when

T

maps into the domain of

S

3.B Null Spaces and RangesNull Spaces and InjectivityIn this section we will learn about two subspaces that are intimately connected with each linear map.

Definition

null space

n u l l T

For

T \in L (V, W)

, the null space of

T

, denoted

n u l l T

, is the subset of

V

consisting of those vectors that

T

maps to

0

n u l l T = {v \in V : T v = 0}

Example

36.1

null space

★ zero map

T \in L (V, W)

, then

n u l l T = V

★ Suppose

𝜑 \in L (C^{3}, C)

is defined by

𝜑 (z_{1}, z_{2}, z_{3}) = z_{1} + 2 z_{2} + 3 z_{3}

. Then

n u l l 𝜑 = {(z_{1}, z_{2}, z_{3}) \in C^{3} : z_{1} + 2 z_{2} + 3 z_{3} = 0}

Some mathematicians use the term kernel instead of null space.

Theorem

The null space is a subspace

Proof

Addition and scalar multiplication is closed because of the definition of linear map.

As we will soon see, for a linear map the next definition is closely connected to the null space.

Definition

injective

A function

T

is called injective if

T u = T v

implies

u = v

, which means 单射

Theorem

The null space and injectivity

Let

T \in L (V, W)

. Then

T

is injective if and only if

n u l l T = {0}

Range and Surjectivity

Definition

Range

For

T

a function from

V

W

, the range of

T

is the subset of

W

consisting of those vectors that are of the form

T v

for some

v \in V

r a n g e T = {T v : v \in V} .

Recall that

V

is called domain here.

Also called image

Theorem

The range is a subspace

Proof

Addition and scalar multiplication is closed because of the definition of linear map.

Definition

surjective

A function

T

is called surjective if its range equals

W

Fundamental Theorem of Linear MapsThe next result is so important that it gets a dramatic name.

Theorem

Fundamental Theorem of Linear Maps

Suppose

V

is finite-dimensional and

T \in L (V, W)

. Then

r a n g e T

is finite-dimensional and

d i m V = d i m n u l l T + d i m r a n g e T .

Proof

Suppose

V

has a

m

length basis

v_{1}, v_{2}, \dots, v_{m}

. When it is mapped into another space, they're changed into

w_{j} = T v_{j}

It's clearly that

w_{j}

spans

r a n g e T

. Thus we can remove some of the

w_{j}

until it remains a basis of

r a n g e T

. Suppose we classify

w_{j}

into two lists:

w_{1}, \dots, w_{k}

and

w_{k + 1}, \dots, w_{m}

, each

w_{j}

in the latter list is linearly dependent with the former. In other words,

w_{1}, \dots, w_{k}

is a basis and

r a n g e T

has dimension

k

. Now I will prove the length

n = m - k

, of list

w_{k + 1}, \dots, w_{m}

, is the dimension of

n u l l T

. Suppose

w_{j} = \sum_{i}^{k} a_{i} w_{i}, f o r j ⩾ k + 1

It means

T v_{j} = \sum_{i}^{k} a_{i} T (v_{i})

as well. Thus: let

u_{j} = v_{j} - \sum_{i}^{k} a_{i} v_{i}

T (u_{j}) = 0

. For every vector in null space

T (\sum_{i = 1}^{m} a_{i} v_{i}) = 0,

use

u_{j}

to replace

v_{j}

for

j ⩾ k + 1

. There will be more terms of

v_{i}

for

i ⩽ k

. So we collect the coefficients and write

T (\sum_{i = 1}^{k} a_{i}^{'} v_{i} + \sum_{j = k + 1}^{m} u_{j}) = 0

Because the latter term naturally equals

0

\sum_{i = 1}^{k} T (a'_{i} v_{i}) = \sum_{i = 1}^{k} a'_{i} w_{i} = 0

a'_{i} = 0

. Therefore, for every vector in null space , we can use list

u_{j}

to represent it, which means list

u_{j}

is a span. Besides, it can be proved that

u_{j}

is linearly independent——So the dimension of null space is

n = m - k

!□

Now we can show that no linear map from a finite-dimensional space to a "smaller" space is injective, where "smaller" is measured by dimension

Theorem

A map to a smaller dimensional space is not injective

Proof

Use fundamental theorem of linear maps

Theorem

A map to a larger dimensional space is not surjective

Proof

Similarly

The term "homogeneous" below means that all the constant value on the right of the equation system equal 0.

Example

45.1

Rephrase in terms of a linear map the question of whether a homogeneous system of linear equations has a nonzero solution

The equation

T (x_{1}, \dots, x_{n}) = 0

means that

x_{1}, \dots, x_{n}

is in the null space. Thus after rephrasing, the question becomes:Whether the linear map is injective.

Theorem

Homogeneous system of linear equations

A homogeneous system of linear equation with more variable than equations has nonzero solutions.

Proof

More variable means it's a mapping into smaller space.

Example

46.1

Whether an inhomogeneous system of linear equations has no solutions for some choice of the constant terms

If the range has less dimension than vector space

W

, then there definitely can be some situation in which the equation has no solution. So the question can be changed as:Whether the linear map is surjective?

Theorem

Inhomogeneous system of linear equations

An inhomogeneous system of linear equation with less variable than equations has no solution for some choice of the constant terms

Proof

More equation means it's a mapping into bigger space.

3.C MatricesRepresenting a Linear Map by a Matrix

Definition

Matrix of a linear map,

M (T)

Suppose

T \in L (V, W)

and

v_{1}, \dots, v_{n}

is a basis of

V

and

w_{1}, \dots, w_{m}

is a basis of

W

. The matrix of

T

with respect to these bases is the

m - b y - n

matrix

M (T)

whose entries

A_{j, k}

are defined by

T v_{k} = A_{1, k} w_{1} + \dots + A_{m, k} w_{m}

If the bases are not clear from the context, then the notation

M (T (v_{1}, \dots, v_{n}), (w_{1}, \dots, w_{m}))

is used.

The matrix

M (T)

of a linear map

T \in L (V, W)

depends on the basis

v_{1}, \dots, v_{n}

V

and the basis

w_{1}, \dots, w_{m}

W

, as well as on

T

. However, the bases should be clear from the context, and thus they are often not included in the notation.

Definition

Notation

F^{m, n}

For

m

and

n

positive integers, the set of all

m - b y - n

matrices with entries in

F

is denoted by

F^{m, n}

Theorem

Linear combination of columns

Suppose

A

is an

m - b y - n

matrix and

c = (\begin{array}{c} c_{1} \\ ⋮ \\ c_{n} \end{array})

is an

n - b y - 1

matrix. Then

A c = c_{1} A_{., 1} + \dots + c_{n} A_{., 1}

In other words,

A c

is a linear combination of the columns of

A

, with the scalars that multiply the columns coming from

c

3.D Invertibility and Isomorphic Vector SpacesInvertible Linear Maps

Definition

invertible, inverse

A linear map

T \in L (V, W)

is called invertible if there exists a linear map

S \in L (W, V)

such that

S T

equals the identity map on

V

and

T S

equals the identity map on

W

A linear map

S \in L (W, V)

satisfying

S T = I

and

T S = I

is called an inverse of

T

(note that the first

I

is on

V

and the second

I

is on

W

Theorem

Inverse is unique

Proof

Otherwise

S_{1} = S_{1} I = S_{1} T S_{2} = I S_{2} = S_{2}

Thus

S_{1} = S_{2}

□

Theorem

Invertibility is equivalent to injectivity and surjectivity

A linear map is invertible if and only if it is injective and surjective.

Proof

★ "Only if" is clearly to prove. ★ Then if the map is one-to-one, we can just define

S \to T (S (w)) = w

★ Clearly we have

T \circ S = I

. Then what about

S \circ T

? For each vector

v

T

, surjectivity tells us that we can find

w \in S, v = S (w)

. Thus

∵ S (T (S (w))) = S (w) S (T (v)) = v

★ To complete the proof, we need to show that

S

is linear as well.

Isomorphic Vector SpacesThe next definition captures the idea of two vector spaces that are essentially the same, except for the names of the elements of the vector spaces

Definition

isomorphic, isomorphism

★ An isomorphism is an invertible linear map★ Two vector spaces are called isomorphic if there is an isomorphism from one vector space onto the other one.

More information

The Greek word isos means equal; the Greek word morph means shape. Thus isomorphic literally means equal shape. It means the same time as invertible. Use "isomorphism" when you want to emphasize that the two spaces are essentially the same.

Theorem

Dimension shows whether vector spaces are isomorphic

Two finite-dimensional vector spaces over

F

are isomorphic if and only if they have the same dimension.

Proof

First, if vector spaces are isomorphic, then there exists an isomorphic linear map between

V

and

W

. Isomorphism asks the null space to be

{0}

. Thus according to the fundamental theorem of linear map,

d i m W = r a n g e M = d i m V

. To complete the proof in another direction, suppose two finite dimensional vector spaces have the same dimension. Thus we can construct a linear map which holds each coefficients of basis label.

T (c_{1} v_{1} + \dots + c_{m} v_{m}) = c_{1} w_{1} + \dots + c_{m} w_{m} .

□

The previous result implies that each finite-dimensional vector space

V

is isomorphic to

F^{n}

, where

n = d i m V

. If

v_{1}, \dots, v_{n}

is a basis of

V

and

w_{1}, \dots, w_{m}

is a basis of

W

, then for each

T \in L (V, W)

, we have a matrix

M (T) \in F^{m, n}

. In other words, once bases have been fixed for

V

and

W

M

becomes a function from

L (V, W)

F^{m, n}

. Notice

M

is a linear map, and it's actually invertible, as we will show.

Theorem

L (V, W)

and

F^{m, n}

are isomorphic

Suppose

v_{1}, \dots, v_{n}

is a basis of

V

and

w_{1}, \dots, w_{m}

is a basis of

W

, Then

M

is an isomorphism between

L (V, W)

and

F^{m, n}

Theorem

d i m L (V, W) = (d i m V) (d i m W)

M

is an isomorphism between

L (V, W)

and

F^{m, n}

. Thus

L

has the same dimension with

F^{m, n}

∴ d i m F^{m, n} = m n

Linear Maps thought of as Matrix MultiplicationPreviously we defined the matrix of a linear map. Now we define the matrix of a vector.

Definition

matrix of a vector,

M (v)

Suppose

v \in V

and

v_{1}, \dots, v_{n}

is a basis of

V

. For

v = \sum_{i}^{n} c_{i} v_{i}

, define

M (v) : = (\begin{array}{c} c_{1} \\ ⋮ \\ c_{n} \end{array})

M (v)

is an isomorphism of

V

onto

F^{n, 1}

Theorem

M (T)_{., k} = M (v_{k})

(? It may should be

M (T)_{., k} = M (T (v_{k}))

Suppose

T \in L (V, W)

and

v_{1}, \dots, v_{n}

is a basis of

V

and

w_{1}, \dots, w_{m}

is a basis of

W

M (T)

then is defined by

c_{i} = \sum_{j}^{} M_{i j} a_{j}

. So the

k

th column is

M_{., k}

, and

T v_{k} = M_{i j} \cdot 𝛿_{j - k} = M_{., k}

It means, for some linear map, the kth column of the matrix equals the matrix of " the map of the kth basis element"

Theorem

Linear maps act like matrix multiplication

Suppose

v = a_{1} v_{1} + \dots + a_{n} v_{n}

\begin{aligned} M (T v) & = a_{1} M (T v_{1}) + a_{2} M (T v_{2}) + \dots + a_{n} M (T v_{n}) \\ = a_{1} M (T)_{., 1} + \dots + a_{n} M (T)_{., n} \\ = M (T) M (v) \end{aligned}

Because the result above allows us to think (via isomorphism) of each linear map as multiplication of

F^{n, 1}

by some matrix

A

, keep in mind that the specific matrix

A

depends not only on the linear map but also on the choice of bases. One of the themes of many of the most important results in later chapters will be the choice of a basis that makes the matrix A as simple as possible. Operators

Definition

operator,

L (V)

A linear map from a vector space to itself is called an operator.

L (V) = L (V, V)

Recall that injective and surjective implies invertibility. However, I'll show that in finite-dimensional vector space, surjective is equivalent to injective in operator.

Theorem

injectivity is equivalent to surjectivity in finite-dimensional operator

Proof

Use fundamental theorem of linear maps. injectivity means

n u l l V = {0}

, thus

d i m r a n g e M = d i m V - 0 = d i m V

vice versa. □

3.E Products and Quotients of Vector SpacesProducts of Vector SpacesAs usual when dealing with more than one vector spaces, all the vector spaces in use should be over the same field.

Definition

product of vector spaces

Suppose

V_{1}, \dots, V_{m}

are vector spaces over

F

.★ The product

V_{1} \times \dots \times V_{m}

is defined by

V_{1} \times \dots \times V_{m} = {(v_{1}, \dots, v_{m}) : v_{1} \in V_{1}, \dots, v_{m} \in V_{m}}

★ Addition on

V_{1} \times \dots \times V_{m}

is defined by

(u_{1}, \dots, u_{m}) + (v_{1}, \dots, v_{m}) = (u_{1} + v_{1}, \dots, u_{m} + v_{m})

★ Scalar multiplication on

V_{1} \times \dots \times V_{m}

is defined similarly

Example

63.1

R^{2} \times R^{3}

is isomorphic to

R^{5}

In this case, the isomorphism is so natural that we should think of it as a relabeling. Some people would even informally say that

R^{2} \times R^{3} = R^{5}

, which is not technically correct but which captures the spirit of identification via relabeling.

Example

63.2

Find a basis of

P_{2} (R) \times R

Recall that product is kind of like "sum" in dimension. The basis is

(1, (0, 0)); (x, (0, 0)); (x^{2}, (0, 0)); (0, (1, 0)); (0, (0, 1))

Theorem

Dimension of a product is the sum of dimensions

d i m (V_{1} \times \dots \times V_{m}) = d i m V_{1} + \dots + d i m V_{m}

Products and Direct SumsIn the next result, the map

Γ

is surjective by the definition of

U_{1} + \dots + U_{m}

. Thus the last word in the result below could be changed from "injective" to "invertible".

Theorem

Products and direct sums

Suppose that

U_{1}, \dots, U_{m}

are subspaces of

V

. Define a linear map

Γ

U_{1} \times \dots \times U_{m} \to U_{1} + \dots + U_{m}

𝛤 (u_{1}, \dots, u_{m}) = u_{1} + \dots + u_{m} .

Then

U_{1} + \dots + U_{m}

is a direct sum if and only if

Γ

is injective.

Proof

Γ

is injective, then for each

u_{1} + \dots + u_{m}

, the original

u_{1}, \dots, u_{m}

is unique, which means that is a direct sum. vice versa. □

Theorem

A sum is a direct sum if and only if dimensions add up

Proof

The products has dimension

d i m V_{1} + d i m V_{2} + \dots

. It's a direct sum if and only if

Γ

is invertible, which is a linear map between finite-dimensional vector space. Thus only if the range equals dimension of domain, can the map be isomorphism.

Quotients of Vector SpaceWe begin our approach to quotient spaces by defining the sum of a vector and a subspace.

Definition

v + U

Suppose

v \in V

and

U

is a subspaces of

V

. Then

v + U

is the subset of

V

defined by

v + U = {v + u : u \in U} .

Definition

affine subset, parallel

★ An affine subset is a subset of

V

of the form

v + U

for some

v \in V

and some subspace

U

V

.★ For

v \in V

and

U

a subspace of

V

, the affine subset

v + U

is said to be parallel to

U

Definition

quotient space,

V / U

Suppose U is a subspace of

V

. Then the quotient space

V / U

is the set of all affine subsets of

V

parallel to

U

. In other words,

V / U = {v + U : v \in V} .

Our next goal is to make

V / U

into a vector space. To do this, we will need the following result.

Theorem

Two affine subsets parallel to

U

are equal or disjoint

Suppose

U

is a subspace of

V

and

v, w \in V

. Then the following are equivalent:

\begin{aligned} (1) & v - w \in U \\ (2) & v + U = w + U \\ (3) & (v + U) \cap (w + U) \neq \emptyset \end{aligned}

Proof

if (1) holds, so

v + u = w + (v - w + u) \in w + U

, vice versa. So (2) holds. If (3) holds, so

(v + U) \cap (w + U) \neq \emptyset

. Thus there exist

u_{1}, u_{2} \in U

such that

v_{1} + u_{1} = w + u_{2} .

Thus

v - w = u_{2} - u_{1} \in U

Now we can define addition and scalar multiplication on

V / U

Definition

addition and scalar multiplication on

V / U

Defined as followed:

\begin{array}{c} (v + U) + (w + U) : = (v + w) + U \\ 𝜆 (v + U) : = 𝜆 v + U \end{array}

Theorem

Quotient Space is a Vector Space

Suppose

U

is a subspace of

V

. Then

V / U

, with the operations of addition and scalar multiplication as defined above, is a vector space.The additive identity of

V / U

0 + U

and that the additive inverse of

v + U

(- v) + U

Fig1:Quotient Space

The next concept will gives us an easy way to compute the dimension of

V / U

Definition

quotient map,

𝜋

Suppose

U

is a subspace of

V

. The quotient map

𝜋

is the linear map

𝜋 : V \to V / U

defined by

𝜋 (v) : = v + U

for

v \in V

𝜋

should depends on

U, V

Definition

Dimension of a quotient space

Suppose

V

is finite-dimensional and

U

is a subspace of

V

. Then

d i m V / U : = d i m V - d i m U

Proof

Use the fundamental theorem of linear map. Consider map

𝜋

, its null space is

U

itself. Then it gives the desired result.

Each linear map

T

V

induces a linear map

\tilde{T}

V / (n u l l T)

, which we now define.

Definition

\tilde{T}

Suppose

T \in L (V, W)

. Define

\tilde{T} : V / (n u l l T) \to W

\tilde{T} (v + n u l l T) = T v .

It means that a group of vectors parallel to null has the same result.

To show that the definition of

\tilde{T}

makes sense, suppose

u, v \in V

are such that

u + n u l l T = v + n u l l T

. Then according to

(2)

, there must be

u - v \in n u l l T

. Thus

T (u - v) = 0

. Hence

T u = T v

. The definition of

\tilde{T}

indeed make sense.

Definition

Null space and range of

\tilde{T}

Suppose

T \in L (V, W)

. Then(1)

\tilde{T}

is a linear map from

V / (n u l l T)

W

;(2)

\tilde{T}

is injective;(3)

r a n g e \tilde{T} = r a n g e T

;(4)

V / (n u l l T)

is isomorphic to range

T

Proof

(1) omit(2) We need to prove that

n u l l \tilde{T} = {0}

. Recall that the domain of

\tilde{T}

is a quotient space. So for different

v \in V : T v = 0

, they all in the null space

v \in n u l l T

. Thus

v_{i} + n u l l T

is a same thing.

3.F DualityThe Dual Space and the Dual MapLinear maps into the scalar field

F

play a special role in linear algebra, and thus they get a special name:

Definition

linear functional

A linear functional on

V

is a linear map from

V

F

. In other words, a linear functional is an element of

L (V, F)

Example

77.1

linear functionals

★ Define

𝜑 : R^{3} \to R

𝜑 (x, y, z) : = 4 x - 5 y + 2 z

. Then

𝜑

is a linear functional on

R^{3}

.★ Fix

(c_{1}, \dots, c_{n}) \in F^{n}

. Define

𝜑 : F^{n} \to F

𝜑 (x_{1}, \dots, x_{n}) = \sum_{i}^{} c_{i} x_{i}

★ Define

𝜑 : P (R) \to R

𝜑 (p) = \int_{0}^{1} p (x) d x

. ★ Define

𝜑 : P (R) \to R

𝜑 (p) = 3 p'' (5) + 7 p (4)

The vector space

L (V, F)

also gets a special name and special notation:

Definition

dual space,

V'

The dual space of

V

, denoted

V'

, is the vector space of all linear functionals on

V

. In other words,

V' = L (V, F) .

Theorem

d i m V' = d i m V

Suppose

V

is finite-dimensional. Then

d i m V' = d i m V

Proof

d i m L = d i m V \times d i m W

Definition

dual basis

v_{1}, \dots, v_{n}

is a basis of

V

, then the dual basis of

v_{1}, \dots, v_{n}

is the list

𝜑_{1}, \dots, 𝜑_{n}

of elements of

V'

, where each

𝜑_{j}

is the linear functional on

V

such that

𝜑_{j} (v_{k}) = {\begin{cases} 1 & i f k = j, \\ 0 & i f k \neq j . \end{cases}

The next result shows that the dual basis is indeed a basis.

Theorem

Dual basis is a basis of the dual space

Proof

We only need to prove that each element in the basis is linearly independent, which is obvious.

Definition

dual map,

T'

T \in L (V, W)

, then the dual map of

T

is the linear map

T' \in L (W', V')

defined by

T' (𝜑) = 𝜑 \circ T

for

𝜑 \in W'

. Recall that,

𝜑 \in W'

, so the dual map itself is a linear functional. It can operate on a linear functional in dual space, find a corresponding linear functional in

V'

, which brings any

v

into

𝜑 \circ T (v)

Fig2:dual map

Example

82.1

dual map

It's like mapping twice. When you map

v \in V

into

W

, you choose a basis in

W

to define a linear functional. After doing this, you want to define a linear functional in space

V

as well. And you want to make sure that these to linear functional has some connection. For example, in this situation, if you choose a good basis and make the

T

map between

V

and

W

very simple:

a_{i} v_{i} \to a_{i} w_{i}

. Then this dual map will create some linear functionals in

V' : v' = 𝜑_{s} \circ T .

Then let

v'

map a vector from

V

into

F

v' (a_{i} v_{i}) = 𝜑 \circ T (a_{i} v_{i}) = 𝜑 \circ a_{i} w_{i} = a_{i}

Example

82.2

an image of dual map

\begin{aligned} (\begin{array}{c} x_{1} \\ x_{2} \\ x_{3} \end{array}) & \to & (\begin{array}{ccc} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \end{array}) (\begin{array}{c} x_{1} \\ x_{2} \\ x_{3} \end{array}) & \to & (\begin{array}{c} y_{1} \\ y_{2} \end{array}) \\ d u a l ↓ & d u a l m a p & d u a l ↓ \\ (\begin{array}{ccc} x_{4} & x_{5} & x_{6} \end{array}) & \leftarrow & (\begin{array}{cc} b_{11} & b_{21} \\ b_{12} & b_{22} \\ b_{13} & b_{23} \end{array}) (\begin{array}{cc} y_{3} & y_{4} \end{array}) & \leftarrow & (\begin{array}{cc} y_{3} & y_{4} \end{array}) \end{aligned}

Theorem

Algebraic properties of dual maps

★

(S + T)' = S' + T' f o r a l l S, T \in L (V, W)

★

(𝜆 T)' = 𝜆 T' f o r a l l T \in L (V, W)

★

(S T)' = T' S' f o r a l l T \in L (U, V) a n d a l l S \in L (V, W)

Proof

★

(S + T)' 𝜑 = 𝜑 \circ (S + T) = 𝜑 \circ S + 𝜑 \circ T = S' 𝜑 + T' 𝜑

★ Suppose

𝜑 \in W'

, then

(S T)' 𝜑 = 𝜑 \circ (S T) = (𝜑 \circ S) \circ T

the first term is a linear functional whose domain is

T

. Thus according to the dual map definition

= T' (𝜑 \circ S) = T' S' (𝜑)

The null Space and Range of the Dual of a Linear MapOur goal in this subsection is to describe

n u l l T'

and range

T'

in terms of

r a n g e T

and

n u l l T

Definition

annihilator,

U^{0}

For

U \subset V

,the annihilator of

U

, denoted

U^{0}

, is defined by

U^{0} = {𝜑 \in V' : 𝜑 (u) = 0 f o r a l l u \in U}

How can a linear map from

V

operate on

U

? In my opinion, to be more specific, we can construct a linear map from

V'

U'

to show this:

T

is just an "identity" map from

U

into

V

, so is

T'

. when we talk about

U^{0}

, we are talking about the linear functional that maps all

u \in U

into 0. Thus it's actually a

0

map, whose domain is

U

, and it's

0

in the dual space

U'

. Now, we can answer the question -- what's annihilator in fact? It's just the null space of T' !

Example

84.1

the annihilator of

P (R)

Suppose

U

is the subspace of

P (R)

consisting of all polynomial multiples of

x^{2}

. If

𝜑

is the linear functional on

P (R)

defined by

𝜑 (p) = p' (0)

, then

𝜑 \in U^{0} .

For

U \subset V

, the annihilator

U^{0}

is a subset of the dual space

V'

. Thus

U^{0}

depends on the vector space containing

U

, so a notation such as

U_{V}^{0}

would be more precise.

Example

84.2

Let

e_{1}, e_{2}, \dots, e_{5}

denote the standard basis of

R^{5}

, and let

𝜑_{1}, 𝜑_{2}, \dots 𝜑_{5}

denote the dual basis of

(R^{5})' .

Suppose

U = s p a n (e_{1}, e_{2}) = {(x_{1}, x_{2}, 0, 0, 0) \in R^{5} : x_{1}, x_{2} \in R} .

Show that

U^{0} =

span

(𝜑_{3}, 𝜑_{4}, 𝜑_{5})

Solutionomit.

Theorem

The annihilator is a subspace

Suppose

U \subset V .

Then

U^{0}

is a subspace of

V'

Proof

Define the addition and scalar multiplication as normal.Clearly

0 \in U^{0}

Theorem

Dimension of the annihilator

Suppose

V

is finite-dimensional vector space and

U

is a subspace of

V

. Then

d i m U + d i m U^{0} = d i m V

Proof

Informally: For a list of basis of

U

, it can be extended to the basis of

V

. Thus the dual basis of

U

only has the dimension of the "absent" bases number in

U

. Formally: because we can define a linear map from

U

V

, such that

i (u) = u, f o r u \in U

. Thus

i'

is a map from

V'

into

U'

. Using fundamental theorem

d i m n u l l i' + d i m r a n g e i' = d i m V' = d i m V

and

n u l l i' = {i' v' : i' (v') = 0}

where

i' (v') = 0

means

𝜑 (u) = 0 f o r a l l u \in U

. Thus

n u l l i' = U^{0}

. And the range of

i'

is clearly

U'

, whose dimensional equals

d i m U

. So equ

(3)

becomes

d i m U^{0} + d i m U = d i m V

We have show the null space of the

T'

from a dual space

V'

to the dual space of a subspace of

V

, which is

U'

. Now, what is the null space of a random dual map

T'

Theorem

The null space of

T'

Suppose

V

and

W

are finite-dimensional and

T \in L (V, W)

. Then★ null

T' = (r a n g e T)^{0}

★

d i m n u l l T' = d i m n u l l T + d i m W - d i m V

Proof

Recall that

T' 𝜑 : = 𝜑 \circ T

. What's null

T'

n u l l T' 𝜑 (v) : = 𝜑 \circ (T v) = 0

What kind of

𝜑

make

T (v) = 0 ?

——

𝜑 \in (T v)^{0}

. So the first theorem is proved. Once the first theorem is proved, using the previous theorem, we have

d i m n u l l T' = d i m (r a n g e T)^{0} = d i m W - d i m (r a n g e T) .

Using the fundamental theorem:

d i m (r a n g e T) = d i m V - d i m (n u l l T)

then

d i m n u l l T' = d i m n u l l T + d i m W - d i m V

□

Fig3:relationships between

n u l l T, n u l l T', (r a n g e T)^{0}

Theorem

T

surjective is equivalent to

T'

injective

Suppose

V

and

W

are finite-dimensional and

T \in L (V, W)

. Then

T

is surjective if and only if

T'

is injective.

Proof

See equ

(4)

Theorem

The range of

T'

Suppose

V

and

W

are finite-dimensional and

T \in L (V, W)

. Then★

d i m r a n g e T' = d i m r a n g e T

★

r a n g e T' = (n u l l T)^{0}

Proof

dim range

T' = d i m W - d i m n u l l T'

, using equ

(5)

\begin{aligned} d i m r a n g e T' & = d i m W - (d i m n u l l T + d i m W - d i m V) \\ = d i m V - d i m n u l l T \\ = d i m r a n g e T \end{aligned}

For

r a n g e T'

\begin{aligned} T' (𝜑) & = {𝜑 \circ T : f o r 𝜑 \in W'} \\ r a n g e T' & = {𝜓 : 𝜓 = T' 𝜑 = 𝜑 \circ T, f o r s o m e 𝜑 \in W'} \end{aligned}

thus for all

𝜓 \in T'

, the result they act on

v \in V

𝜓 v = 𝜑 \circ T v, f o r s o m e 𝜑 \in W'

v \in n u l l T

, then

𝜓 (v) = 𝜑 \circ 0 = 0

𝜓 \in r a n g e T'

can map every vector in

n u l l T

into

0

, thus

r a n g e T' \subseteq (n u l l T)^{0}

Besides, equ

(6)

tells us

d i m r a n g e T' = d i m r a n g e T = d i m V - d i m (n u l l T) = d i m (n u l l T)^{0}

. Thus we have equ

(7)

□

Theorem

T

injective is equivalent to

T'

surjective

Suppose

V

and

W

are finite-dimensional and

T \in L (V, W)

. Then

T

is injective if and only if

T'

is surjective.

Proof

T

is injective if and only if

n u l l T = {0}

, which means

(n u l l T)^{0} = V' .

Using equ

(7)

, it tells us

T'

is surjective.

The Matrix of the Dual map of a Linear MapWe now define the transpose of a matrix.

Definition

Transpose

(A^{t})_{k, j} : = A_{j, k}

Theorem

The transpose of the product of matrices

(A C)^{t} = C^{t} A^{t}

Proof

\begin{aligned} ((A C {)^{t}}_{k, j}) & = (A C)_{j, k} = \sum_{i}^{} A_{j i} C_{i k} \\ = \sum_{i}^{} A_{i j}^{t} C_{k i}^{t} = \sum_{i}^{} C_{k i}^{t} A_{i j}^{t} \\ = (A^{t} C^{t})_{k, j} \end{aligned}

The setting for the next result is the assumption that we have a basis

v_{1}, \dots, v_{n}

V

, along with its dual basis

𝜑_{1}, \dots, 𝜑_{n}

V'

Theorem

The matrix of

T'

is the transpose of the matrix of

T

Suppose

T \in L (V, W)

M (T') = (M (T))^{t}

Proof

First, recall the meaning of a matrix times a matrix of vector:

M (T') M (w')_{i} : = t h e i^{t h} c o e f f i c i e n t o f t h e m a p o f w'

Because

\begin{array}{c} T' w' = \sum_{i}^{} T' (w_{i} 𝜑_{i}) = \sum_{i}^{} w_{i} 𝜑_{i} \circ T \\ (T' w' (v)) = \sum_{i}^{} w_{i} 𝜑_{i} \circ T (v) \end{array}

Because

𝜑_{i} \circ T (v_{j}) = (\sum_{j}^{} T_{i j} v_{j})

(T' w' (v)) = \sum_{i}^{} w_{i} \sum_{j}^{} T_{i j} v_{j} = \sum_{j}^{} (\sum_{i}^{} w_{i} T_{i j}) v_{j}

Now we write the linear functional as the product of two matrix:

\sum_{i}^{} w_{i} T_{i j}

. Because this linear is the product of the matrix of dual map and the matrix in the dual space

W'

, So let's define the matrix of dual map is

M (T') = M (T)^{t}

Thus

\sum_{i}^{} w_{i} T_{i j} = \sum_{i}^{} T'_{j i} w_{i} = M (T') w

□

The Rank of a Matrix

Definition

row rank & column rank

Suppose

A

is an

m - b y - n

matrix with entries in

F

. The row rank of

A

is the dimensional of the span of the rows of

A

F^{1, n}

Theorem

Dimension of

r a n g e T

equals column rank of

M (T)

Proof

Suppose

v_{1}, \dots, v_{n}

is a basis of

V

and

w_{1}, \dots, w_{m}

is a basis of

W

. The function that takes

w \in s p a n (T v_{1}, \dots, T v_{n})

M (w)

is easily seen to be an isomorphism from

s p a n (T v_{1}, \dots, T v_{n})

onto

s p a n (M (T v_{1}), \dots, M (T v_{n}))

, where the last dimension equals the column rank of

M (T)

(\begin{array}{ccc} T_{11} & \dots \\ T_{21} & \dots \\ T_{31} & \dots \end{array}) \cdot (\begin{array}{c} 1 \\ 0 \\ 0 \end{array}) = (\begin{array}{c} T_{11} \\ T_{21} \\ T_{31} \end{array})

It is easy to see that

r a n g e T = s p a n (T v_{1}, \dots, T v_{n})

. Thus we have

d i m r a n g e T = d i m s p a n (T v_{1}, \dots T v_{n}) =

the column rank of

M (T)

Theorem

Row rank equals column rank

Suppose

A \in F^{m, n}

. Then the row rank of

A

equals the column rank of

A

Proof

Define

T : F^{n, 1} \to F^{m, 1}

T x : = A x

. Thus

M (T) = A

. Now

\begin{aligned} c o l u m n r a n k o f A & = c o l u m n r a n k o f M (T) \\ = d i m r a n g e T \\ = d i m r a n g e T' \\ = c o l u m n r a n k o f M (T') \\ = c o l u m n r a n k o f A^{t} \\ = r o w r a n k o f A \end{aligned}

Chapter4. Polynomials(This chapter is mostly omitted)4.A polynomialsThe Division Algorithm for Polynomials

Theorem

Division Algorithm for Polynomials

Suppose that

p, s \in P (F)

, with

s \neq 0

. Then there exist unique polynomials

q, r \in P (F)

such that

p = s q + r

and

d e g r < d e g s

Theorem

Fundamental Theorem of Algebra

Every non-constant polynomial with complex coefficients has a zero.

\dots

\dots

Chapter5. Eigenvalues, Eigenvectors, and Invariant SubspacesLinear maps from one vector space to another vector space were the objects of study in Chapter 3. Now we begin our investigation of linear maps from a finite-dimensional vector space to itself. Their study constitutes the important part of linear algebra.5.A Invariant SubspacesIn this chapter we develop the tools that will help us understand the structure of operators. Recall that an operator is a linear map from a vector space to itself. Recall also that we denote the set of operators on

V

L (V)

. Let's see how we might better understand what an operator looks like. Suppose

T \in L (V)

. If we have a direct sum decomposition

V = U_{1} \oplus U_{2} \oplus \dots \oplus U_{m},

where each

U_{j}

is a proper subspace of

V

, then to understand the behavior of

T

, we need only understand the behavior of each

T |_{U_{j}}

; here

T |_{U_{j}}

denotes the restriction of

T

to the smaller domain

U_{j}

. Dealing with

T |_{U_{j}}

should be easier than dealing with

T

because

U_{j}

is a smaller vector space than

V

. However, if we intend to apply tools useful in the study of operators (such as taking powers), then we have a problem:

T |_{U_{j}}

may not map

U_{j}

into itself. Thus we are led to consider only decompositions of

V

of the form above where

T

maps each

U_{j}

into itself.

Definition

invariant subspace

Suppose

T \in L (V)

. A subspace

U

V

is called invariant under

T

u \in U

implies

T u \in U

Example

99.1

invariant subspace

Suppose

T \in L (V)

. Show that each of the following subspaces of

V

is invariant under

T

★

{0}

;★

V

;★

n u l l T

;★

r a n g e T

;

Eigenvalues and EigenvectorsNow we turn to an investigation of the simplest possible nontrivial invariant subspaces—invariant subspaces with dimension 1. Take any

v \in V

with

V \neq 0

and let

U

equal the set of all scalar multiples of

v

U = {𝜆 v : 𝜆 \in F} = s p a n (v) .

Then

U

is a 1-dimensional subspace of

V

and every 1-dimensional subspace of

V

is of this form for an appropriate choice of

v

. If

U

is invariant under an operator

T \in L (V)

, then

T v \in U

, and hence there is a scalar

𝜆 \in F

such that

T v = 𝜆 v .

Conversely, if

T v = 𝜆 v

for some

𝜆 \in F

, then

s p a n (v)

is a 1-dimensional subspace of

V

invariant under

T

. The equation

T v = 𝜆 v

which we have just seen is intimately connected with 1-dimensional invariant subspaces, is important enough that the vectors

v

and scalar

𝜆

satisfying it are given special names.

Definition

100

eigenvalue and eigenvector

Suppose

T \in L (V)

. A number

𝜆 \in F

is called an eigenvalue of

T

if there exist

v \in V

such that

v \neq 0

and

T v = 𝜆 v

. And

v

is called eigenvector, then.

Now we show that eigenvectors corresponding to distinct eigenvalues are linearly independent.

Theorem

101

Linearly independent eigenvectors

Let

T \in L (V)

. Suppose

𝜆_{1}, \dots, 𝜆_{m}

are distinct eigenvalues of

T

and

v_{1}, \dots, v_{m}

are corresponding eigenvectors. Then

v_{1}, \dots, v_{m}

is linearly independent.

Proof

Suppose

v_{1}, \dots, v_{m}

is linearly dependent. Let

k

be the smallest positive integer such that

v_{k} \in s p a n (v_{1}, \dots, v_{k - 1});

then exists

v_{k} = a_{1} v_{1} + \dots a_{k - 1} v_{k - 1}

So the linear map can be written as

T v_{k} = a_{1} 𝜆_{1} v_{1} + \dots + a_{k - 1} 𝜆_{k - 1} v_{k - 1}

And

∵ T v_{k} = 𝜆_{k} v_{k} = 𝜆_{k} (a_{1} v_{1} + \dots a_{k - 1} v_{k - 1}),

equ

(8)

minus equ

(9)

0 = a_{1} (𝜆_{k} - 𝜆_{1}) v_{1} + \dots + a_{k - 1} (𝜆_{k} - 𝜆_{k - 1}) v_{k - 1}

Because we choose

k

to be the smallest positive integer to be linear dependent, thus

v_{1}, \dots, v_{k - 1}

are linear independent.

∴ 𝜆_{k} = 𝜆_{1} = \dots = 𝜆_{k - 1}

Therefore our assumption that

v_{1}, \dots, v_{m}

is linearly dependent was false.

From the sketch we can realize that, unless

𝜆_{1} = 𝜆_{2} = 𝜆_{3}

, there must be

𝜆_{3} v_{3} \neq T v_{3}

Fig4:eigenvectors of different eigenvalues

Theorem

102

Numbers of eigenvectors

Suppose

V

is finite-dimensional. Then each operator on

V

has at most

d i m V

distinct eigenvalues.

Proof

Distinct eigenvalues means linear independent vectors.

Restriction and Quotient OperatorsIf

T \in L (V)

and

U

is a subspace of

V

invariant under

T

, then

U

determines two other operators

T |_{U} \in L (U)

and

T / U \in L (V / U)

in a natural way, as defined below

Definition

103

T |_{U}

and

T / U

Suppose

T \in L (V)

, and

U

is a subspace of

V

invariant under

T

.★ The restriction operator

T |_{U} \in L (U)

is defined by

T |_{U} (u) : = T u, u \in U

★ The quotient operator

T / U \in L (V / U)

is defined by

(T / U) (v + U) = T v + U

★ for

v \in V

Example

103.1

Define an operator

T \in L (F^{2})

T (x, y) : = (y, 0)

. Let

U = {(x, 0) : x \in F}

. Show that(1)

T |_{U}

is the

0

operator on

U

.(1) solution: obviously(2) there does not exist a subspace

W

F^{2}

that is invariant under

T

and such that

F^{2} = U \oplus W

;(2) solution: if there exist such

W

, then

d i m W = d i m F^{2} - d i m U = 1

. However, it's easy to find the only eigenvector of

T

0

, which is already in

U

.(3)

T / U

is the

0

operator on

F^{2} / U

\begin{array}{c} (T / U) (v + U) = T v + U \\ ∵ T v \in U \\ ∴ T v + U = U, w h i c h i s t h e 0 i n q u o t i e n t s p a c e F^{2} / U \end{array}

5.B Eigenvectors and Upper-Triangular MatricesPolynomials Applied to OperatorsThis is a new use of the symbol

p

because we are applying it to operators, not just elements of

F

Definition

104

p (T)

Suppose

T \in L (V)

, and

p \in P (F)

is a polynomial given by

p (z) = a_{0} + a_{1} z + \dots + a_{m} z^{m}

for

z \in F

. Then

p (T)

is the operator defined by

p (T) = a_{0} I + a_{1} T + a_{2} T^{2} + \dots + a_{m} T^{m} .

If we fix an operator

T \in L (V)

, then the function from

P (F)

L (V)

given by

p \mapsto p (T)

is linear.

Definition

105

product of polynomials

p, q \in P (F)

, then

p q \in P (F)

is the polynomial defined by

(p q) (z) = p (z) q (z)

for

z \in F

. And any tow polynomials of an operator commute:

(p q) T = p (T) q (T) = q (T) p (T)

Existence of Eigenvalues

Theorem

106

Operators on complex vector spaces have an eigenvalue

Every operator on a finite-dimensional, nonzero, complex vector space has an eigenvalue.

Proof

Suppose

V

is a complex vector space with dimension

n

and operator

T \in L (V)

. Choose

v \in V

with

v \neq

0, Then

v, T v, T^{2} v, \dots T^{n} v

is not linear independent, because they are more than

n

. Thus there exist complex number

a_{0}, \dots, a_{n}

, not all 0, such that

0 = a_{0} v + a_{1} T v + \dots a_{n} T^{n} v

Make the

a' s

the coefficients of a polynomial, which by the Fundamental Theorem of Algebra has a factorization

a_{0} + a_{1} z + \dots + a_{n} z^{n} = c (z - 𝜆_{1}) \dots (z - 𝜆_{m}),

where

c

is a nonzero complex number, each

𝜆_{j}

is in

C

, and the equation holds for all

z \in C

. We then have

\begin{aligned} 0 & = a_{0} v + \dots a_{n} T^{n} v \\ = (a_{0} I + a_{1} T + \dots + a_{n} T^{n}) v \\ = c (T - 𝜆_{1} I) \dots (T - 𝜆_{m} I) v . \end{aligned}

Upper-Triangular MatricesNow that we are studying operators, which map a vector space to itself, the emphasis is on using only one basis.

Definition

107

Matrix of Operator

T v_{k} = A_{1, k} v_{1} + \dots + A_{n, k} v_{n}

Definition

108

Diagonal of a Matrix

The diagonal of a matrix consists of the entries along the line from the upper left corner to the bottom right corner.

Definition

109

Upper-triangular matrix

If all the entries below the diagonal equal 0.

Theorem

110

Conditions for upper-triangular matrix

Suppose

T \in L (V)

and

v_{1}, \dots v_{m}

is a basis of

V

. Then the following are equivalent:★ the matrix of

T

with respect to

v_{1}, \dots v_{n}

is upper triangular;★

T v_{j} \in s p a n (v_{1}, \dots, v_{j})

for each

j = 1, \dots, n

;★

s p a n (v_{1}, \dots, v_{j})

is invariant under

T

for each

j = 1, \dots, n

Theorem

111

Over

C

, every operator has an upper-triangular matrix

Suppose

V

is a finite-dimensional complex vector space and

T \in L (V)

. Then

T

has an upper-triangular matrix with respect to some basis of

V

Proof

Use induction on the dimension of

V

. Clearly the desired result holds if

d i m V = 1

.Suppose now that

d i m V > 1

and the desired results holds for all complex vector spaces whose dimension is less than the dimension of

V .

Let

𝜆

be any eigenvalue of

T

. Let

U = r a n g e (T - 𝜆 I)

Because

T - 𝜆 I

is not surjective, thus

d i m U < d i m V

. Furthermore,

U

is invariant under

T

. To prove this, suppose

u \in U .

Then

\begin{aligned} T u & = \underset{\in U}{\underset{⏟}{(T - 𝜆 I) u}} + \underset{\in U}{\underset{⏟}{𝜆 u}} \in U \end{aligned}

Thus

T |_{U}

is an operator on

U

. By our induction hypothesis, there is a basis

u_{j}

based which

T |_{U}

has upper-triangular matrix.

injective and surjective is equivalent for

L (V)

in finite-dimensional space, and eigenvalue means there is at least a 1-dimensional invariant subspace.

T u_{j} = (T |_{U}) (u_{j}) \in s p a n (u_{1}, \dots, u_{j})

Extend

u_{1}, \dots, u_{m}

to a basis

u_{1}, \dots, u_{m}, v_{1}, \dots, v_{n}

V

. For each

k

, we have a obvious relation:

T v_{k} \equiv (T - 𝜆 I) v_{k} + 𝜆 v_{k}

The definition of

U

shows that

(T - 𝜆 I) v_{k} \in U = s p a n (u_{1}, \dots, u_{m})

. Thus the equation above shows that

T v_{k} \in s p a n (u_{1}, \dots, u_{m}, v_{k})

So, it definitely also has:

T v_{k} \in s p a n (u_{1}, \dots, u_{m}, v_{1}, \dots, v_{k})

To show the proof above more clearly, I drew some sketches.

The key idea is to use subspace with a smaller dimension, and the space we are interesting is

V

range T

(

)

domain

However, we actually only need to prove the vector in

r a n g e T

can be represented by some basis. (Pretend

r a n g e T

has a smaller dimension)

range T

(

range T

)

If we investigate

r a n g e T

, and treat it as the domain, we can use induction. We may loose a part of range

T (V)

. And that's the part we need to prove.

T'=T-𝜆Irange T'

(

)

domain

How can we make sure

r a n g e T

has a smaller dimension? We know that with a eigenvalue, there must be a

T' = T - 𝜆 I

, such that it is not injective. Use it and redo the proof we did to

T

Theorem

112

Determination of invertibility from upper-triangular matrix

Suppose

T \in L (V)

has an upper-triangular matrix with respect to some basis of

V

. Then

T

is invertible if and only if all the entries on the diagonal of that upper-triangular matrix are nonzero.

Proof

To prove invertibility, we need to prove injectivity or surjectivity. Suppose the upper-triangular matrix has

𝜆_{1}, \dots, 𝜆_{n}

on its diagonal, with basis

v_{1}, \dots, v_{n}

.If

𝜆_{i} \neq 0

:then we can have

T (\frac{v_{1}}{𝜆_{1}}) = v_{1}

T (\frac{v_{2}}{𝜆_{2}}) = a v_{1} + v_{2}

\dots

Because

𝜆 \neq 0

, we can construct such

n

equations, each equation maps a vector into linearly independent vector list. So

r a n g e T = d i m V

. To prove the other direction, now suppose that

T

is invertible. This implies that

𝜆_{1} \neq 0

, because otherwise we would have

T v_{1} = 0

. Suppose

𝜆_{j} = 0, 1 < j ⩽ n

, if

𝜆_{j} = 0

, it means

T v_{j} \in s p a n (v_{1}, \dots, v_{j - 1})

So list

{T v_{1}, T v_{2}, \dots, T v_{n}}

cannot be linear independent.

T

is not surjective and are not invertible.

Now we have a way to compute the eigenvalues from a upper-triangular matrix:

Theorem

113

Determination of eigenvalues from upper-triangular matrix

Suppose

T \in L (V)

and has an upper-triangular matrix with respect to some basis of

V

. Then the eigenvalues of

T

are precisely the entries on the diagonal of the upper-triangular matrix.

Proof

Use the invertibility of

T - 𝜆 I

for the upper-triangular matrix.

5.C Eigenspaces and Diagonal Matrices

Definition

114

diagonal matrix

If all the entries except the diagonal equal 0.

Definition

115

eigenspace,

E (𝜆, T)

Suppose

T \in L (V)

and

𝜆 \in F

. The eigenspace of

T

corresponding to

𝜆

, denoted

E (𝜆, T)

, is defined by

E (𝜆, T) = n u l l (T - 𝜆 I)

In other words,

E (𝜆, T)

is the set of all eigenvectors of

T

corresponding to eigenvalue

𝜆

, along with the

0

vector.

Theorem

116

Sum of eigenspaces is a direct sum

Suppose

V

is finite-dimensional and

T \in L (V)

. Suppose also that

𝜆_{1}, \dots, 𝜆_{m}

are distinct eigenvalues of

T

. Then

E (𝜆_{1}, T) + \dots + E (𝜆_{m}, T)

is a direct sum. Furthermore,

d i m E (𝜆_{1}, T) + \dots + d i m E (𝜆, T) ⩽ d i m V

Definition

117

diagonalizable

An operator

T \in L (V)

is called diagonalizable if the operator has a diagonal matrix with respect to some basis of

V

Theorem

118

Conditions equivalent to diagonalizability

Suppose

V

is finite-dimensional and

T \in L (V)

. Suppose also that

𝜆_{1}, \dots, 𝜆_{m}

are distinct eigenvalues of

T

. Then the following are equivalent:(1)

T

is diagonalizable;(2)

V

has a basis consisting of eigenvectors of

T

;(3) there exist 1-dimensional subspaces

U_{1}, \dots, U_{n}

V

, each invariant under

T

, such that

V = U_{1} \oplus \dots \oplus U_{n};

(4)

V = E (𝜆_{1}, T) \oplus \dots \oplus E (𝜆_{m}, T)

(5)

d i m V = d i m E (𝜆_{1}, T) + \dots + d i m E (𝜆_{m}, T)

Proof

An operator

T \in L (V)

has a diagonal matrix with respect to a basis

v_{1}, \dots, v_{n}

V

if and only if

T v_{j}

= 𝜆_{j} v_{j}

for each

j

. Thus (1) and (2) are equivalent.Suppose (2) holds; thus

V

has a basis

v_{1}, \dots, v_{n}

consisting of eigenvectors of

T

. For each

j

, let

U_{j} = s p a n (v_{j})

Thus

V = U_{1} \oplus \dots \oplus U_{n}

, (3) holds.Suppose now that (3) holds. The

U_{j}

respected to the same

𝜆

can consist

E (𝜆_{m}, T)

. (4) holds, eigenvectors belong different eigenvalues are linearly independent. So (12) also holds.

Now we have shown that (1) (2) (3) are equivalent, and (3) implies (4), implies (5), implies (6). We will show that (5) implies (2). Suppose (5) holds, which means equ

(12)

holds. Choose a basis of each

E (𝜆_{j}, T)

; put all these bases together to form a list

v_{1}, \dots, v_{n}

of eigenvectors of

T

, where

n = d i m V

. And we know that they are linearly independent, and are all eigenvectors of

T

. Thus (2) holds.

Theorem

119

Enough eigenvalues implies diagonalizability

T \in L (V)

has

d i m V

distinct eigenvalues, then

T

is diagonalizable.

Chapter6. Inner Product SpacesIn making the definition of a vector space, we generalized the linear structure (addition and scalar multiplication) of

R^{2}

and

R^{3}

. We ignored other important features, such as the notions of length and angle. These ideas are embedded in the concept we now investigate, inner products. 6.A Inner Products and NormsInner Products

Definition

120

dot product

For

x, y \in R^{n}

, the dot product of

x

and

y

, denoted

x \cdot y

, is defined by

x \cdot y = x_{1} y_{1} + \dots + x_{n} y_{n},

where

x = (x_{1}, \dots, x_{n})

and

y = (y_{1}, \dots, y_{n})

Note that the dot product of two vectors in

R^{n}

is a number, not a vector. Obviously

x \cdot x = ‖ x ‖^{2}

for all

x \in R^{n}

. The dot product on

R^{n}

has the following properties:★

x \cdot x ⩾ 0

★

x \cdot x = 0

if and only if

x = 0

;★ for

y \in R^{n}

fixed, the map from

R^{n}

R

that sends

x \in R^{n}

x \cdot y

is linear;★

x \cdot y = y \cdot x

for all

x, y \in R^{n}

An inner product is a generalization of the dot product. At this point you may be tempted to guess that an inner product is defined by abstracting the properties of the dot product discussed in the last paragraph. For real vector spaces, that guess is correct. However, so that we can make a definition that will be useful for both real and complex vector spaces, we need to examine the complex case before making the definition.

‖ z ‖ = \sqrt{ℜ z^{2} + ℑ z^{2}} = z \bar{z}

Definition

121

inner product

An inner product on

V

is a function that takes each ordered pair

(u, v)

of elements of

V

to a number

⟨ u, v ⟩ \in F

and has the following properties:positivity

⟨ v, v ⟩ ⩾ 0 f o r a l l v \in V;

definiteness

⟨ v, v ⟩ = 0 i f a n d o n l y i f v = 0;

additivity in first slot

⟨ u + v, w ⟩ = ⟨ u, w ⟩ + ⟨ v, w ⟩ f o r a l l u, v, w \in V

homogeneity in first slot

⟨ 𝜆 u, v ⟩ = 𝜆 ⟨ u, v ⟩ f o r a l l 𝜆 \in F a n d a l l u, v \in V;

conjugate symmetry

⟨ u, v ⟩ = \bar{⟨ v, u ⟩} f o r a l l u, v \in V .

Example

121.1

An inner product can be defined on

P (R)

⟨ p, q ⟩ = \int_{0}^{\infty} p (x) q (x) ⅇ^{- x} d x .

Definition

122

inner product space

An inner product space is a vector space

V

along with an inner product on

V

For the rest of this chapter,

V

denotes an inner product space over

F

Theorem

123

basic properties of an inner product

★ For each fixed

u \in V

, the function that takes

v

⟨ v, u ⟩

is a linear map from

V

F

.★ But fix

v

is not a linear map!★

⟨ 0, u ⟩ = ⟨ u, 0 ⟩ = 0

★

⟨ u, v + w ⟩ = ⟨ u, v ⟩ + ⟨ u, w ⟩

★

⟨ u, 𝜆 v ⟩ = \bar{𝜆} ⟨ u, v ⟩

NormsNow we see that each inner product determines a norm

Definition

124

norm,

‖ v ‖

For

v \in V

, the norm of

v

, is defined by

‖ v ‖ = \sqrt{⟨ v, v ⟩} .

Theorem

125

basic properties of the norm

Suppose

v \in V

.(1)

‖ v ‖ = 0

if and only if

v = 0

(2)

‖ 𝜆 v ‖ = | 𝜆 | ‖ v ‖

for all

𝜆 \in F

Definition

126

orthogonal

Two vectors

u, v \in V

are called orthogonal if

⟨ u, v ⟩ = 0

You can think of the word orthogonal as a fancy word meaning perpendicular. We begin our study of orthogonal with an easy result.

Theorem

127

Orthogonal and

0

★

0

is orthogonal to every vector in

V

.★

0

is the only vector in

V

that is orthogonal to itself.

Theorem

128

Pythagorean Theorem

Suppose

v, u

are orthogonal vectors in

V

. Then

‖ u + v ‖^{2} = ‖ u ‖^{2} + ‖ v ‖^{2} .

Proof

\begin{aligned} ‖ u + v ‖^{2} & = ⟨ u + v, u + v ⟩ \\ = ⟨ u, v ⟩ + ⟨ u, u ⟩ + ⟨ v, u ⟩ + ⟨ v, v ⟩ \end{aligned}

□

Theorem

129

An orthogonal decomposition

Suppose

v, u \in V

, with

v \neq 0

. Set

c = \frac{⟨ u, v ⟩}{‖ v ‖^{2}}

and

w = u - \frac{⟨ u, v ⟩}{‖ v ‖^{2}} v

. Then

⟨ w, v ⟩ = 0 a n d u = c v + w .

Theorem

130

Cauchy-Schwarz Inequality

Suppose

v, u \in V

. Then

| ⟨ u, v ⟩ | ⩽ ‖ u ‖ ‖ v ‖ .

Proof

u = w + \frac{⟨ u, v ⟩}{‖ v ‖^{2}} v

because

w

and

v

are orthogonal, so the norm of

u

can be written as

\begin{aligned} ‖ u ‖ & = ‖ \frac{⟨ u, v ⟩}{‖ v ‖^{2}} v ‖^{2} + ‖ w ‖^{2} \\ = \frac{| ⟨ u, v ⟩ |^{2}}{‖ v ‖^{2}} + ‖ w ‖^{2} \\ ⩾ \frac{| ⟨ u, v ⟩ |^{2}}{‖ v ‖^{2}} \end{aligned}

Example

130.1

examples of the Cauchy-Schwarz Inequality

(1)

| x_{1} y_{1} + \dots + x_{n} y_{n} |^{2} ⩽ (x_{1}^{2} + \dots + x_{n}^{2}) (y_{1}^{2} + \dots + y_{n}^{2})

(2)

| \int_{- 1}^{1} f (x) g (x) d x |^{2} ⩽ (\int_{- 1}^{1} (f (x))^{2} d x) (\int_{- 1}^{1} (g (x))^{2} d x) .

Theorem

131

Triangular Inequality

Suppose

v, u \in V

. Then

‖ u + v ‖ ⩽ ‖ u ‖ + ‖ v ‖ .

Proof

\begin{aligned} ‖ u + v ‖^{2} & = ⟨ u + v, u + v ⟩ \\ = ⟨ u, u ⟩ + ⟨ v, v ⟩ + ⟨ u, v ⟩ + ⟨ v, u ⟩ \\ = ⟨ u, u ⟩ + ⟨ v, v ⟩ + ⟨ u, v ⟩ + \bar{⟨ u, v ⟩} \\ = ⟨ u, u ⟩ + ⟨ v, v ⟩ + 2 ℜ ⟨ u, v ⟩ \\ ⩽ ‖ u ‖^{2} + ‖ v ‖^{2} + 2 | ⟨ u, v ⟩ | \\ ⩽ ‖ u ‖^{2} + ‖ v ‖^{2} + 2 ‖ u ‖ ‖ v ‖ \\ = (‖ u ‖ + ‖ v ‖)^{2} \end{aligned}

The next result is called the parallelogram equality because of its geometric interpretation: in every parallelogram, the sum of the squares of the lengths of the diagonals equals the sum of the squares of the lengths of the four sides.

Theorem

132

Parallelogram Inequality

Suppose

v, u \in V

. Then

‖ u + v ‖^{2} + ‖ u - v ‖^{2} = 2 (‖ u ‖^{2} + ‖ v ‖^{2}) .

Proof

We have

\begin{aligned} ‖ u + v ‖^{2} - ‖ u - v ‖^{2} & = ⟨ u + v, u + v ⟩ + ⟨ u - v, u - v ⟩ \\ = ‖ u ‖^{2} + ‖ v ‖^{2} + | ⟨ u, v ⟩ | + ⟨ v, u ⟩ + ‖ u ‖^{2} + ‖ v ‖^{2} - | ⟨ u, v ⟩ | - | ⟨ v, u ⟩ | \\ = 2 (‖ u ‖^{2} + ‖ v ‖^{2})^{2} \end{aligned}

6.B Orthonormal Bases

Definition

133

Orthonormal

A list of vectors is called orthonormal if each vector in the list has norm 1 and is orthogonal to all the other vectors in the list. In other words, a list

e_{1}, \dots, e_{m}

of vectors in

V

is orthonormal if

⟨ e_{j}, e_{k} ⟩ = {\begin{cases} 1 & i f j = k \\ 0 & i f j \neq k \end{cases}

Theorem

134

An orthonormal list is linearly independent

Every orthonormal list of vectors is linearly independent.

Proof

make the inner product between a vector and other vectors in the list

How do we go and finding orthonormal bases? The algorithm used in the next proof is called the Gram-Schmidt Procedure. It gives a method for turning a linearly independent list into an orthonormal list with the same span as the origin list.

Theorem

135

Gram-Schmidt Procedure

Suppose

v_{1}, \dots, v_{m}

is linearly independent. Let

e_{1} = v_{1} / ‖ v_{1} ‖

. For

j = 2, \dots, m

, define

e_{j}

inductively by

e_{j} = \frac{v_{j} - \sum_{i = 1}^{j - 1} ⟨ v_{j}, e_{i} ⟩ e_{i}}{‖ v_{j} - \sum_{i = 1}^{j - 1} ⟨ v_{j}, e_{i} ⟩ e_{i} ‖}

Then

e_{1}, \dots, e_{m}

is an orthonormal list of vectors in

V

such that

s p a n (v_{1}, \dots, v_{j}) = s p a n (e_{1}, \dots, e_{j})

for

j = 1, \dots, m .

Proof

It's easy to verify

⟨ e_{j}, e_{i} ⟩ = 0

. To think this intuitively:It is removing the components projected into the previous vector list.

Theorem

136

Existence of orthonormal basis

Every finite-dimensional inner product space has an orthonormal basis. Just apply the Gram-Schmidt Procedure to a basis of the space.

Theorem

137

Orthonormal lists extend to orthonormal basis

Suppose

V

is finite-dimensional. Then every orthonormal list of vectors in

V

can be extended to an orthonormal basis of

V

Theorem

138

Upper-triangular matrix with respect to orthonormal basis

T

has an upper-triangular matrix with respect to some orthonormal basis of

V

as long as it has an upper-triangular matrix with respect to some basis.

The next result is an important application of the result above

Theorem

139

Schur's Theorem

Suppose

V

is a finite-dimensional complex vector space and

T \in L (V)

. Then

T

has an upper-triangular matrix with respect to some orthonormal basis of

V

Linear Functional on Inner Product SpacesRecall our definition of linear functional. If

u \in V

, then the map that sends

v

⟨ v, u ⟩

is a linear functional on

V

. The next result shows that every linear functional on

V

is of this form.

Theorem

140

Riesz Representation Theorem

Suppose

V

is finite-dimensional and

𝜑

is a linear functional on

V

. Then there is a unique vector

u \in V

such that

𝜑 (v) = ⟨ v, u ⟩

for every

v \in V

Proof

First we show there exists a vector

u \in V

such that

𝜑 (v) = ⟨ v, u ⟩

\begin{aligned} 𝜑 (v) & = 𝜑 \sum_{i}^{} ⟨ v, e_{i} ⟩ e_{i} \\ = ⟨ v, e_{1} ⟩ 𝜑 (e_{1}) + \dots + ⟨ v, e_{n} ⟩ 𝜑 (e_{n}) \\ = ⟨ v, \bar{𝜑 (e_{1})} e_{1} ⟩ + \dots + ⟨ v, \bar{𝜑 (e_{n})} e_{n} ⟩ \\ = ⟨ v, \bar{𝜑 (e_{1})} e_{1} + \dots + \bar{𝜑 (e_{n})} e_{n} ⟩ \\ = ⟨ v, u ⟩ \\ u & = \bar{𝜑 (e_{1})} e_{1} + \dots \bar{𝜑 (e_{n})} e_{n} \end{aligned}

Then we show

u

is unique:

i f 𝜑 (v) = ⟨ v, u_{1} ⟩ = ⟨ v, u_{2} ⟩

(13)

0 = ⟨ v, u_{1} - u_{2} ⟩

Because

(13)

should holds for all

v \in V

, then

u_{1} - u_{2}

is orthogonal with all

v

. It must be

0

. Thus

u_{1}

equals

u_{2}

6.C Orthogonal Complements and Minimization ProblemsOrthogonal Complements

Definition

141

orthogonal complement,

U^{⊥}

U

is a subset of

V

, then the orthogonal complement of

U

, denoted

U^{⊥}

, is the set of all vectors in

V

that are orthogonal to every vector in

U

U^{⊥} = {v \in V : ⟨ v, u ⟩ = 0 f o r e v e r y u \in U} .

Theorem

142

Basic properties of orthogonal complement

(1) If

U

is a subset of

V

, then

U^{⊥}

is a subspace of

V

.(2)

{0}^{⊥} = V

.(3)

V^{⊥} = {0}

.(4) If

U

is a subset of

V

, then

U \cap U^{⊥} \subset {0}

.(5) If

U

and

W

are subsets of

V

and

U \subset W

, then

W^{⊥} \subset U^{⊥}

Theorem

143

Direct sum of a subspace and its orthogonal complement

Suppose

U

is a finite-dimensional subspace of

V

. Then

V = U \oplus U^{⊥} .

Proof

Let

e_{1}, \dots, e_{m}

be an orthonormal basis of

U

. for a vector in

V

, we can decompose it into two parts:

v = \underset{u}{\underset{⏟}{⟨ v, e_{1} ⟩ e_{1} + \dots + ⟨ v, e_{m} ⟩ e_{m}}} + \underset{w}{\underset{⏟}{v - ⟨ v, e_{1} ⟩ e_{1} + \dots + ⟨ v, e_{m} ⟩ e_{m}}}

So each

v \in V

can be represented by

v = u + w

, which means

V = U + W

. So

d i m W = d i m V - d i m U

. Now we should show that

⟨ w, e_{j} ⟩ = 0

, which means

W \subset U^{⊥}

\begin{aligned} ⟨ w, e_{j} ⟩ & = ⟨ v - u, e_{j} ⟩ \\ = ⟨ v, e_{j} ⟩ - ⟨ v, e_{j} ⟩ ⟨ e_{j}, e_{j} ⟩ \\ = 0 \end{aligned}

Thus

d i m V - d i m U = d i m W ⩽ d i m U^{⊥}

. Besides,

U \cap U^{⊥} = {0}

, it shows

d i m U + d i m U^{⊥} ⩽ n

. So there must be

d i m V = d i m U + d i m U^{⊥} .

Take the both orthonormal basis of

U

and

U^{⊥}

, it has right length. So it's a basis of

V

Theorem

144

Dimension of the orthogonal complement

Suppose

V

is finite-dimensional and

U

is a subspace of

V

. Then

d i m U^{⊥} = d i m V - d i m U

Proof

See

(15)

The next result is an important consequence of

(14)

Theorem

145

The orthogonal complement of the orthogonal complement

Suppose

U

is finite-dimensional subspace of

V

. Then

U = (U^{⊥})^{⊥} .

We now define an operator

P_{U}

for each finite-dimensional subspace of

V

Definition

146

orthogonal projection,

P_{U}

Suppose

U

is a finite-dimensional subspace of

V

. The orthogonal projection of

V

onto

U

is the operator

P_{U} \in L (V)

defined as follows:For

v \in V

, write

v = u + w

, where

u \in U

and

w \in U^{⊥}

. Then

P_{U} v = u

Theorem

147

Properties of the orthogonal projection

P_{U}

(1)

P_{U} = u f o r e v e r y u \in U

(2)

P_{U} w = 0 f o r e v e r y w \in U^{⊥}

(3)

r a n g e P_{U} = U;

(4)

n u l l P_{U} = U^{⊥}

(5)

v - P_{U} v \in U^{⊥}

(6)

P_{U}^{2} = P_{U}

(7)

‖ P_{U} v ‖ ⩽ ‖ v ‖

(8) for every orthonormal basis

e_{1}, \dots, e_{m}

U

P_{U} v = ⟨ v, e_{1} ⟩ e_{1} + \dots + ⟨ v, e_{m} ⟩ e_{m}

Minimization ProblemsThe following problem often arises: given a subspace

U

V

and a point

v \in V

, find a point

u \in U

such that

‖ v - u ‖

is as small as possible. The next proposition shows that this minimization problem is solved by taking

u = P_{U} v

Theorem

148

Minimizing the distance to a subspace

Suppose

U

is finite-dimensional subspace of

V

v \in V

, and

u \in U

. Then

‖ v - P_{U} v ‖ ⩽ ‖ v - u ‖

Furthermore, the inequality above is an equality if and only if

u = P_{U} v .

Proof

\begin{aligned} ‖ v - P_{U} v ‖^{2} & ⩽ ‖ v - P_{U} v ‖^{2} + ‖ P_{U} v - u ‖^{2} \\ = ‖ (v - P_{U} v) + (P_{U} v - u) ‖^{2} \\ = ‖ v - u ‖^{2} \end{aligned}

Example

148.1

approximate

\sin (x)

Find a polynomial

u

with real coefficients and degree at most 5 that approximates

\sin x

as well as possible on the interval

[- 𝜋, 𝜋]

, in the sense that

(16)

is as small as possible.

\int_{- 𝜋}^{𝜋} | \sin x - u (x) |^{2} d x

solutionLet

v \in C_{R} [- 𝜋, 𝜋]

be the function defined by

v (x) = \sin x

. Let

U

denote the subspace of

C_{R} [- 𝜋, 𝜋]

consisting of the polynomials with real coefficients and degree at most 5. Our problem can now be reformulated as follows:

F i n d u \in U s u c h t h a t ‖ v - u ‖ i s a s s m a l l a s p o s s i b l e

To compute the solution to our approximation problem, first apply the Gram-Schmidt Procedure to the basis

1, x, x^{2}, x^{3}, x^{4}, x^{5}

U

, producing an orthonormal basis

e_{1}, \dots, e_{6}

U

. Then, again using the inner product of polynomial, compute

P_{U} v

. Doing this shows that

u (x) = 0.987862 x - 0_{1} 55271 x^{3} + 0.00564312 x^{5}

where the

𝜋' s

that appear in the exact answer have been replaced with a good decimal approximation.

Fig5:

\sin x

and

u (x)

Another good approximation is Taylor polynomial

x - \frac{x^{3}}{3!} + \frac{x^{5}}{5!}

To see how good this approximation is, we draw there function

Fig6:

\sin x

and

u (x)

Chapter7. Operators on Inner Product SpacesThe deepest results related to inner product spaces deal with the subject to which we now turn — operators on inner product spaces. By exploiting properties of the adjoint, we will develop a detailed description of several important classes of operators on inner product spaces. 7.A Self-Adjoint and Normal OperatorsAdjoints

Definition

149

adjoint,

T^{*}

Suppose

T \in L (V, W)

. The adjoint of

T

is the function

T^{*} : W \to V

such that

⟨ T v, w ⟩ = ⟨ v, T^{*} w ⟩

for every

v \in V

and every

w \in W

To see why this definition make sense, consider

⟨ T v, w ⟩

as a linear functional. Then according to Riesz Representation Theorem, there exist a unique vector in

V

such that this linear functional is given by

⟨ v, w' ⟩

. Here

T^{*}

is a linear map from

w

w'

More information

The word adjoint has another meaning in linear algebra. In case you encounter the second meaning for adjoint elsewhere, be warned that the two meanings for adjoint are unrelated to each other.

Example

149.1

Find

T^{*}

(1)

T : R^{3} \to R^{2}

T (x_{1}, x_{2}, x_{3}) = (x_{2} + 3 x_{3}, 2 x_{1})

Here

\begin{aligned} ⟨ (x_{1}, x_{2}, x_{3}), T^{*} (y_{1}, y_{2}) ⟩ & = ⟨ T (x_{1}, x_{2}, x_{3}), (y_{1}, y_{2}) ⟩ \\ = y_{1} (x_{2} + 3 x_{3}) + 2 x_{1} y_{2} \\ = ⟨ (x_{1}, x_{2}, x_{3}), (2 y_{2}, y_{1}, 3 y_{1}) ⟩ \end{aligned}

Thus

T^{*} (y_{1}, y_{2}) = (2 y_{2}, y_{1}, 3 y_{1})

□(2) Fix

u \in V

and

x \in W

. Define

T \in L (V, W)

T v = ⟨ v, u ⟩ x

Fix

w \in W

. Then for every

v \in V

we have

\begin{aligned} ⟨ v, T^{*} w ⟩ & = ⟨ T v, w ⟩ \\ = ⟨ ⟨ v, u ⟩ x, w ⟩ \\ = ⟨ v, u ⟩ ⟨ x, w ⟩ \\ = ⟨ v, ⟨ w, x ⟩ u ⟩ \end{aligned}

Thus

T^{*} w = ⟨ w, x ⟩ u

Theorem

150

The adjoint is a linear map

T \in L (V, W)

, then

T^{*} \in L (W, V)

Theorem

151

Properties of the adjoint

(1)

(S + T)^{*} = S^{*} + T^{*}

for all

S, T \in L (V, W)

;(2)

(𝜆 T)^{*} = \bar{𝜆} T^{*}

for all

𝜆 \in F

and

T \in L (V, W);

(3)

(T^{*})^{*} = T

for all

T \in L (V, W)

;

\begin{aligned} ⟨ w, (T^{*})^{*} v ⟩ & = ⟨ T^{*} w, v ⟩ = ⟨ v, T^{*} w ⟩ \\ = ⟨ T v, w ⟩ = ⟨ w, T v ⟩ \end{aligned}

(4)

I^{*} = I

(5)

(S T)^{*} = T^{*} S^{*}

for all

T \in L (V, W)

and

S \in L (W, U)

\begin{aligned} ⟨ v, (S T)^{*} u ⟩ & = ⟨ (S T) v, u ⟩ \\ = ⟨ S (T v), u ⟩ \\ = ⟨ T v, S^{*} u ⟩ \\ = ⟨ v, T^{*} S^{*} u ⟩ \end{aligned}

Theorem

152

Null space and range

T^{*}

Suppose

T \in L (V, W)

, Then(1)

n u l l T^{*} = (r a n g e T)^{⊥}

;(2)

r a n g e T^{*} = (n u l l T)^{⊥}

;(3)

n u l l T = (r a n g e T^{*})^{⊥}

; (Replace

T

with

T^{*}

in (1))(4)

r a n g e T = (n u l l T^{*})^{⊥}

Definition

153

conjugate transpose

The conjugate transpose of an

m

-by-

n

matrix is the

n

-by-

m

matrix obtained by interchanging the rows and columns and then taking the complex conjugate of each entry.

The next result shows how to compute the matrix of

T^{*}

from the matrix of

T

Theorem

154

The matrix of

T^{*}

Let

T \in L (V, W)

. Suppose

e_{1}, \dots, e_{n}

is an orthonormal basis of

V

and

f_{1}, \dots, f_{m}

is an orthonormal basis of

W

. Then

M (T^{*}, (f_{1}, \dots, f_{m}), (e_{1}, \dots, e_{m}))

is the conjugate transpose of

M (T, (e_{1}, \dots, e_{n}), (f_{1}, \dots, f_{m}) .

Proof

Recall that we obtain the

k^{t h}

column of

M (T)

by writing

T e_{k}

as a linear combination of the

f_{j}

's; the scalars used in this linear combination then become the

k

th column of

M (T)

. So we have

\begin{array}{c} T e_{k} = ⟨ T e_{k}, f_{1} ⟩ f_{1} + \dots + ⟨ T e_{k}, f_{m} ⟩ f_{m} \\ M (T)_{j, k} = ⟨ T e_{k}, f_{j} ⟩ \end{array}

Replacing

T

with

T^{*}

and interchanging the roles played by the

e'

s and

f

's, we see that the entry in row

j

, column

k

, of

M (T^{*})

⟨ T^{*} f_{k}, e_{j} ⟩

, which equals

⟨ f_{k}, T e_{j} ⟩

, which equals

\bar{⟨ T e_{j}, f_{k} ⟩}

, which equals the complex conjugate of the entry in row

k

, column

j

, of

M (T)

Caution

Remember that the result below applies only when we are dealing with orthonormal bases. With respect to nonorthonormal bases, the matrix of

T^{*}

does not necessarily equal the conjugate transpose of the matrix of

T

Definition

155

self-adjoint

An operator

T \in L (V)

is called self-adjoint if

T = T^{*}

. In other words,

T \in L (V)

is self-adjoint if and only if

⟨ T v, w ⟩ = ⟨ v, T w ⟩

for all

v, w \in V

More Information

It's also called Hermitian.

F = R

, then by definition every eigenvalues is real, so the next result is interesting only when

F = C

Theorem

156

Eigenvalues of self-adjoint operators are real

Every eigenvalue of a self-adjoint operator is real.

Proof

Suppose

T v = 𝜆 v

, then

\begin{aligned} ⟨ T v, v ⟩ & = ⟨ v, T^{*} v ⟩ \\ ⟨ 𝜆 v, v ⟩ & = ⟨ v, 𝜆 v ⟩ \\ 𝜆 ⟨ v, v ⟩ & = \bar{𝜆} ⟨ v, v ⟩ \end{aligned}

Thus

𝜆 = \bar{𝜆}

□

The next result is false for real inner product spaces. As an example, consider the operator

T \in L (R^{2})

that is counterclockwise rotation of

90 °

around the origin; thus

T (x, y) = (- y, x)

. Obviously

T v

is orthogonal to

v

for every

v \in R^{2}

, even though

T \neq 0

Theorem

157

Over

C, T v

is orthogonal to all

v

only for the

0

operator

Suppose

V

is a complex inner product space and

T \in L (V)

. Suppose

⟨ T v, v ⟩ = 0

for all

v \in V

. Then

T = 0

Proof

Every

⟨ T u, w ⟩

can be computed as some analogical terms of equ

(17)

⟨ T u, w ⟩ = \frac{⟨ T (u + w), u + w ⟩ - ⟨ T (u - w), u - w ⟩}{4} + \frac{⟨ T (u + ⅈ w), u + ⅈ w ⟩ - ⟨ T (u - ⅈ w), u - ⅈ w ⟩}{4} ⅈ

So if equ

(17)

holds,

T u

is orthonormal to all the vector in space

V

The next result is false for real inner product spaces, as shown by considering any operator on a real inner product space that is not self-adjoint. And this theorem also appears in quantum physics.

Theorem

158

Over

C

⟨ T v, v ⟩

is real for all

v

only for self-adjoint operators

Suppose

V

is a complex inner product space and

T \in L (V)

. Then

T

is self-adjoint if and only if

⟨ T v, v ⟩ \in R .

Proof

Let

v \in V .

Then

\begin{aligned} ⟨ T v, v ⟩ - \bar{⟨ T v, v ⟩} & = ⟨ T v, v ⟩ - ⟨ v, T v ⟩ \\ = ⟨ T v, v ⟩ - ⟨ T^{*} v, v ⟩ \\ = ⟨ (T - T^{*}) v, v ⟩ \end{aligned}

⟨ T v, v ⟩ \in R

for every

v \in V

, then

T = T^{*}

On a real inner product space

V

, a nonzero operator

T

might satisfy

⟨ T v, v ⟩ = 0

for all

v \in V

. However, the next result shows that this cannot happen for a self-adjoint operator.

Theorem

159

T = T^{*}

and

⟨ T v, v ⟩ = 0

for all

v

, then

T = 0

Note that this theorem also holds on real inner product space.

Proof

We use another transformation:

⟨ T u, w ⟩ = \frac{⟨ T (u + w), u + w ⟩ - ⟨ T (u - w), u - w ⟩}{4};

this is correct only when

⟨ T w, u ⟩ = ⟨ w, T u ⟩ = ⟨ T u, w ⟩

Normal Operators

Definition

160

normal

An operator

T \in L (V)

on an inner product space is called normal if it commutes with its adjoint. In other words,

T T^{*} = T^{*} T

Obviously, every self-adjoint operator is normal.

Theorem

161

T

is normal if and only if

‖ T v ‖ = ‖ T^{*} v ‖

for all

v

Proof

\begin{aligned} T i s n o r m a l & ⟺ T T^{*} - T^{*} T = 0 \\ ⟺ ⟨ (T^{*} T - T T^{*}) v, v ⟩ = 0 \\ ⟺ ⟨ T^{*} T v, v ⟩ = ⟨ T T^{*} v, v ⟩ \\ ⟺ ⟨ T v, T v ⟩ = ⟨ T^{*} v, T^{*} v ⟩ \\ ⟺ ‖ T v ‖^{2} = ‖ T^{*} v ‖^{2} \end{aligned}

It can be proved that the eigenvalues of the adjoint of each operator are equal (as a set) to the complex conjugates of the eigenvalues of the operator. But an operator and its adjoint may have different eigenvectors. However, a normal operator and its adjoint have the same eigenvectors.

Theorem

162

For

T

normal,

T

and

T^{*}

have the same eigenvectors

Suppose

T \in L (V)

is normal and

v \in V

is an eigenvector of

T

with eigenvalue

𝜆

. Then

v

is also an eigenvector of

T^{*}

with eigenvalue

\bar{𝜆}

Proof

Suppose

T v = 𝜆 v

. Thus

(T - 𝜆 I) v = 0

Because

T - 𝜆 I

is also normal, so using last theorem, we have

‖ (T - 𝜆 I) v ‖ = ‖ (T - 𝜆 I)^{*} v ‖ = 0

So a vector has 0 norm when it's a

0

(T^{*} - \bar{𝜆} I) v = 0

□

7.B The spectral TheoremThe nicest operators on

V

are those for which there is an orthonormal basis of

V

with respect to which the operator has a diagonal matrix. (We are not dealing with diagonalizing. We are dealing with a more special case: the eigenvector are orthonormal). These are precisely the operators

T \in L (V)

such that there is an orthonormal basis of

V

consisting of eigenvectors of

T

. Our goal in this section is to prove the Spectral Theorem, which characterizes these operators as the normal operators when

F = C

and as the self-adjoint operators when

F = R

. The Spectral Theorem is probably the most useful tool in the study of operators on inner product spaces. Because the conclusion of Spectral Theorem depends on

F

, we will break the Spectral Theorem into two pieces, called the Complex Spectral Theorem and the Real Spectral Theorem. As is often the case in linear algebra, complex vector spaces are easier to deal with than real vector spaces. Thus we present the Complex Spectral Theorem first. The Complex Spectral TheoremThe key part of the CST states that if

F = C

and

T \in L (V)

is normal, then

T

has a diagonal matrix with respect to some orthonormal basis of

V

Theorem

163

Complex Spectral Theorem

Suppose

F = C

and

T \in L (V)

. Then the following are equivalent:(1)

T

is normal(2)

V

has an orthonormal basis consisting of eigenvectors of

T

.(3)

T

has a diagonal matrix with respect to some orthonormal basis of

V

Proof

We have already shown

(2) ⟺ (3)

. First suppose (3) holds, so

T

has a diagonal matrix. The matrix of

T^{*}

is obtained by taking the conjugate transpose of the matrix of

T

; hence

T^{*}

also has a diagonal matrix. Any diagonal matrix commute; thus

T

is normal. Now suppose (1) holds, so

T

is normal. By Schur's Theorem there is an orthonormal basis

e_{1}, \dots, e_{n}

V

with respect to which

T

has an upper-triangular matrix. Thus we can write

M (T) = (\begin{array}{ccc} a_{1, 1} & \dots & a_{1, n} \\ ⋱ & ⋮ \\ 0 & a_{n, n} \end{array})

We will show that this matrix is actually a diagonal matrix. We see from the matrix above that

‖ T e_{1} ‖^{2} = | a_{1, 1} |^{2}

and

‖ T^{*} e_{1} ‖^{2} = | a_{1, 1} |^{2} + | a_{1, 2} |^{2} + \dots + | a_{1, n} |^{2} .

Because

T

is normal, so

‖ T e_{1} ‖ = ‖ T^{*} e_{1} ‖

. Thus the two equations give the result:

| a_{1, 2} | = \dots = | a_{1, n} | = 0

Now we have

\begin{array}{c} ‖ T e_{2} ‖^{2} = | a_{2, 2} |^{2} \\ ‖ T^{*} e_{2} ‖^{2} = | a_{2, 2} |^{2} + | a_{2, 3} |^{2} + \dots \end{array}

So we have result

| a_{2, 3} | = \dots = | a_{2, n} | = 0

Continue in this fashion, we see that all the nondiagonal entries in the matrix equal 0.□

The Real Spectral TheoremThis theorem is the kernel of the spectral theorem:

Theorem

164

Invertible quadratic expressions

Suppose

T \in L (V)

is self-adjoint and

b, c \in R

are such that

b^{2} < 4 c

. Then

T^{2} + b T + c I

is invertible.

Proof

Let

v

be a nonzero vector in

V

\begin{aligned} ⟨ (T^{2} + b T + c I) v, v ⟩ & = ⟨ T^{2} v, v ⟩ + b ⟨ T v, v ⟩ + c ⟨ v, v ⟩ \\ = ⟨ T v, T v ⟩ + b ⟨ T v, v ⟩ + c ‖ v ‖^{2} \\ ⩾ ‖ T v ‖^{2} + | b | ‖ T v ‖ ‖ v ‖ + c ‖ v ‖^{2} \\ > 0 \end{aligned}

It implies

(T_{}^{2} + b T + c I) v

orthonormal with

v

. so

(T_{}^{2} + b T + c I) v \neq 0

. So null space is only {0}, it is invertible.

We know that every operator, self-adjoint or not, on a finite-dimensional nonzero complex vector space has an eigenvalue.

Theorem

165

Self-adjoint operators have eigenvalues

Suppose

T \in L (V)

is self-adjoint Then

T

has an eigenvalue.

Proof

We can assume that

V

is a real inner product space, as we have already noted. Let

n = d i m V

and choose

v \in V

with

v \neq 0

. Then

v, T v, \dots, T^{n} v

cannot be linearly independent.

0 = a_{0} v + a_{1} T v + \dots + a_{n} T^{n} v .

Make the

a

's coefficients of a polynomial, which can be written in factored form as

a_{0} + a_{1} x + \dots + a_{n} x^{n} = c (x^{2} + b_{1} x + c_{1}) \dots (x^{2} + b_{M} x + c_{M}) (x - 𝜆_{1}) \dots (x - 𝜆_{m})

where

c

is a nonzero real number, each

b_{j}, c_{j}

and

𝜆_{j}

is real, each

b_{j}^{2}

is less than

4 c_{j}

m + M ⩾ 1

, and the equation holds for all real

x

. We then have

\begin{aligned} 0 & = a_{0} v + \dots a_{n} T^{n} v \\ = (a_{0} I + a_{1} T + \dots + a_{n} T^{n}) v \\ = c (T^{2} + b_{1} T + c_{1} I) \dots (T^{2} + b_{M} T + c_{M} I) (T - 𝜆_{1} I) \dots (T - 𝜆_{m} I) v \end{aligned}

By last Theorem, each

T^{2} + b_{1} T + c_{1} I

is invertible. Recall also that

c \neq 0

. Thus the equation above implies that

m > 0

and

0 = (T - 𝜆_{1} I) \dots (T - 𝜆_{m} I) v

Hence

T - 𝜆_{j} I

is note injective for at least one

j

. In other words,

T

has an eigenvalue.

The next result shows that if

U

is a subspace of

V

that is invariant under a self-adjoint operator

T

, then

U^{⊥}

is also invariant under

T

Theorem

166

Self-adjoint operators and invariant subspaces

Suppose

T \in L (V)

is self-adjoint

U

is a subspace of

V

that is invariant under

T

. Then(1)

U^{⊥}

is invariant under

T

(2)

T |_{U} \in L (U)

is self-adjoint(3)

T |_{U^{⊥}} \in L (U^{⊥})

is self-adjoint

Proof

U

is invariant means that

⟨ T u, v ⟩ = 0

for

v

not in

U

, which is actually a vector of

U^{⊥}

. So we have similarly

⟨ T v, u ⟩ = ⟨ v, T u ⟩ = 0

To prove (2) and (3)

⟨ (T |_{U}) u_{1}, u_{2} ⟩ = ⟨ T u_{1}, u_{2} ⟩ = ⟨ u_{1}, T u_{2} ⟩ = ⟨ u_{1}, (T |_{U}) u_{2} ⟩

Theorem

167

Real Spectral Theorem

Suppose

F = R

and

T \in L (V)

.Then the following are equivalent(1)

T

is self-adjoint(2)

V

has an orthonormal basis consisting of eigenvectors of

T

.(3)

T

has a diagonal matrix with respect to some orthonormal basis of

V

Proof

First suppose (3) holds, so

T

has a diagonal matrix. Hence

T = T^{*}

. We now prove (1) implies (2) by induction on

d i m V

. It's easy to show if

d i m V = 1

, it holds. If

d i m V ⩾ 2

, because

V

is self-adjoint, so it must have a eigenvector, namely,

u

. Let

U = s p a n (u)

, which is invariant under

T

. Thus

U^{⊥}

is also invariant under

T |_{U^{⊥}}

and has an orthonormal basis consisting of eigenvectors of

T |_{U^{⊥}}

. Append this basis with

u / ‖ u ‖

, we get an orthonormal basis of

V

7.C Positive Operators and IsometriesPositive Operators

Definition

168

Positive operator

An operator

T \in L (V)

is called positive if

T

is self-adjoint and

⟨ T v, v ⟩ ⩾ 0

Definition

169

square root

An operator

R

is called a square root of an operator

T

R^{2} = T

The characterizations of the positive operators in the next result correspond to characterizations of the nonnegative numbers among

C

. Specifically, a complex number

z

is nonnegative if and only if it has a nonnegative square root. Also,

z

is nonnegative if and only if it has a real square root, corresponding to condition (4). Finally,

z

is nonnegative if and only if there exists a complex number

w

such that

z = \bar{w} w

, corresponding to condition (5)

Theorem

170

Characterization of positive operators

Let

T \in L (V)

. Then the following are equivalent:(1)

T

is positive;(2)

T

is self-adjoint and all the eigenvalues of

T

are nonnegative(3)

T

has a positive square root(4)

T

has a self-adjoint square root(5) there exists an operator

R \in L (V)

such that

T = R^{*} R

Proof

We will prove that

(1) \Rightarrow (2) \Rightarrow (3) \Rightarrow (4) \Rightarrow (5) \Rightarrow (1)

.First suppose (1) holds, then

T

is self-adjoint. According to Spectral theorem, it has a diagonalized matrix respected to some basis. So if we need

(18)

holds, (2) must hold. Then (3) just need a matrix

R

with all the entries the square root of

T

. (4) and (5) just use the same matrix

R

as (3) does. If (5) holds, then

⟨ T v, v ⟩ = ⟨ R^{*} R v, v ⟩ = ⟨ R v, R v ⟩ ⩾ 0

Theorem

171

Each positive operator has only one positive square root

Proof

Let

R

be a positive square root of

T

. We will prove that

R v = \sqrt{𝜆} v

. (with supposition

T v = 𝜆 v

) This will imply that the behavior of

R

on the eigenvectors of

T

is uniquely determined. Because there is a basis of

V

consisting of eigenvectors of

T

, this will imply that

R

is uniquely determined. To prove

R v = \sqrt{𝜆} v

, note that the Spectral Theorem asserts that there is an orthonormal basis

e_{1}, \dots, e_{m}

V

consisting of eigenvectors of

R

. Because

R

is a positive operator, all its eigenvalues are nonnegative. Thus there exist nonnegative numbers

𝜆_{1}, \dots, 𝜆_{n}

such that

R e_{j} = \sqrt{𝜆_{j}} e_{j}

for

j = 1, \dots, n

Isometries (unitary operator)

Theorem

172

isometry

An operator

S \in L (V)

is called an isometry if it preserves norms.

‖ S v ‖ = ‖ v ‖

These statements are equivalent:(1)

S

is an isometry(2)

⟨ S u, S v ⟩ = ⟨ u, v ⟩

for all

u, v \in V

;(3)

S e_{1}, \dots, S e_{n}

is orthonormal for every orthonormal list of vectors

e_{1}, \dots, e_{n}

V

;(4) there exists an orthonormal basis

e_{1}, \dots, e_{n}

V

such that

S e_{1}, \dots, S e_{n}

is orthonormal(5)

S^{*} S = I

(6)

S S^{*} = I

(7)

S^{*}

is an isometry(8)

S

is invertible and

S^{- 1} = S^{*}

Proof

(1) and (2) is equivalent by definition. So (3) holds. (4) holds as well. Now suppose (4) holds. Let

e_{1}, \dots, e_{n}

be an orthonormal basis of

V

such that

S e_{1}, \dots, S e_{n}

is orthonormal. Thus

⟨ S e_{j}, S e_{k} ⟩ = ⟨ S^{*} S e_{j}, e_{k} ⟩ = ⟨ e_{j}, e_{k} ⟩

All vectors

u, v \in V

can be written as linear combinations of

e_{1}, \dots, e_{n}

, and thus the equation above implies that

⟨ S^{*} S u, v ⟩ = ⟨ u, v ⟩

. Hence

⟨ (S^{*} S - I) u, v ⟩ = 0

S^{*} S = I

. (5) holds. Thus

‖ S v ‖^{2} = ‖ S^{*} v ‖^{2} = ⟨ S S^{*} v, v ⟩ = ‖ v ‖^{2}

(6) holds. (7) holds. (8) holds. If (8) holds,

⟨ S v, S v ⟩ = ⟨ S^{*} S v, v ⟩ = ⟨ v, v ⟩

. Thus (1) holds.

Theorem

173

Description of isometries when

F = C

Suppose

V

is a complex inner product space and

S \in L (V)

. Then the following are equivalent:(1)

S

is an isometry(2) There is an orthonormal basis of

V

consisting of eigenvectors of

S

whose corresponding eigenvalues all have absolute value 1.

Proof

It's easy to show that (2) implies (1). To show the other direction, suppose (1) holds.So

S

is an isometry. By the Complex Spectral Theorem, there is an orthonormal basis

e_{1}, \dots, e_{n}

V

consisting of eigenvectors of

S

. For

j \in {1, \dots, n}

, let

𝜆_{j}

be the eigenvalue corresponding to

e_{j}

. Then

| 𝜆_{j} | = ‖ 𝜆_{j} e_{j} ‖ = ‖ S e_{j} ‖ = ‖ e_{j} ‖ = 1

Thus each eigenvalue of

S

has absolute value 1.

7.D Polar Decomposition and Singular Value DecompositionPolar decompositionWe have found the analogy between

C

and

L (V)

. Continuing with our analogy, note that each complex number

z

except 0 can be written in the form

z = (\frac{z}{| z |}) | z | = (\frac{z}{| z |}) \sqrt{\bar{z} z}

Our analogy leads us to guess that each operator

T \in L (V)

can be written as an isometry times

\sqrt{T^{*} T}

Definition

174

\sqrt{T}

T

is a positive operator, then

\sqrt{T}

denotes the unique positive square root of

T

Note that

T^{*} T

is a positive operator for every

T \in L (V)

(⟨ T^{*} T v, v ⟩ ⩾ 0)

, so the theorem is reasonable.

Theorem

175

Polar Decomposition

Suppose

T \in L (V)

. Then there exists an isometry

S \in L (V)

such that

T = S \sqrt{T^{*} T .}

Proof

v \in V

, then

\begin{aligned} ‖ T v ‖^{2} & = ⟨ T v, T v ⟩ \\ = ⟨ T^{*} T v, v ⟩ \\ = ⟨ \sqrt{T^{*} T} v, \sqrt{T^{*} T} v ⟩ \\ = ‖ \sqrt{T^{*} T} v ‖^{2} \end{aligned}

T

and

\sqrt{T^{*} T}

preserve the norm. Thus we only need to construct a isometry between them:

S_{1} (\sqrt{T^{*} T} v) : = T v

First we must check that

S_{1}

is well defined. To do this, suppose

v_{1}, v_{2} \in V

are such that

\sqrt{T^{*} T} v_{1} = \sqrt{T^{*} T} v_{2}

. For the definition given by

(19)

, we must show that

T v_{1} = T v_{2}

\begin{aligned} ‖ T v_{1} - T v_{2} ‖ & = ‖ T (v_{1} - v_{2}) ‖ \\ = ‖ \sqrt{T^{*} T} (v_{1} - v_{2}) ‖ \\ = ‖ \sqrt{T^{*} T} v_{1} - \sqrt{T^{*} T} v_{2} ‖ \\ = 0 \end{aligned}

Thus

T v_{1} = T v_{2}

. However, this just define the operator in a subspace. What about

S

on the outside of

r a n g e \sqrt{T^{*} T}

? In particular,

S

is injective. Thus form the Fundamental Theorem of Linear Map we have

\begin{array}{c} d i m r a n g e \sqrt{T^{*} T} = d i m r a n g e T \\ d i m (r a n g e \sqrt{T^{*} T})^{⊥} = d i m (r a n g e T)^{⊥} \end{array}

So find two basis for

(r a n g e \sqrt{T^{*} T})^{⊥} a n d (r a n g e T)^{⊥}

, define a

S_{2}

, preserving the coefficients of the basis. Then let

S

equal

S_{1}

r a n g e \sqrt{T^{*} T}

and equal

S_{2}

(r a n g e \sqrt{T^{*} T})^{⊥}

Singular Value Decomposition

Definition

176

singular values

Suppose

T \in L (V)

. The singular values of

T

are the eigenvalues of

\sqrt{T^{*} T}

, with each eigenvalue

𝜆

repeated

d i m E (𝜆, \sqrt{T^{*} T})

times. Actually, we can just compute them by computing eigenvalues of

T^{*} T

. Singular values are the square root of eigenvalues.

Example

176.1

singular values

Define

T \in L (F^{4})

T (z_{1}, z_{2}, z_{3}, z_{4}) : = (0, 3 z_{1}, 2 z_{2}, - 3 z_{4})

Then

T = (\begin{array}{cccc} 3 \\ 2 \\ - 3 \end{array}), T^{*} = (\begin{array}{cccc} 3 \\ 2 \\ - 3 \end{array})

We can show that

T^{*} T = (\begin{array}{cccc} 3 \\ 2 \\ - 3 \end{array}) (\begin{array}{cccc} 3 \\ 2 \\ - 3 \end{array}) = (\begin{array}{cccc} 9 \\ 4 \\ 0 \\ 9 \end{array})

\sqrt{T^{*} T}

has three eigenvalues: 3, 2, 0. and the singular values of

T

are 3, 3, 2, 0.

The next result shows that every operator on

V

has a clean description in terms of its singular values and two orthonormal bases of

V

Theorem

177

Singular Value Decomposition

Suppose

T \in L (V)

has singular values

s_{1}, \dots, s_{n}

. Then there exist orthonormal bases

e_{1}, \dots, e_{n}

and

f_{1}, \dots, f_{n}

V

such that

T v = s_{1} ⟨ v, e_{1} ⟩ f_{1} + \dots + s_{n} ⟨ v, e_{n} ⟩ f_{n}

for every

v \in V

Proof

By the Spectral Theorem applied to

\sqrt{T^{*} T}

, there is an orthonormal basis

e_{1}, \dots, e_{n}

V

such that

\sqrt{T^{*} T} e_{j} = s_{j} e_{j}

for

j = 1, \dots, n

. We have

v = ⟨ v, e_{1} ⟩ e_{1} + \dots + ⟨ v, e_{n} ⟩ e_{n}

for every

v \in V

. Apply

\sqrt{T^{*} T}

to both sides of this equation, getting

\sqrt{T^{*} T} v = s_{1} ⟨ v, e_{1} ⟩ e_{1} + \dots + s_{n} ⟨ v, e_{n} ⟩ e_{n}

By the Polar Decomposition, there is an isometry

S \in L (V)

such that

T = S \sqrt{T^{*} T}

. Let

S e_{j} = f_{j}

, because

S

is an isometry,

f_{1}, \dots, f_{n}

is an orthonormal basis of

V

. The equation above now becomes

T v = s_{1} ⟨ v, e_{1} ⟩ f_{1} + \dots + s_{n} ⟨ v, e_{n} ⟩ f_{n}

for every

v \in V

Fig7:Singular Value Decomposition

The singular Value Decomposition allows us a rare opportunity to make good use of two different bases for the matrix of an operator. To do this, suppose

T \in L (V)

. Let

s_{1}, \dots, s_{n}

denote the singular values of

T

, and let

e_{1}, \dots, e_{n}

and

f_{1}, \dots, f_{n}

be orthonormal bases of

V

such that the Singular Value Decomposition holds. Because

T e_{j} = s_{j} f_{j}

for each

j

, we have

M (T, e_{j}, f_{j}) = (\begin{array}{ccc} s_{1} & 0 \\ ⋱ \\ s_{n} \end{array})

Chapter8. Operators on Complex Vector Spaces8.A Generalized Eigenvectors and Nilpotent OperatorsNull Spaces of Powers of an OperatorWe begin this chapter with a study of null spaces of powers of an operator.

Theorem

178

Sequence of increasing null spaces

Suppose

T \in L (V)

. Then

{0} = n u l l T^{0} \subset n u l l T^{1} \subset \dots \subset n u l l T^{k} \subset \dots

Theorem

179

Equality in the sequence of null spaces

Suppose

T \in L (V)

. Suppose

m

is a nonnegative integer such that

n u l l T^{m} = n u l l T^{m + 1}

. Then

n u l l T^{m} = n u l l T^{m + 1} = \dots = n u l l T^{m + k} = \dots

Proof

Suppose

v \in n u l l T^{m + k + 1}

, then

T^{m + 1} T^{k} v = T^{m + k + 1} v = 0

Hence

T^{k} v \in n u l l T^{m + 1} = n u l l T^{m}

Therefore

T^{m} T^{k} v = 0

It means

v \in n u l l T^{m + k}

n u l l T^{m + k + 1} \subset n u l l T^{m + k} \subset n u l l T^{m + k + 1}

Theorem

180

Null spaces stop growing

Suppose

T \in L (V)

. Let

n = d i m V

. Then

n u l l T^{n} = n u l l T^{n + 1} = n u l l T^{n + 2} = \dots

Proof

We need only prove that

n u l l T^{n} = n u l l T^{n + 1}

. Suppose this is not true. Then

{0} = n u l l T^{0} ⊊ n u l l T^{1} ⊊ \dots ⊊ n u l l T^{n} ⊊ n u l l T^{n + 1}

At each of the strict inclusions in the chain above, the dimension increases by at least 1. Thus the dim null

T^{n + 1} ⩾ n + 1

Unfortunately, it is not true that

V = n u l l T \oplus r a n g e T

for each

T \in L (V)

. However, the following result is a useful substitute.

Theorem

181

V

is the direct sum of

n u l l T^{d i m V}

and

r a n g e T^{d i m V}

Suppose

T \in L (V)

. Let

n = d i m V

. Then

V = n u l l T^{n} \oplus r a n g e T^{n} .

Proof

First we show that

(n u l l T^{n}) \cap (r a n g e T^{n}) = {0}

Suppose

v \in (n u l l T^{n}) \cap (r a n g e T^{n})

. Then

T^{n} v = 0

, and there exist

u \in V

such that

v = T^{n} u

. Applying

T^{n}

to both sides of the last equation shows that

T^{n} v = T^{2 n} u

. Hence

T^{2 n} u = 0

u \in n u l l T^{2 n}

, so

u \in n u l l T^{n}

too. And we know from the fundamental theorem of Linear Maps,

d i m n u l l T^{n} = d i m r a n g e T^{n}

. Thus

V = n u l l T^{n} \oplus r a n g e T^{n}

□

Generalized EigenvectorsSome operators do not have enough eigenvectors to lead to a good description. Thus in this subsection we introduce the concept of generalized eigenvectors. To understand why we need more than eigenvectors, let's examine the question of describing an operator by decomposing its domain into invariant subspaces. Fix

T \in L (V)

. We seek to describe

T

by finding a "nice" direct sum decomposition

V = U_{1} \oplus \dots \oplus U_{m}

where each

U_{j}

is a subspace of

V

invariant under

T

. The simplest possible nonzero invariant subspaces are 1-dimensional. A decomposition as above where each

U_{j}

is 1-dimensional is possible if and only if

V

has a basis consisting of eigenvectors of

T

. This happens if and only if

V

has an eigenspace decomposition

V = E (𝜆_{1}, T) \oplus \dots \oplus E (𝜆_{m}, T)

The Spectral Theorem in the previous chapter shows that if

V

is an inner product space, then a decomposition of the form

(20)

holds for every normal operator if

F = C

and for every self-adjoint operator if

F = R

because operators of those types have enough eigenvectors to form a basis of

V

. Some operator does not have enough eigenvectors. Generalized eigenvectors and generalized eigenspaces, which we now introduce, will remedy this situation.

Definition

182

generalized eigenvector

Suppose

T \in L (V)

. and

𝜆

is an eigenvalue of

T

. A vector

v \in V

is called a generalized eigenvector of

T

corresponding to

𝜆

v \neq 0

and

(T - 𝜆 I)^{j} v = 0

for some positive integer

j

Definition

183

generalized eigenspaces

G (𝜆, T)

Suppose

T \in L (V)

and

𝜆 \in F

. The generalized eigenspace of

T

corresponding to

𝜆

, denoted

G (𝜆, T)

, is defined to be the set of all generalized eigenvectors of

T

corresponding to

𝜆

, along with the

0

vector.

G (𝜆, T) = n u l l (T - 𝜆 I)^{j} = n u l l (T - 𝜆 I)^{n}

Because every eigenvector of

T

is a generalized eigenvector of

T

, each eigenspace is contained in the corresponding generalized eigenspace. In other words,

E (𝜆, T) \subset G (𝜆, T)

The next result implies that if

T \in L (V)

and

𝜆 \in F

, then

G (𝜆, T)

is a subspace of

V

Theorem

184

Description of generalized eigenspaces

Suppose

T \in L (V)

. Let

n = d i m V

. Then

V = n u l l T^{n} \oplus r a n g e T^{n} .

Proof

First we show that

(n u l l T^{n}) \cap (r a n g e T^{n}) = {0}

Suppose

v \in (n u l l T^{n}) \cap (r a n g e T^{n})

. Then

T^{n} v = 0

, and there exist

u \in V

such that

v = T^{n} u

. Applying

T^{n}

to both sides of the last equation shows that

T^{n} v = T^{2 n} u

. Hence

T^{2 n} u = 0

u \in n u l l T^{2 n}

, so

u \in n u l l T^{n}

too. Thus

v = T^{n} u = 0

And we know from the fundamental theorem of Linear Maps,

d i m n u l l T^{n} = d i m r a n g e T^{n}

. Thus

V = n u l l T^{n} \oplus r a n g e T^{n}

□

Example

184.1

Define

T \in L (C^{3})

T (z_{1}, z_{2}, z_{3}) = (4 z_{2}, 0, 5 z_{3})

(a) Find all eigenvalues of

T

, the corresponding eigenspaces, and the corresponding generalized eigenspaces.(b) Show that

C^{3}

is the direct sum of generalized eigenspaces corresponding to the distinct eigenvalues of

T

. (a)

𝜆_{1} = 0

, eigenvector is

(z_{1}, 0, 0)

;

𝜆_{2} = 5

, eigenvector is

(0, 0, z_{3})

. There must be other generalized eigenvector. We have

T^{3} (z_{1}, z_{2}, z_{3}) = (0, 0, 125 z_{3})

Thus

v = (0, z_{2}, 0)

is also an eigenvector. So

{\begin{cases} G (0, T) = {(z_{1}, z_{2}, 0) : z_{1}, z_{2} \in C} \\ G (5, T) = {(0, 0, z_{3}) : z_{3} \in C} \end{cases}

One of our major goals in this chapter is to show that the result in part (b) of the example above holds in general for operators on finite-dimensional complex vector spaces.

Theorem

185

Linear independent generalized eigenvectors

Suppose

𝜆_{1}, \dots, 𝜆_{m}

are distinct eigenvalues of

T

v_{1}, \dots, v_{m}

are corresponding generalized eigenvectors. Then

v_{1}, \dots, v_{m}

are linearly independent.

Proof

Suppose

a_{1}, \dots, a_{m}

are complex number such that

0 = a_{1} v_{1} + \dots + a_{m} v_{m}

Let

k

be the largest nonnegative integer such that

(T - 𝜆_{1} I)^{k} v_{k} \neq 0

. Let

w = (T - 𝜆_{1} I)^{k} v_{1}

Thus

(T - 𝜆_{1} I) w = (T - 𝜆_{1} I)^{k + 1} w = 0

and hence

T w = 𝜆_{1} w

. Thus

(T - 𝜆 I) w = (𝜆_{1} - 𝜆) w

for every

𝜆 \in F

and hence

(T - 𝜆 I)^{n} w = (𝜆_{1} - 𝜆)^{n} w

for every

𝜆 \in F

, where

n = d i m V

. Apply the operator

(T - 𝜆_{1} I)^{k} (T - 𝜆_{2} I)^{n} \dots (T - 𝜆_{m})^{n}

to both sides of

(21)

. The equation above implies that

a_{1} = 0

. In a similar fashion,

a_{j} = 0

for each

j

, which implies that

v_{1}, \dots, v_{m}

is linearly independent.□

Nilpotent Operators

Definition

186

nilpotent

An operator is called nilpotent if some power of it equals

0

Example

186.1

nilpotent

A. The operator

N \in L (F^{4})

defined by

N (z_{1}, z_{2}, z_{3}, z_{4}) = (z_{3}, z_{4}, 0, 0)

is nilpotentB. The operator of differentiation on

P (R)

is nilpotent.

Theorem

187

Nilpotent operator raised to dimension of domain is

0

Suppose

N \in L (V)

is nilpotent. Then

N^{d i m V} = 0

map

Proof

G (0, N) = V

, thus

n u l l N^{d i m V} = V

N^{d i m V} = 0

map.

Theorem

188

Matrix of a nilpotent operator

Suppose

N \in L (V)

is nilpotent. Then there is a basis of

V

with respect to which the matrix of

N

has the form

(\begin{array}{ccc} 0 & * \\ ⋱ \\ 0 & 0 \end{array})

Proof

First choose a basis of null

N

. Then extend this to a basis of null

N^{2}

. Then extend to a basis of null

N^{3}

. Continue in this fashion, eventually getting a basis of

V

. Now let's think about the matrix of

N

with respect to this basis. The first column, and perhaps additional columns at the beginning, consists of all

0

's.

(\begin{array}{ccc} ? & ? & ? \\ ? & ? & ? \\ ? & ? & ? \end{array}) (\begin{array}{c} x \\ 0 \\ 0 \end{array}) = 0 ⟶ (\begin{array}{ccc} ? & ? & ? \\ ? & ? & ? \\ ? & ? & ? \end{array}) = (\begin{array}{ccc} 0 & ? & ? \\ 0 & ? & ? \\ 0 & ? & ? \end{array})

Continue in this fashion:

(\begin{array}{ccc} 0 & ? & ? \\ 0 & ? & ? \\ 0 & ? & ? \end{array}) ⟶ (\begin{array}{ccc} 0 & ? & ? \\ 0 & 0 & ? \\ 0 & 0 & ? \end{array}) ⟶ (\begin{array}{ccc} 0 & ? & ? \\ 0 & 0 & ? \\ 0 & 0 & 0 \end{array})

8.B Decomposition of an OperatorDescription of Operators on Complex Vector SpacesWe will see that every operator on a finite-dimensional complex vector space has enough generalized eigenvectors to provide a decomposition.

Theorem

189

The null space and range of

p (T)

are invariant under

T

Suppose

T \in L (V)

and

p \in P (F)

. Then null

p (T)

and range

p (T)

are invariant under

T

Proof

The key is:

p (T) (T u) = T p (T) u

The following major result shows that every operator on a complex vector space can be thought of as composed of pieces, each of which is a nilpotent operator plus a scalar multiple of the identity.

Theorem

190

Description of operators on complex vector spaces

Suppose

V

is a complex vector space and

T \in L (V)

. Let

𝜆_{1}, \dots, 𝜆_{m}

be the distinct eigenvalues of

T

. Then(1)

V = G (𝜆_{1}, T) \oplus \dots \oplus G (𝜆_{m}, T)

(2) each

G (𝜆_{j}, T)

is invariant under

T

(3) each

(T - 𝜆_{j} I) |_{G (𝜆_{j}, T)}

is nilpotent

Proof

Let

n = d i m V

. Recall that

G (𝜆_{j}, T) = n u l l (T - 𝜆_{j} I)^{n}

for each

j

. Thus

G

is a polynomial's null spaces, and is invariant under

T

. (2) (3) holds. We will prove (1) by induction on

n

. To get started, note that the desired result holds if

n = 1

. Thus we can assume that

n > 1

and that the desired result holds on all vector spaces of smaller dimension.Because

V

is a complex vector space,

T

has an eigenvalue; thus

m ⩾ 1

. There exist a

G (𝜆_{1}, T)

. And we can decompose

V = G (𝜆_{1}, T) \oplus U

where

U = r a n g e (T - 𝜆_{1} I)^{n}

U

is a polynomial's range, so it is invariant under

T

. Because

G (𝜆_{1}, T) \neq {0}

, we have

d i m U < n

. Thus we can apply our induction hypothesis to

T |_{U}

. None of the generalized eigenvectors of

T |_{U}

correspond to the eigenvalue

𝜆_{1}

, because they would be in

G (𝜆_{1}, T)

. Thus each eigenvalue of

T |_{U}

is in

{𝜆_{2}, \dots, 𝜆_{m}}

. We need to show that

G (𝜆_{k}, T |_{U}} = G (𝜆_{k}, T)

for

k = 2, \dots, m

. Thus fix

k \in {2, \dots, m}

. The inclusion

G (𝜆_{k}, T |_{U}) \subset G (𝜆_{k}, T)

is clear

\dots

G (𝜆, T)

cannot be decomposed into eigenvectors. For example:

\begin{array}{c} (\begin{array}{ccc} 6 & 3 & 4 \\ 6 & 2 \\ 7 \end{array}) \to_{T - 𝜆 I}^{e i g e n v a l u e = 6} (\begin{array}{ccc} 0 & 3 & 4 \\ 0 & 2 \\ 1 \end{array}) \\ {(\begin{array}{ccc} 0 & 3 & 4 \\ 0 & 2 \\ 1 \end{array})}^{2} = (\begin{array}{ccc} 0 & 0 & 10 \\ 0 & 2 \\ 1 \end{array}) \end{array}

Then the generalized eigenvectors are

(1, 0, 0)

and

(0, 1, 0)

Theorem

191

A basis of generalized eigenvectors

Suppose

V

is a complex vector space and

T \in L (V)

. Then there is a basis of

V

consisting of generalized eigenvectors of

T

Multiplicity of an EigenvalueIf

V

is a complex vector space and

T \in L (V)

, then the decomposition of

V

provided by

(22)

can be a powerful tool. The dimensions of the subspaces involved in this decomposition are sufficiently important to get a name.

Definition

192

multiplicity

Suppose

T \in L (V)

. The multiplicity of an eigenvalue

𝜆

T

is defined to be the dimension of the corresponding generalized eigenspace

G (𝜆, T)

Theorem

193

Sum of the multiplicities equals

d i m V

Or we won't have equ

(22)

More information

algebraic multiplicity of

𝜆 = d i m n u l l (T - 𝜆 I)^{d i m V} = d i m G (𝜆, T)

geometric multiplicity of

𝜆 = d i m n u l l (T - 𝜆 I) = d i m E (𝜆, T)

Block Diagonal MatricesTo interpret our results in matrix form, we make the following definition, generalizing the notion of a diagonal matrix

Definition

194

block diagonal matrix

square matrix of the form

(\begin{array}{ccc} A_{1} & 0 \\ ⋱ \\ 0 & A_{m} \end{array})

where

A_{1}, \dots, A_{m}

are square matrices lying along the diagonal and all the other entries of the matrix equal 0.

Theorem

195

Block diagonal matrix with upper-triangular blocks

We can choose a basis to block diagonalize an operator. Make every

A_{j}

be a upper-trangular matrix.

Square RootsNot every operator on a complex vector space has a square root.

Theorem

196

Identity plus nilpotent has a square root

Suppose

N \in L (V)

is nilpotent. Then

I + N

has a square root.

Proof

Consider the Taylor series for the function

\sqrt{1 + x}

1 + a_{1} x + a_{2} x^{2} + \dots

Because

N

is nilpotent,

N^{m} = 0

for some positive integer

m

. We guess that there is a square root of

I + N

of the form

I + a_{1} N + a_{2} N^{2} + \dots + a_{m - 1} N^{m - 1}

Having this guess, we can compute

(I + a_{1} N + a_{2} N^{2} + \dots + a_{m - 1} N^{m - 1}) (I + a_{1} N + a_{2} N^{2} + \dots + a_{m - 1} N^{m - 1})

and make the coefficient to be

1, 1, 0, 0, 0, \dots

Theorem

197

Over

C

, invertible operators have square roots

Suppose

V

is a complex vector space and

T \in L (V)

is invertible. Then

T

has a square root.

Proof

Invertibility means all the eigenvalues are not zero. Thus

G = T - 𝜆 I

is a nilpotent, so

T |_{G} = 𝜆 (I + \frac{G}{𝜆})

has a square root.

8.C Characteristic and Minimal PolynomialThe Cayley-Hamilton TheoremThe next definition associates a polynomial with each operator on

V

F = C

Definition

198

characteristic polynomial

Suppose

V

is a complex vector space and

T \in L (V)

. Let

𝜆_{1}, \dots, 𝜆_{m}

denote the distinct eigenvalues of

T

, with multiplicities

d_{1}, \dots, d_{m}

. The polynomial

(z - 𝜆_{1})^{d_{1}} \dots (z - 𝜆_{m})^{d_{m}}

is called the characteristic polynomial of

T

Theorem

199

Cayley-Hamilton Theorem

Suppose

V

is a complex vector space and

T \in L (V)

. Let

q

denote the characteristic polynomial of

T

. Then

q (T) = 0

Proof

Every vector in

V

is a sum of vectors in

G (𝜆_{1}, T), \dots G (𝜆_{m}, T)

. Thus we only need to prove

p (T) |_{G (𝜆_{j}, T)} = 0

Because

p (T) = (T - 𝜆_{1} I)^{d_{1}} \dots

, we can commute the term and let

(T - 𝜆_{j} I)^{d_{j}}

be the rightest. The definition of

G

is:

G = n u l l (T - 𝜆_{j} I)^{d_{j}}

So the proof is clearly.

The Minimal Polynomial

Definition

200

monic polynomial

A monic polynomial is a polynomial whose highest-degree coefficient equals 1.

Theorem

201

Minimal polynomial

Suppose

T \in L (V)

. Then there is a unique monic polynomial

p

of smallest degree such that

p (T) = 0

Proof

Let

n = d i m V

. Then the list

I, T, T^{2}, \dots, T^{n^{2}}

is linearly dependent, because

d i m L = n^{2}

. Let

m

be the smallest positive integer such that

I, T, T^{2}, \dots, T^{m}

is linearly dependent. so

a_{0} I + a_{1} T + \dots + a_{m - 1} T^{m - 1} + T^{m} = 0

Define a monic polynomial

p \in P (F)

p (z) = a_{0} + a_{1} z + \dots + a_{m - 1} z^{m - 1} + z^{m}

Then it means

p (T) = 0

Definition

202

minimal polynomial

A minimal polynomial of

T

is the unique monic polynomial

p

of smallest degree such that

p (T) = 0

The proof of the last result shows that the degree of the minimal polynomial of each operator on

V

is at most

(d i m V)^{2}

. The Cayley-Hamilton Theorem tells us that if

V

is a complex vector space, then the minimal polynomial of each operator on

V

has degree at most

d i m V

. This remarkable improvement also holds on real vector spaces, as we will see in the next chapter.

Theorem

203

q (T) = 0

implies

q

is a multiple of the minimal polynomial

Suppose

T \in L (V)

and

q \in P (F)

. Then

q (T) = 0

if and only if

q

is a polynomial multiple of the minimal polynomial of

T

Proof

The "if" direction is easy to prove. The other direction, use the Division Algorithm for Polynomials, there exist polynomials

s, r \in P (F)

such that

q = p s + r

and

d e g r < d e g p

. We can prove that

r = 0

Theorem

204

Characteristic polynomial is a multiple of minimal polynomial

Suppose

T \in L (V)

F = C

. Then the characteristic polynomial of

T

is a polynomial multiple of the minimal polynomial of

T

Theorem

205

Eigenvalues are the zeros of the minimal polynomial

Suppose

T \in L (V)

. Then the zeros of the minimal polynomial of

T

are precisely the eigenvalues of

T

8.D Jordan FormWe know that if

V

is a complex vector space, then for every

T \in L (V)

there is a basis of

V

with respect to which

T

has a nice upper-triangular matrix. In this section we will see that we can do even better -- there is a basis of

V

with respect to which the matrix of

T

contains 0's everywhere except possibly on the diagonal and the line directly above the diagonal.

Theorem

206

Basis corresponding to a nilpotent operator

Suppose

N \in L (V)

is a nilpotent. Then there exist vectors

v_{1}, \dots, v_{n} \in V

and nonnegative integers

m_{1}, \dots, m_{n}

such that(1)

N^{m_{1}} v_{1}, \dots, N v_{1}, v_{1} \dots, N^{m_{n}} v_{n}, \dots, N v_{n}, v_{n} i s a b a s i s o f V

(2)

N^{m + 1} v_{1} = \dots = N^{m_{n} + 1} v_{n} = 0

Proof

Use the induction on

d i m V

. When dim

V = 1

N

is must a

0

map. So let

m = 0

v_{1}

is a basis of

V

. Now for a

N

and

V

, Suppose the theorem holds when

d i m V' < d i m V

. Thus It should holds on the range of

N

because

d i m r a n g e N < d i m V

. Thus

N^{m_{1}} v_{1}, \dots, v_{1}, \dots, N^{m_{n}} v_{n}, \dots, v_{n}

is a basis of

r a n g e N

. for each

v_{j}

, there exist

u_{j}

such that

v_{j} = N u_{j}

. We will prove that

N^{m_{1} + 1} u_{1}, \dots, u_{1}, \dots, N^{m_{n} + 1} u_{n}, \dots, u_{n}

can extend to a basis by two steps. First,

N^{m_{1} + 1} u_{1}, \dots, u_{1}, \dots, N^{m_{n} + 1} u_{n}, \dots, u_{n}

is linearly independent. If there were

a_{1} u_{1} + \dots a_{m_{1} + 1} N^{m_{1} + 1} u_{1} + \dots = 0

Then operator

N

on the left, most of the vector become the vector in

(25)

, others become

0

because of

(24)

(25)

is linearly independent, thus their coefficients must be 0. Only those who became

0

can have nonnegative coefficients. However, their origin is actually

N^{m_{1}} v_{1}, \dots, N^{m_{n}} v_{n}

, which is also linearly independent. So

(27)

requires their coefficients to be 0. Second, extend

(26)

to be a basis

N^{m_{1} + 1} u_{1}, \dots, u_{1}, \dots, N^{m_{n} + 1} u_{n}, \dots, u_{n}, w_{1}, \dots, w_{p}

V

. Each

N w_{j}

is in range

N

and hence is in the span of

(25)

. Each vector in the list

(25)

equals

N

applied to some vector in the list

(26)

. Thus there exists

x_{j}

in the span of

(26)

such that

N w_{j} = N x_{j}

Now let

u_{n + j} = w_{j} - x_{j}

Then

N u_{n + j} = 0

. Furthermore,

N^{m_{1} + 1} u_{1}, \dots, N u_{1}, u_{1}, \dots N^{m_{n} + 1} u_{n}, \dots, N u_{n}, u_{n}, u_{n + 1}, \dots, u_{n + p}

spans

V

because it span contains each

x_{j}

and each

u_{n + j}

and hence each

w_{j}

. Thus the spanning list above is a basis of

V

because it has the same length as the basis

(28)

. This basis has the required form, where

m_{n + j} = 0

for

j ⩾ 1

Definition

207

Jordan basis

Suppose

T \in L (V)

. A basis of

V

is called a Jordan basis for

T

if with respect to this basis

T

has a block diagonal matrix

(\begin{array}{ccc} A_{1} \\ ⋱ \\ A_{p} \end{array}), w h e r e A_{j} = (\begin{array}{cccc} 𝜆_{j} & 1 & 0 \\ ⋱ & ⋱ \\ ⋱ & 1 \\ 0 & 𝜆_{j} \end{array})

Theorem

208

Jordan Form

Suppose

V

is a complex vector space. If

T \in L (V)

, then there is a basis of

V

that is a Jordan basis for

T

Proof

(23)

tells us, a nilpotent operator can gives a basis of

V

with respect to which

N

has a block diagonal matrix, where each matrix on the diagonal has the form

(\begin{array}{ccc} (\begin{array}{cccc} 0 & 1 & 0 \\ ⋱ & ⋱ \\ ⋱ & 1 \\ 0 & 0 \end{array}) \\ (\begin{array}{cccc} 0 & 1 & 0 \\ ⋱ & ⋱ \\ ⋱ & 1 \\ 0 & 0 \end{array}) \\ (\begin{array}{cccc} 0 & 1 & 0 \\ ⋱ & ⋱ \\ ⋱ & 1 \\ 0 & 0 \end{array}) \end{array})

Because

(T - 𝜆_{i} I) |_{G (𝜆_{j}, T)}

is nilpotent. Thus

(T - 𝜆_{i} I) |_{G (𝜆_{j}, T)}

can be represented as

(29)

. So

T |_{G (𝜆_{j}, T)}

can be written as

(\begin{array}{ccc} (\begin{array}{cccc} 𝜆_{i} & 1 & 0 \\ ⋱ & ⋱ \\ ⋱ & 1 \\ 0 & 𝜆_{i} \end{array}) \\ (\begin{array}{cccc} 𝜆_{i} & 1 & 0 \\ ⋱ & ⋱ \\ ⋱ & 1 \\ 0 & 𝜆_{i} \end{array}) \\ (\begin{array}{cccc} 𝜆_{i} & 1 & 0 \\ ⋱ & ⋱ \\ ⋱ & 1 \\ 0 & 𝜆_{i} \end{array}) \end{array})

And we have the generalized eigenspace decomposition

(22)

V = G (𝜆_{1}, T) \oplus \dots \oplus G (𝜆_{m}, T)

Thus we can represent

T |_{V}

as the block diagonal matrix of

T |_{G (𝜆_{j}, T)}

. Each block has several blocks.

Chapter9. Operators on Real Vector Spaces9.A Complexification of a Vector SpaceAs we will soon see, a real vector space

V

can be embedded, in a natural way, in a complex vector space called the complexification of

V

Definition

209

complexification of

V

V_{C}

Suppose

V

is a real vector space.★ The complexification of

V

, denoted

V_{C}

, equals

V \times V

. An element of

V_{C}

is an ordered pair

(u, v)

, where

u, v \in V

, but we will write this as

u + ⅈ v

.★ Addition on

V_{C}

is defined by

(u_{1} + ⅈ v_{1}) + (u_{2} + ⅈ v_{2}) = (u_{1} + u_{2}) + ⅈ (v_{1} + v_{2})

★ Complex scalar multiplication on

V_{C}

is defined by

(a + b ⅈ) (u + v ⅈ) = (a u - b v) + ⅈ (a v + b u)

★ for

a, b \in R

and

u, v \in V

We think of

V

as a subset of

V_{C}

by identifying

u \in V

with

u + ⅈ 0

. The construction of

V_{C}

from

V

can then by thought of as generalizing the construction of

C^{n}

from

R^{n}

Theorem

210

basis of

V

is basis of

V_{C}

Suppose

V

is a real vector space★ If

v_{1}, \dots, v_{n}

is a basis of

V

(as a real vector space, the coefficients are real), then

v_{1}, \dots, v_{n}

is a basis of

V_{C}

(as a complex vector space, the coefficients are complex)★ their dimensions equal.

Complexification of an Operator

Definition

211

complexification of

T

T_{C}

Suppose

V

is a real vector space and

T \in L (V)

. The complexification of

T

, denoted

T_{C}

, is the operator

T_{C} \in L (V_{C})

defined by

T_{C} (u + ⅈ v) = T u + ⅈ T v

for

u, v \in V

Theorem

212

Matrix of

T_{C}

equals matrix of

T

Suppose

V

is a real vector space with basis

v_{1}, \dots, v_{n}

and

T \in L (V)

. Then

M (T) = M (T_{C})

, where both matrices are with respect to the same basis

We know that every operator on a nonzero finite-dimensional complex vector space has an eigenvalue and thus has a 1-dimensional invariant subspace. But an operator on a nonzero finite-dimensional real vector space may have no eigenvalues and thus no 1-dimensional invariant subspaces. However, we now show that an invariant subspace of dimension 1 or 2 always exists.

Theorem

213

Every operator has an invariant subspace of dimension 1 or 2.

Suppose

V

is a real vector space and

T \in L (V)

. The complexification

T_{C}

has an eigenvalue

a + b ⅈ

, where

a, b \in R

. Thus there exist

u, v \in V

, not both

0

, such that

T_{C} (u + ⅈ v) = (a + b ⅈ) (u + ⅈ v)

. Using the definition of

T_{C}

, the last equation can be rewritten as

T u + ⅈ T v = (a u - b v) + (a v + b u) ⅈ

Thus

T u = a u - b v a n d T v = a v + b u

Then

U = s p a n (u, v)

is invariant.

The Minimal Polynomial of the ComplexificationSuppose

V

is a real vector space and

T \in L (V)

. Then the minimal polynomial of

T_{C}

equals the minimal polynomial of

T

. Eigenvalues of the ComplexificationAn eigenvalue of

T_{C}

is real if and only if it is also an eigenvalue of

T

Theorem

214

T_{C} - 𝜆 I

and

T_{C} - \bar{𝜆} I

(T_{C} - 𝜆 I)^{j} (u + ⅈ v) = 0

if and only if

(T_{C} - \bar{𝜆} I)^{j} (u + ⅈ v) = 0

As a consequence, the nonreal eigenvalues of

T_{C}

comes in pair.

Theorem

215

Multiplicity of

𝜆

equals multiplicity of

\bar{𝜆}

Suppose

V

is a real vector space with basis

v_{1}, \dots, v_{n}

and

T \in L (V)

. Then

M (T) = M (T_{C})

, where both matrices are with respect to the same basis

Chapter10. Trace and Determinant10.A TraceChange of Basis★ identity matrix

I

★ invertible, inverse,

A^{- 1}

★ The matrix of the product of linear maps

M (S T, (u_{1}, \dots, u_{n}), (w_{1}, \dots, w_{n})) = M (S, (v_{1}, \dots, v_{n}), (w_{1}, \dots, w_{n})) M (T, (u_{1}, \dots, u_{n}), (v_{1}, \dots, v_{n}))

★

M (I, (u_{1}, \dots, u_{n}), (v_{1}, \dots, v_{n}))

's inverse is

M (I, (v_{1}, \dots, v_{n}), (u_{1}, \dots, u_{n}))

★ Change of basis formula, let

A = M (I, (u_{1}, \dots, u_{n}), (v_{1}, \dots, v_{n}))

M (T, (u_{1}, \dots, u_{n})) = A^{- 1} M (T, (v_{1}, \dots, v_{n})) A

Trace: A Connection Between Operators and Matrices★ The trace of

T

is the sum of the eigenvalues of

T

(if

F = C

)or

T_{C}

(if

F = R)

★ Trace

T

equals the negative of the coefficient of

z^{n - 1}

in the characteristic polynomial of

T

.★ The trace of a square matrix is the sum of the diagonal entries of it.★ trace

(A B) =

trace

(B A)

★ Trace of matrix of operator does not depend on basis:

T_{1} = A^{- 1} T_{2} A

, trace

T_{1} =

trace

T_{2}

★ Trace of an operator equals trace of its matrix★ Trace is additive10.B DeterminantDeterminant of an operator★ It's the product of the eigenvalue of

T

T_{C}

, with each eigenvalue repeated according to its multiplicity★ det

T

equals

(- 1)^{n}

times the constant term of the characteristic polynomial of

T

.★ Invertible is equivalent to nonzero determinant★ Characteristic polynomial of

T

equals det

(z I - T)

Determinant of a matrix★ det

A = \sum_{(m_{1}, \dots, m_{n}) \in p e r m n}^{} (s i g n (m_{1}, \dots, m_{n})) A_{m_{1}, 1} \dots A_{m_{n}, n}

★ determinant is multiplicative★ Determinant of an operator equals determinant of its matrix

Acknowledgment

All the typesetting is completed using

Math

really kicks ass!

M(Tv)	=a1M(Tv1)+a2M(Tv2)+⋯+anM(Tvn)
	=a1M(T).,1+⋯+anM(T).,n
	=M(T)M(v)

dim range T'	=dim W-(dim null T+dim W-dim V)
	=dim V-dim null T
	=dim range T

T'(𝜑)	={𝜑∘T: for 𝜑∈W'}
range T'	={𝜓:𝜓=T'𝜑=𝜑∘T, for some 𝜑∈W'}

((AC)tk,j)	=(AC)j,k= ∑iAjiCik
	= ∑iAtijCtki= ∑iCtkiAtij
	=(AtCt)k,j

T11	⋯
T21	⋯
T31	⋯

	column rank of A	=column rank of M(T)
		=dim range T
		=dim range T'
		=column rank of M(T')
		=column rank of At
		=row rank of A

0	=a0v+⋯anTnv
	=(a0I+a1T+⋯+anTn)v
	=c(T-𝜆1I)⋯(T-𝜆mI)v.

‖u‖	=‖⟨u,v> ‖v‖2v‖2+‖w‖2
	=\|\|⟨u,v>\|\|2 ‖v‖2+‖w‖2
	⩾\|\|⟨u,v>\|\|2 ‖v‖2

‖u+v‖2	=⟨u+v,u+v>
	=⟨u,u>+⟨v,v>+⟨u,v>+⟨v,u>
	=⟨u,u>+⟨v,v>+⟨u,v>+⏨⏨⏨⟨u,v>
	=⟨u,u>+⟨v,v>+2ℜ⟨u,v>
	⩽‖u‖2+‖v‖2+2\|\|⟨u,v>\|\|
	⩽‖u‖2+‖v‖2+2‖u‖ ‖v‖
	=(‖u‖+‖v‖)2

‖u+v‖2-‖u-v‖2	=⟨u+v,u+v>+⟨u-v,u-v>
	=‖u‖2+‖v‖2+\|\|⟨u,v>\|\|+⟨v,u>+‖u‖2+‖v‖2-\|\|⟨u,v>\|\|-\|\|⟨v,u>\|\|
	=2(‖u‖2+‖v‖2)2

𝜑(v)	=𝜑 ∑i⟨v,ei>ei
	=⟨v,e1>𝜑(e1)+⋯+⟨v,en>𝜑(en)
	=⟨v, ⏨⏨⏨𝜑(e1)e1>+⋯+⟨v,⏨⏨⏨𝜑(en)en>
	=⟨v, ⏨⏨⏨𝜑(e1)e1+⋯+⏨⏨⏨𝜑(en)en>
	=⟨v,u>
u	=⏨⏨⏨𝜑(e1)e1+⋯⏨⏨⏨𝜑(en)en

‖v-PUv‖2	⩽‖v-PUv‖2+‖PUv-u‖2
	=‖(v-PUv)+(PUv-u)‖2
	=‖v-u‖2

<(x1,x2,x3), T*(y1,y2)>	=<T(x1,x2,x3), (y1,y2)>
	=y1(x2+3x3)+2x1y2
	=<(x1,x2,x3),(2y2,y1,3y1)>

<Tv,v>-⏨⏨⏨⏨<Tv,v>	=<Tv,v>-<v,Tv>
	=<Tv,v>-<T*v,v>
	=<(T-T*)v,v>

T is normal	⟺TT-TT=0
	⟺<(TT-TT)v,v>=0
	⟺<TTv,v>=<TTv,v>
	⟺<Tv,Tv>=<Tv,Tv>
	⟺‖Tv‖2=‖T*v‖2

(1)			v-w∈U
(2)			v+U=w+U
(3)			(v+U)∩(w+U)≠∅

1		if k=j,
0		if k≠j.

dim1	=dimW-dim2
	=dimW-dimV+dim3

1		if j=k
0		if j≠k

<(T2+bT+cI)v,v>	=<T2v,v>+b<Tv,v>+c<v,v>
	=<Tv,Tv>+b<Tv,v>+c‖v‖2
	⩾‖Tv‖2+\|\|b\|\|‖Tv‖ ‖v‖+c‖v‖2
	>0

0	=a0v+⋯anTnv
	=(a0I+a1T+⋯+anTn)v
	=c(T2+b1T+c1I)⋯(T2+bMT+cMI)(T-𝜆1I)⋯(T-𝜆mI)v

‖Tv1-Tv2‖	=‖T(v1-v2)‖
	=‖T*T(v1-v2)‖
	=‖TTv1-TTv2‖
	=0