A list of length n is (must be finite, because it has a length)(x1, x2,⋯, xn)Many mathematicians call a list of length n an n-tuple.
Definition1.A.1FnDefinition1.A.20Let 0 denote the list of length n whose coordinates are all 0: (0,⋯,0)1.B Definition of Vector Space
Definition1Notation FS
A vector space is a set V, along with an addition on V and a scalar multiplication on V such that the following properties hold:• commutativityu+v=v+u• associativity(u+v)+w=u+(v+w)• additive identityv+0=v • additive inversev+w=0• multiplicative identity1⋅v=v• distributive propertiesa(u+v)=au+avinfinite vector space:• F∞ is defined to be the set of all sequences of elements of F:F∞={(x1,x2,⋯):xj∈F for j=1,2,⋯}
Definition2Notation FS
• If S is a set, then FS denotes the set of functions from S to F• For f, g∈FS, the sumf+g∈FS is the function defined by (f+g)(x)=f(x)+g(x)• For 𝜆∈F and f∈FS, the product 𝜆f∈FS is the function defined by(𝜆f)(x)=𝜆f(x)• for all x∈S.
For example, R[0,1] is the set of real-values functions on the interval [0,1].But, wait!
Example2.1FS is a vector space
We just need to define:• The additive identity of FS is the function 0:S→F defined by0(x)=0• The additive inverse of f is the function -f: S→F defined by(-f)(x):=-f(x)• for all x∈S.
1.C SubspacesDefinition
Definition3subspaces
A subset U of V is called a subspace if U is also a vector spaces, using the same addition and scalar multiplication.
Sums of Subspaces
Definition4sum of subsets
Suppose U1,⋯,Um are subsets of V. The sum of U1,⋯Um , denoted U1+⋯+Um, is the set of all possible sums of elements of U1,⋯,Um. More precisely,U1+⋯+Um:={u1+⋯+um:u1∈U1,⋯,um∈Um}
Example4.1
Suppose U is the set of all elements of F3 whose second and third coordinates equal 0, and W is the set of all elements of F3 whose first and third coordinates equal 0:U={(x, 0, 0)∈F3:x∈F} and W={(0, y, 0)∈F3:y∈F}ThenU+W={(x,y,0)∈F3: x,y∈F}
Theorem5Sum of subspaces is the smallest containing subspace
Suppose U1,⋯Um are subspaces of V. Then U1+⋯+Um is the smallest subspace of V containing U1,⋯,Um.Proof :• First it's easy to prove U1+⋯+Um is a subspace of V. • Then, every subspace in U1+⋯+Um containing U1,⋯,Um must contain U1+⋯+Um because subspaces must contain all finite sums of their elements (closed under addition), so if it contains U1 and U2, then the element of U1+U2 must be contained to.
Now, what about the situation each vector in U1+⋯Um can be represented in the from above in only one way?
Definition6Direct Sum
Suppose U1,⋯Um are subspaces of V. • Then U1+⋯+Um is called a direct sum if each element of U1+⋯+Um can be written in only one way as a sum u1+⋯um, where each uj is in Uj• If U1+⋯+Um is a direct sum, then U1⊕⋯⊕Un denotes U1+⋯Um, with the ⊕ notation serving as an indication that this is a direct sum.
Example6.1direct sum
Suppose U is the subspace of F3 of those vectors whose last coordinates equal 0, and W is the subspace of F3 of those vectors whose first two coordinates equal 0:U={(x,y,0)∈F3:x,y∈F} and W={(0,0,z)∈F3: z∈F}.The F3=U⊕W.
The definition of direct sum requires that every vector in the sum have a unique representation as an appropriate sum. The next result shows that when deciding whether a sum is direct, you only need consider whether 0 can be uniquely written as an appropriate sum.
Theorem7The condition of direct sum
Suppose U1,⋯Um are subspaces of V. Then U1+⋯+Um is a direct sum if and only if the only way to write 0 as a sum u1+⋯+um, where each uj is in Uj, is by taking each uj equal to 0.Proof• First suppose U1+⋯+Um is a direct sum. Then the definition requires 0 can only be represented by one way. Then because each Uj is a subspace of U, so it means 0∈Uj, thus 0+⋯+0=0 is possible. So the only way to write 0 as a sum is it.• If we can write 0 in the only way, then we can't write a vector in two ways:v=u1+⋯um=𝜈1+⋯𝜈m• Subtracting these two equations, we have0=(u1-𝜈1)+⋯(um-𝜈m)• Because 0=0+⋯+0, sou1=𝜈1, u2=𝜈2,⋯□
The next result gives a simple condition for testing which pairs of subspaces give a direct sum.
Theorem8Direct sum of two subspaces
Suppose U and W are subspaces of V. Then U+W is a direct sum if and only if U∩W={0}ProofIf it's a direct sum, then, there can't be other representation of 0 except 0+0. If U∩W={0,s}, then because of the addition conservation, there must be a -s in W as well. So there can be another way to represent:0=s+(-s)Besides, it's easy to prove that if U∩W={0}, there is only one way to represent 0. Thus Theorem "Condition of direct sum" is satisfied.□
More information
The results above only deals with the case of two subspaces. When asking about a possible direct sum with more than two subspaces, it is not enough to test whether U1∩U2=U1∩U3=U2∩U3={0}
Chapter2. Finite-Dimensional Vector Space2.A Span and Linear IndependenceTo avoid confusion, we will usually write lists of vectors without surrounding parentheses. For example: (4,1,6), (9,5,7) is a list of length 2 of vectors in R3Linear Combinations and SpanAdding up scalar multiples of vectors in a list gives what is called a linear combination of the list:
Definition9Linear Combination
A linear combination of a list v1,⋯vm of vectors in V is a vector of the forma1v1+⋯+amvmwhere a1,⋯,am∈F.
Definition10Span
(Some mathematicians also call it linear span)
The set of all linear combination of a list of vectors v1,⋯,vm in V is called the span of v1,⋯,vm, denoted span(v1,⋯,vm). In other wordsspan(v1,⋯,vm)={a1v1+⋯+amvm:a1,⋯,am∈F}The span of the empty list ( ) is defined to be {0}.Besides, we introduce a verb spans. We say that v1,⋯,vmspans V.
Theorem11Span is the smallest containing subspace
ProofFirst we show that span(v1,⋯vm) is a subspace of V. • The additive identity is in it: set all ai=0 in equ (1). • It's closed under addition• It's closed under scalar multiplicationThen we show that span is the smallest subspace• Each vj is a linear combination of v1,⋯,vm, thus is in the span.• Conversely, if we need to contain all vectors in the list, we must allow the addition and scalar multiplication. Thus we must include the span if we want to include all the vectors in the list□
Now we can make one of the key definition in linear algebra!!!
Definition12finite-dimensional vector space
A vector space is called finite-dimensional if some list of vectors in it spans the space.
Recall that by definition every list has finite length.
Definition13polynomial, P(F)
• A function p:F→F is called a polynomial with coefficients in F if there exist a0,⋯,am∈F such thatp(z)=a0+a1z+a2z2+⋯+amzm• for all z∈F.• P(F) is the set of all polynomial with coefficients in F.
With the usual operations of addition and scalar multiplication, P(F) is a vector space over F. In other words, P(F) is a subspace of FF, the vector space of functions from F to F.If a polynomial is represented by two sets of coefficients, then subtracting one representation of the polynomial from the other produces a polynomial that is identically zero as a function on F and hence has all zero coefficients. Conclusion: the coefficients of a polynomial are uniquely determined by the polynomial.
Definition14degree of a polynomial: deg p
A polynomial is said to have a degreem if there exist am≠0 such thatp(z)=a0+a1z+⋯amzmfor all z∈F. If p has degree m, we write deg p=m.★ The polynomial that is identically 0 is called to have degree -∞
In next definition, we use the convention that -∞<m, which means that the polynomial 0 is in any set of polynomials Pm(F).
Definition15Pm(F)
For m a non-negative integer, Pm(F) denotes the set of all polynomials with coefficients in F and degree at most m.
Example15.1Pm(F) is a finite-dimensional vector space.
Pm(F) is a finite-dimensional vector space for each non-negative integer m.Note that Pm(F)=span(1,z,⋯,zm); here we are slightly abusing notation by letting zk denote a function, so zm∈FF, and with all the number smaller than m, they spans a bigger subspace Pm(F) in the vector space FF.
However, P(F) is infinite-dimensional
Example15.2P(F) is an infinite-dimensional vector space.
Consider any list of elements in P(F), let m denotes the highest degree in the list. Then the span of the list cannot represent zm+1
Linear IndependenceSuppose v1,⋯,vm∈V and v∈span(v1,⋯,vm). If 0∈span(vj) can only be written as the combination of vj in one trivial way——0v1+0v2+⋯, then this situation is so important that we give it a special name -- linear independence
Definition16Linearly Independence
A list v1,⋯,vm of vectors in V is called linearly independent if the only choice of a1,...,am∈F that makes a1v1+⋯+amvm equal 0 is a1=⋯=am=0.★ The empty list ( ) is also declared to be linearly independent.
The reasoning above shows that v1,⋯,vm is linearly independent if and only if each vector in span(v1,⋯,vm) has only one representation as a linear combination of vjNow we define Linear Dependent
Definition17Linearly dependent
A list v1,⋯,vm of vectors in V is called linearly dependent if it is not linearly independent.In other words, a list of vector is linearly dependent if there exist a1,⋯,am∈F, not all 0, such that a1v1+⋯amvm=0
Lemma17.1Linear Dependence Lemma
Suppose v1,⋯,vm is a linearly dependent list in V . Then there exists j∈{1,2,⋯,m} such that the following hold:★ vj∈span(v1,⋯,vj-1)★ if the jth term is removed from v1,⋯,vm, the span of the remaining list equals previous span.The proof is easy. Just change the form of the 0=a1v1+⋯+amvm, and then use other vector in the list to present any of the vj
The second lemma above is very useful. It means most time, we can simply add any vector into the list and remove another:
Theorem18Length of linearly independent list ⩽ length of spanning list
ProofLet the vector in the linearly independent list denoted by uj, those in the spanning list denoted by wj, up to wn. And we do the following steps:(1) Each time, we add a new uj into the spanning list, thus we change it into a linear dependent list(2) Then, we remove any wj of the list. Because uj is independent.Then we create a spanning list of length n, any other uj cannot be linearly independent with this spanning list.□
We can use these theorem to prove some "trivial" true:
Theorem19Finite-dimensional subspaces
Every subspaces of a finite-dimensional vector space is finite-dimensionalProofLet v0=0. If U=span(v1,⋯,vj-1), then U is finite-dimensional. Otherwise, choose a vector vj∈U such thatvj∉span(v1,⋯,vj-1)After each step, as long as the process continues, we have constructed a list of vectors such that no vector in this list is in the span of the previous vectors. Thus after each step, the list is linearly independent. By linear independent Lemma, This linearly independent list cannot be longer than any spanning list of V. □
2.B BasesIn the last section, we discussed linearly independent lists and spanning lists. Now we bring there concepts together.
Definition20basis
A basis of V is a list of vectors in V that is linearly independent and spans V.
Theorem21Criterion for basis
A list v1,⋯vn of vectors in V is a basis of V if and only if every v∈V can be written uniquely in the formv=a1v1+⋯+anvnwhere a1,⋯,an∈F.
Theorem22spanning list contains a basis
Proof★ If spanning list is linearly independent, it's a basis★ Otherwise, remove any vector of it until it's linearly independent. □
Theorem23Basis of finite-dimensional vector space
Any finite-dimensional vector space has a basis, because it has some list spanning it.
Our next theorem is in some sense a dual of Theorem "spanning list contains a basis"
Theorem24Linearly independent list extends to a basis
Every linearly independent list of vectors in a finite-dimensional vector space can be extended to a basis of the vector space.ProofKeep adding additional linearly independent vector into the list, until its length equal to the basis of the space. ——not formalWe can simply add the basis to the independent list, keep removing vectors of it until it becomes a basis. ( Because the vector in previous list v1⋯vn is linearly independent, so we don't need to remove any vj)
As an application of the results above, we now show that every subspace of a finite-dimensional vector space can be paired with another subspace to form a direct sum of the whole space
Theorem25Every subspace of V is part of sum equal to V
A list v1,⋯vn of vectors in V is a basis of V if and only if every v∈V can be written uniquely in the formv=a1v1+⋯+anvnwhere a1,⋯,an∈F.
2.C DimensionAlthough we have been discussing finite-dimensional vector space, what's the dimension of such an object? A reasonable definition should force the dimension of Fn to equal n. Notice that the standard basis(1,0,⋯,0),⋯of Fn has length n. Thus we try to define the dimension as the length of a basis. However, before doing this, can we promise that every basis of a vector space has equal length?
Theorem26Basis length does not depend on basis
ProofSuppose V is finite-dimensional. Let B1 and B2 be two bases. Then B1 is a linearly independent list in the view of B2. The length of B2 thus must bigger than length of B1. And vice versa.
Now we can formally define the dimension of such spaces:
Definition27dimension, dim V
The dimension of a finite-dimensional vector space is the length of any basis of the vector space.
Example27.1dimPm(F)=m+1
A polynomial has dimension: dimPm(F)=m+1,because the basis is1,z,⋯zm
To check that a list of vectors in V is a basis of V , we must, according to the definition, show that the list in question satisfies two properties. However, sometimes it's easier than this:
Theorem28Linearly independent list of right length is a basis
ProofSuppose dim V=n and v1,⋯,vn is linearly independent in V. The list v1,⋯,vn can be extended to the basis, but basis has length equal n, thus the list itself is already a basis.
Example28.1Show that list (5,7),(4,3) is a basis of F2
It's easy to show. Just remember that two vector with "," is a list.
Similarly, we have
Theorem29Spanning list of right length is a basis
ProofSpanning list contains a basis. But all bases have the same length. Thus the list itself is a basis.
The next result gives a formula for the dimension of the sum of two subspaces of a finite-dimensional vector space. The formula is analogous to a familiar counting formula: the number of elements in the union of two set equals the number of elements in the first set, plus the number of elements in the second set , minus the number of element in the intersection of the two set.
Theorem30Dimension of a sum
If U1 and U2 are subspaces of a finite-dimensional vector space, thendim(U1+U2)=dim U1+dim U2-dim(U1∩U2).ProofLet u1,⋯,um be the basis of U1∩U2, because u1,⋯,um are linear independent vectors in U1 and U2, we extend them to basis of Ui. For example, u1,⋯,um,v1,⋯vn is a basis of U1, u1,⋯,um,w1,⋯wk is a basis of U2. Now we will prove that u1,⋯,um,v1,⋯,vn,w1,⋯,wk is a basis of U1+U2.★ First, it's easy to show that u1,⋯,um,v1,⋯,vn,w1,⋯,wk is a spanning list.★ Then, if v1 is linearly dependent with u1,⋯,um,w1,⋯,wk, we can represent v1 with this list. According to the addition reservation, v1 is a vector of U2. Besides, it's a vector of U1, so it's in the intersection U1∩U2, which means it should be represented by the basis list u1,⋯,um. However, we supposed that u1,⋯,um,v1,⋯vn is a basis of U1, v1 cannot be represented by u1,⋯,um. So v1 must be linear independent with u1,⋯,um,w1,⋯,wk★ So u1,⋯,um,v1,⋯,vn,w1,⋯,wk is a linearly independent list□
Chapter3. Linear MapsHere comes something!★ Fundamental Theorem of Linear Maps★ the matrix of a linear map with respect to given bases★ isomorphic vector spaces★ product spaces★ quotient spaces★ the dual space of a vector space and the dual of a linear map3.A The Vector Space of Linear MapsDefinition and Examples of Linear MapsNow we are ready for one of the key definitions in linear algebra.
Definition31linear map
A linear map from V to W is a function T:V→W with the following properties:★ additivityT(u+v)=Tu+Tv,for all u,v∈V ;★ homogeneityT(𝜆v)=𝜆T(v),for all 𝜆∈Fand all v∈V .The set of all linear maps from V to W is denoted L(V,W).
More information
★ Some mathematicians use the term linear transform the same as linear map★ For linear maps we often use the notation Tv as well as the more standard functional notation T(v)
Example31.1Some linear maps
★ zero map★ identity map★ differential★ integral★ multiplication by x2: T∈L(P(R),P(R)),(Tp)(x):=x2p(x)★ backward shift: T∈L(F∞,F∞), T(x1,x2,x3,⋯):=(x2,x3,⋯)
The existence part of the next results means we can find a linear map that takes on whatever values we wish on the vectors in a basis. The uniqueness part means that a linear map is completely determined by its values on a basis.
Theorem32Linear maps and basis of domain
Suppose v1,⋯,vn is a basis of V and w1,⋯,wn∈W. Then there exists a unique linear map T:V→W such thatTvj=wjfor each j=1,⋯,n.ProofWe can define a linear map that maps each vj to wj:T∈L(V,W), T(a1v1+⋯+anvn):=a1w1+⋯anwn.And it's easy to show that T is a linear map.If there exist another S which can map each vj to wj, then for every vector in the space, two maps must be equal. So actually, there is only one map.
Algebraic Operations on L(V,W)We begin by defining addition and scalar multiplication on L(V,W)
Definition33addition and scalar multiplication on L(V,W)
Suppose S,T∈L(V,W) and 𝜆∈F. The sumS+T and the product 𝜆T are the linear maps from V to W defined by(S+T)(v):=Sv+Tv(𝜆T)(v):=𝜆(Tv)
The next result should not be a surprise.
Definition34L(V,W) is a vector space
With the operations defined in Definition "addition and scalar multiplication"
Usually it makes no sense to multiply together two elements of a vector space, but for some pairs of linear maps a useful product exists. We will need a third vector space, so for the rest of this section suppose U is a vector space over F
Definition35Product of Linear Maps
If T∈L(U,V) and S∈L(V,W), then the productST∈L(U,W) is defined by(ST)(u):=S(Tu).It's just a notation for linear mapping twice. It's just the usual composition S∘T of two functions.Note that ST is defined only when T maps into the domain of S
3.B Null Spaces and RangesNull Spaces and InjectivityIn this section we will learn about two subspaces that are intimately connected with each linear map.
Definition36null space null T
For T∈L(V,W), the null space of T, denoted null T, is the subset of V consisting of those vectors that T maps to 0:null T={v∈V: Tv=0}
Example36.1null space
★ zero map T∈L(V,W), then null T=V★ Suppose 𝜑∈L(C3,C) is defined by 𝜑(z1,z2,z3)=z1+2z2+3z3. Then null 𝜑={(z1,z2,z3)∈C3: z1+2z2+3z3=0}Some mathematicians use the term kernel instead of null space.
Theorem37The null space is a subspace
ProofAddition and scalar multiplication is closed because of the definition of linear map.□
As we will soon see, for a linear map the next definition is closely connected to the null space.
Definition38injective
A function T is called injective if Tu=Tv implies u=v, which means 单射
Theorem39The null space and injectivity
Let T∈L(V,W). Then T is injective if and only if null T={0}.
Range and Surjectivity
Definition40Range
For T a function from V to W, the range of T is the subset of W consisting of those vectors that are of the form Tv for some v∈V:range T={Tv:v∈V}.Recall that V is called domain here.Also called image
Theorem41The range is a subspace
ProofAddition and scalar multiplication is closed because of the definition of linear map.□
Definition42surjective
A function T is called surjective if its range equals W.
Fundamental Theorem of Linear MapsThe next result is so important that it gets a dramatic name.
Theorem43Fundamental Theorem of Linear Maps
Suppose V is finite-dimensional and T∈L(V,W). Then range T is finite-dimensional and dim V=dimnull T+dimrange T.ProofSuppose V has a m length basis v1,v2,⋯,vm. When it is mapped into another space, they're changed intowj=TvjIt's clearly that wj spans range T. Thus we can remove some of the wj until it remains a basis of range T. Suppose we classify wj into two lists: w1,⋯,wk and wk+1,⋯,wm, each wj in the latter list is linearly dependent with the former. In other words, w1,⋯,wk is a basis and range T has dimension k. Now I will prove the length n=m-k, of list wk+1,⋯,wm, is the dimension of null T. Supposewj=k∑iaiwi,for j⩾k+1It means Tvj=k∑iaiT(vi) as well. Thus: let uj=vj-k∑iaivi, T(uj)=0. For every vector in null spaceT(m∑i=1aivi)=0,use uj to replace vj for j⩾k+1. There will be more terms of vi for i⩽k. So we collect the coefficients and writeT(k∑i=1a'ivi+m∑j=k+1uj)=0Because the latter term naturally equals 0, k∑i=1T(a'ivi)=k∑i=1a'iwi=0So a'i=0.Therefore, for every vector in null space , we can use list uj to represent it, which means list uj is a span. Besides, it can be proved that uj is linearly independent——So the dimension of null space is n=m-k!□
Now we can show that no linear map from a finite-dimensional space to a "smaller" space is injective, where "smaller" is measured by dimension
Theorem44A map to a smaller dimensional space is not injective
ProofUse fundamental theorem of linear maps
Theorem45A map to a larger dimensional space is not surjective
ProofSimilarly
The term "homogeneous" below means that all the constant value on the right of the equation system equal 0.
Example45.1Rephrase in terms of a linear map the question of whether a homogeneous system of linear equations has a nonzero solution
The equationT(x1,⋯,xn)=0means that x1,⋯,xn is in the null space. Thus after rephrasing, the question becomes:Whether the linear map is injective.
Theorem46Homogeneous system of linear equations
A homogeneous system of linear equation with more variable than equations has nonzero solutions.ProofMore variable means it's a mapping into smaller space. □
Example46.1Whether an inhomogeneous system of linear equations has no solutions for some choice of the constant terms
If the range has less dimension than vector space W, then there definitely can be some situation in which the equation has no solution. So the question can be changed as:Whether the linear map is surjective?
Theorem47Inhomogeneous system of linear equations
An inhomogeneous system of linear equation with less variable than equations has no solution for some choice of the constant termsProofMore equation means it's a mapping into bigger space. □
3.C MatricesRepresenting a Linear Map by a Matrix
Definition48Matrix of a linear map, M(T)
Suppose T∈L(V,W) and v1,⋯,vn is a basis of V and w1,⋯,wm is a basis of W . The matrix of T with respect to these bases is the m-by-n matrix M(T) whose entries Aj,k are defined byTvk=A1,kw1+⋯+Am,kwmIf the bases are not clear from the context, then the notation M(T(v1,⋯,vn),(w1,⋯,wm)) is used.
The matrix M(T) of a linear map T∈L(V,W) depends on the basis v1,⋯,vn of V and the basis w1,⋯,wm of W, as well as on T. However, the bases should be clear from the context, and thus they are often not included in the notation.
Definition49Notation Fm,n
For m and n positive integers, the set of all m-by-n matrices with entries in F is denoted by Fm,n.
Theorem50Linear combination of columns
Suppose A is an m-by-n matrix and c=a
c1
⋮
cn
is an n-by-1 matrix. ThenAc=c1A.,1+⋯+cnA.,1In other words, Ac is a linear combination of the columns of A, with the scalars that multiply the columns coming from c.
3.D Invertibility and Isomorphic Vector SpacesInvertible Linear Maps
Definition51invertible, inverse
A linear map T∈L(V,W) is called invertible if there exists a linear map S∈L(W,V) such that ST equals the identity map on V and TS equals the identity map on WA linear map S∈L(W,V) satisfying ST=I and TS=I is called an inverse of T (note that the first I is on V and the second I is on W.
Theorem52Inverse is unique
ProofOtherwiseS1=S1I=S1TS2=IS2=S2Thus S1=S2□
Theorem53Invertibility is equivalent to injectivity and surjectivity
A linear map is invertible if and only if it is injective and surjective.Proof★ "Only if" is clearly to prove. ★ Then if the map is one-to-one, we can just defineS→T(S(w))=w★ Clearly we have T∘S=I. Then what about S∘T ? For each vector v in T, surjectivity tells us that we can find w∈S, v=S(w). Thus∵S(T(S(w)))=S(w) S(T(v))=v★ To complete the proof, we need to show that S is linear as well.
Isomorphic Vector SpacesThe next definition captures the idea of two vector spaces that are essentially the same, except for the names of the elements of the vector spaces
Definition54isomorphic, isomorphism
★ An isomorphism is an invertible linear map★ Two vector spaces are called isomorphic if there is an isomorphism from one vector space onto the other one.
More information
The Greek word isos means equal; the Greek word morph means shape. Thus isomorphic literally means equal shape.It means the same time as invertible. Use "isomorphism" when you want to emphasize that the two spaces are essentially the same.
Theorem55Dimension shows whether vector spaces are isomorphic
Two finite-dimensional vector spaces over F are isomorphic if and only if they have the same dimension.ProofFirst, if vector spaces are isomorphic, then there exists an isomorphic linear map between V andW. Isomorphism asks the null space to be {0}. Thus according to the fundamental theorem of linear map, dim W=range M=dim V. To complete the proof in another direction, suppose two finite dimensional vector spaces have the same dimension. Thus we can construct a linear map which holds each coefficients of basis label.T(c1v1+⋯+cmvm)=c1w1+⋯+cmwm.□
The previous result implies that each finite-dimensional vector space V is isomorphic to Fn, where n=dim V.If v1,⋯,vn is a basis of V and w1,⋯,wm is a basis of W, then for each T∈L(V,W), we have a matrix M(T)∈Fm,n. In other words, once bases have been fixed for V and W, M becomes a function from L(V,W) to Fm,n. Notice M is a linear map, and it's actually invertible, as we will show.
Theorem56L(V,W) and Fm,n are isomorphic
Suppose v1,⋯,vn is a basis of V and w1,⋯,wm is a basis of W, Then M is an isomorphism between L(V,W) and Fm,n.
Theorem57dimL(V,W)=(dim V )(dim W )
M is an isomorphism between L(V,W) and Fm,n. Thus L has the same dimension with Fm,n∴dimFm,n=mn
Linear Maps thought of as Matrix MultiplicationPreviously we defined the matrix of a linear map. Now we define the matrix of a vector.
Definition58matrix of a vector, M(v)
Suppose v∈V and v1,⋯,vn is a basis of V. For v=n∑icivi, defineM(v):=a
c1
⋮
cn
M(v) is an isomorphism of V onto Fn,1.
Theorem59M(T).,k=M(vk) (? It may should be M(T).,k=M(T(vk))
Suppose T∈L(V,W) and v1,⋯,vn is a basis of V and w1,⋯,wm is a basis of W. M(T) then is defined by ci=∑jMijaj. So the kth column is M.,k, and Tvk=Mij⋅𝛿j-k=M.,k.
It means, for some linear map, the kth column of the matrix equals the matrix of " the map of the kth basis element"
Theorem60Linear maps act like matrix multiplication
Suppose v=a1v1+⋯+anvna
M(Tv)
=a1M(Tv1)+a2M(Tv2)+⋯+anM(Tvn)
=a1M(T).,1+⋯+anM(T).,n
=M(T)M(v)
Because the result above allows us to think (via isomorphism) of each linear map as multiplication of Fn,1 by some matrix A, keep in mind that the specific matrix A depends not only on the linear map but also on the choice of bases.One of the themes of many of the most important results in later chapters will be the choice of a basis that makes the matrix A as simple as possible.Operators
Definition61operator, L(V)
A linear map from a vector space to itself is called an operator.L(V)=L(V,V)
Recall that injective and surjective implies invertibility. However, I'll show that in finite-dimensional vector space, surjective is equivalent to injective in operator.
Theorem62injectivity is equivalent to surjectivity in finite-dimensional operator
ProofUse fundamental theorem of linear maps. injectivity means null V={0}, thusdimrange M=dim V-0=dim Vvice versa. □
3.E Products and Quotients of Vector SpacesProducts of Vector SpacesAs usual when dealing with more than one vector spaces, all the vector spaces in use should be over the same field.
Definition63product of vector spaces
Suppose V1,⋯,Vm are vector spaces over F.★ The productV1×⋯×Vm is defined byV1×⋯×Vm={(v1,⋯,vm):v1∈V1,⋯,vm∈Vm}★ Addition on V1×⋯×Vm is defined by(u1,⋯,um)+(v1,⋯,vm)=(u1+v1,⋯,um+vm)★ Scalar multiplication on V1×⋯×Vm is defined similarly
Example63.1R2×R3 is isomorphic to R5
In this case, the isomorphism is so natural that we should think of it as a relabeling. Some people would even informally say that R2×R3=R5, which is not technically correct but which captures the spirit of identification via relabeling.
Example63.2Find a basis of P2(R)×R
Recall that product is kind of like "sum" in dimension. The basis is(1,(0,0));(x,(0,0));(x2,(0,0));(0,(1,0));(0,(0,1))
Theorem64Dimension of a product is the sum of dimensions
dim(V1×⋯×Vm)=dim V1+⋯+dim Vm
Products and Direct SumsIn the next result, the map Γ is surjective by the definition of U1+⋯+Um. Thus the last word in the result below could be changed from "injective" to "invertible".
Theorem65Products and direct sums
Suppose that U1,⋯,Um are subspaces of V. Define a linear map Γ: U1×⋯×Um→U1+⋯+Um by𝛤(u1,⋯,um)=u1+⋯+um.Then U1+⋯+Um is a direct sum if and only if Γ is injective.ProofIf Γ is injective, then for each u1+⋯+um, the original u1,⋯,um is unique, which means that is a direct sum. vice versa. □
Theorem66A sum is a direct sum if and only if dimensions add up
ProofThe products has dimension dimV1+dimV2+⋯. It's a direct sum if and only if Γ is invertible, which is a linear map between finite-dimensional vector space. Thus only if the range equals dimension of domain, can the map be isomorphism.
Quotients of Vector SpaceWe begin our approach to quotient spaces by defining the sum of a vector and a subspace.
Definition67v+U
Suppose v∈V and U is a subspaces of V. Then v+U is the subset of V defined byv+U={v+u:u∈U}.
Definition68affine subset, parallel
★ An affine subset is a subset of V of the form v+U for some v∈V and some subspace U of V.★ For v∈V and U a subspace of V, the affine subset v+U is said to be parallel to U
Definition69quotient space, V/U
Suppose U is a subspace of V. Then the quotient space V/U is the set of all affine subsets of V parallel to U. In other words,V/U={v+U:v∈V}.
Our next goal is to make V/U into a vector space. To do this, we will need the following result.
Theorem70Two affine subsets parallel to U are equal or disjoint
Suppose U is a subspace of V and v,w∈V. Then the following are equivalent:a
(1)
v-w∈U
(2)
v+U=w+U
(3)
(v+U)∩(w+U)≠∅
Proofif (1) holds, so v+u=w+(v-w+u)∈w+U, vice versa. So (2) holds. If (3) holds, so (v+U)∩(w+U)≠∅. Thus there exist u1,u2∈U such thatv1+u1=w+u2.Thus v-w=u2-u1∈U
Now we can define addition and scalar multiplication on V/U
Definition71addition and scalar multiplication on V/U
Defined as followed:(v+U)+(w+U):=(v+w)+U𝜆(v+U):=𝜆v+U
Theorem72Quotient Space is a Vector Space
Suppose U is a subspace of V. Then V/U, with the operations of addition and scalar multiplication as defined above, is a vector space.The additive identity of V/U is 0+U and that the additive inverse of v+U is (-v)+U.
VU,a subspace of VV/U
Fig1:Quotient Space
The next concept will gives us an easy way to compute the dimension of V/U.
Definition73quotient map, 𝜋
Suppose U is a subspace of V. The quotient map 𝜋 is the linear map 𝜋:V→V/U defined by𝜋(v):=v+Ufor v∈V. 𝜋 should depends on U, V.
Definition74Dimension of a quotient space
Suppose V is finite-dimensional and U is a subspace of V. Thendim V/U:=dimV-dimUProofUse the fundamental theorem of linear map. Consider map 𝜋, its null space is U itself. Then it gives the desired result.
Each linear map T on V induces a linear map T on V/(null T), which we now define.
Definition75T
Suppose T∈L(V,W). Define T: V/(null T)→W byT(v+null T)=Tv.It means that a group of vectors parallel to null has the same result.
To show that the definition of Tmakes sense, suppose u,v∈V are such that u+null T=v+null T. Then according to (2), there must be u-v∈null T. Thus T(u-v)=0. Hence Tu=Tv. The definition of Tindeed make sense.
Definition76Null space and range of T
Suppose T∈L(V,W). Then(1) Tis a linear map from V/(null T) to W;(2) Tis injective;(3) rangeT=range T;(4) V/(null T) is isomorphic to range T.Proof(1) omit(2) We need to prove that nullT={0}. Recall that the domain of Tis a quotient space. So for different v∈V: Tv=0, they all in the null space v∈null T. Thus vi+null T is a same thing.
3.F DualityThe Dual Space and the Dual MapLinear maps into the scalar field F play a special role in linear algebra, and thus they get a special name:
Definition77linear functional
A linear functional on V is a linear map from V to F. In other words, a linear functional is an element of L(V,F).
Example77.1linear functionals
★ Define 𝜑:R3→R by 𝜑(x,y,z):=4x-5y+2z. Then 𝜑 is a linear functional on R3.★ Fix (c1,⋯,cn)∈Fn. Define 𝜑:Fn→F by𝜑(x1,⋯,xn)=∑icixi★ Define 𝜑:P(R)→R by 𝜑(p)=1∫0p(x)dx. ★ Define 𝜑:P(R)→R by 𝜑(p)=3p''(5)+7p(4).
The vector space L(V,F) also gets a special name and special notation:
Definition78dual space, V'
The dual space of V, denoted V', is the vector space of all linear functionals on V. In other words, V'=L(V,F).
Theorem79dim V'=dim V
Suppose V is finite-dimensional. Then dim V'=dim VProofdimL=dim V×dimW
Definition80dual basis
If v1,⋯,vn is a basis of V, then the dual basis of v1,⋯,vn is the list 𝜑1,⋯,𝜑n of elements of V', where each 𝜑j is the linear functional on V such that𝜑j(vk)=a
1
if k=j,
0
if k≠j.
The next result shows that the dual basis is indeed a basis.
Theorem81Dual basis is a basis of the dual space
ProofWe only need to prove that each element in the basis is linearly independent, which is obvious.
Definition82dual map, T'
If T∈L(V,W), then the dual map of T is the linear map T'∈L(W',V') defined by T'(𝜑)=𝜑∘T for 𝜑∈W'. Recall that, 𝜑∈W', so the dual map itself is a linear functional. It can operate on a linear functional in dual space, find a corresponding linear functional in V', which brings any v into 𝜑∘T(v).
VWV'W'TT'linear functional
Fig2:dual map
Example82.1dual map
It's like mapping twice. When you map v∈V into W, you choose a basis in W to define a linear functional. After doing this, you want to define a linear functional in space V as well. And you want to make sure that these to linear functional has some connection. For example, in this situation, if you choose a good basis and make the T map between V and W very simple: aivi→aiwi. Then this dual map will create some linear functionals in V': v'=𝜑s∘T.Then let v' map a vector from V into F: v' (aivi)=𝜑∘T(aivi)=𝜑∘aiwi=ai
Example82.2an image of dual map
a
a
x1
x2
x3
→
a
a11
a12
a13
a21
a22
a23
a
x1
x2
x3
→
a
y1
y2
dual↓
dual map
dual↓
a
x4
x5
x6
←
a
b11
b21
b12
b22
b13
b23
a
y3
y4
←
a
y3
y4
Theorem83Algebraic properties of dual maps
★ (S+T)'=S'+T' for all S,T∈L(V,W)★ (𝜆T)'=𝜆T' for all T∈L(V,W)★ (ST)'=T'S' for all T∈L(U,V)and all S∈L(V,W)Proof★ (S+T)'𝜑=𝜑∘(S+T)=𝜑∘S+𝜑∘T=S'𝜑+T'𝜑★ Suppose 𝜑∈W', then(ST)'𝜑=𝜑∘(ST)=(𝜑∘S)∘Tthe first term is a linear functional whose domain is T. Thus according to the dual map definition=T'(𝜑∘S)=T'S'(𝜑)□
The null Space and Range of the Dual of a Linear MapOur goal in this subsection is to describe null T' and range T' in terms of range T and null T.
Definition84annihilator, U0
For U⊂V,the annihilator of U, denoted U0, is defined byU0={𝜑∈V':𝜑(u)=0 for all u∈U}How can a linear map from V operate on U? In my opinion, to be more specific, we can construct a linear map from V' to U' to show this:linear functionalT'TV'U'VUT(u)=u;T'(v')=v'T is just an "identity" map from U into V, so is T'. when we talk about U0, we are talking about the linear functional that maps all u∈U into 0. Thus it's actually a 0 map, whose domain is U, and it's 0 in the dual space U'. Now, we can answer the question -- what's annihilator in fact? It's just the null space of T' !
Example84.1 the annihilator of P(R)
Suppose U is the subspace of P(R) consisting of all polynomial multiples of x2. If 𝜑 is the linear functional on P(R) defined by 𝜑(p)=p'(0), then 𝜑∈U0.
For U⊂V, the annihilator U0 is a subset of the dual space V'. Thus U0 depends on the vector space containing U, so a notation such as U0V would be more precise.
Example84.2
Let e1,e2,⋯,e5 denote the standard basis of R5, and let 𝜑1,𝜑2,⋯𝜑5 denote the dual basis of (R5)'. SupposeU=span(e1,e2)={(x1,x2,0,0,0)∈R5:x1,x2∈R}.Show that U0=span(𝜑3,𝜑4,𝜑5)Solutionomit.
Theorem85The annihilator is a subspace
Suppose U⊂V. Then U0 is a subspace of V'.ProofDefine the addition and scalar multiplication as normal.Clearly 0∈U0.
Theorem86Dimension of the annihilator
Suppose V is finite-dimensional vector space and U is a subspace of V. Thendim U+dim U0=dim VProofVUU'Informally: For a list of basis of U, it can be extended to the basis of V. Thus the dual basis of U only has the dimension of the "absent" bases number in U.Formally: because we can define a linear map from U to V, such that i(u)=u, for u ∈U. Thus i' is a map from V' into U'. Using fundamental theoremdimnull i'+dimrange i'=dim V'=dim Vand null i'={i'v': i'(v')=0}where i'(v')=0 means 𝜑(u)=0 for all u∈U. Thus null i'=U0. And the range of i' is clearly U', whose dimensional equals dim U. So equ(3) becomesdim U0+dim U=dim V□
We have show the null space of the T' from a dual space V' to the dual space of a subspace of V, which is U'. Now, what is the null space of a random dual map T'?
Theorem87The null space of T'
Suppose V and W are finite-dimensional and T∈L(V,W). Then★ null T'=(range T)0★ dimnull T'=dim null T+dim W-dim VProofRecall that T'𝜑:=𝜑∘T. What's null T'?null T'𝜑(v):=𝜑∘(Tv)=0What kind of 𝜑 make T(v)=0?——𝜑∈(Tv)0. So the first theorem is proved.Once the first theorem is proved, using the previous theorem, we havedimnull T'=dim(range T)0=dim W-dim(range T).Using the fundamental theorem:dim(range T)=dim V-dim(null T)thendimnull T'=dim null T+dim W-dim V□Trange T→dim1null T→dim3000T'00a
null T'
=(range T)0→dim2
a
dim1
=dimW-dim2
=dimW-dimV+dim3
Fig3:relationships between null T,null T',(range T)0
Theorem88T surjective is equivalent to T' injective
Suppose V and W are finite-dimensional and T∈L(V,W). Then T is surjective if and only if T' is injective.ProofSee equ(4)
Theorem89The range of T'
Suppose V and W are finite-dimensional and T∈L(V,W). Then★ dimrange T'=dimrange T★ range T'=(null T)0Proofdim range T'=dim W-dimnull T', using equ (5)a
dimrange T'
=dim W-(dimnull T+dim W-dim V)
=dim V-dimnull T
=dimrange T
For range T':a
T'(𝜑)
={𝜑∘T: for 𝜑∈W'}
range T'
={𝜓:𝜓=T'𝜑=𝜑∘T, for some 𝜑∈W'}
thus for all 𝜓∈T', the result they act on v∈V is 𝜓v=𝜑∘Tv, for some 𝜑∈W'If v∈null T, then𝜓(v)=𝜑∘0=0𝜓∈range T' can map every vector in null T into 0, thus range T'⊆(null T)0Besides, equ(6) tells us dimrangeT'=dimrange T=dim V-dim(null T)=dim(null T)0. Thus we have equ(7)□
Theorem90T injective is equivalent to T' surjective
Suppose V and W are finite-dimensional and T∈L(V,W). Then T is injective if and only if T' is surjective.ProofT is injective if and only if null T={0}, which means(null T)0=V'.Using equ(7), it tells us T' is surjective.
The Matrix of the Dual map of a Linear MapWe now define the transpose of a matrix.
Definition91Transpose
(At)k,j:=Aj,k
Theorem92The transpose of the product of matrices
(AC)t=CtAtProofa
((AC)tk,j)
=(AC)j,k=∑iAjiCik
=∑iAtijCtki=∑iCtkiAtij
=(AtCt)k,j
The setting for the next result is the assumption that we have a basis v1,⋯,vn of V, along with its dual basis 𝜑1,⋯,𝜑n of V'.
Theorem93The matrix of T' is the transpose of the matrix of T
Suppose T∈L(V,W), M(T')=(M(T))tProofFirst, recall the meaning of a matrix times a matrix of vector:M(T')M(w')i:=the ithcoefficient of the map of w'BecauseT'w'=∑iT'(wi𝜑i)=∑iwi𝜑i∘T(T'w'(v))=∑iwi𝜑i∘T(v)Because 𝜑i∘T(vj)=(∑jTijvj)(T'w'(v))=∑iwi∑jTijvj=∑j(∑iwiTij)vjNow we write the linear functional as the product of two matrix: ∑iwiTij. Because this linear is the product of the matrix of dual map and the matrix in the dual space W', So let's define the matrix of dual map isM(T')=M(T)t Thus∑iwiTij=∑iT'jiwi=M(T')w□
The Rank of a Matrix
Definition94row rank & column rank
Suppose A is an m-by-n matrix with entries in F. The row rank of A is the dimensional of the span of the rows of A in F1,n
Theorem95Dimension of range T equals column rank of M(T)
ProofSuppose v1,⋯,vn is a basis of V and w1,⋯,wm is a basis of W. The function that takes w∈span(Tv1,⋯,Tvn) to M(w) is easily seen to be an isomorphism from span(Tv1,⋯,Tvn) onto span(M(Tv1),⋯,M(Tvn)), where the last dimension equals the column rank of M(T):a
T11
⋯
T21
⋯
T31
⋯
⋅a
1
0
0
=a
T11
T21
T31
It is easy to see that range T=span(Tv1,⋯,Tvn). Thus we havedimrange T=dimspan(Tv1,⋯Tvn)=the column rank of M(T)
Theorem96Row rank equals column rank
Suppose A∈Fm,n. Then the row rank of A equals the column rank of A.ProofDefine T:Fn,1→Fm,1 by Tx:=Ax. Thus M(T)=A. Now
column rank of A
=column rank ofM(T)
=dimrange T
=dimrange T'
=column rank of M(T')
=column rank of At
=row rank of A
Chapter4. Polynomials(This chapter is mostly omitted)4.A polynomialsThe Division Algorithm for Polynomials
Theorem97Division Algorithm for Polynomials
Suppose that p, s∈P(F), with s≠0. Then there exist unique polynomials q,r∈P(F) such thatp=sq+rand deg r<deg s
Theorem98Fundamental Theorem of Algebra
Every non-constant polynomial with complex coefficients has a zero.
⋯⋯Chapter5. Eigenvalues, Eigenvectors, and Invariant SubspacesLinear maps from one vector space to another vector space were the objects of study in Chapter 3. Now we begin our investigation of linear maps from a finite-dimensional vector space to itself. Their study constitutes the important part of linear algebra.5.A Invariant SubspacesIn this chapter we develop the tools that will help us understand the structure of operators. Recall that an operator is a linear map from a vector space to itself. Recall also that we denote the set of operators on V by L(V).Let's see how we might better understand what an operator looks like. Suppose T∈L(V). If we have a direct sum decompositionV=U1⊕U2⊕⋯⊕Um,where each Uj is a proper subspace of V, then to understand the behavior of T, we need only understand the behavior of each |T||Uj; here T|Uj denotes the restriction of T to the smaller domain Uj. Dealing with T|Uj should be easier than dealing with T because Uj is a smaller vector space than V.However, if we intend to apply tools useful in the study of operators (such as taking powers), then we have a problem:T|Uj may not map Uj into itself. Thus we are led to consider only decompositions of V of the form above where T maps each Uj into itself.
Definition99invariant subspace
Suppose T∈L(V). A subspace U of V is called invariant under T if u∈U implies Tu∈U.
Example99.1 invariant subspace
Suppose T∈L(V). Show that each of the following subspaces of V is invariant under T:
★ {0};★ V;★ null T;★ range T;
Trange Tnull T{0}
Eigenvalues and EigenvectorsNow we turn to an investigation of the simplest possible nontrivial invariant subspaces—invariant subspaces with dimension 1.Take any v∈V with V≠0 and let U equal the set of all scalar multiples of v:U={𝜆v:𝜆∈F}=span(v).Then U is a 1-dimensional subspace of V and every 1-dimensional subspace of V is of this form for an appropriate choice of v. If U is invariant under an operator T∈L(V), then Tv∈U, and hence there is a scalar 𝜆∈F such that Tv=𝜆v.Conversely, if Tv=𝜆v for some 𝜆∈F, then span(v) is a 1-dimensional subspace of V invariant under T. The equationTv=𝜆vwhich we have just seen is intimately connected with 1-dimensional invariant subspaces, is important enough that the vectors v and scalar 𝜆 satisfying it are given special names.
Definition100eigenvalue and eigenvector
Suppose T∈L(V). A number 𝜆∈F is called an eigenvalue of T if there exist v∈V such that v≠0 and Tv=𝜆v. And v is called eigenvector, then.
Now we show that eigenvectors corresponding to distinct eigenvalues are linearly independent.
Theorem101Linearly independent eigenvectors
Let T∈L(V). Suppose 𝜆1,⋯,𝜆m are distinct eigenvalues of T and v1,⋯,vm are corresponding eigenvectors. Then v1,⋯,vm is linearly independent.ProofSuppose v1,⋯,vm is linearly dependent. Let k be the smallest positive integer such thatvk∈span(v1,⋯,vk-1);then existsvk=a1v1+⋯ak-1vk-1So the linear map can be written asTvk=a1𝜆1v1+⋯+ak-1𝜆k-1vk-1And∵Tvk=𝜆kvk=𝜆k(a1v1+⋯ak-1vk-1),equ(8) minus equ(9):0=a1(𝜆k-𝜆1)v1+⋯+ak-1(𝜆k-𝜆k-1)vk-1
Because we choose k to be the smallest positive integer to be linear dependent, thus v1,⋯,vk-1 are linear independent. ∴𝜆k=𝜆1=⋯=𝜆k-1Therefore our assumption that v1,⋯,vm is linearly dependent was false.
From the sketch we can realize that, unless 𝜆1=𝜆2=𝜆3, there must be𝜆3v3≠Tv3
Tv3v3𝜆2=6𝜆1=4
Fig4:eigenvectors of different eigenvalues
Theorem102Numbers of eigenvectors
Suppose V is finite-dimensional. Then each operator on V has at most dim V distinct eigenvalues.ProofDistinct eigenvalues means linear independent vectors.
Restriction and Quotient OperatorsIf T∈L(V) and U is a subspace of V invariant under T, then U determines two other operators T|U∈L(U) and T/U∈L(V/U) in a natural way, as defined below
Definition103T|U and T/U
Suppose T∈L(V), and U is a subspace of V invariant under T.★ The restriction operatorT|U∈L(U) is defined byT|U(u):=Tu, u∈U★ The quotient operator T/U∈L(V/U) is defined by(T/U)(v+U)=Tv+U★ for v∈V.
(T/U)T
Example103.1
Define an operator T∈L(F2)by T(x,y):=(y,0). Let U={(x,0):x∈F}. Show that(1) T|U is the 0 operator on U.(1) solution: obviously(2) there does not exist a subspace W of F2 that is invariant under T and such that F2=U⊕W;(2) solution: if there exist such W, then dim W=dimF2-dim U=1. However, it's easy to find the only eigenvector of T is 0, which is already in U.(3) T/U is the 0 operator on F2/U.(T/U)(v+U)=Tv+U∵Tv∈U∴Tv+U=U, which is the 0 in quotient space F2/U
5.B Eigenvectors and Upper-Triangular MatricesPolynomials Applied to OperatorsThis is a new use of the symbol p because we are applying it to operators, not just elements of F.
Definition104p(T)
Suppose T∈L(V), and p∈P(F) is a polynomial given byp(z)=a0+a1z+⋯+amzmfor z∈F. Then p(T) is the operator defined byp(T)=a0I+a1T+a2T2+⋯+amTm.
If we fix an operator T∈L(V) , then the function from P(F) to L(V) given by p↦p(T) is linear.
Definition105product of polynomials
If p,q∈P(F), then p q∈P(F) is the polynomial defined by(pq)(z)=p(z)q(z)for z∈F. And any tow polynomials of an operator commute:(pq)T=p(T)q(T)=q(T)p(T)
Existence of Eigenvalues
Theorem106Operators on complex vector spaces have an eigenvalue
Every operator on a finite-dimensional, nonzero, complex vector space has an eigenvalue.ProofSuppose V is a complex vector space with dimension n and operator T∈L(V). Choose v∈V with v≠0, Thenv,Tv,T2v,⋯Tnvis not linear independent, because they are more than n. Thus there exist complex number a0,⋯,an, not all 0, such that0=a0v+a1Tv+⋯anTnvMake the a's the coefficients of a polynomial, which by the Fundamental Theorem of Algebra has a factorizationa0+a1z+⋯+anzn=c(z-𝜆1)⋯(z-𝜆m),where c is a nonzero complex number, each 𝜆j is in C, and the equation holds for all z∈C. We then havea
0
=a0v+⋯anTnv
=(a0I+a1T+⋯+anTn)v
=c(T-𝜆1I)⋯(T-𝜆mI)v.
Upper-Triangular MatricesNow that we are studying operators, which map a vector space to itself, the emphasis is on using only one basis.
Definition107Matrix of Operator
Tvk=A1,kv1+⋯+An,kvn
Definition108Diagonal of a Matrix
The diagonal of a matrix consists of the entries along the line from the upper left corner to the bottom right corner.
Definition109Upper-triangular matrix
If all the entries below the diagonal equal 0.
Theorem110Conditions for upper-triangular matrix
Suppose T∈L(V) and v1,⋯vm is a basis of V. Then the following are equivalent:★ the matrix of T with respect to v1,⋯vn is upper triangular;★ Tvj∈span(v1,⋯,vj) for each j=1,⋯,n;★ span(v1,⋯,vj) is invariant under T for each j=1,⋯,n.
Theorem111Over C, every operator has an upper-triangular matrix
Suppose V is a finite-dimensional complex vector space and T∈L(V). Then T has an upper-triangular matrix with respect to some basis of V.ProofUse induction on the dimension of V. Clearly the desired result holds if dim V=1.Suppose now that dimV>1 and the desired results holds for all complex vector spaces whose dimension is less than the dimension of V. Let 𝜆 be any eigenvalue of T. LetU=range(T-𝜆I)
Because T-𝜆I is not surjective, thus dim U<dim V. Furthermore, U is invariant under T. To prove this, suppose u∈U. Then a
Tu
=(T-𝜆I)u⏠⏣⏣⏣⏡⏣⏣⏣⏢∈U+𝜆u⏠⏣⏣⏡⏣⏣⏢∈U∈U
Thus T|U is an operator on U. By our induction hypothesis, there is a basis uj based which T|U has upper-triangular matrix.
injective and surjective is equivalent for L(V) in finite-dimensional space, and eigenvalue means there is at least a 1-dimensional invariant subspace.
Tuj=(T|U)(uj)∈span(u1,⋯,uj)Extend u1,⋯,um to a basis u1,⋯,um, v1,⋯,vn of V. For each k, we have a obvious relation:Tvk≡(T-𝜆I)vk+𝜆vkThe definition of U shows that (T-𝜆I)vk∈U=span(u1,⋯,um). Thus the equation above shows that Tvk∈span(u1,⋯,um,vk)So, it definitely also has:Tvk∈span(u1,⋯,um,v1,⋯,vk)
To show the proof above more clearly, I drew some sketches.
The key idea is to use subspace with a smaller dimension, and the space we are interesting is V.Vrange T(V)domainTHowever, we actually only need to prove the vector in range T can be represented by some basis. (Pretend range T has a smaller dimension)range T(range T)If we investigate range T, and treat it as the domain, we can use induction. We may loose a part of rangeT(V) . And that's the part we need to prove.VT'=T-𝜆Irange T'(V)domainTHow can we make sure range T has a smaller dimension? We know that with a eigenvalue, there must be a T'=T-𝜆I, such that it is not injective. Use it and redo the proof we did to T.T
Theorem112Determination of invertibility from upper-triangular matrix
Suppose T∈L(V) has an upper-triangular matrix with respect to some basis of V. Then T is invertible if and only if all the entries on the diagonal of that upper-triangular matrix are nonzero.ProofTo prove invertibility, we need to prove injectivity or surjectivity. Suppose the upper-triangular matrix has 𝜆1,⋯,𝜆n on its diagonal, with basis v1,⋯,vn.If 𝜆i≠0:then we can have T(v1
𝜆1)=v1, T(v2
𝜆2)=av1+v2, ⋯Because 𝜆≠0, we can construct such n equations, each equation maps a vector into linearly independent vector list. So range T=dim V.To prove the other direction, now suppose that T is invertible. This implies that 𝜆1≠0, because otherwise we would have Tv1=0. Suppose 𝜆j=0,1<j⩽n, if 𝜆j=0, it means Tvj∈span(v1,⋯,vj-1)So list {Tv1,Tv2,⋯,Tvn} cannot be linear independent. T is not surjective and are not invertible.
Now we have a way to compute the eigenvalues from a upper-triangular matrix:
Theorem113Determination of eigenvalues from upper-triangular matrix
Suppose T∈L(V) and has an upper-triangular matrix with respect to some basis of V. Then the eigenvalues of T are precisely the entries on the diagonal of the upper-triangular matrix.ProofUse the invertibility of T-𝜆I for the upper-triangular matrix.
5.C Eigenspaces and Diagonal Matrices
Definition114diagonal matrix
If all the entries except the diagonal equal 0.
Definition115eigenspace, E(𝜆,T)
Suppose T∈L(V) and 𝜆∈F. The eigenspace of T corresponding to 𝜆, denoted E(𝜆,T), is defined byE(𝜆,T)=null(T-𝜆I)In other words, E(𝜆,T) is the set of all eigenvectors of T corresponding to eigenvalue 𝜆, along with the 0 vector.
Theorem116Sum of eigenspaces is a direct sum
Suppose V is finite-dimensional and T∈L(V). Suppose also that 𝜆1,⋯,𝜆m are distinct eigenvalues of T. ThenE(𝜆1,T)+⋯+E(𝜆m,T)is a direct sum. Furthermore,dimE(𝜆1,T)+⋯+dimE(𝜆,T)⩽dim V
Definition117diagonalizable
An operator T∈L(V) is called diagonalizable if the operator has a diagonal matrix with respect to some basis of V.
Theorem118Conditions equivalent to diagonalizability
Suppose V is finite-dimensional and T∈L(V). Suppose also that 𝜆1,⋯,𝜆m are distinct eigenvalues of T. Then the following are equivalent:(1) T is diagonalizable;(2) V has a basis consisting of eigenvectors of T;(3) there exist 1-dimensional subspaces U1,⋯,Un of V, each invariant under T, such thatV=U1⊕⋯⊕Un;(4) V=E(𝜆1,T)⊕⋯⊕E(𝜆m,T)(5) dim V=dimE(𝜆1,T)+⋯+dimE(𝜆m,T)ProofAn operator T∈L(V) has a diagonal matrix with respect to a basis v1,⋯,vn of V if and only if Tvj=𝜆jvj for each j. Thus (1) and (2) are equivalent.Suppose (2) holds; thus V has a basis v1,⋯,vn consisting of eigenvectors of T. For each j, letUj=span(vj)Thus V=U1⊕⋯⊕Un, (3) holds.Suppose now that (3) holds. The Uj respected to the same 𝜆 can consist E(𝜆m,T). (4) holds, eigenvectors belong different eigenvalues are linearly independent. So (12) also holds.
Now we have shown that (1) (2) (3) are equivalent, and (3) implies (4), implies (5), implies (6). We will show that (5) implies (2).Suppose (5) holds, which means equ(12) holds. Choose a basis of each E(𝜆j,T); put all these bases together to form a list v1,⋯,vn of eigenvectors of T, where n=dim V. And we know that they are linearly independent, and are all eigenvectors of T. Thus (2) holds.
If T∈L(V) has dim V distinct eigenvalues, then T is diagonalizable.
Chapter6. Inner Product SpacesIn making the definition of a vector space, we generalized the linear structure (addition and scalar multiplication) of R2 and R3. We ignored other important features, such as the notions of length and angle. These ideas are embedded in the concept we now investigate, inner products.6.A Inner Products and NormsInner Products
Definition120dot product
For x,y∈Rn, the dot product of x and y, denoted x⋅y, is defined byx⋅y=x1y1+⋯+xnyn,where x=(x1,⋯,xn) and y=(y1,⋯,yn).
Note that the dot product of two vectors in Rn is a number, not a vector. Obviously x⋅x=‖x‖2 for all x∈Rn. The dot product on Rn has the following properties:★ x⋅x⩾0★ x⋅x=0 if and only if x=0;★ for y∈Rn fixed, the map from Rn to R that sends x∈Rn to x⋅y is linear;★ x⋅y=y⋅x for all x,y∈RnAn inner product is a generalization of the dot product. At this point you may be tempted to guess that an inner product is defined by abstracting the properties of the dot product discussed in the last paragraph. For real vector spaces, that guess is correct. However, so that we can make a definition that will be useful for both real and complex vector spaces, we need to examine the complex case before making the definition.‖z‖=ℜz2+ℑz2=z⏨z
Definition121inner product
An inner product on V is a function that takes each ordered pair (u,v) of elements of V to a number ⟨u,v>∈F and has the following properties:positivity⟨v,v>⩾0 for all v∈V;definiteness⟨v,v>=0 if and only if v=0;additivity in first slot⟨u+v,w>=⟨u,w>+⟨v,w>for all u,v,w∈Vhomogeneity in first slot⟨𝜆u,v>=𝜆⟨u,v>for all 𝜆∈Fand all u,v∈V;conjugate symmetry⟨u,v>=⏨⏨⏨⟨v,u>for all u,v∈V.
Example121.1
An inner product can be defined on P(R) by⟨p,q>=∞∫0p(x)q(x)ⅇ-xdx.
Definition122inner product space
An inner product space is a vector space V along with an inner product on V.
For the rest of this chapter, V denotes an inner product space over F.
Theorem123basic properties of an inner product
★ For each fixed u∈V, the function that takes v to ⟨v,u>is a linear map from V to F.★ But fix v is not a linear map!★ ⟨0,u>=⟨u,0>=0★ ⟨u,v+w>=⟨u,v>+⟨u,w>★ ⟨u,𝜆v>=⏨𝜆⟨u,v>
NormsNow we see that each inner product determines a norm
Definition124norm, ‖v‖
For v∈V, the norm of v, is defined by‖v‖=⟨v,v>.
Theorem125basic properties of the norm
Suppose v∈V.(1) ‖v‖=0 if and only if v=0(2) ‖𝜆v‖=||𝜆||‖v‖ for all 𝜆∈F
Definition126orthogonal
Two vectors u,v∈V are called orthogonal if ⟨u,v>=0.
You can think of the word orthogonal as a fancy word meaning perpendicular.We begin our study of orthogonal with an easy result.
Theorem127Orthogonal and 0
★ 0 is orthogonal to every vector in V.★ 0 is the only vector in V that is orthogonal to itself.
Theorem128Pythagorean Theorem
Suppose v,u are orthogonal vectors in V. Then‖u+v‖2=‖u‖2+‖v‖2.Proofa
The next result is called the parallelogram equality because of its geometric interpretation: in every parallelogram, the sum of the squares of the lengths of the diagonals equals the sum of the squares of the lengths of the four sides.vuuvu-vu+v
A list of vectors is called orthonormal if each vector in the list has norm 1 and is orthogonal to all the other vectors in the list. In other words, a list e1,⋯,em of vectors in V is orthonormal if⟨ej,ek>=a
1
if j=k
0
if j≠k
Theorem134An orthonormal list is linearly independent
Every orthonormal list of vectors is linearly independent.Proofmake the inner product between a vector and other vectors in the list
How do we go and finding orthonormal bases? The algorithm used in the next proof is called the Gram-Schmidt Procedure. It gives a method for turning a linearly independent list into an orthonormal list with the same span as the origin list.
Theorem135Gram-Schmidt Procedure
Suppose v1,⋯,vm is linearly independent. Let e1=v1/‖v1‖. For j=2,⋯,m, define ej inductively byej=vj-j-1∑i=1⟨vj,ei>ei
‖vj-j-1∑i=1⟨vj,ei>ei‖Then e1,⋯,em is an orthonormal list of vectors in V such thatspan(v1,⋯,vj)=span(e1,⋯,ej)for j=1,⋯,m.ProofIt's easy to verify ⟨ej,ei>=0. To think this intuitively:It is removing the components projected into the previous vector list.
Theorem136Existence of orthonormal basis
Every finite-dimensional inner product space has an orthonormal basis. Just apply the Gram-Schmidt Procedure to a basis of the space.
Theorem137Orthonormal lists extend to orthonormal basis
Suppose V is finite-dimensional. Then every orthonormal list of vectors in V can be extended to an orthonormal basis of V.a
𝜆1
x
z
𝜆2
y
𝜆3
a
v1
v2
v3
a
e1=v1/𝜆2
e2=av2+bv1
e3=cv1+bv2+dv3
Schmidt Procedure only involves basis before the perticular vector
Theorem138Upper-triangular matrix with respect to orthonormal basis
T has an upper-triangular matrix with respect to some orthonormal basis of V as long as it has an upper-triangular matrix with respect to some basis.
The next result is an important application of the result above
Theorem139Schur's Theorem
Suppose V is a finite-dimensional complex vector space and T∈L(V). Then T has an upper-triangular matrix with respect to some orthonormal basis of V.
Linear Functional on Inner Product SpacesRecall our definition of linear functional. If u∈V, then the map that sends v to ⟨v,u> is a linear functional on V. The next result shows that every linear functional on V is of this form.
Theorem140Riesz Representation Theorem
Suppose V is finite-dimensional and 𝜑 is a linear functional on V. Then there is a unique vector u∈V such that𝜑(v)=⟨v,u>for every v∈VProofFirst we show there exists a vector u∈V such that 𝜑(v)=⟨v,u>. a
𝜑(v)
=𝜑∑i⟨v,ei>ei
=⟨v,e1>𝜑(e1)+⋯+⟨v,en>𝜑(en)
=⟨v,⏨⏨⏨𝜑(e1)e1>+⋯+⟨v,⏨⏨⏨𝜑(en)en>
=⟨v,⏨⏨⏨𝜑(e1)e1+⋯+⏨⏨⏨𝜑(en)en>
=⟨v,u>
u
=⏨⏨⏨𝜑(e1)e1+⋯⏨⏨⏨𝜑(en)en
Then we show u is unique:if 𝜑(v)=⟨v,u1>=⟨v,u2>(13)-(13):0=⟨v,u1-u2>Because (13) should holds for all v∈V, then u1-u2 is orthogonal with all v. It must be 0. Thus u1 equals u2.
6.C Orthogonal Complements and Minimization ProblemsOrthogonal Complements
Definition141orthogonal complement, U⊥
If U is a subset of V, then the orthogonal complement of U, denoted U⊥, is the set of all vectors in V that are orthogonal to every vector in U:U⊥={v∈V:⟨v,u>=0 for every u∈U}.
Theorem142Basic properties of orthogonal complement
(1) If U is a subset of V, then U⊥ is a subspace of V.(2) {0}⊥=V.(3) V⊥={0}.(4) If U is a subset of V, then U∩U⊥⊂{0}.(5) If U and W are subsets of V and U⊂W, then W⊥⊂U⊥.
Theorem143Direct sum of a subspace and its orthogonal complement
Suppose U is a finite-dimensional subspace of V. ThenV=U⊕U⊥.ProofLet e1,⋯,em be an orthonormal basis of U. for a vector in V, we can decompose it into two parts:v=⟨v,e1>e1+⋯+⟨v,em>em⏠⏣⏣⏣⏣⏣⏣⏡⏣⏣⏣⏣⏣⏣⏢u+v-⟨v,e1>e1+⋯+⟨v,em>em⏠⏣⏣⏣⏣⏣⏣⏡⏣⏣⏣⏣⏣⏣⏢wSo each v∈V can be represented by v=u+w, which means V=U+W. So dim W=dim V-dim U. Now we should show that ⟨w,ej>=0, which means W⊂U⊥:a
⟨w,ej>
=⟨v-u,ej>
=⟨v,ej>-⟨v,ej>⟨ej,ej>
=0
Thus dim V-dim U=dim W⩽dim U⊥. Besides, U∩U⊥={0}, it shows dim U+dim U⊥⩽n. So there must bedim V=dim U+dim U⊥.Take the both orthonormal basis of U and U⊥, it has right length. So it's a basis of V.
Theorem144Dimension of the orthogonal complement
Suppose V is finite-dimensional and U is a subspace of V. Thendim U⊥=dim V-dim UProofSee (15)
The next result is an important consequence of (14)
Theorem145The orthogonal complement of the orthogonal complement
Suppose U is finite-dimensional subspace of V. ThenU=(U⊥)⊥.
We now define an operator PU for each finite-dimensional subspace of V.
Definition146orthogonal projection, PU
Suppose U is a finite-dimensional subspace of V. The orthogonal projection of V onto U is the operator PU∈L(V) defined as follows:For v∈V, write v=u+w, where u∈U and w∈U⊥. Then PUv=u.
Theorem147Properties of the orthogonal projection PU
(1) PU=u for every u∈U(2) PUw=0 for every w∈U⊥(3) range PU=U;(4) null PU=U⊥(5) v-PUv∈U⊥(6) P2U=PU(7) ‖PUv‖⩽‖v‖(8) for every orthonormal basis e1,⋯,em of U,PUv=⟨v,e1>e1+⋯+⟨v,em>em
Minimization ProblemsThe following problem often arises: given a subspace U of V and a point v∈V, find a point u∈U such that ‖v-u‖ is as small as possible. The next proposition shows that this minimization problem is solved by taking u=PUv.
Theorem148Minimizing the distance to a subspace
Suppose U is finite-dimensional subspace of V, v∈V, and u∈U. Then‖v-PUv‖⩽‖v-u‖Furthermore, the inequality above is an equality if and only if u=PUv.Proofa
‖v-PUv‖2
⩽‖v-PUv‖2+‖PUv-u‖2
=‖(v-PUv)+(PUv-u)‖2
=‖v-u‖2
Example148.1approximate sin(x)
Find a polynomial u with real coefficients and degree at most 5 that approximates sinx as well as possible on the interval [-𝜋,𝜋], in the sense that (16) is as small as possible. 𝜋∫-𝜋|sinx-u(x)|2dxsolutionLet v∈CR[-𝜋,𝜋] be the function defined by v(x)=sinx. Let U denote the subspace of CR[-𝜋,𝜋] consisting of the polynomials with real coefficients and degree at most 5. Our problem can now be reformulated as follows:Find u∈U such that ‖v-u‖ is as small as possibleTo compute the solution to our approximation problem, first apply the Gram-Schmidt Procedure to the basis 1,x,x2,x3,x4,x5 of U, producing an orthonormal basis e1,⋯,e6 of U. Then, again using the inner product of polynomial, compute PUv . Doing this shows that u(x)=0.987862x-0155271x3+0.00564312x5 where the 𝜋's that appear in the exact answer have been replaced with a good decimal approximation.
Fig5:sinx and u(x)
Another good approximation is Taylor polynomialx-x3
3!+x5
5!To see how good this approximation is, we draw there function
Fig6:sinx and u(x)
Chapter7. Operators on Inner Product SpacesThe deepest results related to inner product spaces deal with the subject to which we now turn — operators on inner product spaces. By exploiting properties of the adjoint, we will develop a detailed description of several important classes of operators on inner product spaces.7.A Self-Adjoint and Normal OperatorsAdjoints
Definition149adjoint, T*
Suppose T∈L(V,W). The adjoint of T is the function T*: W→V such that⟨Tv,w>=⟨v,T*w>for every v∈V and every w∈W.
To see why this definition make sense, consider ⟨Tv,w> as a linear functional. Then according to Riesz Representation Theorem, there exist a unique vector in V such that this linear functional is given by ⟨v,w'>. Here T* is a linear map from w to w'.
More information
The word adjoint has another meaning in linear algebra. In case you encounter the second meaning for adjoint elsewhere, be warned that the two meanings for adjoint are unrelated to each other.
Example149.1Find T*
(1) T:R3→R2T(x1,x2,x3)=(x2+3x3,2x1)Herea
<(x1,x2,x3), T*(y1,y2)>
=<T(x1,x2,x3),(y1,y2)>
=y1(x2+3x3)+2x1y2
=<(x1,x2,x3),(2y2,y1,3y1)>
Thus T*(y1,y2)=(2y2,y1,3y1)□(2) Fix u∈V and x∈W. Define T∈L(V,W) byTv=<v,u>xFix w∈W. Then for every v∈V we havea
<v, T*w>
=<Tv,w>
=<<v,u>x,w>
=<v,u><x,w>
=<v,<w,x>u>
ThusT*w=<w,x>u□
Theorem150The adjoint is a linear map
If T∈L(V,W), then T*∈L(W,V).
Theorem151Properties of the adjoint
(1) (S+T)*=S*+T* for all S,T∈L(V,W);(2) (𝜆T)*=⏨𝜆T* for all 𝜆∈F and T∈L(V,W);(3) (T*)*=T for all T∈L(V,W);a
<w,(T*)*v>
=<T*w,v>=<v,T*w>
=<Tv,w>=<w,Tv>
(4) I*=I(5) (ST)*=T*S* for all T∈L(V,W) and S∈L(W,U)a
<v,(ST)*u>
=<(ST)v,u>
=<S(Tv),u>
=<Tv,S*u>
=<v,T*S*u>
Theorem152Null space and range T*
Suppose T∈L(V,W), Then(1) null T*=(range T)⊥;(2) range T*=(null T)⊥;(3) null T=(range T*)⊥; (Replace T with T* in (1))(4) range T=(null T*)⊥
Definition153conjugate transpose
The conjugate transpose of an m-by-n matrix is the n-by-m matrix obtained by interchanging the rows and columns and then taking the complex conjugate of each entry.
The next result shows how to compute the matrix of T* from the matrix of T.
Theorem154The matrix of T*
Let T∈L(V,W). Suppose e1,⋯,en is an orthonormal basis of V and f1,⋯,fm is an orthonormal basis of W. ThenM(T*,(f1,⋯,fm),(e1,⋯,em))is the conjugate transpose of M(T,(e1,⋯,en),(f1,⋯,fm).Proof
Recall that we obtain the kth column of M(T) by writing Tek as a linear combination of the fj's; the scalars used in this linear combination then become the kth column of M(T). So we have
a
A11
A12
A21
A22
A31
A32
a
v1
v2
v3
=a
A11u1+A12u2
A21u1+A22u2
A31u1+A32u2
a
u1
u2
It shows how the 2th component in U reacts with the linear map, and contributes to the3th component in V1th row gives the coefficient on the 1thbasis of V1th column shows how the 1th basis of U corresponds to the linear map
Tek=<Tek,f1>f1+⋯+<Tek,fm>fmM(T)j,k=<Tek,fj>Replacing T with T* and interchanging the roles played by the e's and f's, we see that the entry in row j, column k, of M(T*) is <T*fk,ej>, which equals <fk,Tej>, which equals ⏨⏨⏨⏨<Tej,fk>, which equals the complex conjugate of the entry in row k, column j, of M(T).
Caution
Remember that the result below applies only when we are dealing with orthonormal bases. With respect to nonorthonormal bases, the matrix of T* does not necessarily equal the conjugate transpose of the matrix of T.
Definition155self-adjoint
An operator T∈L(V) is called self-adjoint if T=T*. In other words, T∈L(V) is self-adjoint if and only if <Tv,w>=<v,Tw>for all v,w∈V.
More Information
It's also called Hermitian.
If F=R, then by definition every eigenvalues is real, so the next result is interesting only when F=C.
Theorem156Eigenvalues of self-adjoint operators are real
Every eigenvalue of a self-adjoint operator is real.ProofSuppose Tv=𝜆v, then a
<Tv,v>
=<v,T*v>
<𝜆v,v>
=<v,𝜆v>
𝜆<v,v>
=⏨𝜆<v,v>
Thus 𝜆=⏨𝜆□
The next result is false for real inner product spaces. As an example, consider the operator T∈L(R2)that is counterclockwise rotation of 90° around the origin; thus T(x,y)=(-y,x). Obviously Tv is orthogonal to v for every v∈R2, even though T≠0
Theorem157Over C, Tv is orthogonal to all v only for the 0 operator
Suppose V is a complex inner product space and T∈L(V). Suppose<Tv,v>=0for all v∈V. Then T=0.ProofEvery <Tu,w> can be computed as some analogical terms of equ(17).<Tu,w>=<T(u+w),u+w>-<T(u-w),u-w>
4+<T(u+ⅈw),u+ⅈw>-<T(u-ⅈw),u-ⅈw>
4ⅈSo if equ (17) holds, Tu is orthonormal to all the vector in space V.
The next result is false for real inner product spaces, as shown by considering any operator on a real inner product space that is not self-adjoint. And this theorem also appears in quantum physics.
Theorem158Over C, <Tv,v>is real for all v only for self-adjoint operators
Suppose V is a complex inner product space and T∈L(V). Then T is self-adjoint if and only if<Tv,v>∈R.ProofLet v∈V. Thena
<Tv,v>-⏨⏨⏨⏨<Tv,v>
=<Tv,v>-<v,Tv>
=<Tv,v>-<T*v,v>
=<(T-T*)v,v>
If <Tv,v>∈R for every v∈V, then T=T*.
On a real inner product space V, a nonzero operator Tmight satisfy <Tv,v>=0 for all v∈V. However, the next result shows that this cannot happen for a self-adjoint operator.
Theorem159If T=T* and <Tv,v>=0 for all v, then T=0
Note that this theorem also holds on real inner product space.ProofWe use another transformation:<Tu,w>=<T(u+w),u+w>-<T(u-w),u-w>
4;this is correct only when<Tw,u>=<w,Tu>=<Tu,w>
Normal Operators
Definition160normal
An operator T∈L(V) on an inner product space is called normal if it commutes with its adjoint. In other words,T T*=T*T
Obviously, every self-adjoint operator is normal.
Theorem161T is normal if and only if ‖Tv‖=‖T*v‖ for all v
Proofa
T is normal
⟺TT*-T*T=0
⟺<(T*T-TT*)v,v>=0
⟺<T*Tv,v>=<TT*v,v>
⟺<Tv,Tv>=<T*v,T*v>
⟺‖Tv‖2=‖T*v‖2
It can be proved that the eigenvalues of the adjoint of each operator are equal (as a set) to the complex conjugates of the eigenvalues of the operator. But an operator and its adjoint may have different eigenvectors. However, a normal operator and its adjoint have the same eigenvectors.
Theorem162For T normal, T and T* have the same eigenvectors
Suppose T∈L(V) is normal and v∈V is an eigenvector of T with eigenvalue 𝜆. Then v is also an eigenvector of T* with eigenvalue ⏨𝜆.ProofSuppose Tv=𝜆v. Thus (T-𝜆I)v=0Because T-𝜆I is also normal, so using last theorem, we have‖(T-𝜆I)v‖=‖(T-𝜆I)*v‖=0So a vector has 0 norm when it's a 0(T*-⏨𝜆I)v=0□
7.B The spectral TheoremThe nicest operators on V are those for which there is an orthonormal basis of V with respect to which the operator has a diagonal matrix. (We are not dealing with diagonalizing. We are dealing with a more special case: the eigenvector are orthonormal). These are precisely the operators T∈L(V) such that there is an orthonormal basis of V consisting of eigenvectors of T. Our goal in this section is to prove the Spectral Theorem, which characterizes these operators as the normal operators when F=C and as the self-adjoint operators when F=R. The Spectral Theorem is probably the most useful tool in the study of operators on inner product spaces.Because the conclusion of Spectral Theorem depends on F, we will break the Spectral Theorem into two pieces, called the Complex Spectral Theorem and the Real Spectral Theorem. As is often the case in linear algebra, complex vector spaces are easier to deal with than real vector spaces. Thus we present the Complex Spectral Theorem first.The Complex Spectral TheoremThe key part of the CST states that if F=C and T∈L(V) is normal, then T has a diagonal matrix with respect to some orthonormal basis of V.
Theorem163Complex Spectral Theorem
Suppose F=C and T∈L(V). Then the following are equivalent:(1) T is normal(2) V has an orthonormal basis consisting of eigenvectors of T.(3) T has a diagonal matrix with respect to some orthonormal basis of V.ProofWe have already shown (2)⟺(3). First suppose (3) holds, so T has a diagonal matrix. The matrix of T* is obtained by taking the conjugate transpose of the matrix of T; hence T* also has a diagonal matrix. Any diagonal matrix commute; thus T is normal.Now suppose (1) holds, so T is normal. By Schur's Theorem there is an orthonormal basis e1,⋯,enof V with respect to which T has an upper-triangular matrix. Thus we can writeM(T)=a
a1,1
⋯
a1,n
⋱
⋮
0
an,n
We will show that this matrix is actually a diagonal matrix. We see from the matrix above that‖Te1‖2=||a1,1||2and‖T*e1‖2=||a1,1||2+||a1,2||2+⋯+||a1,n||2.Because T is normal, so ‖Te1‖=‖T*e1‖. Thus the two equations give the result:||a1,2||=⋯=||a1,n||=0Now we have‖Te2‖2=||a2,2||2‖T*e2‖2=||a2,2||2+||a2,3||2+⋯So we have result||a2,3||=⋯=||a2,n||=0Continue in this fashion, we see that all the nondiagonal entries in the matrix equal 0.□
The Real Spectral TheoremThis theorem is the kernel of the spectral theorem:
Theorem164Invertible quadratic expressions
Suppose T∈L(V) is self-adjoint and b,c∈R are such that b2<4c. ThenT2+bT+cIis invertible.ProofLet v be a nonzero vector in V.a
<(T2+bT+cI)v,v>
=<T2v,v>+b<Tv,v>+c<v,v>
=<Tv,Tv>+b<Tv,v>+c‖v‖2
⩾‖Tv‖2+||b||‖Tv‖ ‖v‖+c‖v‖2
>0
It implies (T2+bT+cI)v orthonormal with v. so (T2+bT+cI)v≠0. So null space is only {0}, it is invertible.
We know that every operator, self-adjoint or not, on a finite-dimensional nonzero complex vector space has an eigenvalue.
Theorem165Self-adjoint operators have eigenvalues
Suppose T∈L(V) is self-adjoint Then T has an eigenvalue.ProofWe can assume that V is a real inner product space, as we have already noted. Let n=dim V and choose v∈V with v≠0. Thenv,Tv,⋯,Tnvcannot be linearly independent. 0=a0v+a1Tv+⋯+anTnv.Make the a's coefficients of a polynomial, which can be written in factored form asa0+a1x+⋯+anxn=c(x2+b1x+c1)⋯(x2+bMx+cM)(x-𝜆1)⋯(x-𝜆m)where c is a nonzero real number, each bj,cj and 𝜆j is real, each b2j is less than 4cj, m+M⩾1, and the equation holds for all real x. We then havea
0
=a0v+⋯anTnv
=(a0I+a1T+⋯+anTn)v
=c(T2+b1T+c1I)⋯(T2+bMT+cMI)(T-𝜆1I)⋯(T-𝜆mI)v
By last Theorem, each T2+b1T+c1I is invertible. Recall also that c≠0. Thus the equation above implies that m>0 and0=(T-𝜆1I)⋯(T-𝜆mI)vHence T-𝜆jI is note injective for at least one j. In other words, T has an eigenvalue.
The next result shows that if U is a subspace of V that is invariant under a self-adjoint operator T, then U⊥ is also invariant under T.
Theorem166Self-adjoint operators and invariant subspaces
Suppose T∈L(V) is self-adjoint U is a subspace of V that is invariant under T. Then(1) U⊥ is invariant under T(2) T|U∈L(U) is self-adjoint(3) T|U⊥∈L(U⊥)is self-adjoint ProofU is invariant means that <Tu,v>=0 for v not in U, which is actually a vector of U⊥. So we have similarly<Tv,u>=<v,Tu>=0To prove (2) and (3)<(T|U)u1,u2>=<Tu1,u2>=<u1,Tu2>=<u1,(T|U)u2>
Theorem167Real Spectral Theorem
Suppose F=R and T∈L(V).Then the following are equivalent(1) T is self-adjoint(2) V has an orthonormal basis consisting of eigenvectors of T.(3) T has a diagonal matrix with respect to some orthonormal basis of V.ProofFirst suppose (3) holds, so T has a diagonal matrix. Hence T=T*.We now prove (1) implies (2) by induction on dim V. It's easy to show if dim V=1, it holds.If dim V⩾2, because V is self-adjoint, so it must have a eigenvector, namely, u. Let U=span(u), which is invariant under T. Thus U⊥ is also invariant under T|U⊥ and has an orthonormal basis consisting of eigenvectors of T|U⊥. Append this basis with u/‖u‖, we get an orthonormal basis of V.
7.C Positive Operators and IsometriesPositive Operators
Definition168Positive operator
An operator T∈L(V) is called positive if T is self-adjoint and<Tv,v>⩾0
Definition169square root
An operator R is called a square root of an operator T if R2=T
The characterizations of the positive operators in the next result correspond to characterizations of the nonnegative numbers among C. Specifically, a complex number z is nonnegative if and only if it has a nonnegative square root. Also, z is nonnegative if and only if it has a real square root, corresponding to condition (4). Finally, z is nonnegative if and only if there exists a complex number w such that z=⏨ww, corresponding to condition (5)
Theorem170Characterization of positive operators
Let T∈L(V). Then the following are equivalent:(1) T is positive;(2) T is self-adjoint and all the eigenvalues of T are nonnegative(3) T has a positive square root(4) T has a self-adjoint square root(5) there exists an operator R∈L(V) such that T=R*RProofWe will prove that (1)⇒(2)⇒(3)⇒(4)⇒(5)⇒(1).First suppose (1) holds, then T is self-adjoint. According to Spectral theorem, it has a diagonalized matrix respected to some basis. So if we need (18) holds, (2) must hold. Then (3) just need a matrix R with all the entries the square root of T. (4) and (5) just use the same matrix R as (3) does.If (5) holds, then<Tv,v>=<R*Rv,v>=<Rv,Rv>⩾0
Theorem171Each positive operator has only one positive square root
ProofLet R be a positive square root of T. We will prove that Rv=𝜆v. (with supposition Tv=𝜆v) This will imply that the behavior of R on the eigenvectors of T is uniquely determined. Because there is a basis of V consisting of eigenvectors of T, this will imply that R is uniquely determined.To prove Rv=𝜆v, note that the Spectral Theorem asserts that there is an orthonormal basis e1,⋯,em of V consisting of eigenvectors of R. Because R is a positive operator, all its eigenvalues are nonnegative. Thus there exist nonnegative numbers 𝜆1,⋯,𝜆n such that Rej=𝜆jej for j=1,⋯,n.
Isometries (unitary operator)
Theorem172isometry
An operator S∈L(V) is called an isometry if it preserves norms.‖Sv‖=‖v‖These statements are equivalent:(1) S is an isometry(2) <Su,Sv>=<u,v> for all u,v∈V;(3) Se1,⋯,Sen is orthonormal for every orthonormal list of vectors e1,⋯,en in V;(4) there exists an orthonormal basis e1,⋯,en of V such that Se1,⋯,Sen is orthonormal(5) S*S=I(6) SS*=I(7) S* is an isometry(8) S is invertible and S-1=S*.
Proof(1) and (2) is equivalent by definition. So (3) holds. (4) holds as well. Now suppose (4) holds. Let e1,⋯,en be an orthonormal basis of V such that Se1,⋯,Sen is orthonormal. Thus<Sej,Sek>=<S*Sej,ek>=<ej,ek>All vectors u,v∈V can be written as linear combinations of e1,⋯,en, and thus the equation above implies that <S*Su,v>=<u,v>. Hence <(S*S-I)u,v>=0, S*S=I. (5) holds. Thus ‖Sv‖2=‖S*v‖2=<SS*v,v>=‖v‖2(6) holds. (7) holds. (8) holds.If (8) holds, <Sv,Sv>=<S*Sv,v>=<v,v>. Thus (1) holds.
Theorem173Description of isometries when F=C
Suppose V is a complex inner product space and S∈L(V). Then the following are equivalent:(1) S is an isometry(2) There is an orthonormal basis of V consisting of eigenvectors of S whose corresponding eigenvalues all have absolute value 1.ProofIt's easy to show that (2) implies (1). To show the other direction, suppose (1) holds.So S is an isometry. By the Complex Spectral Theorem, there is an orthonormal basis e1,⋯,en of V consisting of eigenvectors of S. For j∈{1,⋯,n}, let 𝜆j be the eigenvalue corresponding to ej. Then||𝜆j||=‖𝜆jej‖=‖Sej‖=‖ej‖=1Thus each eigenvalue of S has absolute value 1.
7.D Polar Decomposition and Singular Value DecompositionPolar decompositionWe have found the analogy between C and L(V). Continuing with our analogy, note that each complex number z except 0 can be written in the formz=(z
||z||)||z||=(z
||z||)⏨zzOur analogy leads us to guess that each operator T∈L(V) can be written as an isometry times T*T.
Definition174T
If T is a positive operator, then Tdenotes the unique positive square root of T.
Note that T*T is a positive operator for every T∈L(V), (<T*Tv,v>⩾0), so the theorem is reasonable.
Theorem175Polar Decomposition
Suppose T∈L(V). Then there exists an isometry S∈L(V) such thatT=ST*T.ProofIf v∈V , thena
‖Tv‖2
=<Tv,Tv>
=<T*Tv,v>
=<T*Tv,T*Tv>
=‖T*Tv‖2
So T and T*T preserve the norm. Thus we only need to construct a isometry between them:S1(T*Tv):=Tv First we must check that S1 is well defined. To do this, suppose v1,v2∈V are such that T*Tv1=T*Tv2. For the definition given by (19), we must show that Tv1=Tv2a
‖Tv1-Tv2‖
=‖T(v1-v2)‖
=‖T*T(v1-v2)‖
=‖T*Tv1-T*Tv2‖
=0
Thus Tv1=Tv2.However, this just define the operator in a subspace. What about S on the outside of rangeT*T ? In particular, S is injective. Thus form the Fundamental Theorem of Linear Map we havedimrangeT*T=dimrange Tdim(rangeT*T)⊥=dim(range T)⊥So find two basis for (rangeT*T)⊥and (range T)⊥, define a S2, preserving the coefficients of the basis. Then let S equal S1 on rangeT*T and equal S2 on (rangeT*T)⊥
Singular Value Decomposition
Definition176singular values
Suppose T∈L(V). The singular values of T are the eigenvalues of T*T, with each eigenvalue 𝜆 repeated dimE(𝜆,T*T) times.Actually, we can just compute them by computing eigenvalues of T*T. Singular values are the square root of eigenvalues.
So T*T has three eigenvalues: 3, 2, 0. and the singular values of T are 3, 3, 2, 0.
The next result shows that every operator on V has a clean description in terms of its singular values and two orthonormal bases of V.
Theorem177Singular Value Decomposition
Suppose T∈L(V) has singular values s1,⋯,sn. Then there exist orthonormal bases e1,⋯,en and f1,⋯,fn of V such thatTv=s1<v,e1>f1+⋯+sn<v,en>fnfor every v∈V.ProofBy the Spectral Theorem applied to T*T, there is an orthonormal basis e1,⋯,en of V such that T*Tej=sjej for j=1,⋯,n. We havev=<v,e1>e1+⋯+<v,en>enfor every v∈V. Apply T*Tto both sides of this equation, gettingT*Tv=s1<v,e1>e1+⋯+sn<v,en>enBy the Polar Decomposition, there is an isometry S∈L(V) such that T=ST*T. Let Sej=fj, because S is an isometry, f1,⋯,fn is an orthonormal basis of V. The equation above now becomesTv=s1<v,e1>f1+⋯+sn<v,en>fnfor every v∈V.TS∘T*Ta
s1
s2
s3
e1e2f1f2
Fig7:Singular Value Decomposition
The singular Value Decomposition allows us a rare opportunity to make good use of two different bases for the matrix of an operator. To do this, suppose T∈L(V). Let s1,⋯,sn denote the singular values of T, and let e1,⋯,en and f1,⋯,fn be orthonormal bases of V such that the Singular Value Decomposition holds. Because Tej=sjfj for each j, we haveM(T,ej,fj)=a
s1
0
⋱
sn
Chapter8. Operators on Complex Vector Spaces8.A Generalized Eigenvectors and Nilpotent OperatorsNull Spaces of Powers of an OperatorWe begin this chapter with a study of null spaces of powers of an operator.
Suppose T∈L(V). Suppose m is a nonnegative integer such that null Tm=null Tm+1. Thennull Tm=null Tm+1=⋯=null Tm+k=⋯ProofSuppose v∈null Tm+k+1, thenTm+1Tkv=Tm+k+1v=0HenceTkv∈null Tm+1=null TmThereforeTmTkv=0It means v∈null Tm+k, null Tm+k+1⊂null Tm+k⊂null Tm+k+1
Theorem180Null spaces stop growing
Suppose T∈L(V). Let n=dim V. Thennull Tn=null Tn+1=null Tn+2=⋯ProofWe need only prove that null Tn=null Tn+1. Suppose this is not true. Then {0}=null T0⊊null T1⊊⋯⊊null Tn⊊null Tn+1At each of the strict inclusions in the chain above, the dimension increases by at least 1. Thus the dim null Tn+1⩾n+1
Unfortunately, it is not true that V=null T⊕range T for each T∈L(V). However, the following result is a useful substitute.
Theorem181V is the direct sum of null Tdim V and range Tdim V
Suppose T∈L(V). Let n=dim V. ThenV=null Tn⊕range Tn.ProofFirst we show that(null Tn)∩(range Tn)={0}Suppose v∈(null Tn)∩(range Tn). Then Tnv=0, and there exist u∈V such that v=Tnu. Applying Tn to both sides of the last equation shows that Tnv=T2nu. Hence T2nu=0. u∈null T2n, so u∈null Tn too. uTnnull Tnrange Tnv∈range Tn∩null Tn0u∈T2n=Tnv=0And we know from the fundamental theorem of Linear Maps, dimnull Tn=dimrange Tn. ThusV=null Tn⊕range Tn□
Generalized EigenvectorsSome operators do not have enough eigenvectors to lead to a good description. Thus in this subsection we introduce the concept of generalized eigenvectors.To understand why we need more than eigenvectors, let's examine the question of describing an operator by decomposing its domain into invariant subspaces. Fix T∈L(V). We seek to describe T by finding a "nice" direct sum decompositionV=U1⊕⋯⊕Umwhere each Uj is a subspace of V invariant under T. The simplest possible nonzero invariant subspaces are 1-dimensional. A decomposition as above where each Uj is 1-dimensional is possible if and only if V has a basis consisting of eigenvectors of T. This happens if and only if V has an eigenspace decompositionV=E(𝜆1,T)⊕⋯⊕E(𝜆m,T)The Spectral Theorem in the previous chapter shows that if V is an inner product space, then a decomposition of the form(20) holds for every normal operator if F=C and for every self-adjoint operator if F=R because operators of those types have enough eigenvectors to form a basis of V. Some operator does not have enough eigenvectors. Generalized eigenvectors and generalized eigenspaces, which we now introduce, will remedy this situation.
Definition182generalized eigenvector
Suppose T∈L(V). and 𝜆 is an eigenvalue of T. A vector v∈V is called a generalized eigenvector of T corresponding to 𝜆 if v≠0 and (T-𝜆I)jv=0for some positive integer j.
Definition183generalized eigenspaces G(𝜆,T)
Suppose T∈L(V) and 𝜆∈F. The generalized eigenspace of T corresponding to 𝜆, denoted G(𝜆,T), is defined to be the set of all generalized eigenvectors of T corresponding to 𝜆, along with the 0 vector.G(𝜆,T)=null(T-𝜆I)j=null(T-𝜆I)n
Because every eigenvector of T is a generalized eigenvector of T, each eigenspace is contained in the corresponding generalized eigenspace. In other words, E(𝜆,T)⊂G(𝜆,T)The next result implies that if T∈L(V) and 𝜆∈F, then G(𝜆,T) is a subspace of V
Theorem184Description of generalized eigenspaces
Suppose T∈L(V). Let n=dim V. ThenV=null Tn⊕range Tn.ProofFirst we show that(null Tn)∩(range Tn)={0}Suppose v∈(null Tn)∩(range Tn). Then Tnv=0, and there exist u∈V such that v=Tnu. Applying Tn to both sides of the last equation shows that Tnv=T2nu. Hence T2nu=0. u∈null T2n, so u∈null Tn too. Thus v=Tnu=0And we know from the fundamental theorem of Linear Maps, dimnull Tn=dimrange Tn. ThusV=null Tn⊕range Tn□
Example184.1
Define T∈L(C3) byT(z1,z2,z3)=(4z2,0,5z3)(a) Find all eigenvalues of T, the corresponding eigenspaces, and the corresponding generalized eigenspaces.(b) Show that C3 is the direct sum of generalized eigenspaces corresponding to the distinct eigenvalues of T.(a) 𝜆1=0, eigenvector is (z1,0,0); 𝜆2=5, eigenvector is (0,0,z3). There must be other generalized eigenvector. We have T3(z1,z2,z3)=(0,0,125z3) Thus v=(0,z2,0) is also an eigenvector. Soa
G(0,T)={(z1,z2,0):z1,z2∈C}
G(5,T)={(0,0,z3):z3∈C}
One of our major goals in this chapter is to show that the result in part (b) of the example above holds in general for operators on finite-dimensional complex vector spaces.
Suppose 𝜆1,⋯,𝜆m are distinct eigenvalues of T. v1,⋯,vm are corresponding generalized eigenvectors. Then v1,⋯,vm are linearly independent.ProofSuppose a1,⋯,am are complex number such that0=a1v1+⋯+amvmLet k be the largest nonnegative integer such that (T-𝜆1I)kvk≠0. Letw=(T-𝜆1I)kv1Thus (T-𝜆1I)w=(T-𝜆1I)k+1w=0and hence Tw=𝜆1w. Thus (T-𝜆I)w=(𝜆1-𝜆)w for every 𝜆∈F and hence(T-𝜆I)nw=(𝜆1-𝜆)nwfor every 𝜆∈F, where n=dim V.Apply the operator (T-𝜆1I)k(T-𝜆2I)n⋯(T-𝜆m)nto both sides of (21). The equation above implies that a1=0. In a similar fashion, aj=0 for each j, which implies that v1,⋯,vm is linearly independent.□
Nilpotent Operators
Definition186nilpotent
An operator is called nilpotent if some power of it equals 0.
Example186.1
nilpotent
A. The operator N∈L(F4) defined byN(z1,z2,z3,z4)=(z3,z4,0,0)is nilpotentB. The operator of differentiation on P(R) is nilpotent.
Theorem187Nilpotent operator raised to dimension of domain is 0
Suppose N∈L(V) is nilpotent. Then NdimV=0 mapProofG(0,N)=V, thus null NdimV=V, NdimV=0 map.
Theorem188Matrix of a nilpotent operator
Suppose N∈L(V) is nilpotent. Then there is a basis of V with respect to which the matrix of N has the forma
0
*
⋱
0
0
ProofFirst choose a basis of null N. Then extend this to a basis of null N2. Then extend to a basis of null N3. Continue in this fashion, eventually getting a basis of V.Now let's think about the matrix of N with respect to this basis. The first column, and perhaps additional columns at the beginning, consists of all 0's.a
?
?
?
?
?
?
?
?
?
a
x
0
0
=0⟶a
?
?
?
?
?
?
?
?
?
=a
0
?
?
0
?
?
0
?
?
Continue in this fashion:a
0
?
?
0
?
?
0
?
?
⟶a
0
?
?
0
0
?
0
0
?
⟶a
0
?
?
0
0
?
0
0
0
8.B Decomposition of an OperatorDescription of Operators on Complex Vector SpacesWe will see that every operator on a finite-dimensional complex vector space has enough generalized eigenvectors to provide a decomposition.
Theorem189The null space and range of p(T) are invariant under T
Suppose T∈L(V) and p∈P(F). Then null p(T) and range p(T) are invariant under T.ProofThe key is:p(T)(Tu)=Tp(T)u
The following major result shows that every operator on a complex vector space can be thought of as composed of pieces, each of which is a nilpotent operator plus a scalar multiple of the identity.
Theorem190Description of operators on complex vector spaces
Suppose V is a complex vector space and T∈L(V). Let 𝜆1,⋯,𝜆m be the distinct eigenvalues of T. Then(1) V=G(𝜆1,T)⊕⋯⊕G(𝜆m,T)(2) each G(𝜆j,T) is invariant under T(3) each (T-𝜆jI)|G(𝜆j,T) is nilpotentProofLet n=dim V. Recall that G(𝜆j,T)=null(T-𝜆jI)n for each j. Thus G is a polynomial's null spaces, and is invariant under T. (2) (3) holds.We will prove (1) by induction on n. To get started, note that the desired result holds if n=1. Thus we can assume that n>1 and that the desired result holds on all vector spaces of smaller dimension.Because V is a complex vector space, T has an eigenvalue; thus m⩾1. There exist a G(𝜆1,T). And we can decomposeV=G(𝜆1,T)⊕Uwhere U=range(T-𝜆1I)n. U is a polynomial's range, so it is invariant under T. Because G(𝜆1,T)≠{0}, we have dim U<n. Thus we can apply our induction hypothesis to T|U.None of the generalized eigenvectors of T|U correspond to the eigenvalue 𝜆1, because they would be in G(𝜆1,T). Thus each eigenvalue of T|U is in {𝜆2,⋯,𝜆m}.We need to show that G(𝜆k,T|U}=G(𝜆k,T) for k=2,⋯,m. Thus fix k∈{2,⋯,m}. The inclusion G(𝜆k, T|U)⊂G(𝜆k,T) is clear⋯
G(𝜆,T) cannot be decomposed into eigenvectors. For example:a
6
3
4
6
2
7
eigenvalue=6⏪⏪⏪⏪⏪⏫T-𝜆Ia
0
3
4
0
2
1
a
0
3
4
0
2
1
2=a
0
0
10
0
2
1
Then the generalized eigenvectors are (1,0,0) and (0,1,0).
Theorem191A basis of generalized eigenvectors
Suppose V is a complex vector space and T∈L(V). Then there is a basis of V consisting of generalized eigenvectors of T.
Multiplicity of an EigenvalueIf V is a complex vector space and T∈L(V), then the decomposition of V provided by (22) can be a powerful tool. The dimensions of the subspaces involved in this decomposition are sufficiently important to get a name.
Definition192multiplicity
Suppose T∈L(V). The multiplicity of an eigenvalue 𝜆 of T is defined to be the dimension of the corresponding generalized eigenspace G(𝜆,T)
algebraic multiplicity of 𝜆=dimnull(T-𝜆I)dimV=dim G(𝜆,T)geometric multiplicity of 𝜆=dimnull(T-𝜆I)=dimE(𝜆,T)
Block Diagonal MatricesTo interpret our results in matrix form, we make the following definition, generalizing the notion of a diagonal matrix
Definition194block diagonal matrix
square matrix of the forma
A1
0
⋱
0
Am
where A1,⋯,Am are square matrices lying along the diagonal and all the other entries of the matrix equal 0.
Theorem195Block diagonal matrix with upper-triangular blocks
We can choose a basis to block diagonalize an operator. Make every Aj be a upper-trangular matrix.
Square RootsNot every operator on a complex vector space has a square root.
Theorem196Identity plus nilpotent has a square root
Suppose N∈L(V) is nilpotent. Then I+N has a square root.ProofConsider the Taylor series for the function 1+x:1+a1x+a2x2+⋯Because N is nilpotent, Nm=0 for some positive integer m. We guess that there is a square root of I+N of the formI+a1N+a2N2+⋯+am-1Nm-1Having this guess, we can compute(I+a1N+a2N2+⋯+am-1Nm-1)(I+a1N+a2N2+⋯+am-1Nm-1)and make the coefficient to be 1,1,0,0,0,⋯
Theorem197Over C, invertible operators have square roots
Suppose V is a complex vector space and T∈L(V) is invertible. Then T has a square root.ProofInvertibility means all the eigenvalues are not zero. Thus G=T-𝜆I is a nilpotent, soT|G=𝜆(I+G
𝜆)has a square root.
8.C Characteristic and Minimal PolynomialThe Cayley-Hamilton TheoremThe next definition associates a polynomial with each operator on V if F=C.
Definition198characteristic polynomial
Suppose V is a complex vector space and T∈L(V). Let 𝜆1,⋯,𝜆m denote the distinct eigenvalues of T, with multiplicities d1,⋯,dm. The polynomial (z-𝜆1)d1⋯(z-𝜆m)dmis called the characteristic polynomial of T.
Theorem199Cayley-Hamilton Theorem
Suppose V is a complex vector space and T∈L(V). Let q denote the characteristic polynomial of T. Then q(T)=0.ProofEvery vector in V is a sum of vectors in G(𝜆1,T),⋯G(𝜆m,T). Thus we only need to prove|p(T)|G(𝜆j,T)=0Because p(T)=(T-𝜆1I)d1⋯, we can commute the term and let (T-𝜆jI)dj be the rightest. The definition of G is:G=null(T-𝜆jI)dj So the proof is clearly.
The Minimal Polynomial
Definition200monic polynomial
A monic polynomial is a polynomial whose highest-degree coefficient equals 1.
Theorem201Minimal polynomial
Suppose T∈L(V). Then there is a unique monic polynomial p of smallest degree such that p(T)=0.ProofLet n=dim V. Then the listI, T, T2,⋯,Tn2is linearly dependent, because dimL=n2. Let m be the smallest positive integer such thatI, T, T2,⋯,Tmis linearly dependent. soa0I+a1T+⋯+am-1Tm-1+Tm=0Define a monic polynomial p∈P(F) byp(z)=a0+a1z+⋯+am-1zm-1+zmThen it means p(T)=0.
Definition202minimal polynomial
A minimal polynomial of T is the unique monic polynomial p of smallest degree such thatp(T)=0
The proof of the last result shows that the degree of the minimal polynomial of each operator on V is at most (dim V)2. The Cayley-Hamilton Theorem tells us that if V is a complex vector space, then the minimal polynomial of each operator on V has degree at most dim V. This remarkable improvement also holds on real vector spaces, as we will see in the next chapter.
Theorem203q(T)=0 implies q is a multiple of the minimal polynomial
Suppose T∈L(V) and q∈P(F). Then q(T)=0 if and only if q is a polynomial multiple of the minimal polynomial of T.ProofThe "if" direction is easy to prove. The other direction, use the Division Algorithm for Polynomials, there exist polynomials s,r∈P(F) such thatq=ps+rand deg r<deg p. We can prove that r=0
Theorem204Characteristic polynomial is a multiple of minimal polynomial
Suppose T∈L(V), F=C. Then the characteristic polynomial of T is a polynomial multiple of the minimal polynomial of T.
Theorem205Eigenvalues are the zeros of the minimal polynomial
Suppose T∈L(V). Then the zeros of the minimal polynomial of T are precisely the eigenvalues of T.
8.D Jordan FormWe know that if V is a complex vector space, then for every T∈L(V) there is a basis of V with respect to which T has a nice upper-triangular matrix. In this section we will see that we can do even better -- there is a basis of V with respect to which the matrix of T contains 0's everywhere except possibly on the diagonal and the line directly above the diagonal.
Theorem206Basis corresponding to a nilpotent operator
Suppose N∈L(V) is a nilpotent. Then there exist vectors v1,⋯,vn∈V and nonnegative integers m1,⋯,mn such that(1) Nm1v1,⋯,Nv1,v1⋯,Nmnvn,⋯,Nvn,vnis a basis of V(2) Nm+1v1=⋯=Nmn+1vn=0ProofUse the induction on dim V. When dim V=1, N is must a 0 map. So let m=0, v1is a basis of V.Now for a N and V, Suppose the theorem holds when dim V'<dim V. Thus It should holds on the range of N because dim range N<dim V. ThusNm1v1,⋯,v1,⋯,Nmnvn,⋯,vnis a basis of range N. for each vj, there exist uj such that vj=Nuj. We will prove that Nm1+1u1,⋯,u1,⋯,Nmn+1un,⋯,uncan extend to a basis by two steps.First,Nm1+1u1,⋯,u1,⋯,Nmn+1un,⋯,un is linearly independent. If there werea1u1+⋯am1+1Nm1+1u1+⋯=0Then operator N on the left, most of the vector become the vector in (25), others become 0 because of (24). (25) is linearly independent, thus their coefficients must be 0. Only those who became 0 can have nonnegative coefficients. However, their origin is actually Nm1v1,⋯,Nmnvn, which is also linearly independent. So (27) requires their coefficients to be 0.Second, extend (26) to be a basisNm1+1u1,⋯,u1,⋯,Nmn+1un,⋯,un,w1,⋯,wpof V. Each Nwj is in range N and hence is in the span of (25) . Each vector in the list (25) equals N applied to some vector in the list (26). Thus there exists xj in the span of (26) such thatNwj=NxjNow letun+j=wj-xjThen Nun+j=0. Furthermore,Nm1+1u1,⋯,Nu1,u1,⋯Nmn+1un,⋯,Nun,un,un+1,⋯,un+pspans V because it span contains each xj and each un+j and hence each wj. Thus the spanning list above is a basis of V because it has the same length as the basis (28). This basis has the required form, where mn+j=0 for j⩾1.
Definition207Jordan basis
Suppose T∈L(V). A basis of V is called a Jordan basis for T if with respect to this basis T has a block diagonal matrixa
A1
⋱
Ap
,where Aj=a
𝜆j
1
0
⋱
⋱
⋱
1
0
𝜆j
Theorem208Jordan Form
Suppose V is a complex vector space. If T∈L(V), then there is a basis of V that is a Jordan basis for T. Proof(23) tells us, a nilpotent operator can gives a basis of V with respect to which N has a block diagonal matrix, where each matrix on the diagonal has the forma
a
0
1
0
⋱
⋱
⋱
1
0
0
a
0
1
0
⋱
⋱
⋱
1
0
0
a
0
1
0
⋱
⋱
⋱
1
0
0
Because (T-𝜆iI)|G(𝜆j,T) is nilpotent. Thus (T-𝜆iI)|G(𝜆j,T) can be represented as(29). So T|G(𝜆j,T) can be written asa
a
𝜆i
1
0
⋱
⋱
⋱
1
0
𝜆i
a
𝜆i
1
0
⋱
⋱
⋱
1
0
𝜆i
a
𝜆i
1
0
⋱
⋱
⋱
1
0
𝜆i
And we have the generalized eigenspace decomposition (22):V=G(𝜆1,T)⊕⋯⊕G(𝜆m,T)Thus we can represent T|V as the block diagonal matrix of T|G(𝜆j,T). Each block has several blocks.
Chapter9. Operators on Real Vector Spaces9.A Complexification of a Vector SpaceAs we will soon see, a real vector space V can be embedded, in a natural way, in a complex vector space called the complexification of V
Definition209complexification of V, VC
Suppose V is a real vector space.★ The complexification of V, denoted VC, equals V×V. An element of VC is an ordered pair (u,v), where u,v∈V, but we will write this as u+ⅈv.★ Addition on VC is defined by(u1+ⅈv1)+(u2+ⅈv2)=(u1+u2)+ⅈ(v1+v2)★ Complex scalar multiplication on VC is defined by(a+bⅈ)(u+vⅈ)=(au-bv)+ⅈ(av+bu)★ for a,b∈R and u,v∈V.
We think of V as a subset of VC by identifying u∈V with u+ⅈ0. The construction of VC from V can then by thought of as generalizing the construction of Cn from Rn
Theorem210basis of V is basis of VC
Suppose V is a real vector space★ If v1,⋯,vn is a basis of V(as a real vector space, the coefficients are real), then v1,⋯,vn is a basis of VC(as a complex vector space, the coefficients are complex)★ their dimensions equal.
Complexification of an Operator
Definition211complexification of T, TC
Suppose V is a real vector space and T∈L(V). The complexification of T, denoted TC, is the operator TC∈L(VC) defined byTC(u+ⅈv)=Tu+ⅈTvfor u,v∈V.
Theorem212Matrix of TCequals matrix of T
Suppose V is a real vector space with basis v1,⋯,vn and T∈L(V). Then M(T)=M(TC), where both matrices are with respect to the same basis
We know that every operator on a nonzero finite-dimensional complex vector space has an eigenvalue and thus has a 1-dimensional invariant subspace. But an operator on a nonzero finite-dimensional real vector space may have no eigenvalues and thus no 1-dimensional invariant subspaces. However, we now show that an invariant subspace of dimension 1 or 2 always exists.
Theorem213Every operator has an invariant subspace of dimension 1 or 2.
Suppose V is a real vector space and T∈L(V). The complexification TC has an eigenvalue a+bⅈ, where a,b∈R. Thus there exist u,v∈V, not both 0, such that TC(u+ⅈv)=(a+bⅈ)(u+ⅈv). Using the definition of TC, the last equation can be rewritten asTu+ⅈTv=(au-bv)+(av+bu)ⅈThusTu=au-bv and Tv=av+buThen U=span(u,v) is invariant.
The Minimal Polynomial of the ComplexificationSuppose V is a real vector space and T∈L(V). Then the minimal polynomial of TC equals the minimal polynomial of T.Eigenvalues of the ComplexificationAn eigenvalue of TC is real if and only if it is also an eigenvalue of T.
Theorem214TC-𝜆I and TC-⏨𝜆I
(TC-𝜆I)j(u+ⅈv)=0 if and only if (TC-⏨𝜆I)j(u+ⅈv)=0As a consequence, the nonreal eigenvalues of TC comes in pair.
Theorem215Multiplicity of 𝜆 equals multiplicity of ⏨𝜆
Suppose V is a real vector space with basis v1,⋯,vn and T∈L(V). Then M(T)=M(TC), where both matrices are with respect to the same basis
Chapter10. Trace and Determinant10.A TraceChange of Basis★ identity matrix I★ invertible, inverse, A-1★ The matrix of the product of linear mapsM(ST,(u1,⋯,un),(w1,⋯,wn))=M(S,(v1,⋯,vn),(w1,⋯,wn))M(T,(u1,⋯,un),(v1,⋯,vn))★ M(I,(u1,⋯,un),(v1,⋯,vn))'s inverse is M(I,(v1,⋯,vn),(u1,⋯,un))★ Change of basis formula, let A=M(I,(u1,⋯,un),(v1,⋯,vn))M(T,(u1,⋯,un))=A-1M(T,(v1,⋯,vn))ATrace: A Connection Between Operators and Matrices★ The trace of T is the sum of the eigenvalues of T (if F=C)or TC(if F=R)★ Trace T equals the negative of the coefficient of zn-1 in the characteristic polynomial of T.★ The trace of a square matrix is the sum of the diagonal entries of it.★ trace(AB)=trace(BA)★ Trace of matrix of operator does not depend on basis: T1=A-1T2A, traceT1=traceT2★ Trace of an operator equals trace of its matrix★ Trace is additive10.B DeterminantDeterminant of an operator★ It's the product of the eigenvalue of T or TC, with each eigenvalue repeated according to its multiplicity★ det T equals (-1)n times the constant term of the characteristic polynomial of T.★ Invertible is equivalent to nonzero determinant★ Characteristic polynomial of T equals det(zI-T)Determinant of a matrix★ detA=∑(m1,⋯,mn)∈perm n(sign(m1,⋯,mn))Am1,1⋯Amn,n★ determinant is multiplicative★ Determinant of an operator equals determinant of its matrix
Acknowledgment
All the typesetting is completed using Math. Math really kicks ass!