Longest Path Search

Fun with polynomials and linear algebra; or, slight abstract nonsense

Posted

This is mostly a bunch of notes to myself (with some slight expansion) and is a combination/extension/simplification of theorems/ideas/constructions from a bunch of texts, including Wistbauer’s “Foundations of Module and Ring Theory” and Fuhrmann’s “A Polynomial Approach to Linear Algebra”, along with others that at this point I don’t recall.

While most of the things here are pretty standard (indeed, many of these are just slightly generalized definitions given in introductions to linear algebra over arbitrary fields!) there is some stuff here that might be weird and surprising unless you’re already quite well-versed in the language of module theory. (We won’t be discussing modules here, though, even if many statements have natural module-theoretic generalizations.)

This document/post is more of an exercise to see just how many “standard” constructions can be done purely in the linear-algebraic language. In a sense, this document is an “any% speedrun” of some theorems (Chinese remainder theorem, e.g.) and tidbits that most of us associate with different parts of math. It will also have some small exercises littered throughout, which are fun one-or-two-liners for those interested.

Anyways, onwards.

Basic facts

Let V be a vector space over a field 𝐅 in what follows. If V is a vector space over the same field, we say

VV

if there exists any invertible linear map between V and V. It is pretty easy to see that this is an equivalence relation.

Given a linear map T:VV note that T is injective (one-to-one) only when Tx=0 implies that x=0 (why?). Equivalently, it is when the kernel ker(T)={x|Tx=0} is trivial, i.e., ker(T)={0}. These are, of course, identical statements, but the latter is sometimes nicer when dealing with quotients, which we define later.

Now, let W be a subspace of V, that is WV and W is itself a vector space. We say that W is finite-dimensional and that dim(W) is the dimension of the subspace if

W𝐅dim(W).

That is, if there exists some invertible linear map W𝐅dim(W). (For those familiar with linear algebra, this is identical to saying that that there exists a finite number of w1,,wdim(W)W which span the space and are linearly-independent; this is easy to see via the map ϕ:𝐅dim(W)W with ϕ(ei)=wi.)

Note that dim(W) is unique (if it exists) so does not depend on the mapping. This means that if we have another subspace UV with dim(U)=dim(W), then UW since U𝐅dim(U)=𝐅dim(W)W.

In general, given two finite-dimensional vector spaces U and W, then UW exactly when dim(U)=dim(W). The former often conveys slightly more information in practice since the mappings between U and W are fairly natural when constructed, but if we only know that dim(U)=dim(W) then the map UW may be fairly arbitrary and potentially complicated (though it will be, of course, linear).

It is also pretty easy to see from this definition that dim(U×W)=dim(U)+dim(W) when U and W are finite-dimensional. (Consider the “obvious” map U×WU𝐅dim(U) and similarly for W.)

One useful observation is that if n=dim(U)=dim(W) with both quantities defined, then any linear map T:UW that is either injective or surjective (onto) must also be invertible. The “non-gentleman’s proof”1 in the injective case is by noting that any basis u1,,un in U is a basis Tu1,,Tun in W if T is injective. (Note that this is not true in the non-finite case; why?) This is helpful since, as we will soon see, it allows us to take “natural” maps T which are easily seen to be injective and immediately show that they are invertible by dimension-counting.

Finally, it’s useful to note that for some linear map T:UW, the preimage of this map T1(w)={uU|Tu=w} can be written as T1(w)=u+ker(T) for any uU satisfying Tu=w.

Quotients by spaces

Given some subspace WV then (much like in standard abstract algebra) we can set

V/W={v+W|vV},

where v+W is defined elementwise; i.e., v+W={v+w|wW}. The notion of a coset for vector spaces is identical to the one of Abelian groups, except that we work with subspaces which are the natural analogue to subgroups in the standard setting.

Note that V/W is itself a vector space under the standard operations (again, applied elementwise) with the zero element 0 equivalent to 0+W. (In what follows, it will be worth it to use 0 as 0+W and vice versa, but the “type” of zero should be clear from context.) To see that V/W is a vector space, let v+W and v+W be any “vectors” in V/W and let α,β𝐅 be any elements of the field which V is over, then

α(v+W)+β(v+W)=αv+βv+WV/W.

As an exercise it is worth parsing the left hand side very carefully. This means that, like with any vector space, there is a notion of dimension, when the space V/W is finite-dimensional.

Now, let W,UV be subspaces of V with dim(U) and dim(V/W) defined, then note that if

dim(U)=dim(V/W)

and UW={0}, then V=U⊕︎W. Since dim(U) and dim(V/W) are finite then V/WU is equivalent to dim(U)=dim(V/W).

The proof of this is pretty simple: let ϕ:UV/W giving ϕ(u)=u+W. Then ϕ(u)=0 implies that u+W=0, so u=0 since WU={0} and so ϕ is injective. But an injective linear map between two spaces of the same dimension is invertible. Take vV then note that ϕ1(v+W)=uU. Finally, applying ϕ to both sides gives u+W=v+W, so uvW, which means that there is some wW such that v=u+w. (Another way of stating this is that ϕ defines a natural projection from vV to U via vϕ1(v+W).)

Polynomial fun

Consider the vector space of polynomials over some field, denote it V. (It may be useful to consider V instead to be the set of polynomials of degree N for some N, in which case V would have finite dimension, but what we’ll talk about here holds in general, even when V is not finite-dimensional.) Let pV be a polynomial over the same field and set WpV to be the set of polynomials divisible by p.

Note that Wp is a perfectly reasonable subspace of V since adding polynomials divisible by p and multiplying them by constants does not change the fact that they are divisible by p.

This view leads to a bunch of nice results.

For example, it is reasonable to ask what vector subspaces Wp+Wq correspond to for polynomials p and q? This is just the set of polynomials that can be written as ap+bq for a,bV, by definition of Wp and Wq. Using Bezout’s lemma, this is the same as W(p,q) where (p,q) is the (polynomial) greatest common divisor of p and q. Similarly, you should work out what WpWq is. (If you’re familiar with ideals, this is, for now, just a rewriting of the standard theorems there, hence the use of parenthesis notation!)

Additionally, note that the set of vectors in V modulo Wp (in the linear algebraic sense) is exactly the same as the set of polynomials modulo p (in the polynomial sense). We will show that, indeed, quotienting V by the space Wp is the same as quotienting general polynomials by the polynomial p, in that

V/WpRp,

where RpV is the set of polynomials with degree <deg(p). That is, the set of vectors in V, modulo Wp (or, equivalently, polynomials modulo the polynomial p) is identical to the set of polynomials of degree <deg(p). From our little structure theorem above, this immediately shows that the polynomials decompose uniquely into Wp and a remainder part in Rp:

V=Wp⊕︎Rp.

In this sense, the vector space Wp captures a lot of the structure of the polynomial p.

Proof

Ok, to start, note that a polynomial qV/Wp if and only if qr+Wp for some polynomial r in V, or, equivalently, q=ap+r for some polynomials a,rV. This should already be quite reminiscent of the division theorem.

It is not hard to show that dim(V/Wp)deg(p) for p nonzero. In fact, this is the only polynomial fact we need to prove the theorem: let q be a polynomial with deg(q)deg(p), then q+Wp=qap+Wp for any polynomial aV. It is easy to set a such that deg(qap)<deg(q) (why?) so we may assume in general that deg(q)<deg(p) by applying this observation repeatedly. This implies that the monomials {1,,xdeg(p)1}+Wp span V/Wp so dim(V/Wp)deg(p), as required. It is easy to see that dim(V/Wp)deg(p) by comparing degrees, which gives the final claim that

dim(V/Wp)=deg(p).

Of course, this exactly matches the degree bound for polynomial division and therefore the dimension of Rp hence RpV/Wp. Finally, since RpWp={0} (why?) and dim(Rp)=deg(p) we have the result V=Wp⊕︎Rp by the mini-structure theorem above.

Abstract nonsense-ish Chinese remainder theorem

Now, we can get to a result that generalizes the case of ideals.

An interesting observation from Wisbauer 1991’s textbook is the following very lovely construction (let’s say over vector spaces, but it holds in general over modules with a slightly different condition).

Let W1,,WnV be vector subspaces of V such that V/iWi is finite-dimensional, then

V/iWii(V/Wi),

exactly when

dim(V/iWi)=idim(V/Wi).

This just follows from our definition of “dimension” and the surrounding discussion.

In other words, it is possible to decompose a quotient over an intersection of subspaces into a product of individual quotients with each subspace, exactly when the dimensions match.

Canonical map

In fact, the dimension equality implies the canonical map V/iWii(V/Wi) is also invertible. To see this, let πi:VV/Wi be the map for the ith component, such that π=π1××πn is the “obvious” map. That is,

πi(x)=x+Wi,

for xV. Now, note that π=gf where f:VV/kerπ and g:V/kerπi(V/Wi), where f is the canonical map f(x)=x+kerπ and g is the (induced!) injective map g(x+kerπ)=π(x). (You should check that this is, indeed, injective.) It is clear that kerπ=ikerπi=iWi, and the dimension equality then tells us that the space g maps from, V/kerπ=V/iWi, has the same dimension as the output space i(V/Wi) and hence g is not just injective, but invertible.

As a special case, if V is finite-dimensional and iWi={0}, then we get that there is a natural invertible map

Vi(V/Wi),

exactly when dim(V)=idim(V/Wi).

Back to the theorem

After all of this, we might ask: why is this the Chinese remainder theorem?

As before, let V be the vector space of polynomials over some field 𝐅 and let p1,,pn be polynomials with coefficients in 𝐅 that are mutually coprime, such that a least-common multiple of them is p1p2pn. Then recall that V/Wpi is exactly the set of remainders modulo pi (in the polynomial sense).

By the division argument above, we have that dim(V/Wpi)=deg(pi) while iWpi is Wq where q is any least-common multiple of the pi. So, again from above, dim(V/iWpi)=deg(p1p2pn)=deg(p1)++deg(pn), so the left hand side and right hand side match and we can decompose any polynomial f modulo p1p2pn into the polynomial f modulo each pi uniquely and vice versa, which is exactly the Chinese remainder theorem.

Indeed, the map above lets us do the decomposition in a straightforward way since it tells us how the two spaces are related via the πi (and g).

Given the product description π=π1××πn it is natural to consider elements of the form (0,,xj+Wj,,0) for some index j with xjV. Since any element in i(V/Wi) can be written as the sum of elements of this form, finding an inverse for this type of element suffices to find inverses for any element because g (and therefore g1) is linear.

Now, since g is invertible, let y+iWpi=g1((0,,xj+Wpj,,0)) for some yV and, by definition of g we have yxjWpj. But note that yWpk for each kj, by definition, so ykjWpk. Since we know that kjWpk=Wqj where qj=q/pj=p1pj1pj+1pn then this tells us exactly what we need to find the inverse: find some polynomial y such that y has the same remainder as xj (modulo pj) but is a multiple of the polynomial qj, then take the remainder over the big polynomial q.

If we want to go back to “polynomial land” then this follows directly from Bezout’s lemma using the fact that qj and pj are coprime. Of course, we can give a linear-algebraic description here too, using the definitions and discussion above regarding the equivalence between polynomials, greatest common divisors, and the vector subspaces generated by them.

To see this, note that we have Wpj+Wqj=W(pj,qj)=W(1)=V, so any xjV can be written in the form xj=z+y with zWpj and yWqj. (Again, this is Bezout’s lemma in a funny hat, but the point is that it’s all just vector spaces once you have the equivalence.) It is then pretty immediate that (a) y+Wpj=xj+Wpj, (b) by definition y is a multiple of qj since yWqj, and, finally, (c) we may take y+iWpi as required.

The basis view

In the case that V is finite-dimensional, the equivalence has a very simple interpretation in terms of stacked matrices.

For now, take the subspaces to satisfy iWi={0}, as it will make our lives slightly easier.

Since V is finite dimensional, say of dimension m, then V𝐅m so we have an invertible map ϕ:𝐅mV. This also means that V/Wi is of finite dimension (why?), call it mi=dim(V/Wi). Now let Ti:𝐅m𝐅mi be any linear map (representable as a matrix of size mi×m) satisfying ϕ(kerTi)=Wi. In other words, the nullspace of Ti is Wi in the basis given by ϕ. Then the condition

dim(V)=idim(V/Wi),

is exactly the condition that the dimensions of the matrices Ti add up: m=m1++mn, and the condition that iWi={0} is similarly the condition that they share no elements that are mapped to zero simultaneously. Equivalently, it is the condition that the stacked matrix

T=[T1T2Tn]

has kerT={0} or is, in other words, invertible.

There’s all sorts of additional simple consequences of this view. For example, this means that the rows of the Ti must all be linearly independent, or, given a system with outputs over V, any linear measurements T1,,Tn can reconstruct the output unambiguously.

As a final exercise: what’s the interpretation of T and the dimension conditions in the case that the subspaces do not satisfy iWi={0}?


Footnotes

  1. A “gentleman’s proof” is one which is not basis-dependent. There is such a proof of this statement by noting that a proper subspace TU={Tu|uU}W has at least one (nonzero) element xWTU and this element generates a subspace of dimension at least 1 making TU⊕︎xW so dim(U)=dim(TU)<dim(W) since T is injective.