Longest Path Search

The Chinese remainder theorem and the Fourier transform

Posted

I’d recommend skimming the previous post if you have not yet done so, which discusses the Chinese remainder theorem in a purely linear-algebraic light, along with a bunch of notions we use. It’s neat, though we don’t say anything that is terribly surprising.

That’s, instead, what this post is for.

As a quick reminder, the somewhat abstract-nonsense-y Chinese remainder theorem is the following: let V be a vector space over a field 𝐅 and let W1,,WkV be subspaces of V. If, dimensionally, we have

dim(V/iWi)=idim(V/Wi),

(where all dimensions are defined) then we can decompose V in the following way

V/iWii(V/Wi),

where the invertible map π:V/iWii(V/Wi) is the product π=π1××πk of natural maps into V/Wi:

πj(x+iWi)=x+Wj,

for j=1,,k.

The abstract Fourier transform

Ok, with that, we can start with the “general” Fourier transform over fields.

Let V be the vector space of polynomials over some field 𝐅 and let pV be any nonzero polynomial which decomposes into p=q1q2qk with q1,,qkV mutually coprime. Define Wp to be the set of all polynomials in V that are divisible by p. This means, from the previous post, that we have the equivalence:

V/WpiV/Wqi

where the coordinate-wise map is the “obvious” one:

πi(x+Wp)=x+Wqi.

which is (again!) the Chinese remainder theorem over the {qi}.

Now, if p splits over 𝐅 and has no repeated root, that is, if p satisfies

p(x)=α0(xα1)(xα2)(xαn)

for α0,,αn𝐅 with α1,,αn distinct, then we can set qi(x)=xαi for i=1,,n. (Easy check: what must n be here, in terms of p?) Note that the qi are indeed mutually coprime and iWqi=Wp, so the conditions are satisfied and, from before, we have

V/WpiV/Wqi,

via the simple map above. Finally, note that dim(V/Wqi)=deg(qi)=1 for each i, so V/Wqi can be described by exactly one field element 𝐅. Indeed, from the previous post, one such description is that fV/Wqi is equivalent to taking the remainder of f modulo qi, which is simply the value of f at αi.1

In other words, there is a simple invertible linear mapping between a polynomial (modulo p) and its evaluation on points α1,,αn, which are the roots of p. That is to say for each polynomial f modulo p, there is an equivalence ff^ between the polynomial modulo p, f+Wp, and its evaluations f(α1),,f(αn).

This leads us to the last appetizer course.

The abstract convolution theorem

This is almost an immediate application of the above. In particular, let f,gV be two polynomials, then we have that

fg+Wp((fg)(α1),,(fg)(αn))=(f(α1)g(α1),,f(αn)g(αn)).

Or, in polynomial notation:

fgmodp(f(α1)g(α1),,f(αn)g(αn)),

and the map is exactly the evaluation of the product at the points α1,,αn.

Again, note on the left hand side we are doing multiplication as polynomials. That is to say, we are convolving the coefficients of f and g and then reducing them modulo p, whereas on the right, we are doing pointwise multiplication of the evaluations of f and g over the points αi.

Indeed, this is exactly where the structure of p splitting over 𝐅 is useful. Note that the original abstract Chinese remainder theorem requires that iWqi=Wp. Now the dimensions of the map V/WpiV/Wqi, of course, make sense without this requirement when deg(p)=n, so we in general have a relationship between f modulo p and its evaluations over the αi, but the “natural” quotient map x+Wpx+Wqi only makes sense when WpWqi, or, equivalently, when qi divides p. So, if we were to take fg+Wp, then the output of such a map will not necessarily be of the form fg+Wqi, so the abstract convolution theorem above would not hold.

The usual discrete Fourier transform (DFT)

Ok, now let’s finally get to the “usual” DFT. Let α1,,αn be the nth roots of unity over some field 𝐅, such that they are the roots of the polynomial p(x)=xn1. That is to say, the polynomial p splits over 𝐅, with the roots of unity {αi} as its factors. (This goes both ways: αi is an nth root of unity over some field if, and only if, it is a root of xn1 over this field.) We can also take any nth primitive root of unity ω (that is, one in which ωi1 when i=1,,n1) and write αi=ωi for i=1,n up to relabelling of the indices.

Note that the field 𝐅 can be anything that has nth roots of unity—we have made no assumptions about the field anywhere other than 01, implicitly, in our proof. For example, taking our field 𝐅=𝐂, the complex numbers, we could have, for any n, αi=exp(𝐢2πi/n), where 𝐢2=1. On the other hand, in the Fermat prime field |𝐅|=216+1, we can take αi=3imod216+1 and n=216. Any of these (and many more, of course!) are perfectly valid options.

Now, let’s go back to our equivalence.

We previously said, taking p(x)=xn1, that

fmod(xn1)(f(α1),,f(αn)),

is invertible over 𝐅. This is the usual (discrete) Fourier transform. In particular, we may take fmod(xn1) to be of degree n1 since any element of degree mn will have xm=xmmodn. The mapping is just the “forward” Fourier transform:

f(αi)=j=1nfjαij1,

where f1,,fn are the n coefficients of fmod(xn1). In terms of a primitive nth root ω, we have that αi=ωi (again, up to relabeling of the {αi}) so

f(ωi)=j=1nωi(j1)fj.

This is the standard DFT equation.

Circular convolution to products

Now, let f,gV both be polynomials, then we also have, from our more abstract version of the convolution theorem above, that

fgmod(xn1)(f(α1)g(α1),,f(αn)g(αn)).

As mentioned before, the multiplication fg is multiplication as polynomials. That is, the coefficients of the polynomial fg are the coefficients of the polynomial f convolved with those of g, then reduced by xn1. But this reduction simply maps all terms xmxmmodn. In other words, the coefficients of fgmod(xn1) are exactly the coefficients resulting from a circular convolution of the coefficients of f with those of g !

Equivalently, this gives the “vectorial form”, when f and g are assumed to be of degree n1:

fgf^g^.

Here, we write fg for the circular convolution of the coefficients of f and g and write f^i=f(αi) and similarly for g, with being the elementwise (Hadamard) product. From before, the forward map between the convolution and the product is the evaluation map ff^. (A simple exercise is to write out what the matrix F𝐅n×n such that f^=Ff looks like!)

The “additive” Fourier transform

Finally, we note that there are a bunch of other possible types of Fourier transform, including the so-called “additive” Fourier transform, which works over fields of characteristic two. (It really works for any finite field of nonzero characteristic, but small-characteristic fields are the most practical; we’ll go through the characteristic-two case here and leave the rest as an exercise.) This type of transform is incredibly useful in a number of succinct proofs, see, e.g., Binius, ZODA, or Ligerito.

Let 𝐅 be a finite field with |𝐅|=2m for some positive integer m, then note that 𝐅 is a vector space over 𝐅2={0,1} of dimension m. (Why?) Let e1,,em be any basis of this vector space and let nm. In what follows, 2n will be the number of evaluation points of the polynomial f modulo a particular polynomial p we will choose carefully.

With that, let p be a polynomial that vanishes on any x which is in the span of e1,,en (not m) over 𝐅2. Call this the 𝐅2-vector space A. Explicitly, p can be, for example

p(x)=β{0,1}n(xβ1e1βnen)=αA(xα).

so kerp=A. Note that p splits exactly over all points in A (and therefore splits over 𝐅) satisfying the conditions for the abstract Fourier transform above.

From our abstract Fourier transform theorem above, this has the immediate implication that there is an invertible map, for any polynomial f over 𝐅 between

f+Wp(f(α))αA,

and this map is the obvious one: f+Wpf(α) for each αA. (Quick one: why is this injective, again?) This looks just like a simple application of the above, but it has some very special structure.

A teensy dessert

As a little bit of a cliffhanger, it is not hard to show by induction that p satisfies p(x+y)=p(x)+p(y) for any x,y𝐅, as does any polynomial of that form.

We can then split A into two disjoint cosets A=A(u+A), for some u𝐅 and some AA which also a vector space over 𝐅2. Set q to be the polynomial that vanishes over A, in the same way as p above, then p(x)=q(x)q(xu)=q(x)(q(x)+q(u)), since ab=a+b in characteristic two. From before, q and q+q(u) split over 𝐅, yet share no roots since they vanish over A and u+A respectively, which share no common elements. This means they are mutually coprime and, by the abstract Chinese remainder theorem above, we have

f+Wp(f+Wq,f+Wq+q(u)).

Of course, we can continue decomposing A and therefore q itself into the product of two polynomials, mutually coprime, and so on…

Which suggests, maybe, that there’s a faster algorithm for evaluating (f(α))αA, than just naively evaluating f at every point of A.

And, maybe, just maybe, a very similar observation also holds for the “usual” Fourier transform.

Until next time!


Footnotes

  1. There is a “nearly-purely-linear-algebraic proof” instead by noting that the evaluation-at-αi map V𝐅 has kernel containing Wqi since any fWqi evaluates to zero under this map (why?). We can therefore consider it as a map V/Wqi𝐅, but the map is also not zero everywhere (since obviously the constant polynomial 1V is not zero at αi) and so must be surjective—and therefore invertible—by dimension counting since dim(V/Wqi)=1=dim(𝐅).