Understanding Fast Fourier Transform from scratch – to solve Polynomial Multiplication.

Fast Fourier Transform is a widely used algorithm in Computer Science. It is also generally regarded as difficult to understand. I have spent the last few days trying to understand the algorithm – how it works and why. In the process I digressed to various other mathematical topics to build a complete understanding ground up. Here, I have tried to collate and document everything I have learnt. In this blog, we will use FFT (Fast Fourier Transform) to solve the problem of quickly multiplying two polynomials. I must warn that this is a long (and tedious?) post. But, you should be able to read and understand everything from start to end without having to look anything up.

PrerequisitesYou do not need to know Fourier Transform to understand this blog. However, a basic understanding of the following is required –

  • Polynomials
  • Complex Numbers
  • Trigonometry
  • Matrix Multiplication
  • Algorithm Complexity Analysis (Big O notation) – You are free to skip these parts and it shouldn’t affect the understanding of working of the algorithm.

Problem – Given two polynomials- A(x) =a0+a1x+a2x2+… anxn  and B(x) =b0+b1x+b2x2+… bnxn, find the polynomial C(x) = A(x)*B(x).

E.g., multiply (x^2 + 2x + 3)(2x^2 + 5) = 

			2x^2 + 0x + 5
			1x^2 + 2x + 3
                        -------------
                        6      0   15
                 4      0      10
      2     0      5
         ----------------------------
          2x^4 + 4x^3 + 11x^2 +10x+15

More generally, when C(x) = A(x)*B(x) and C(x) =c0+c1x+c2x2+… c2nx2n .

                   ci = a0*bi + a1*bi-1 + ... + ai*b0.

If we think of A and B as vectors, then C vector is called ‘Convolution‘ of A and B (represented as {A⊗B}). Make sure that the polynomials A and B are appropriately appended with 0s to make each of their degree equal to 2n (degree of C).  Straightforward computation is O(n^2) time

Fourier Transform and Convolution Theorm

Fourier analysis is the decomposition of any mathematical function into a series of sine and cosine waves (sinusoids). If you wish to learn more, here are some links that help build the intuition – a quora answer on fft, an interactive guide to Fourier Transform.

We generally represent a function in its Fourier Transform form, as it is easier to work with.

The Convolution Theorm  : states that under suitable conditions the Fourier transform of a convolution (denoted by ‘⊗’) is the point-wise product of Fourier transforms. Let f and be two functions with convolution  {f⊗ g}. Fourier transform be denoted by the operator ‘ζ’ so ζ(f) and ζ(g) are the Fourier transforms of f and g, respectively. –

                 ζ(f⊗g) = ζ(f).ζ(g)

The convolution operation is converted to point-wise multiplication (we will discuss what point-wise multiplication means).

Representation of Polynomials

The representation of Polynomial we saw above is called the coefficient representation. Another way of representing a polynomial is called point-value representation. A polynomial of degree n-1 can be uniquely represented by its values at at-least n different points. This is a result of the Fundamental Theorm of Algebra that states – “A degree n-1 polynomial A(x) is uniquely specified by its evaluation at n distinct values of x”.

A polynomial A(x) =a0+a1x+a2x2+… anxn can be represented as

A set of n pairs {(x0, y0),(x1, y1),...,(xn, yn)} such that 
• for all i != j, xi != xj. ie, the points are unique. 
• for every k, yk = A(xk);

In point-value form, multiplication 𝐶(𝑥) = 𝐴(𝑥)𝐵(𝑥) is given by 𝐶(𝑥𝑘) = 𝐴(𝑥𝑘) .𝐵(𝑥𝑘) for any point (𝑥𝑘).

If 𝐴 and 𝐵 are of degree-bound 'n', then 𝐶 is of degree-bound '2n'.
Need to start with “extended” point-value forms for 𝐴 and 𝐵 consisting of 
2𝑛 point-value pairs each. – 

𝐴: (𝑥0, 𝑦0) , (𝑥1, 𝑦1) , … ,( 𝑥2𝑛−1, 𝑦2𝑛−1) 
𝐵: (𝑥0, 𝑦0')  , (𝑥1, 𝑦1 ′) , … , (𝑥2𝑛−1, 𝑦2𝑛−1 ′) 
𝐶: (𝑥0, 𝑦0𝑦0 ′) ,( 𝑥1, 𝑦1𝑦1 ′) , … ,( 𝑥2𝑛−1, 𝑦2𝑛−1𝑦2𝑛−1 ′)

For example,
𝐴(𝑥) = 𝑥 3 − 2𝑥 + 1
𝐵 𝑥 = 𝑥 3 + 𝑥 2 + 1
𝑥𝑘 = (−3, −2, −1, 0, 1, 2, 3) We need 7 coefficients!
𝐴: (−3,−17) ,( −2,−3) ,( −1,1) ,( 0, 1) ,( 1, 0) , (2, 5) , (3, 22)
𝐵:  (−3,−20) ,(−2,−3) , (−1, 2) ,( 0, 1) ,( 1, 3) ,( 2, 13) ,( 3, 37)
𝐶: (−3,340) , (−2,9) , (−1,2) ,( 0, 1) ,( 1, 0) ,( 2, 65) ,( 3, 814)

This is point-wise multiplication. The time to multiply two polynomials of degree-bound n in point-value form is Θ(n).The inverse of evaluation–determining the coefficient form of a polynomial from a point-value representation–is called interpolation.  This is what the Convolution Theorm states with respect to Fourier Transform of a function.

The Road so far

screen-shot-2017-02-23-at-12-18-22-pm

Now the efficiency of our algorithm depends on the efficiency of conversion between the two representation – screen-shot-2017-02-23-at-12-20-02-pm
This is where we use the FFT algorithm.

An Overview of the Algorithm

Let m = 2n-1. [so degree of C is less than m]
1. Pick m points x_0, x_1, ..., x_{m-1} chosen cleverly.
2. Evaluate A at each of the points: A(x_0),..., A(x_{m-1}).
3. Same for B.
4. Now compute C(x_0),..., C(x_{m-1}), where C is A(x)*B(x)
5. Interpolate to get the coefficients of C.

At a glance, it looks like points 2 and 3 from the Algorithm takes O(n2) time. However, the FFT will allow us to quickly move from coefficient representation of polynomial to the point-value representation, and back, for our cleverly chosen set of m points. (Doesn’t work for *arbitrary* m points. The special points will turn out to be roots of unity).

A Detour to Roots of Unity

The word “root” in the term refers to square roots, cube roots, and any other roots you might happen to need. For any integer n, the nth root of a number k is a number that, when multiplied by itself n times, yields k. The word “unity,” means “one.” So a root of unity is any number which, when multiplied by itself some number of times, yields 1. To get any root of unity, other than 1, we need to delve into Complex Numbers.

We can think of the set of complex numbers as a 2-dimensional plane. We specify a complex number with two coordinates the same way we would on a graph: the point (1,1) refers to the number 1+i. If we were standing on the point (0,0), we probably wouldn’t want to walk one unit over and then one unit up to get to the point (1,1). Instead, we’d take a diagonal by walking √2 units at an angle of 45 degrees to the x-axis. Notationally, when we use this radial distance and direction, we write a complex number as rei θ, where r is the distance and θ is the direction, usually written in radians rather than degrees. There are 2π radians in a whole circle, so the number 1 is written ei 2π. The angle 45 degrees is π/4 radians, so the point (1,1) that we found above would be written as √2ei π/4.

screen-shot-2017-02-23-at-12-41-40-pm
This method of specifying a point with a length and a direction is called using polar coordinates.
If a point is represented as re, the x-co-ordinate(real part) is given by rcosθ, and the y-co-ordinate (imaginary part) is given by rsinθ. This follows from basic trigonometric identities.

  polar co-ordinates : re
  cartesian co-ordinates :(rcosθ, rsinθ)

Multiplication with the polar notation of re, is really easy: you multiply the distances together and add the angles. So 5eiπ/6× 2eiπ/3=10 ei π/2. So multiplying involves both expansion or contraction (that’s the distance part) and rotation (that’s the angle part).

If we have a number written rei θ which, when multiplied by itself a certain number of times, yields 1, the distance r itself must be 1. If r were larger than 1, say 2, then as we multiplied the number by itself more and more times, its distance from (0,0) would go from 2 to 4 to 8 and on and on, spiralling out to infinity. If r were smaller than 1, say 1/2, the point would spiral in to 0: 1/2, 1/4, 1/8, and so on. 1 is the only radial distance that will stay perfectly balanced, just moving around the circle as we multiply the numbers together.

To make this more concrete, I happen to know that ei2π/7 is a root of unity. When I raise it to the 7th power, I get ei2π, which is 1. Each time I multiply ei2π/7 by itself (product =ei(2*2π/7)) , I rotate 1/7 of the way around the circle. In fact, as I multiply ei2π/7 by itself successively, I get this picture –

Screen Shot 2017-02-23 at 12.48.04 pm.png

In fact, there are seven 7th roots of unity, and each gold disc in that picture is one of them. We can get an nth root of unity for any number n by replacing the 7 in ei 2π/7 by n. The pictures for other roots of unity look a lot like that diagram above, they just have a different number of gold discs.

There are exactly n complex nth root of unity e2𝜋𝑖𝑘/𝑛 for k = 0, 1, … , 𝑛 − 1 where
e𝑖𝑢 = cos (u) + 𝑖 sin(u). Here ‘u’ represents an angle in radians.

Visualising 8th roots of unity-
screen-shot-2017-02-23-at-1-13-21-pm

Some Interesting Properties of Roots of Unity

  1. Principal nth Root of Unity
    • The value ω𝑛 = e2𝜋𝑖/𝑛 is called the principal nth root of unity.
    • All of the other complex nth roots of unity are powers of 𝜔𝑛.
    • The n complex nth roots of unity, 𝜔𝑛0 , 𝜔𝑛1 , … , ,𝜔𝑛n-1  form a group under multiplication that has the same structure as (ℤn, +) modulo n.
    • 𝜔𝑛𝑛 = 𝜔𝑛0 = 1 implies
      – 𝜔𝑛𝑗𝜔𝑛𝑘 = 𝜔𝑛𝑗+𝑘 = 𝜔𝑛𝑗+𝑘 mod 𝑛
      𝜔𝑛-1 = 𝜔𝑛n-1
  2. Cancellation Lemma 
    • For any integers n ≥ 0, k ≥ 0, and b > 0, 𝜔𝑑𝑛𝑑𝑘 = 𝜔𝑛𝑘.
    • Proof : 𝜔𝑑𝑛𝑑𝑘 = (𝑒2𝜋𝑖/𝑑𝑛)𝑑𝑘 = 𝑒2𝜋𝑖/n𝑘 = 𝜔𝑛𝑘
    • For any even integer n > 0, 𝜔𝑛𝑛/2 = 𝜔2 = −1.
    • Example 𝜔246 = 𝜔4
      – 𝜔246 = (𝑒2𝜋𝑖/24)6 = 𝑒2𝜋𝑖*6/24 = 𝑒2𝜋𝑖/4 = 𝜔4
  3. Halving Lemma

    • If n > 0 is even, then the squares of the complex nth roots of unity are the n/2 complex n/2th roots of unity.
    • Proof :  By the cancellation lemma, we have( 𝜔𝑛 𝑘) 2 = 𝜔𝑛/2 𝑘 for any non-negative integer, 𝑘.
    • If we square all of the complex nth roots of unity, then each n/2th root of unity is obtained exactly twice .
      – (𝜔𝑛𝑘+𝑛/ 2)2 = 𝜔𝑛2𝑘+𝑛 = 𝜔𝑛 2𝑘𝜔𝑛𝑛 = 𝜔𝑛2𝑘 = (𝜔𝑛 𝑘)2
      –Thus, 𝜔𝑛𝑘 and 𝜔𝑛𝑘+𝑛/2 have the same square.
As we will see, the fast Fourier transform algorithm cleverly makes 
use of the following properties about ωn:
   𝜔𝑛n =1
   𝜔𝑛n+𝑘 = 𝜔𝑛𝑘
    𝜔𝑛n/2 = -1
   𝜔𝑛n/2+𝑘 = -𝜔𝑛𝑘

Discrete Fourier Transform

Since we have broken up the problem, our goal now is to evaluate a given polynomial, A (x)(degree <m) at m points of our choosing in total time O(m log m). Assume m is a power
of 2.

Let’s first develop it through an example.

Say m=8, so we have a polynomial
A(x) = a_0 + a_1 x + a_2 x^2 + a_3 x^3 + a_4 x^4 + a_5 x^5 + a_6x^6 + a_7x^7.
(as a vector, A = [a_0, a_1, …, a_7])

And we want to evaluate at eight points of our choosing. Here is an idea. Split A into two pieces, but instead of left and right, have them be even and odd. So, as vectors,
A_even = [a_0, a_2, a_4, a_6]
A_odd = [a_1, a_3, a_5, a_7]

or, as polynomials:

A_even(x) = a_0 + a_2 x + a_4 x^2 + a_6 x^3
A_odd(x) = a_1 + a_3 x + a_5 x^2 + a_7 x^3.

Each has degree < m/2. How can we write A(x) in terms of A_even and A_odd?
A(x) = A_even(x^2) + x A_odd(x^2).
A(-x) = A_even(x^2) – x A_odd(x^2).

Now, instead of calculating the value for 1 polynomial of degree m, we now need to calculate 2 polynomials of degree (m/2) and then combine them.

Let T(m) = time to evaluate a degree-m polynomial at our special set of m points. We’re doing this by evaluating two degree-m/2 polynomials at m/2 points each (the squares), and then doing O(m) work to combine the results. This is great because the recurrence T(m) = 2T(m/2) + O(m) solves to O(m log m).

What’s nice is that the effort spent computing A(x) will give us A(-x) almost for free. So, we need to specially choose m points that will have the property:

The 2nd half of points are the negative of the 1st half   (***)

E.g., {1,2,3,4,-1,-2,-3,-4}.

But, we’re deluding ourselves by saying “just do it recursively”. Why is that? The problem is that recursively, our special points (now {1, 4, 9, 16}) have to satisfy property (***). E.g., they should really look like {1, 4, -1, -4}. BUT THESE ARE SQUARES!! How to fix? Just use complex numbers! E.g., if these are the squares, what do the original points look like?

{1, 2, i, 2i, -1, -2, -i, -2i}

so then their squares are: 1, 4, -1, -4
and their squares are: 1, 16

But, at the next level we again need property (***). So, we want to have {1, -1} there. This means we want the level before that to be {1, i, -1, -i}, which is the same as {1, i, i2, i3}.

So, for the original level, let ω = the primitive 8th root of unity, and then our original set of points will be:

1,ω ,ω 234 (= -1),ω5 (= -ω ),ω6 (= -ω2),ω7 (= -ω3)

so that the squares are: 1, i, i^2 (= -1), i^3 (= -i)
and *their* squares are: 1, -1
and *their* squares are: 1

In general, the mth primitive root of unity is the vector
ω  = cos(2*pi/m) + i*sin(2*pi/m)

 Fast Fourier Transform (FFT) 

• The problem of evaluating 𝐴(𝑥) at 𝜔𝑛0 , 𝜔𝑛1 , … , 𝜔𝑛𝑛−1 reduces to 
   1. evaluating the degree-bound 𝑛/2 polynomials 𝐴even(𝑥) and 𝐴odd(𝑥) at the 
      points (𝜔𝑛0)2 ,(𝜔𝑛1)2 , … , (𝜔𝑛𝑛−1)2. 
   2. combining the results by 𝐴(𝑥) = 𝐴even(𝑥2) + 𝑥𝐴odd(𝑥2). 

• Why bother? 

– The list (𝜔𝑛0)2 ,(𝜔𝑛1)2 , … , (𝜔𝑛𝑛−1)2 does not contain 
  𝑛 distinct values, but 𝑛/2 complex 𝑛/2th roots of unity. 
– Polynomials 𝐴even and 𝐴odd are recursively evaluated at the 𝑛/2 complex 
  𝑛/2th roots of unity. 
– Subproblems have exactly the same form as the original problem, but are 
  half the size

Example
A(x) = 3+2x+3x2+4x3
A(ω40) = A(1) = 3+2+3+4=12
A(ω41) = A(i) = 3+2i-3+4i=6i
A(ω42) = A(-1) = 3-2+3-4=0
A(ω43) = A(-i) = 3-2i-3+4i=2i

Using A(x)= Aeven(x2) + xAodd(x2)
Aeven(x) = 3+3x
Aodd(x) = 2+4x
A(ω40)= A(1) = Aeven(1)+1.Aodd(1)= 3+3+2+4=12
A(ω41)= A(i) = Aeven(-1)+i.Aodd(-1)=3-3+2i+4i=6i
A(ω42)= A(-1) = Aeven(1)-1.Aodd(1)= 3+3-2-4=0
A(ω43)= A(-i)= Aeven(-1)-i.Aodd(-1)=3-3-2i+4i= 2i

Algorithm
Here is the general algorithm in pseudo-C:

Let A be array of length m, w be primitive mth root of unity.
Goal: produce DFT F(A): evaluation of A at 1, w, w^2,...,w^{m-1}.
FFT(A, m, w)
{
  if (m==1) return vector (a_0)
  else {
    A_even = (a_0, a_2, ..., a_{m-2})
    A_odd  = (a_1, a_3, ..., a_{m-1})
    F_even = FFT(A_even, m/2, w^2)//w^2 is a m/2-th root of unity
    F_odd = FFT(A_odd, m/2, w^2)
    F = new vector of length m
    x = 1
    for (j=0; j < m/2; ++j) {
      F[j] = F_even[j] + x*F_odd[j]
      F[j+m/2] = F_even[j] - x*F_odd[j]
      x = x * w
  }
  return F
}

Note : To apply this algorithm, ‘m’ should be a power of 2.We should pad
extra zeros to the polynomial to ensure that length of the array is a
power of 2. This algorithm returns an array of complex numbers which is
then used for point-wise multiplication.
Here is the tree of input vectors to the recursive calls of the FFT procedure. The initial invocation is for n = 8.

screen-shot-2017-02-23-at-5-25-53-pm

Inverse Fourier Transform

Once we perform point-wise multiplication on the Fourier Transforms of A and B and get C (an array of Complex Numbers), we need to convert it back to coefficient form to get the final answer.

We use the Inverse Fourier Transform to get the coefficient form of C (Don’t worry about the imaginary part of the complex number. Once the Inverse Fourier Transform is performed, we get an array of Complex Numbers with the imaginary part almost equal to 0. Hence ,it can be ignored.

What we have actually done so far (computing A(x) at 1, ω,ω2, …,ω{m-1}) can be represented  as-
Screen Shot 2017-02-23 at 7.04.09 pm.png
This can be represented in the matrix form as –

screen-shot-2017-02-23-at-7-01-17-pm

Lets use F to denote the matrix of primitive roots. In order to retrieve the coefficients, we need to get F-1(inverse of matrix F) , and multiply it with the y- matrix. It turns out that F-1 looks the same. In place of ω, the inverse Fourier Matrix uses ω-1.

if ω = a+ib   => ω-1 = a-ib
if ω = e2Πi/k  => ω-1 = e-2Πi/k

so, F-1
screen-shot-2017-02-23-at-7-09-33-pm

Based on the above observation, we can still apply FFT by replacing a with y, y with a, ωn with ω n−1 (that is, ωnn-1 ), and scaling the result by 1/n .

So, the final algorithm is:

    Let F_A = FFT(A, m, ω)                        // time O(n log n)
    Let F_B = FFT(B, m, ω)                        // time O(n log n)
    For i=1 to m, let F_C[i] = F_A[i]*F_B[i]      // time O(n)
    Output C = 1/m * FFT(F_C, m, ω-1). // time O(n log n)

I will be adding the full Java code for polynomial multiplication in a couple of hours.

References : 
           http://www.cs.cmu.edu/afs/cs/academic/class/15451-s10/www/lectures/lect0423.txt
           https://www.cs.princeton.edu/~wayne/teaching/fft.pdf
           https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-046j-design-and-analysis-of-algorithms-spring-2012/lecture-notes/MIT6_046JS12_lec05.pdf
           http://web.cs.iastate.edu/~cs577/handouts/polymultiply.pdf
           http://web.cecs.pdx.edu/~maier/cs584/Lectures/lect07b-11-MG.pdf
           https://blogs.scientificamerican.com/roots-of-unity/what-are-roots-of-unity/

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s