Table of Contents¶

1 Floating point error propagation in polynomial multiplication with Fast-Fourier Transform

1.1 Requirements

1.2 Examples

1.2.1 Naive interpolation values

1.2.2 Chebyshev nodes as interpolation values

1.2.3 Benchmark

1.3 Conclusion

Floating point error propagation in polynomial multiplication with Fast-Fourier Transform¶

Simple question: when using the FFT (or any Fourier transform methods) for multiplying two polynomials, how to deal with floating point error propagation?

In this Jupyter notebook, I will try to understand this phenomena.

Requirements¶

The numpy.polymul and numpy.polyfit functions.

import numpy as np
np.version.full_version

'1.12.1'

Examples¶

Let consider the polynomials $P(X) = 1 + X + X^3$ and $Q(X) = X^2 + X^5$, of degrees $n=3$ and $m=5$:

P = [      1, 0, 1, 1]
n = len(P) - 1
P, n

([1, 0, 1, 1], 3)

Q = [1, 0, 0, 1, 0, 0]
m = len(Q) - 1
Q, m

([1, 0, 0, 1, 0, 0], 5)

Their product is $(PQ)(X)$, of degree $n+m=8$: $$ \begin{align*}(PQ)(X) &= (1 + X + X^3) (X^2 + X^5) \\ &= X^2 + X^5 + X^3 + X^7 + X^5 + X^8 \\ &= X^2 + X^3 + 2 X^5 + X^7 + X^8 \end{align*} $$

PQ = polymul(P, Q)
d = len(PQ) - 1
PQ, d

(array([1, 0, 1, 2, 0, 1, 1, 0, 0]), 8)

If we evaluate both $P(X)$ and $Q(X)$, on $n+m$ different points, $\lambda_i \in \mathbb{R}$ or $\in\mathbb{N}$, then we can fit a polynomial of degree $n+m = \delta(PQ)$ on these sampling points, and by uniqueness (thanks to the Fundamental Theorem of Algebra), it will be equal to $(PQ)(X)$.

The fit can be obtained, for instance, by Lagrange interpolation, which is not so efficient but easy to implement. Here, I will simply use the numpy.polyfit function.

assert d == n + m

Naive interpolation values¶

Let us consider the points $\lambda_i = i$, $i=0,\dots,n+m$.

lambdas = np.arange(0, d + 1)
lambdas

array([0, 1, 2, 3, 4, 5, 6, 7, 8])

values_P = np.polyval(P, lambdas)
values_P

array([  1,   3,  11,  31,  69, 131, 223, 351, 521])

values_Q = np.polyval(Q, lambdas)
values_Q

array([    0,     2,    36,   252,  1040,  3150,  7812, 16856, 32832])

values_PQ = values_P * values_Q
values_PQ

array([       0,        6,      396,     7812,    71760,   412650,
        1742076,  5916456, 17105472])

PQ_sampled = np.polyfit(lambdas, values_PQ, d)
PQ_sampled

array([  1.00000000e+00,  -5.06843863e-12,   1.00000000e+00,
         2.00000000e+00,   2.43215915e-09,   9.99999993e-01,
         1.00000001e+00,  -9.77014560e-09,   5.62310258e-09])

PQ

array([1, 0, 1, 2, 0, 1, 1, 0, 0])

np.asarray(np.round(PQ_sampled), dtype=int)

array([1, 0, 1, 2, 0, 1, 1, 0, 0])

Ok, at least it seems to work!

But we saw that even with very small degrees ($n=3, m=5$), floating point errors were not so small on these wrongly chosen points $\lambda_i = i$, $i=0,\dots,n+m$. The largest "should be 0" value (i.e., $\simeq 0$) value was:

np.max(np.abs(PQ_sampled)[np.abs(PQ_sampled) < 0.9])

9.7701456003317433e-09

Chebyshev nodes as interpolation values¶

Let us consider the points $\lambda_k = \cos\left(\frac{2k-1}{2d} \pi\right)$, $k=1,\dots,1+d=n+m+1$. These are called the Chebyshev nodes.

lambdas = np.cos(np.pi * (2 * np.arange(1, 2 + d) - 1) / (2 * d))
lambdas

array([ 0.98078528,  0.83146961,  0.55557023,  0.19509032, -0.19509032,
       -0.55557023, -0.83146961, -0.98078528, -0.98078528])

values_P = np.polyval(P, lambdas)
values_P

array([ 2.92424164,  2.40629924,  1.72705159,  1.20251551,  0.79748449,
        0.27294841, -0.40629924, -0.92424164, -0.92424164])

values_Q = np.polyval(Q, lambdas)
values_Q

array([ 1.86948796,  1.08874542,  0.36158742,  0.03834284,  0.03777763,
        0.25572914,  0.29393801,  0.05439157,  0.05439157])

values_PQ = values_P * values_Q
values_PQ

array([ 5.46683454,  2.61984727,  0.62448014,  0.04610786,  0.03012707,
        0.06980086, -0.11942679, -0.05027096, -0.05027096])

PQ_sampled2 = np.polyfit(lambdas, values_PQ, d)
PQ_sampled2

/usr/local/lib/python3.5/dist-packages/ipykernel_launcher.py:1: RankWarning: Polyfit may be poorly conditioned
  """Entry point for launching an IPython kernel.

array([  1.21880049e+00,   1.55720018e-14,   5.62399019e-01,
         2.00000000e+00,   2.73500613e-01,   1.00000000e+00,
         9.45299877e-01,  -2.11823147e-16,   1.70937883e-03])

PQ

array([1, 0, 1, 2, 0, 1, 1, 0, 0])

np.asarray(np.round(PQ_sampled2), dtype=int)

array([1, 0, 1, 2, 0, 1, 1, 0, 0])

Ok, at least it seems to work!

But we saw that even with very small degrees ($n=3, m=5$), floating point errors were not so small on these points: the largest "should be 0" value (i.e., $\simeq 0$) value was:

np.max(np.abs(PQ_sampled2)[np.abs(PQ_sampled2) < 0.9])

0.56239901918986446

Benchmark¶

Stupidly, let us check if our naive implementation of $(P, Q) \mapsto PQ$ by evaluation-and-interpolation is more or less efficient than numpy.polyfit:

def mypolymul(P, Q):
    n = len(P) - 1
    m = len(Q) - 1
    d = n + m
    lambdas = np.cos(np.pi * (2 * np.arange(1, 2 + d) - 1) / (2 * d))
    values_P = np.polyval(P, lambdas)
    values_Q = np.polyval(Q, lambdas)
    values_PQ = values_P * values_Q
    PQ_sampled = np.polyfit(lambdas, values_PQ, d)
    # return PQ_sampled
    return np.asarray(np.round(PQ_sampled), dtype=int)

np.polymul(P, Q)

array([1, 0, 1, 2, 0, 1, 1, 0, 0])

mypolymul(P, Q)

array([1, 0, 1, 2, 0, 1, 1, 0, 0])

import warnings
warnings.simplefilter('ignore', np.RankWarning)

%timeit np.polymul(P, Q)
%timeit mypolymul(P, Q)

33 µs ± 995 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
181 µs ± 7.68 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Of course, our implementation is slower. But on small polynomials, not so slower.

What about larger polynomials?

def random_polynomial(d=10, maxcoef=1):
    return np.random.randint(low=-maxcoef, high=maxcoef+1, size=d+1)

P = random_polynomial()
Q = random_polynomial()
P, Q
%timeit np.polymul(P, Q)
np.polymul(P, Q)
%timeit mypolymul(P, Q)
mypolymul(P, Q)
assert np.all(np.polymul(P, Q) == mypolymul(P, Q))

(array([ 1,  1, -1, -1,  0,  1,  1,  0,  1,  1,  0]),
 array([-1,  0, -1,  0,  0, -1,  1,  0, -1,  1,  1]))

27.4 µs ± 1.67 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

array([-1, -1,  0,  0,  1, -1, -1,  1, -3, -2,  1,  0, -1, -3,  0,  3,  0,
        0,  2,  1,  0])

350 µs ± 7.41 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

array([-1, -1,  0,  0,  1, -1, -1,  1, -3, -2,  1,  0, -1, -3,  0,  3,  0,
        0,  2,  1,  0])

On a larger example:

d = 100
maxcoef = 1
%timeit np.polymul(random_polynomial(d=d, maxcoef=maxcoef), random_polynomial(d=d, maxcoef=maxcoef))
%timeit mypolymul(random_polynomial(d=d, maxcoef=maxcoef), random_polynomial(d=d, maxcoef=maxcoef))

P, Q = random_polynomial(d=d, maxcoef=maxcoef), random_polynomial(d=d, maxcoef=maxcoef)
assert np.all(np.polymul(P, Q) == mypolymul(P, Q))

48.6 µs ± 1.36 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
11.9 ms ± 1.7 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)

/usr/local/lib/python3.5/dist-packages/ipykernel_launcher.py:7: DeprecationWarning: elementwise == comparison failed; this will raise an error in the future.
  import sys

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-138-c9d7506c5aaa> in <module>()
      5 
      6 P, Q = random_polynomial(d=d, maxcoef=maxcoef), random_polynomial(d=d, maxcoef=maxcoef)
----> 7 assert np.all(np.polymul(P, Q) == mypolymul(P, Q))

AssertionError:

d = 10
maxcoef = 3
%timeit np.polymul(random_polynomial(d=d, maxcoef=maxcoef), random_polynomial(d=d, maxcoef=maxcoef))
%timeit mypolymul(random_polynomial(d=d, maxcoef=maxcoef), random_polynomial(d=d, maxcoef=maxcoef))

P, Q = random_polynomial(d=d, maxcoef=maxcoef), random_polynomial(d=d, maxcoef=maxcoef)
assert np.all(np.polymul(P, Q) == mypolymul(P, Q))

38.9 µs ± 2.57 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
369 µs ± 15.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-139-49289ccd0af4> in <module>()
      5 
      6 P, Q = random_polynomial(d=d, maxcoef=maxcoef), random_polynomial(d=d, maxcoef=maxcoef)
----> 7 assert np.all(np.polymul(P, Q) == mypolymul(P, Q))

AssertionError:

d = 10
maxcoef = 50
%timeit np.polymul(random_polynomial(d=d, maxcoef=maxcoef), random_polynomial(d=d, maxcoef=maxcoef))
%timeit mypolymul(random_polynomial(d=d, maxcoef=maxcoef), random_polynomial(d=d, maxcoef=maxcoef))

P, Q = random_polynomial(d=d, maxcoef=maxcoef), random_polynomial(d=d, maxcoef=maxcoef)
assert np.all(np.polymul(P, Q) == mypolymul(P, Q))

36 µs ± 1.21 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
452 µs ± 62.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-140-f5f8ae306925> in <module>()
      5 
      6 P, Q = random_polynomial(d=d, maxcoef=maxcoef), random_polynomial(d=d, maxcoef=maxcoef)
----> 7 assert np.all(np.polymul(P, Q) == mypolymul(P, Q))

AssertionError:

Our method is slower. And wrong.

That's sad.

Conclusion¶

Implementing naively the multiplication of 1D polynomials with evaluation-and-interpolation does not work.

It is slower that the FFT based method (available in any numerical computation environment), e.g., numpy.polymul in Python with NumPy.
And it does not work. Booum!

Thanks for reading!

See this repo on GitHub for more notebooks, or on nbviewer.jupyter.org.

That's it for this demo! See you, folks!