Firstly, if you turn on the cache before you main loop, the interfaces work largely as expected:

```
pyfftw.interfaces.cache.enable()
pyfftw.interfaces.cache.set_keepalive_time(30)
```

It’s interesting that despite wisdom that should be stored, the construction of the `pyfftw`

objects is still rather slow when the cache is off. No matter, this is exactly the purpose of the cache. In your case you need to make the cache keep-alive time quite long because your loop is very long.

Secondly, it’s not a fair comparison to include the construction time of the `fft_object`

in the final test. If you move it outside the timer, then calling `fft_object`

is a better measure.

Thirdly, it’s also interesting to see that even with cache turned on, the call to `numpy_fft`

is slower than the call to `scipy_fft`

. Since there is no obvious difference in the code path, I suggest that is caching issue. This is the sort of issue that `timeit`

seeks to mitigate. Here’s my proposed timing code which is more meaningful:

```
import numpy
import scipy
import pyfftw
import timeit
n1 = 144
n2 = 522720
pyfftw.config.NUM_THREADS = 4
pyfftw.config.PLANNER_EFFORT = 'FFTW_MEASURE'
Q_1 = pyfftw.empty_aligned([n1, n2], dtype='float64')
pyfftw.interfaces.cache.enable()
pyfftw.interfaces.cache.set_keepalive_time(30)
times = timeit.repeat(lambda: numpy.fft.fft(Q_1, axis=0), repeat=5, number=1)
print('NUMPY fastest time = ', min(times))
times = timeit.repeat(lambda: scipy.fft.fft(Q_1, axis=0), repeat=5, number=1)
print('SCIPY fastest time = ', min(times))
times = timeit.repeat(
lambda: pyfftw.interfaces.numpy_fft.fft(Q_1, axis=0), repeat=5, number=1)
print('PYFFTW NUMPY fastest time = ', min(times))
times = timeit.repeat(
lambda: pyfftw.interfaces.scipy_fftpack.fft(Q_1, axis=0), repeat=5, number=1)
print('PYFFTW SCIPY fastest time = ', min(times))
fft_object = pyfftw.builders.fft(Q_1, axis=0)
times = timeit.repeat(lambda: fft_object(Q_1), repeat=5, number=1)
print('FFTW PURE fastest time = ', min(times))
```

On my machine this gives an output like:

```
NUMPY fastest time = 0.6622681759763509
SCIPY fastest time = 0.6572431400418282
PYFFTW NUMPY fastest time = 0.4003451430471614
PYFFTW SCIPY fastest time = 0.40362057799939066
FFTW PURE fastest time = 0.324020683998242
```

You can do a bit better if you don’t force it to copy the input array into a complex data type by changing `Q_1`

to be `complex128`

:

```
NUMPY fastest time = 0.6483533839927986
SCIPY fastest time = 0.847397351055406
PYFFTW NUMPY fastest time = 0.3237176960101351
PYFFTW SCIPY fastest time = 0.3199474769644439
FFTW PURE fastest time = 0.2546963169006631
```

That interesting `scipy`

slow-down is repeatable.

That said, if your input is real, you should be doing a real transform (for >50% speed-up with `pyfftw`

) and manipulating the resultant complex output.

What’s interesting about this example is (I think) how important the cache is in the results (which I suggest is why switching to a real transform is so effective in speeding things up). You see something dramatic also when you use change the array size to 524288 (the next power of two, which you think might perhaps speed things up, but not slow it down dramatically). In this case everything slows down quite a bit, `scipy`

particularly. It feels to me that `scipy`

is more cache sensitive, which would explain the slow down with changing the input to `complex128`

(522720 is quite a nice number for FFTing though, so perhaps we should expect a slowdown).

Finally, if speed is secondary to accuracy, you can always use 32-bit floats as the data type. If you combine that with doing a real transform, you get a better than factor of 10 speed-up over the initial `numpy`

best given above:

```
PYFFTW NUMPY fastest time = 0.09026529802940786
PYFFTW SCIPY fastest time = 0.1701313250232488
FFTW PURE fastest time = 0.06202622700948268
```

(numpy and scipy don’t change much as I think they use 64-bit floats internally).

Edit: I forgot that the Scipy’s `fftpack`

real FFTs have a weird output structure, which `pyfftw`

replicates with some slowdown. This is changed to be more sensible in the new FFT module.

The new FFT interface is implemented in pyFFTW and should be preferred. There was unfortunately a problem with the docs being rebuilt so the docs were a long time out of date and didn’t show the new interface – hopefully that is fixed now.

CLICK HERE to find out more related problems solutions.