As per the central limit theorem one can approximate Gaussian with a repeated convolution with any function, box blur being most obvious candidate here. And box blur can be computed quickly with a summed area table.
> a repeated convolution
I really wonder what's the field of reference of "quickly" there. To me convolution is one of the last resort techniques in signal processing given how expensive it is (O(size of input data * size of convolution kernel)). It's of course still much faster than gaussian blur which is still non-trivial to manage at a barely decent 120fps even on huge Nvidia GPUs but still.
How are we supposed to think about SIMD in Big-O? Because this is still linear time if the kernel width is less than the max SIMD width (which is 16 I think on x64?)