Sesse__ 8 hours ago

> In my experience multi-run benchmarking frameworks which use non-parametric statistics should be the default tool of choice unless you know the particular benchmark is exceptionally well behaved.

Agreed. Do you have any suggestions? :-)

1
janwas 4 hours ago

I like taking the trimmed mean of 10-20 runs, or if a run is quick, the (half-sample) mode of more runs. See robust_statistics.h.