feat: use histogram samples for t-test analysis #155
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Justification:
Each sample in the histogram represents a durationPerOp value calculated by dividing a certain number of iterations of executing the function under test divided by the cumulative time of those runs. Which is the average of the execution time of each execution. opsSec and opsSecPerRun are then an average of the samples, which are themselves averages.
Therefore, using opsSecPerRun as a t-test inaccurately applying the calculation to an average of averages, when it is meant to be applied to a set of averages totalling a minimum of 30 samples, with 40 preferable.
In other words, it's a histogram entry that represents a valid t-test sample.
When repeatSuite > 1, the additional samples accumulate in the histogram.
This change also converts the forced override of the input options to a warning if the sample size is too small. Let the users pick whether they want minSamples: 30 or repeatSuite: 3. The code already had support for omitting the significance data if the low bar is not met.