feat: add t-test mode for statistical significance testing #133

RafaelGSS · 2025-12-02T20:08:13Z

Add ttest option to Suite that automatically sets repeatSuite=30
Implement Welch's t-test for comparing benchmark results
Display significance stars (*, **, ***) based on p-values

The t-test compares 30 independent runs of each benchmark to determine if performance differences are statistically significant, helping identify real improvements vs. random variance.

README.md

lib/index.js

jdmarshall · 2025-12-15T20:07:56Z

lib/reporter/text.js

 			text += styleText(["blue", "bold"], `${result.timeFormatted} total time`);
 		}

 		// TODO: produce confidence on stddev


Should this todo be cleared?

lib/reporter/text.js

jdmarshall · 2025-12-15T20:11:19Z

lib/utils/ttest.js

@@ -0,0 +1,218 @@
+// Welch's t-test implementation for benchmark comparison


This might warrant a bibliography entry or three. What reference materials did you use?

I thought about including some references, but that's actually well spread out there. I think it's just a matter of context.

Well, my challenge to you is to keep in the back of your head if you ever see a particularly good article or video to come back here and add a link.

Or maybe just the Wikipedia link if nothing else fits the bill.

- Add ttest option to Suite that automatically sets repeatSuite=30 - Implement Welch's t-test for comparing benchmark results - Display significance stars (*, **, ***) based on p-values - Add T-Test Mode indicator in reporter output - Update TypeScript definitions with ttest and ReporterOptions - Add comprehensive tests for t-test utilities - Add statistical-significance example demonstrating the feature - Update documentation with usage and interpretation guide The t-test compares 30 independent runs of each benchmark to determine if performance differences are statistically significant, helping identify real improvements vs. random variance. Signed-off-by: RafaelGSS <rafael.nunu@hotmail.com>

RafaelGSS · 2025-12-16T21:47:12Z

PTAL @jdmarshall

RafaelGSS · 2025-12-17T14:19:24Z

PTAL @H4ad @jdmarshall - Planning to land it today.

H4ad

It's the same implementation of the one we have for node? If so, great to have it here.

RafaelGSS · 2025-12-17T16:25:45Z

It's the same implementation of the one we have for node? If so, great to have it here.

Pretty much it. But instead of comparing binaries, we compare benchmark.fn.

jdmarshall · 2025-12-17T21:05:51Z

Oops. LGTM

jdmarshall · 2025-12-19T00:26:45Z

Holy cow is this a lot slower. Should I be adjusting the run count?

jdmarshall reviewed Dec 15, 2025

View reviewed changes

jdmarshall mentioned this pull request Dec 15, 2025

Update regressions summary to eliminate insignificant results cobblers-children/faceoff#12

Closed

RafaelGSS force-pushed the add-ttest-feature branch from f10f65a to a8c60b5 Compare December 16, 2025 21:45

RafaelGSS requested a review from jdmarshall December 16, 2025 21:47

Merge branch 'main' into add-ttest-feature

53474b2

H4ad approved these changes Dec 17, 2025

View reviewed changes

RafaelGSS merged commit 53e20aa into main Dec 17, 2025
6 checks passed

RafaelGSS deleted the add-ttest-feature branch December 17, 2025 16:25

github-actions bot mentioned this pull request Dec 17, 2025

chore(main): release 0.14.0 #142

Merged

		@@ -0,0 +1,218 @@
		// Welch's t-test implementation for benchmark comparison

Uh oh!

feat: add t-test mode for statistical significance testing #133

feat: add t-test mode for statistical significance testing #133

Uh oh!

Conversation

RafaelGSS commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jdmarshall Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jdmarshall Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

RafaelGSS Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jdmarshall Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

RafaelGSS commented Dec 16, 2025

Uh oh!

RafaelGSS commented Dec 17, 2025

Uh oh!

H4ad left a comment

Choose a reason for hiding this comment

Uh oh!

RafaelGSS commented Dec 17, 2025

Uh oh!

Uh oh!

jdmarshall commented Dec 17, 2025

Uh oh!

jdmarshall commented Dec 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

RafaelGSS commented Dec 2, 2025 •

edited

Loading

RafaelGSS Dec 16, 2025 •

edited

Loading