SLEP025: Losing Accuracy in Scikit-Learn #96

lorentzenchr · 2025-12-07T12:26:26Z

What

This SLEP is for finding a consensus on how to removing accuracy as a default metric for classifiers which currently is: classifiers.score.

Why

Because accuracy has many severe weaknesses and we even baked a probability threashold of 50% into it.

What else

I hope the irony of the title is not lost ~~by its passive aggressiveness~~.

This SLEP is based on @amueller's proposal scikit-learn/scikit-learn#28995.

@scikit-learn/core-devs @scikit-learn/communication-team @scikit-learn/contributor-experience-team @scikit-learn/documentation-team ping

jjerphan · 2025-12-07T14:05:56Z

Thank you for writing the SLEP.

While discussing the rectification of this design choice is appropriate (I really think it is valuable), I find the title a bit aggressive or at least detrimental to the intent of the proposal (improving scikit-learn's theoretical consistency): how about "Redefining default metrics"?

I think this could be implemented as part of scikit-learn 2.0.

reshamas · 2025-12-07T17:36:04Z

@lorentzenchr How about: Removing the accuracy metric in scikit-learn scoring

betatim · 2025-12-08T07:55:22Z

slep025/proposal.rst

+
+   The fact that different scoring metrics focus on different things, i.e. ``predict``
+   vs. ``predict_proba``, and not all classifiers provide ``predict_proba`` complicates
+   a unified choice.


Do we need to choose the same metric for all classifiers?

I think the answer is yes because people will use the results of est1.score(X, y) and est2.score(X, y) to evaluate which one is the better estimator. It seems very hard to educate people that they can't compare scores from different estimators

(This is almost a rhetorical question, but I wanted to double check my thinking)

Given your assumption that users will continue to compare score results of different estimators, and given that a generally satisfying metric does not exist, the conclusion is to remove the score method.

My currently best choice for a general classifier metric is the skill score (R2) variant of the Brier score. Classifiers and regressors would then have the same metric, which is nice.

SLEP 25 Killing Accuracy

d5e5765

lorentzenchr mentioned this pull request Dec 7, 2025

Add "scoring" argument to estimator's score method scikit-learn/scikit-learn#28995

Open

Change title

30e5d52

lorentzenchr changed the title ~~SLEP025: Killing Accuracy in Scikit-Learn~~ SLEP025: Losing Accuracy in Scikit-Learn Dec 7, 2025

betatim reviewed Dec 8, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

SLEP025: Losing Accuracy in Scikit-Learn #96

SLEP025: Losing Accuracy in Scikit-Learn #96

Uh oh!

lorentzenchr commented Dec 7, 2025 •

edited

Loading

Uh oh!

jjerphan commented Dec 7, 2025 •

edited

Loading

Uh oh!

reshamas commented Dec 7, 2025

Uh oh!

betatim Dec 8, 2025

Uh oh!

lorentzenchr Dec 8, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

SLEP025: Losing Accuracy in Scikit-Learn #96

Are you sure you want to change the base?

SLEP025: Losing Accuracy in Scikit-Learn #96

Uh oh!

Conversation

lorentzenchr commented Dec 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

What else

Uh oh!

jjerphan commented Dec 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

reshamas commented Dec 7, 2025

Uh oh!

betatim Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

lorentzenchr Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

lorentzenchr commented Dec 7, 2025 •

edited

Loading

jjerphan commented Dec 7, 2025 •

edited

Loading

lorentzenchr Dec 8, 2025 •

edited

Loading