Skip to content

Conversation

@Mattdl
Copy link
Collaborator

@Mattdl Mattdl commented Jan 9, 2026

Model evaluation is perfect for new contributions to check if they can reproduce results from a paper. However, we don't want this in every PR to trigger full evaluation of all models. Therefore it is disabled by default now and added to Model contributing documentation.

Reproducing results for models on a benchmark is now in a separate github workflow that can be manually triggered and is excluded by default to avoid bloating tests with the number of models being added.
@Mattdl Mattdl merged commit 44dbe7e into main Jan 9, 2026
2 checks passed
@Mattdl Mattdl deleted the refactor-contributions branch January 9, 2026 11:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants