Skip to content

Conversation

@phoevos
Copy link
Member

@phoevos phoevos commented Dec 2, 2025

Add the following model version tags when logging a model to MLflow:

  • model_uri: The URI of the model artifact
  • model_type: The type of the model (e.g. 'medcat_snomed')
  • validation_status: The validation status of the model (e.g. 'pending')

Add the following model version tags when logging a model to MLflow:
* model_uri: The URI of the model artifact
* model_type: The type of the model (e.g. 'medcat_snomed')
* validation_status: The validation status of the model (e.g. 'pending')

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
@phoevos phoevos added the enhancement New feature or request label Dec 2, 2025
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Copy link
Member

@baixiac baixiac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just make sure that versions[0] will always return the most recently saved model version.

(versions[0] had been used prior to this PR and should have been tested)

model_name: str,
model_manager: ModelManager,
validation_status: str = "pending",
model_type: Optional[str] = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All CMS models have the ModelType, hence there's no need to make the argument optional and mlflow.set_tag() will not set a None value.

@phoevos
Copy link
Member Author

phoevos commented Dec 5, 2025

@baixiac so search_model_versions does happen to return the versions in descending order, but I'm not sure that's guaranteed since the behaviour is not explicitly documented on the client side. It seems there's an order_by param we could use to explicitly request descending ordering by version. The syntax for it is super unintuitive and they have no examples anywhere, but I think I figured it out with a lot of trial and error. I'll add it for the pre-existing call too

Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
Copilot AI review requested due to automatic review settings December 5, 2025 18:06
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds version tags to registered models in MLflow to improve model tracking and discovery. The changes introduce three new tags: model_uri, model_type, and validation_status that are attached to model versions when logging models.

Key changes:

  • Added _set_model_version_tags helper method to standardize tag setting across model registration flows
  • Updated save_model and save_pretrained_model methods to accept model_type parameter and set version tags
  • Modified all trainer implementations to pass model type information when saving models

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
app/management/tracker_client.py Added _set_model_version_tags static method and updated save_model/save_pretrained_model to set version tags including model_uri, model_type, and validation_status
app/trainers/metacat_trainer.py Updated save_model call to include model type from model service
app/trainers/medcat_trainer.py Updated save_model calls (2 locations) to include model type from model service
app/trainers/medcat_deid_trainer.py Updated save_model call to include model type from model service
app/trainers/huggingface_ner_trainer.py Updated save_model calls (2 locations) to include model type from model service
app/trainers/huggingface_llm_trainer.py Updated save_model call to include model type from model service
app/cli/cli.py Updated save_pretrained_model call to pass model_type parameter
tests/app/monitoring/test_tracker_client.py Enhanced tests to verify version tags are set correctly; added mock setup for pretrained model test

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 183 to 190
mlflow.set_tag.has_calls(
[
call("training.output.package", "file.zip"),
call("training.output.model_uri", artifact_uri),
call("training.output.model_type", "model_type"),
],
any_order=False,
)
Copy link

Copilot AI Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect assertion method. Should be assert_has_calls instead of has_calls. The current code will not actually perform the assertion, allowing the test to pass even if the calls were not made.

Copilot uses AI. Check for mistakes.
@phoevos phoevos merged commit 19eb765 into main Dec 8, 2025
7 checks passed
@phoevos phoevos deleted the feature-phoevos-model-version-tags branch December 8, 2025 12:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants