This project is a detailed and comprehensive tutorial of Definition of each component🤗, and also a very useful tutorial for using pretrained model in huggingFace🤗. It includes 2 models, all of these models have full code implement + pdf-tutorial :
- Self-defined Model using
ResNet-50backbone, processor ofDetrand customized Multi-task Head and loss to make classification, boundary box prediction and segmentation (24.7M parameters). - Fine-Turned Model using
detr-resnet-50-panopticto do the same task (42.9M parameters).
Above 2 full-trained models can be found at 🤗:
- https://huggingface.co/FriedParrot/fish-segmentation-model
- https://huggingface.co/FriedParrot/fish-segmentation-simple
Both models are trained by A Large Scale Fish Dataset.
Note that for the convenience, I also made a copy version for this dataset available on hugging face.
Note
Actually at start I just wanted to fine-tune the DETR model for this task, but I made it using the resnet-50 backbone to do it while studying components (I admit it complicated the problem😅). So finally I implemented both😊.
The first approach may seem more complicated, but it provides a comprehensive view of the different components available in Hugging Face. By exploring these details, the guide becomes more in-depth and accessible for newcomers(including myself) who want to understand the full workflow and flexibility of the library, also it's friendly for those who used to making models on pytorch ecosystem.
For those who looking for fine-tune the DETR model directly, or just using pretrained-model, you can also find fine-tuning code for DETR at Fish_Simple Folder.
- Fish_Pretrain : The first model which use resnet-50 backbone and all customized model and multi-task head.
- Fish_Finetune : The second model which using fine-tuned detr-resnet-50-panoptic model to train on fish dataset.
- tutorials : All pdf tutorials for both models.
if you have cuda12.8, you can install the requirements by:
pip install -r requirements.txtor you can install requirements according to your cuda version and requirements.txt file.
hf auth login For kagglehub token, you can find it at https://www.kaggle.com/<your_account>/account and then create a new API token. After that, put the kaggle.json file into ~/.kaggle/ folder.
run the Fish_Pretrain > model_building.py to train the first model.
run the Fish_Finetune > model_building.py to train the second model.
Note you may change the model upload repo_id and private_mode at before calling backup_model_to_hub() function.
run Fish_Pretrain > model_evaluation.py or Fish_Finetune > model_evaluation.py to evaluate the model and get picture result.
since the first model decrease mask resolution, so the mask is not very accurate:
training platform : RTX4060 8GB VRAM + cu126
Since this is not a model for very high accuracy training task, I didn't evaluate its accuracy very in detail.
Actually, the classification performance of first model is terrible at classification, segmentation and bounding box prediction.
(Note you can modify the loss weight of classification for better result).
Reference Classification Accuracy : 79.33% (714/900) (the set used for test is not entirely test set)
training platform : RTX4090 + 48GB VRAM + cu128
This model is fine-tuned from detr-resnet-50-panoptic model, so the performance is very good.
Reference Classification Accuracy : 100.00% (450/450)

