Skip to content

Segmentation model with resnet50-backbone and fine-tuned DETR model on A Large Fish Dataset (Code + pdf tutorial)

License

Notifications You must be signed in to change notification settings

FRIEDparrot/fish-segmentation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Fish Segmentation Task

Introduction

This project is a detailed and comprehensive tutorial of Definition of each component🤗, and also a very useful tutorial for using pretrained model in huggingFace🤗. It includes 2 models, all of these models have full code implement + pdf-tutorial :

  1. Self-defined Model using ResNet-50 backbone, processor of Detr and customized Multi-task Head and loss to make classification, boundary box prediction and segmentation (24.7M parameters).
  2. Fine-Turned Model using detr-resnet-50-panoptic to do the same task (42.9M parameters).

Above 2 full-trained models can be found at 🤗:

  1. https://huggingface.co/FriedParrot/fish-segmentation-model
  2. https://huggingface.co/FriedParrot/fish-segmentation-simple

Both models are trained by A Large Scale Fish Dataset.

Note that for the convenience, I also made a copy version for this dataset available on hugging face.

Note

Actually at start I just wanted to fine-tune the DETR model for this task, but I made it using the resnet-50 backbone to do it while studying components (I admit it complicated the problem😅). So finally I implemented both😊.

The first approach may seem more complicated, but it provides a comprehensive view of the different components available in Hugging Face. By exploring these details, the guide becomes more in-depth and accessible for newcomers(including myself) who want to understand the full workflow and flexibility of the library, also it's friendly for those who used to making models on pytorch ecosystem.

For those who looking for fine-tune the DETR model directly, or just using pretrained-model, you can also find fine-tuning code for DETR at Fish_Simple Folder.

Project Structure

  • Fish_Pretrain : The first model which use resnet-50 backbone and all customized model and multi-task head.
  • Fish_Finetune : The second model which using fine-tuned detr-resnet-50-panoptic model to train on fish dataset.
  • tutorials : All pdf tutorials for both models.

How to run the code

1. Install the requirements

if you have cuda12.8, you can install the requirements by:

pip install -r requirements.txt

or you can install requirements according to your cuda version and requirements.txt file.

2. login you hugging face account and also make sure you have correct kagglehub token

hf auth login 

For kagglehub token, you can find it at https://www.kaggle.com/<your_account>/account and then create a new API token. After that, put the kaggle.json file into ~/.kaggle/ folder.

3. Train the model

run the Fish_Pretrain > model_building.py to train the first model. run the Fish_Finetune > model_building.py to train the second model.

Note you may change the model upload repo_id and private_mode at before calling backup_model_to_hub() function.

run Fish_Pretrain > model_evaluation.py or Fish_Finetune > model_evaluation.py to evaluate the model and get picture result.

Example of inference

Example of First model

since the first model decrease mask resolution, so the mask is not very accurate:

training platform : RTX4060 8GB VRAM + cu126

batch0_sample0.png

Since this is not a model for very high accuracy training task, I didn't evaluate its accuracy very in detail.

Actually, the classification performance of first model is terrible at classification, segmentation and bounding box prediction.

(Note you can modify the loss weight of classification for better result).

Reference Classification Accuracy : 79.33% (714/900) (the set used for test is not entirely test set)

Example of Second model

training platform : RTX4090 + 48GB VRAM + cu128

This model is fine-tuned from detr-resnet-50-panoptic model, so the performance is very good.

img.png

Reference Classification Accuracy : 100.00% (450/450)

About

Segmentation model with resnet50-backbone and fine-tuned DETR model on A Large Fish Dataset (Code + pdf tutorial)

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages