There are multiple approaches available to modify the built-in API to be well-adjusted for multiple situations. Here I have demonstrated a couple of them (except the number 2) until the results satisfy me.
- Default API only
- Default API + Faster RCNN ( not Included )
- Default API + YOLOv8
- Default Model + Custom model
- Default Model + Custom model + Image Preprocessing
- FINAL MODEL --- Default Model + Image Preprocessing
Test videos with different purposes ( Downloadable sources)
The strength of the API lies in being able to handle objects in scale variations, occlusion and complex backgrounds. I have selectively choosen the test videos that can demonstrate those capacities.
- people.mp4 >> small object detection
- market.mp4 >> occlusion + clutter background
- race_car.mp4 >> perspective change + rapid motion
- boat.mp4 >> scaling
- smallest.mp4 >> scaling ( optimized by image preprocessing )
- rabbit.MOV >> Similarity with the background , occlusion ( can only be detected by the final model )
- The built-in API performs great in most situation, however I failed to track my bunny in self-made rabbit.MOV which leads me to research into possbile upgrades.
- Most commonly researched method for CSRT tracker combination is the default tracker + Faster RCNN. However, that is left out in this project due to the heaviness of the model and training requirement. *Coupling the tracker with pre-trained YOLOv8 model showed no signigicant improvement ( This couple did not pass the rabbit.MOV and oeople.mp4 videos) even though it slowed down the frame rate.
- Using custom model but adjusting the parameters: shows dis-satisfactory results. In my model, even though it can detect the rabbit.MOV well, it failed to track the phone in the people.mp4 and got fixed on the background scene. *Swapping between the default model and custom model: does not perform as expected as well as the main problem is the model not recognizing the object is lost.
- Histogram normalization enhances contrast and reduces sensitivity to lighting variations, ensuring consistent features for the CSRT tracker.
- Edge detection highlights object boundaries, improving initial target definition and distinguishing the object from background clutter.
- Together, these preprocessing steps improve the tracker's accuracy, reduce drift, and ensure better feature extraction, enabling more robust and reliable object tracking.