-
-
Notifications
You must be signed in to change notification settings - Fork 38
Open
Description
I have collected some videos and used the project for generation, but I found that the actual results were not particularly good. The YOLO model makes errors when localizing objects in certain frames, and there doesn’t seem to be a way to manually correct them—or perhaps I just haven't found the option. Recently, I noticed an open-source model called SAM3, which does an excellent job at image segmentation. However, its recognition efficiency isn’t very high and it can’t achieve real-time processing. On an RTX 4070 Ti, it takes about 0.2 seconds per image. But we could try extracting keyframes for prediction. Bro, do you think this would be feasible?
Metadata
Metadata
Assignees
Labels
No labels