I think you can try SAM3

I have collected some videos and used the project for generation, but I found that the actual results were not particularly good. The YOLO model makes errors when localizing objects in certain frames, and there doesn’t seem to be a way to manually correct them—or perhaps I just haven't found the option. Recently, I noticed an open-source model called SAM3, which does an excellent job at image segmentation. However, its recognition efficiency isn’t very high and it can’t achieve real-time processing. On an RTX 4070 Ti, it takes about 0.2 seconds per image. But we could try extracting keyframes for prediction. Bro, do you think this would be feasible?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

I think you can try SAM3 #97

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

I think you can try SAM3 #97

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions