Get the latest tech news
Segment Anything 2: Demo-First Model Development
Don't bother keeping absolutely still: This vision model has memory now! Covering SAM 2 with Nikhila Ravi of Facebook AI Research, and special returning guest host Joseph Nelson of Roboflow
[00:23:53] Joseph Nelson: But then for querying those embeddings, we do that client side, in the browser, so that the user can very quickly, you know, you can move your mouse over and you get the proposed candidate masks that Sam found for that region of the image. [00:27:27] Joseph Nelson: Here, I have a bunch of images, and there's a number of ways that I could annotate things, like I could prompt a large multimodal model with like grounding capabilities, you know, you could outsource it, or I can do manual labeling. [00:52:43] Joseph Nelson: The way I kind of see the field continuing to progress, the problem statement of computer vision is making sense of visual input.
Or read this on Hacker News