Meta Takes a Strong Step Forward in the AI Race, Unveiling Its SAM 2 Video Editor

Meta’s new model, an evolution of the 2023 version, allows you to detect and track any element in a photo or video.

SAM 2, Apple's new tool to edit videos
No comments Twitter Flipboard E-mail

Meta has unveiled the Segment Anything Model 2 (SAM 2), its new AI model for identifying and segmenting elements in images and videos.

Why it matters. Meta’s new tool, an evolution of SAM, the 2023 version, improves how users interact with photos and videos before uploading them to Instagram or WhatsApp.

Context.

  • SAM was already available in Instagram features like Backdrop and Cutouts.
  • The updated version identifies and tracks objects in a video in real-time, which is much easier and faster.
  • Meta has released SAM 2 under an open-source license, allowing developers and businesses to use it to build their own apps.

Key features.

  • Accurate identification. SAM 2 can identify any object in a photo or video with a single click.
  • Real-time tracking. Once SAM 2 identifies the object, the model follows it throughout the video, even if it temporarily disappears from the frame.
  • Adaptability. It works with objects and scenes it has never seen before, even if they aren’t in its training data, making it very versatile.
  • Interactivity. It allows the user to refine the results with additional prompts, increasing the level of control.

Watch the video below to see SAM 2 in action. Pay attention to how the model selects the bike, even when the rider partially covers it.

Potential uses. There are several use cases that Meta had in mind when introducing this model:

  • Video editing: Remove backgrounds or add special effects to objects easier.
  • Medical: Analyze medical images or videos of surgery.
  • Marine research: SAM has helped segment sonar images of coral reefs.
  • Security: Improve surveillance and threat detection.
  • Mixed reality: Enhance interactive experiences like those offered by Quest 3.

The bottom line. SAM 2 represents a quantum leap in computer vision. It promises to democratize complex video processing and visual analysis tasks. Because of its open-source nature, we’ll see a flood of creative apps based on this technology.

This article was written by Javier Lacort and originally published in Spanish on Xataka.

Image | Meta

Related | Meta Has Unveiled the ‘World’s Largest and Most Capable’ AI Model, Surpassing OpenAI’s GPT-4o in Various Aspects

Home o Index