Meta launches its first open-source AI model Llama 3.2

Meta launches its first open-source AI model Llama 3.2

Meta has introduced Llama 3.2, the first open-source AI model designed to process both images and text. This new development comes just two months after the company released a major AI model. The innovative features of Llama 3.2are expected to help developers build more advanced AI applications.


Llama 3.2 Can Process Both Images and Text

The model’s capabilities include real-time video understandingvisual search engines that categorize images based on content, and document analysis tools that summarize long text passages. Meta has designed Llama 3.2 to be developer-friendly with minimal setup required.

Meta’s VP of Generative AI, Ahmad Al-Dahle, stated that developers only need to "integrate this new multimodality and allow Llama to communicate using images."

This feature brings Meta on par with competitors like OpenAI and Google, who introduced their multimodal models last year. The addition of vision support in Llama 3.2 is a strategic move as Meta continues to enhance AI capabilities in devices like its Ray-Ban Meta glasses.

The model consists of two vision models with 11 billion and 90 billion parameters, respectively, and two lightweight text-only models with 1 billion and 3 billion parameters. These smaller models are designed to run on Qualcomm, MediaTek, and other Arm-based hardware.

Despite the launch of Llama 3.2, its predecessor Llama 3.1, released in July, still plays a role. The older model has a 405 billion-parameter version, which theoretically offers superior text generation capabilities compared to the new version.

Comments