Subscribe to Our Newsletter

Success! Now Check Your Email

To complete Subscribe, click the confirmation link in your inbox. If it doesn't arrive within 3 minutes, check your spam folder.

Ok, Thanks
Meta wants the open-source Llama 3.1 405B to compete with heavyweights like GPT-4 and Claude 3.5 Sonnet
Credit: Meta

Meta wants the open-source Llama 3.1 405B to compete with heavyweights like GPT-4 and Claude 3.5 Sonnet

Llama 3.1 405B is likely the largest open LLM to be launched recently. Recommended for applications like synthetic data generation and model distillation, the 405B model is complemented by a refresh to the 70B and 8B models with a 128K-token context window and updated capabilities.

Ellie Ramirez-Camara profile image
by Ellie Ramirez-Camara

As part of its continued commitment to openly accessible AI, Meta has released Llama 3.1 405B under a permissive license. With this release, Meta has made one of the largest open models in recent times broadly available. The model is available for download on Hugging Face and llama.meta.com and for access on the cloud service providers in Meta's network of partner platforms, including AWS, NVIDIA; Google Cloud, Microsoft Azure, and more.

Given the hardware requirements for deploying a model of this size, Meta advises developers to consider Llama 3.1 405B is best suited for tasks such as model distillation and synthetic data generation. Meta even modified its model license to allow model training using Llama models' outputs. General purpose workloads, including chatbots and assistants, are proposed as ideal use cases for Meta's smaller models, which got a refresh that enhances them with some of Llama 3.1 405B's novel features. Meta's 8B and 70B share the 405B model's 128K-token context window, and updated multilingual, tool use, and reasoning capabilities. The 8B and 70B are available on the same channels as Llama 3.1 405B.

Llama 3.1 405B was trained using a refined training dataset built from the one used to train the previous iteration of the Llama models. The company also claims it has balanced the model's training with synthetic data, although it refuses to share more details. The model is evaluated against GPT-4, GPT-4o, and Claude 3.5 Sonnet on several benchmarks testing general knowledge, mathematics, coding, tool use, reasoning, and long-context and multilingual capabilities, with mixed results. Overall, it comes across as a performant model practically on par with its closed-source competitors, consistent with Meta's statements about the model.

Along with the Llama 3.1 405B model launch, Meta continues to advance its generative AI strategy by building a full-fledged platform around its models, much like Anthropic and OpenAI have done recently. Meta's Llama System is centered around the idea of responsible AI development. In its current iteration, it is a reference system including sample applications and other safety tools, such as the multilingual safety model Llama Guard 3, and Prompt Guard, a prompt injection filter. However noble the goal of democratizing access to open AI may be, Meta is still a for-profit entity, much like any other of the top contenders in the market. As such, it's in the company's best interest that open access to AI becomes synonymous with the Meta brand name.

Ellie Ramirez-Camara profile image
by Ellie Ramirez-Camara
Updated

Data Phoenix Digest

Subscribe to the weekly digest with a summary of the top research papers, articles, news, and our community events, to keep track of trends and grow in the Data & AI world!

Success! Now Check Your Email

To complete Subscribe, click the confirmation link in your inbox. If it doesn’t arrive within 3 minutes, check your spam folder.

Ok, Thanks

Read More