Nova is Amazon's new family of multimodal foundation models
At re:Invent 2024, Amazon launched the Nova AI model family with multimodal capabilities, offering a range of models from text-only to multimodal and generative AI solutions, with features like fine-tuning and model distillation, aimed at enterprise use and AI-powered advertising.
Amazon's AWS launched Nova this Tuesday at the re:Invent 2024 conference. The release includes:
- Amazon Nova Micro, a text-only model;
- Amazon Nova Lite, Pro, and Premier—three foundation models that can process multimodal inputs (text, image, video) and output text;
- Amazon Nova Canvas, an image generation model; and
- Amazon Nova Reel, a video generation model.
All the Nova models are available in Amazon Bedrock except Premier, which is scheduled for a Q1 2025 launch. Nova Lite and Pro feature a 300K-token context window, while Micro supports 128K-token inputs; all models can generate up to 5K tokens as outputs. Moreover, these models reportedly support over 200 languages, although they are optimized for English, German, Spanish, French, Italian, Japanese, Korean, Arabic, Simplified Chinese, Russian, Hindi, Portuguese, Dutch, Turkish, and Hebrew.
To enhance their performance, Micro, Lite, and Pro can be fine-tuned using customer data and support model distillation, where a larger, more capable model transfers specific information to a smaller, faster, and more affordable model. Enterprise use cases are expected, so Amazon highlights the multimodal Nova models that excel at retrieval augmented generation. The company claims these models are the fastest in their classes and significantly more affordable than the leading models in the relevant categories.
The Canvas and Reel models are already powering Amazon's AI-based advertising solutions, making image and video-based advertisement campaigns accessible to all. The company claims that businesses using them advertise five times more products with twice as many images as those not using AI-powered ads.
In addition to announcing the Amazon Nova model family, Amazon shared it has ambitious plans to release two more models during 2025: a speech-to-speech model that achieves nuanced understanding of spoken language and enables conversational interactions and an any-to-any model that processes text, images, audio, and video as inputs and outputs.