Vertex AI positions itself as an enterprise-ready generative AI solution with new models and capabilities
Google Cloud's Vertex AI platform has enhanced its enterprise-ready generative AI offering with new models like Gemini 1.5 Flash and Imagen 3, third-party model integrations, cost-saving features, and advanced data grounding options to boost performance.
The latest update to Vertex AI introduces powerful new models and features designed to meet the diverse needs of businesses across various industries, solidifying Vertex AI's position as a leading enterprise-ready generative AI solution.
The general availability of Gemini 1.5 Flash leads the new model announcements, a compact yet performant model that boasts a 1 million-token context window, low latency, and competitive pricing. Reportedly, Gemini 1.5 Flash is 40% faster than GPT-3.5 Turbo at processing 10,000-character inputs and is 4 times more affordable when context caching is enabled for inputs larger than 32,000 characters. Early adopters like Uber Eats, Ipsos, and Jasper.ai report impressive performance gains after experimenting with Gemini 1.5 Flash. Joining Gemini 1.5 Flash is Gemini 1.5 Pro, a model with a boundary-pushing 2 million-token context window that enables analysis of extensive code bases, research libraries, and multimedia content.
On the media generation front, Google will preview Imagen 3, its latest image generation model, in Vertex AI. Imagen 3 delivers high-quality generations and features improvements over Imagen 2, such as 40% faster image generations, enhanced prompt understanding and instruction-following, more realistic generations of challenging concepts such as groups of people, and improved text rendering within images. Imagen 3 supports multiple languages and several aspect ratios and is delivered with safety features including SynthID digital watermarking. To round up its model offerings in Vertex AI, Anthropic's Claude 3.5 Sonnet is now available on Vertex AI. Later, Mistral Small, Large, and Codestral will be added to the Vertex AI Model Garden, as will be Gemma 2 9B and 27B.
Enterprise-focused features include context caching, which can reduce input costs by 75%, and provisioned throughput for predictable performance at scale. Google is also expanding its data residency guarantees to meet stringent data sovereignty requirements. From all the enterprise-focused updates, the general availability of Grounding with Google Search is the standout. Grounding with Google Search is a function that allows businesses to augment Gemini outputs with fresh, high-quality information using Google Search.
This capability and the upcoming third-party data grounding service, which lets users ground their AI agents with third-party data from sources including Moody’s, MSCI, Thomson Reuters, and Zoominfo, aim to maximize factuality and minimize hallucinations in AI-generated content. Finally, Grounding with high-fidelity, currently in experimental preview, unlocks use cases related to grounding AI agents with proprietary data including summarization across multiple documents, data extraction against a set corpus of financial data, or processing across a predefined set of documents. The feature is powered by a fine-tuned version of Gemini 1.5 Flash designed to use customer-provided data only.
With these enhancements, Vertex AI is positioning itself as a comprehensive, scalable, and trustworthy platform for organizations looking to harness the full potential of generative AI while maintaining control, efficiency, and data sovereignty in their AI initiatives.