This Week in AI: February 25 - March 2
Every weekend, we bring together a selection of the week's most relevant headlines on AI. Each weekly selection covers startups, market trends, regulation, debates, technology, and other trending topics in the industry.
Glean secured $200M at a $2.2B valuation
The company restated its commitment to remaining an industry-leading enterprise generative AI solution as it vowed to invest the raised funds in developing "the most secure, comprehensive, and intuitive enterprise AI platform of the future." Glean's AI-powered assistant offers enterprise users customized answers to their queries obtained through the company's search and RAG technologies and grounded in each organization's unique knowledge graph. In addition to recording groundbreaking levels of user engagement, Glean stated that Glean Assistant users perform about 14 daily queries, saving users 2 to 3 hours weekly on average. Glean's funding round was led by Kleiner Perkins and Lightspeed Venture Partners, with the participation of a selection of new, strategic, and existing investors.
Adobe announces experimental generative AI music generation and editing tool
Adobe's Project Music GenAI Control will enable users to create music from text prompts, with the possibility to edit the audio once generated so users can customize it to their exact specifications. The first step is to specify a text prompt such as "powerful rock," "happy dance," or "sad jazz" for the tool to generate music. Then, the fine-grained editing tools let users transform the generated audio by adjusting the tempo, intensity, and structure. Users can also build loops, extend the length of an existing clip, and even re-mix sections of an existing piece of music. Adobe is developing Project Music GenAI Control in collaboration with research teams from the University of California, San Diego, and the School of Computer Science, Carnegie Mellon University.
SambaNova introduced Samba-1, one of the first trillion-parameter generative AI model
Samba-1 was built using a composition of experts architecture, which means that Samba-1 is composed of a smaller number of expert models that are only queried if the prompt is relevant to those models, thus making inference faster and cost-effective. This is similar to the structure of the next-generation Gemini 1.5 models, and allegedly, it is also like GPT-4's architecture. The model is available as part of the SambaNova Suite platform, allowing for a fast and flexible deployment that addresses any concerns about data privacy, transparency, and security that enterprise customers may have when looking into generative AI solutions.
Lightricks launches LTX Studio, an AI-powered filmmaking studio
LTX Studio is a holistic platform that assists filmmakers throughout the creation cycle, from ideation to final editing. Currently, the web-based tool can be accessed via waitlist registration only, although the company plans to make the tool publicly available free of charge soon. LTX Studio can suggest a script, storyboard, and characters in response to any prompt containing the user's idea. The suggested storyboard will show different scenes cut into shots. Scenes can be modified via prompting to customize features such as style, weather, or location, while individual shots can be tailored using the incorporated shot editor. Characters can be modified, added, and removed using the dedicated tab. The LTX Studio launch is one of Lightrick's first steps towards shifting from a consumer-only company to catering to media creation professionals.
Jamix raised $3M to join the enterprise AI assistant landscape
The pre-seed funding round was led by Audacious Ventures, with the participation of some angel investors. The startup cites the issues stemming from adopting ChatGPT in workplaces that have led multiple businesses and enterprises to consider implementing a ban on ChatGPT as motivation. Jamix boldly recognizes that some big tech companies are already developing enterprise-grade AI assistants but is also confident that competition is still possible in this market. According to the startup, its capability to connect to apps, data, and APIs, which ChatGPT cannot do (at least not out-of-the-box), and its being model agnostic are essential differentiators from the competition. It is somewhat difficult to see these features as true differentiators, seeing the amount of companies and startups advertising similar solutions. However, it may stand its ground if it succeeds at offering a more affordable platform than the competition.
Elon Musk sues Open AI citing a "breach of contract" due to the company's profit pursuits
In short, Musk claims that OpenAI has betrayed its "Founding Agreement" by pursuing profit, especially given its close relationship with Microsoft. The alleged "Founding Agreement" is not an explicit contract, nor does it exist as a standalone document. Musk's legal team is arguing that the Agreement is memorialized in several places, such as OpenAI's Certificate of Incorporation and written communications between the founding parties, and that it somehow commits OpenAI to remain a non-profit and to open-source just about every piece of technology it develops. A widely circulated email from 2015 between Musk and Altman is being cited as evidence for the case, even though it makes no reference to OpenAI having to remain a non-profit, and it explicitly states that OpenAI's members would jointly decide what should be open-sourced and what shouldn't.
In another highly dramatic claim, Musk states that GPT-4's capabilities have been downplayed to prevent the model from being classified as an artificial general intelligence (AGI). By claiming that GPT-4 is an AGI, Musk is looking for the court to rule that GPT-4 falls beyond the OpenAI and Microsoft licensing agreement, and given that OpenAI's intended mission was to put AGI at the service of the common good, that OpenAI must open-source the technology behind GPT-4. Musk is also looking for restitution and damages, which seems unexpected for money granted as a donation rather than an investment.
Hugging Face and NVIDIA have released StarCoder 2, a family of code-generation open-access LLMs
StarCoder 2 is developed through a partnership between Hugging Face, ServiceNow, and recently, NVIDIA. Two StarCoder 2 models can be run on several modern consumer GPUs: a 3B parameter model trained by ServiceNow and a 7B model trained by Hugging Face. NVIDIA also built and trained a 15B parameter model. StarCoder 2 was trained in over 600 programming languages and can be quickly fine-tuned to create apps such as coding chatbots and personal assistants that can generate source code, workflows, and summarize text, among several other tasks. StarCoder 2 is available under the BigCode Open RAIL-M license, enabling royalty-free access and use. The license has been the source of criticism, given that it is not a truly open license since it restricts some applications.
GitHub Copilot Enterprise is now generally available
GitHub Copilot Enterprise is available via a $39 per user monthly subscription and has three core features as its selling point. First, Copilot Enterprise makes it easier for developers to navigate and understand code, which leads to several benefits, such as faster feature implementation, issue resolution, and code modernization. The assistant also integrates natively into github.com, letting users ask questions about the codebase in natural language while Copilot guides them to relevant documentation and possible resolutions. Finally, Copilot Enterprise can generate pull request summaries and analyze differences between requests, saving developers time. A beta feature in Copilot Enterprise also integrates Bing directly into the assistant's chat interface, enabling developers to complement their internal organization's knowledge base with outside information. Copilot Enterprise is available starting February 27 alongside the other available Copilot plans.
AI image editor Photoroom raised $43M at a $500M valuation
Initially, Paris-based Photoroom targeted professionals editing product pictures. However, the editor has started to gain traction amongst casual users, emerging as a competitor to consumer-focused editors such as Picsart and Pixelcut. Photoroom is available via mobile and web apps and can be accessed via API. The company reports that the mobile app boasts 150 million downloads and processes around 5 billion images yearly. The editor offers a suite of AI-powered tools, including a background remover and generator, an AI image enhancer, a profile picture maker, and several other tools. Balderton Capital led the funding round alongside new investor Aglaé, returning investor Y Combinator, and other undisclosed investors. Photoroom plans to use the funding to grow its team and further its research and development efforts.
Writer launched a trio of proprietary LLMs: Palmyra Small, Base, and Large
The Palmyra models are trained for business writing and marketing data tasks and can be fine-tuned with company data, including branding and style guidelines. The models can ingest several file formats and perform web searches to assist users in research, analysis, and data transformation tasks. Palmyra Large (20B) is already powering the AI generation tools for Writer users with Enterprise plans. Enterprise customers can also integrate Palmyra Large's capabilities into over 1oo third-party applications using the Writer API. On the other hand, customers with Teams plans can leverage Palmyra Small (128M) and Base (5B) free of charge. The latter models are also open-source and can be downloaded from Hugging Face. The Palmyra models are SOC 2 Type II, PCI, and HIPAA certified. Writer does not store proprietary data nor uses it for model training purposes.
Mistral AI releases Mistral Large, its new flagship model, and announces beta access for its chat-based assistant
Mistral Large is an advanced multilanguage text generation model that can be used for text understanding, transformation, and code generation, among other tasks. Mistral Large is fluent in English, French, Spanish, German, and Italian; it boasts a 32K-token context window, instruction-following capabilities, and native function calling. Mistral Large is available via La Plateforme, Mistral's European access point, and Azure. A second, smaller model, Mistral Small was released alongside Mistral Large. Mistral Small is optimized for latency and cost, outperforming Mixtral 8x7B and positioning itself as an intermediate between the flagship and open-weight models.
The Le Chat conversational interface was launched alongside Mistral Large and Small to showcase their capabilities. The conversational assistant can be powered by Mistral Large, Small, or Next (a prototype model designed to be brief and concise). The conversational interface is self-moderating, with the assistant pushing a warning whenever it detects that the conversation may lead to sensitive or controversial content. Mistral is also working on an enterprise-ready version of the assistant, Le Chat Enterprise. Interested users can gain access to Le Chat via Le Plateforme.
Tenstorrent was selected by Japan's Leading-edge Semiconductor Technology Center (LSTC) to build a new AI accelerator
Tenstorrent will license its RISC-V and Chiplet IP and collaborate with LSTC to co-design the chip powering the cutting-edge AI accelerator. Rapidus Corporation will also contribute to the development of this project by developing a Rapid and Unified Manufacturing Service (RUMS) to shorten the turnaround times for the semiconductor manufacturing process. Tenstorrent will co-develop the AI accelerator's chiplet using its Ascalon RISC-V CPU core technology. Given Tenstorrent and Rapidus' trajectories as industry leaders in their respective fields, it is no surprise that both companies were LSTC's top picks for the project.
NVIDIA launches AI workload-capable RTX500 and 1000 Professional Ada Generation Laptop GPUs
The new GPUs are based on the NVIDIA Ada Lovelace architecture. They will be available this spring in thin and highly portable workstations from NVIDIA partners Dell, HP, Lenovo, and MSI. All workstations, including an ADA Generation GPU, will include a neural processing unit (NPU) as a CPU component and an NVIDIA RTX GPU with Tensor Core. The NPU will help offload lighter AI tasks, while the GPU powers more demanding workflows. Media creation professionals and video conferencing and streaming service users can benefit from RTX 500 and 1000 to streamline daily tasks and benefit from the high-quality features available in conferencing and streaming platforms. Users looking to deploy heavier workloads, such as researchers, data scientists, and creatives needing to perform advanced renders, should look into the RTX 2000, 3000, 3500, 4000, and 5000 GPUs, as they offer enough compute power to, among other things, experiment with data science and AI model training and tuning.
Ideogram announces Ideogram 1.0 availability and Series A funding round
The new text-to-image model was trained from scratch and delivers remarkable photorealism, prompt adherence, and everyone's new favorite feature: state-of-the-art text rendering with an error rate lower than DALL-E. Additionally, a feature named "Magic Prompt" embellishes users' simple text prompts to help them generate creative images. The announcement showcases an impressive selection of generated images; however, upon closer inspection, even those are not exempt from odd fabric drapings, a two-stemmed pumpkin, or a four-eared cat. Regardless, even these unexpected generations are incredibly subtle and do not overshadow Ideogram's text rendering capabilities, the true star of the show.
Ideogram also announced that it secured $80 million in Series A funding led by Andreessen Horowitz. Existing investor Index Ventures and new investors Redpoint Ventures, Pear VC, and SV Angel also participated in the Series A round. As part of the negotiations, Martin Casado, General Partner at Andreessen Horowitz, will join Ideogram's board of directors.
Sridhar Ramaswamy is the new Snowflake CEO
Former CEO Frank Slootman decided to retire from this role but remains as Chairman of Snowflake's Board of Directors. Ramaswamy, formerly Snowflake's Senior Vice President of AI, is now the Chief Executive Officer and a member of the Board of Directors. Ramaswamy has pledged to help Snowflake customers "leverage AI to deliver massive business value" as he ushers the next stage for the company. Sridhar Ramaswamy joined Snowflake in May 2023 with the Neeva acquisition. Since joining Snowflake, Ramaswamy has been at the forefront of the company's AI strategy, leading the launch of Snowflake Cortex. This fully managed service lets customers analyze data and build AI applications within Snowflake. Before joining Snowflake, Ramaswamy co-founded Neeva in 2019.
Stack Overflow launches its Overflow API in partnership with Google
Stack Overflow plans to grant AI companies access to its knowledge base via the Overflow API. Its launch partner for this service is Google, which will conveniently be able to enrich Gemini using the Stack Overflow corpus and display Stack Overflow-verified answers in the Google Cloud console. Conversely, Stack Overflow will be able to further its AI-powered offerings after embarking on the AI path with the OverflowAI launch. As with other partnerships, this will be materialized by harnessing Gemini via Vertex AI. The companies plan to show the fruits of their collaboration at the upcoming Cloud Next conference. Google may even harness the partnership to keep developing its coding assistant, Codey.
Alibaba's Institute for Intelligent Computing introduced EMO, an AI system that generates realistic talking and singing videos from a single portrait
The AI system EMO, short for Emote Portrait Alive, turns reference portraits into videos where the subject realistically sings or talks. The research team has made the paper describing EMO available as a preprint. EMO can generate facial expressions and head positions congruent with a provided reference audio track. As a result, EMO represents a groundbreaking development in audio-driven video generation. Like many other breakthroughs in synthetic media generation, the EMO system is powered by a diffusion model trained on a substantial dataset of talking head videos extracted from several sources. Since it is a direct audio-to-video generation system, EMO bypasses the 3D model and shape blending stages. Researchers credit this feature as the key to capturing subtle nuances and quirks that result in more natural-looking videos. To address potential misuse of the technology, the researchers are looking into methods to detect synthetically generated videos.
Figure AI raised $675M at a $2.6B valuation and signed a collaboration agreement with OpenAI
Speculation about Figure's funding talks arose a month ago when a source close to the negotiation reported that Figure AI was looking to raise $500 million from several investors, including Microsoft and OpenAI. At $675 million, the final amount seems to have exceeded Figure's expectations. Moreover, the investors involved in the deal include some of the industry's key players, including NVIDIA, Jeff Bezos (via Bezos Expeditions), and Intel (via Intel Capital). The collaboration agreement between Figure and OpenAI will enable the development of the next generation of language-processing humanoid robots by combining Figure's robotics hardware and software expertise with OpenAI's research. The collaboration will be supported by Microsoft Azure, which will cover Figure AI's needs for AI training and infrastructure.
Brave launches Leo, the privacy-protecting assistant built into the browsers of Android devices
Brave Leo started as a desktop-based assistant with a chat interface built into the browser and the possibility to choose which model powers the experience. Users can leverage Leo for daily tasks such as question answering, translation, summarization, content creation, code generation, and more. Now, Brave is announcing the availability of the same experience across Android devices, where users can safely interact with AI to maximize their productivity. Leo supports Mixtral 8x7B, Claude Instant, and Llama 2 13B out-of-the-box. It is also a multilingual assistant fluent in English, French, German, Italian, and Spanish.
Because of its impressive features, Brave ships Leo with Mixtral as default, but users can always change the model or upgrade their service to Leo Premium's $14.99 subscription for higher rate limits. Leo guarantees users' privacy by proxying their queries, discarding conversations (for Brave-hosted models), not requiring an account for its free tier, and issuing unlinkable tokens so subscription information cannot be linked to usage patterns. Brave has also announced the upcoming availability of Leo for iOS devices.
Groq leverages Definitive Intelligence acquisition to launch GroqCloud
Groq is the company behind the LPU™ Inference Engine, the fastest language processing accelerator on the market. As a result, the LPU Inference Engine has become an end-to-end solution to run LLMs and other generative AI applications more efficiently. Groq recently acquired Definitive Intelligence, a company developing automated agents that perform deep analyses of customers' data, enabling them to unlock actionable insights. GroqCloud provides access to the LPU™ Inference Engine via a self-serve playground so customers can deploy AI applications while benefitting from Groq's speed. The Definitive Intelligence will enable the development of a Groq Systems business unit, which will be centered on the needs of the public sector and customers that require Groq's hardware for AI computing centers.
Enkrypt AI secured $2.35 million to continue its mission to facilitate safe, reliable, and compliant generative AI adoption
Enkrypt AI is building Sentry, a comprehensive solution that leads to reliable and compliant LLM adoption by looking out for common threats, such as unreliable outputs, prompt injections and related vulnerabilities, and data privacy issues. Research results and product updates are forthcoming, but Enkrypt AI has publicly acknowledged that none of its progress would have been possible without the support of an impressive list of investors, which includes BoldCap, Berkeley SkyDeck, and Kubera Venture Capital, among other participants.
FlowGPT, the platform supporting an open community of AI application creators, secured $10M in Pre-Series A funding
The funding round was led by Goodwater Capital, with participation from previous investor DCM, which helped FlowGPT come out of stealth. Launched in January 2023, the company has since attracted millions of monthly users from 110 countries and a user base that has developed over 100,000 AI applications leveraging a variety of LLMs, from GPT-4 and PaLM to lesser-known open-source models, such as Pygmalion. FlowGPT founders were motivated to realize their vision of an all-in-one AI app store and community after catching on to the fact that LLMs would democratize software development due to their ease of use and the fact that they can be leveraged using natural language prompts. FlowGPT has stated that it will invest the funding in growing its engineering and research teams and fostering the further development of its community.
NLX raised $12M in Series A funding round
NLX's Series A was led by Cercano, with participation from Thayer Ventures, HL Ventures, and other existing investors. NLX provides an enterprise-ready end-to-end AI platform that delivers customized multimodal conversational applications. Brands can deploy the applications throughout their organization and benefit from industry-leading models from Amazon, Google, Microsoft, and the open-source community. NLX's conversational applications already deliver exceptional customer experiences for several enterprise customers, including Red Bull and Copa Airlines.
Google is bringing AI to data analysis by connecting BigQuery and Gemini via Vertex AI
Vertex AI integration in BigQuery means users can now access Gemini 1.0 Pro using simple SQL statements or the embedded DataFrame API without leaving the BigQuery console. The integration enables the creation of data pipelines combining structured and unstructured data with generative AI models. Moreover, Google will expand support for the Gemini 1.0 Pro Vision model, meaning users can use SQL queries to analyze multimodal data, such as videos and images.
BigLake already unifies data lakes and warehouses under one framework, enabling users to store, share, and perform several tasks on unstructured data. The recent Vertex AI integration will unlock even more insights from the data thanks to Vertex AI's document processing and speech-to-text API integration into BigLake. These capabilities enable the creation of generative AI applications for tasks such as content generation, sentiment analysis, entity extraction, and more. Finally, the BigQuery vector search is coming out of the preview stage, unlocking safe and compliant use cases such as retrieval-augmented generation (RAG), semantic search, text clustering, and summarization.