This week in AI: January 21-26

Every Saturday, we bring together a selection of the week's most relevant headlines on AI. Each weekly selection covers startups, market trends, regulation, debates, technology, and other trending topics in the industry.

The Federal Trade Commission (FTC) issued 6(b) Orders to Alphabet, Amazon, Anthropic, Microsoft and OpenAI

The FTC will look into the companies' corporate partnerships and investments so it can gain a better understanding of those relationships, with an emphasis on how those relationships may affect the competitive landscape. The compulsory 6(b) Orders require the five companies to provide information on recent investments and partnerships with generative AI companies and cloud service providers. New technologies always bring the potential for innovation, healthy market growth, and competition. However, when the technology and resources are concentrated in the hands of a few dominant players, innovation can be stunted, and fair competition is undermined. The FTC's investigation was motivated by the multi-billion partnership between Microsoft and OpenAI and similar ones between Google, Anthropic, and Amazon.

OpenDialog raised over $8M to keep bringing generative AI automation to regulated industries

OpenDialog recently closed a successful Series A funding round led by AlbionVC and counting with the support from Dowgate Capital, raising over $8 million. The investment will enable OpenDialog to continue developing its conversational AI platform for the healthcare and insurance industries. In the longer term, OpenDialog plans to become "the go-to platform for companies navigating generative AI in regulated industries."

DXwand, a startup developing a conversational AI platform that serves the MENA region, raised $4M in Series A funding

DXwand develops conversational AI tools to help businesses in the Middle East and North Africa region to deploy automated customer service and employee assistance solutions. The company recently completed a Series A funding round led by Shorooq Partners (UAE) and Algebra Ventures (Cairo), with the participation of the Dubai Future District Fund. DXwand CEO Ahmed Mahmoud formerly worked at Microsoft, where he realized that Silicon Valley's AI providers didn't cater to Arabic or any other language in the MENA region. Thus, he decided to bridge this gap and launched DXwand. The startup's chatbot platform currently serves over 40 users in the MENA region from sectors including healthcare, e-commerce, fintech, telecom, government, and legal. The startup plans to destine the funding into expanding its research efforts, developing new partnerships, and expanding into the African and Saudi Arabian markets.

Lumiere: a space-time diffusion model that generates videos featuring coherent, realistic, and diverse motion

Researchers from Google, Weizmann Institute of Science, and Tel Aviv University introduced Lumiere, a space-time diffusion model for video generation. Lumiere supports text-to-video and image-to-video generation, and it can generate video in the style of a single reference image. Moreover, it can be used for text-based video editing, animate the contents in a region specified by the user, and perform video inpainting from a masked source. One of the innovations powering Lumiere is the Space-Time U-Net architecture that enables it to generate the full temporal extension of the video in a single pass. This method results in increased accuracy and consistency, compared with other existing methods that generate distant keyframes and then run temporal super-resolution models to fill out the gaps between keyframes.

ElevenLabs closes $80M Series B round and releases new AI-powered voice products

The startup's $80M Series B round was co-led by Andreessen Horowitz, Nat Friedman, and Daniel Gross. Sequoia Capital, Smash Capital, SV Angel, BroadLight Capital, and Credo Ventures participated in the funding round. ElevenLabs is counting on the Series B funding to help establish the company's position as an industry leader in AI voice research and product development. Moreover, ElevenLabs also unveiled three new AI voice products: the Dubbing Studio workflow, the Voice Library marketplace, and the Mobile App reader. The Dubbing Studio is a solution providing users who need to dub entire video files with fine-grained control over the transcripts, translations, and timecodes. The Voice Library lets users earn money by offering AI versions of their voices in a secure platform. The Mobile App reader instantly converts texts and URLs into audio, helping users interact with content in a different medium as needed. The app will be subject to an introductory 3-month free trial, and users can already register for early access.

AiDash raised $50M to continue making critical infrastructure climate-resilient and sustainable

AiDash is an enterprise SaaS company that uses satellites and AI to make essential infrastructure more weather-resistant and sustainable. The oversubscribed Series C round brought the company's total raised funding to $83 million. The Series C was led by Lightrock, and partner Ashish (Ash) Puri will join the AiDash board of directors as part of the negotiations. SE Ventures and the company's previous investors also participated in the funding round. AiDash is currently experiencing a surge in demand for its services, including its Intelligent Vegetation Management System™, which helps with the optimization of maintenance cycles to the utility grid and prevents damage by identifying vegetation risks, and its Intelligent Sustainability Management System™ (ISMS™), a tool used by companies in the path towards complying with the UK's Biodiversity Net Gain laws, a set of mandatory legislation introduced in the Environment Act 2021. Thus, AiDash is counting on this funding to fuel its planned doubling of its current team over the next two years, the establishment of its Europe headquarters, and the ability to meet the internationally growing demand.

Send AI hopes to end manual data entry work with the help of the €2.2M raised from Gradient Ventures

Send AI is an Amsterdam-based startup funded by Thom Trentelman (CEO) and Philip Weijschede (COO). The startup specializes in services that enable companies to accurately and automatically scan documents to extract relevant information, especially from lengthy, complex documents and low-resolution images or scans. To power its solutions, Send AI favors its small open-source and locally-trained models over the usual proprietary giants such as GPT-4. This choice boosts data privacy since it guarantees that sensitive customer data will remain on-premises. Open-source models also enable Send AI to offer a fast cost-effective service that allows its customers to train their models. The startup will use the funding to strengthen its team, continue developing its models, and help it prepare for international expansion.

Oracle finally revealed its Cloud Infrastructure Generative AI service

The Oracle Cloud Infrastructure (OCI) Generative AI service is now generally available for enterprises looking to harness the latest generative AI technologies. The OCI Generative AI is a fully managed, API-accessible service integrating Meta's Llama 2 and Cohere's LLMs into diverse business use cases. The OCI service also supports over 100 languages, features an upgraded GPU cluster management experience, and includes adjustable fine-tuning options, including retrieval-augmented generation techniques. Oracle also announced the beta stage of its OCI Generative AI Agents service. The service is meant to help customers interact with their enterprises' data sources in natural language, requiring no additional specialist skills. While the initial beta works with OCI OpenSearch, Oracle will work to expand the service's capabilities and add support to several data search and aggregation tools.

Google enhanced its ads solutions with a Gemini-powered chat that helps customers deploy better Search campaigns

Gemini, Google's natively multimodal LLM, is powering the Google Ads conversational experience as the first of several upcoming Gemini integrations across the company's ecosystem. The Google Ads conversational experience is currently in the beta access phase for English-language US and UK customers. The service will roll out internationally in the coming weeks. Advertisers can now rely on the Gemini-powered experience to help them develop effective Search campaigns, complete with creatives and keywords. In the coming months, this conversational experience will be enhanced with image suggestions based on images currently featured on the announcers' landing pages. AI-generated material will be watermarked using SynthID, and their metadata will include information clearly labeling the images as AI-generated. According to Google's data, customers using the new conversational experience are 42% more likely to publish campaigns rated "Good" or "Excellent" according to Google's metric.

An MIT study concluded that AI is too expensive to replace human workers at the moment

The study looked at how cost-effective it would be to replace humans with AI by evaluating the cost and performance of computer vision systems in 1,000 visually assisted tasks drawn from 800 occupations, including teaching, baking, and property appraisement. The results show that AI can replace only 23% of workers' wages while remaining cost-efficient. Researchers were also inclined to conclude that cost-effective computer vision is still years away, even with a 20% yearly price drop. The timely release of this study coincides with increased concerns about whether AI is out to steal people's jobs. It was not long ago, for instance, that Sam Altman and Marc Benioff weighed in on the topic in a panel at the 2024 World Economic Forum meeting. Although both shared vastly different predictions for the future, neither Altman nor Benioff touched on how costly (and resource-intensive) it would be for AI to replace humans.

RagaAI emerged from stealth by announcing the public release of its comprehensive platform for AI testing

The RagaAI testing platform focuses on improving AI quality and safety. According to the company, these are two essential roadblocks to overcome on the path to the AI revolution. Thus, the RagaAI testing platform offers over 300 tests that diagnose a battery of model, data, and operational issues. The RagaAI platform is powered by RagaAI DNA, the company's AI testing-specific foundational models. The platform was built to be multimodal and can perform duties as diverse as debugging LLM-based AI applications and evaluating computer vision tasks for several sectors. A snapshot of the platform's impressive capabilities can be found here.

MLflow 2.10.0 has been released

MLflow 2.10 introduces several new features and improvements that enable enhanced support for current and future deep learning use cases, broader GenAI application support, and quality-of-life improvements for the MLflow Deployments Server. MLflow also revamped its website, which aggregates more content than previous versions. The complete list of noteworthy upgrades, bug fixes, and documentation updates can be found here.

MLCommons announced the MLPerf Client Working Group, its new effort to build ML benchmarks

The MLPerf™ Client working group will build ML benchmarks for several client hardware systems (laptops, desktops, and workstations) running Microsoft Windows and other operating systems. The first working group benchmark targets the Llama 2 models. This contribution will use MLCommons past work on the challenges faced by clients incorporating Llama 2-based workloads as a springboard. Initial participants in the working group include representatives from multiple industry-leading giants, including AMD, Arm, ASUSTeK, Dell Technologies, Intel, Lenovo, Microsoft, NVIDIA, and Qualcomm Technologies. Co-chairs Ramesh Jaladi, Senior Director of Engineering in the IP Performance group at Intel; Yannis Minadakis, Partner GM, Software Development at Microsoft; and Jani Joki, Director of Performance Benchmarking at NVIDIA, were drawn from this pool of participants that also includes Vinesh Sukumar, Senior Director, AI/ML Product Management at Qualcomm, who will be leading a benchmark development task force within the MLPerf™ Client working group.

Hugging Face and Google will join forces in open AI collaboration

Hugging Face has announced its partnership with Google Cloud to enable companies to build their own AI applications using the latest models from Hugging Face and Google Cloud's cutting-edge cloud and hardware features. Hugging Face hosts over 1 million models, datasets, and AI applications based on Google's contributions to AI, including the original Transformer. Google has also contributed groundbreaking open-source tools, including JAX and Tensorflow, to the open-source community. Adding to these contributions, the partnership will enable Google Cloud customers to benefit from the platform's impressive hardware by letting them train and deploy Hugging Face models within the Google Kubernetes Engine (GKE) and Vertex AI. Likewise, Google Cloud will also enable additional experiences on the Hugging Face Hub throughout 2024, including the ability to deploy models for production on Google Cloud with Inference Endpoints.

OpenAI announces API updates and a new generation of embedding models

In a recent announcement, OpenAI launched a new generation of models, including two new embedding models, a GPT-4 Turbo preview model update, an updated GPT-3.5 Turbo model, and an updated text-moderation model. OpenAI also pointed out that, by default, data sent to its API will not be used for training and improvement purposes. The two embedding models feature enhanced performance and lower pricing: text-embedding-3-large is now OpenAI's best-performing model, while text-embedding-3-small is much more efficient than the previous generation text-embedding-ada-002 model and is priced accordingly at $0.00002 per 1K tokens. The GPT-3.5 Turbo update also received a price adjustment, with a 50% price reduction for input tokens and a 25% price reduction for output tokens. While the GPT-4 Turbo preview model was also updated, the general availability of the awaited GPT-4 Turbo with vision is still in the works. The API updates will enable developers to assign permissions to API keys from the API keys page and to leverage the usage dashboard and usage export function to expose metrics on a tracking-enabled API key level.