Weekly AI Highlights Review: October 15–22
Two AI voice startups closed funding rounds, Manifest is using AI to combat the loneliness epidemic, Galileo raised $45M in Series B funding, the creators of Outlines started dottxt, Google upgraded its Shopping and NotebookLM products, Mistral launched two tiny models, and more.
It sometimes seems like AI-powered audio processing and generation do not receive as much attention as other AI applications. It may be that audio features an unfortunate combination of, on the one hand, not having the wow factor that AI-generated video enjoys, and on the other, bringing its own set of difficulties to overcome, including complexities involved with processing speech at (near) real-time speed accurately and generating natural-sounding speech, especially across several languages.
This week, two startups, one working on AI-powered speech processing and the other on AI-powered generation, announced successful fundraisers. Gladia, a startup working on audio transcription and analytics, announced it raised $16 million in a Series A round led by XAnge with participation from investors including Illuminate Financial, XTX Ventures, and Athletico Ventures. In parallel, Gladia launched its Real-Time API, a product capable of automating the transcription of over 100 languages with an ultra-low latency (under 300 ms). Optionally, the Real-Time API can enhance transcriptions with sentiment analysis, named entity recognition, and summarization, all under a second.
According to Gladia, the Real-Time API is a significant first step toward eliminating the compromise between speed and accuracy, and the need to leverage multiple solutions depending on whether the focus is on speed or accurate transcription.
Similarly, Neuphonic is tackling the AI speech challenge in the opposite direction. The company is working on an ultra-low latency engine that can generate natural-sounding, language-agnostic, AI-generated speech. Neuphonic has secured €3.5 million in pre-seed funding from Moonfire VC, with participation from Tiny VC, Salica's Oryx Fund, and Cur8 Capital. Instead of waiting until an LLM is done generating text to start generating speech, Neuphonic's engine works incrementally, resulting in an under 25-millisecond latency.
The startup recently launched two models for speech generation, one focused on speed, featuring an ultra-low latency, and another aimed at balancing higher-quality generations and speech. With its offerings, Neuphonic aims to improve customer experience in voice AI applications such as gaming, digital avatars, conversational AI, and real-time translation.
Gladia and Neuphonic were two of several startups announcing new funding this week, with companies like Shift Bioscience, which shared it secured $16 million in seed funding to advance its AI-powered cell simulation platform, which supports the startup's mission to identify gene activations that can safely trigger cell rejuvenation and serve as the basis for treatments for age-related diseases. Other funding-related news this week include:
- Manifest, an AI-powered wellness app designed to combat loneliness among young adults, offers empathetic responses, personalized affirmations, and bite-sized meditation sessions. Manifest has emerged from stealth with $3.4 million in funding as it scales operations to meet growing demand.
- Galileo, a leading AI evaluation platform, has secured $45 million in Series B funding to expand its Evaluation Intelligence Platform which aims to improve AI accuracy and trustworthiness for teams worldwide.
- Lidwave has closed a successful $10 million seed funding round that will enable it to continue developing its FCR™ technology, a novel approach to LiDAR based on light's coherence. With its FCR™ technology, Lidwave expects to unlock scalable production that makes LiDAR a mainstream technology.
- dottxt, founded by the creators of the popular Outlines library, has raised $11.9 million. dottxt focuses on developing advanced structured generation solutions for large language models, which ensure more reliable and predictable AI outputs.
- Lightmatter secured $400 million in Series D funding to accelerate the deployment of its Passage product. Passage uses photonics technology to overcome chip interconnection bottlenecks in AI and HPC workloads, potentially revolutionizing data center performance and efficiency.
In addition to funding rounds, there were several interesting product launches and updates:
- Just a week after Amazon announced it would enhance its shopping experience with AI-powered Shopping Guides, Google has revamped its Shopping product, infusing it with AI-powered features, and launching integrations with other related experiences, such as its virtual try-on tool.
- Mistral AI has launched two new compact models, Ministral 3B and 8B (Les Ministraux), which outperform comparably sized models, feature long context windows, and offer competitive pricing for several edge and on-device applications ranging from on-device translation and offline assistants to robotics.
- Google recently announced that it no longer considers NotebookLM an 'Experimental' product. The company updated NotebookLM to incorporate controls that enable users to customize the Audio Overviews based on their Notebooks. Google also announced an upcoming NotebookLM Business product.
- Lastly, although not exactly a product launch in the traditional sense, X has updated its Terms and Privacy Policy, suggesting the platform is considering licensing user content to third parties, including for AI model training. The platform also updated its Terms of Service, committing anyone accepting them to paying liquidated damages if caught scraping X's content.
Closing this week's roundup are three pieces of news about research and compliance:
- LatticeFlow launched COMPL-AI, an open-source framework for EU AI Act compliance, offering technical interpretations and evaluations of major AI models while addressing the gap between regulatory requirements and practical implementation. The framework was launched with an accompanying technical report detailing how LatticeFlow mapped the regulatory requirements into technical benchmarks and a series of evaluation results for some of the most popular open and closed-source foundation models.
- Google DeepMind researchers developed the Habermas Machine (HM), an AI system designed to mediate group discussions by generating consensus statements. The research team found strong endorsement of the AI-generated statements and a shift in the groups' opinion to the majority view after the AI-mediated caucus process.
- Meta's FAIR has released several new open-source AI models and research tools, including SAM 2.1, an updated checkpoint for its popular image segmentation model, Spirit LM for seamless speech-text integration, and some additional tools for LLM acceleration, cryptography, materials discovery, and cross-lingual processing.