Using memory with LLM applications in production
This talk will review how you can use Metal to give your LLM chatbots both short and long-term memory.
This talk will review how you can use Metal to give your LLM chatbots both short and long-term memory.
When you implement memory for LLM chatbots, they can recall user interactions and learn from them over time. But getting this to work in production can be challenging. In this session, we'll review how you can use Metal to give your LLM chatbots both short and long-term memory, allowing them to support more complex queries and powerful retrieval augmentation in production.
Speaker:
Sergio Prada is the co-founder and CTO at Metal. Prior to joining Metal, Sergio worked in machine learning at Meta, developer tools and engineering at Datadog, and has spent +10 years in enterprise software.
PhysicsX, a London-based AI engineering startup, has raised $300M at a $2.4B valuation to scale its physics simulation platform across industries like aerospace, semiconductors, and automotive.
Suno raised $400 million at a $5.4 billion valuation—more than doubling its worth in seven months—despite facing copyright lawsuits from Universal Music Group and Sony alleging unauthorized use of over 61,000 copyrighted works in its AI training data.
OpenAI expanded Codex with six role-specific plugins for jobs like sales and investment banking, a Sites feature for sharing work as hosted interactive webpages, and inline Annotations for targeted edits, as non-developer users grow three times faster than developers on the platform.
Inherent emerged from stealth with a $50M seed round co-led by Index Ventures and Radical Ventures to develop Faraday, an AI system designed to reimagine scientific discovery by enabling open-ended human-AI collaboration on unsolved research problems.
XCENA raised $135 million at a $570 million valuation to commercialize its MX1 chip, which places compute capabilities directly inside memory modules to eliminate the costly data relay between CPUs, GPUs, and DRAM that bottlenecks every AI inference request.
Data Phoenix is a live media platform for AI and Data professionals, covering technologies under the hood, best practices, and live demos from the builders shaping the industry, via original shows.
Copyright © 2026 Data Phoenix. Published with Ghost and Data Phoenix.
Privacy Policy | Terms of Service | Cookie Preferences
Comments