Reducing NLP Inference costs through model specialisation
This talk will discuss ways to reduce costs for NLP inference through a better choice of model, hardware, and model compression techniques.
This talk will discuss ways to reduce costs for NLP inference through a better choice of model, hardware, and model compression techniques.
NLP inference can be very expensive, requiring access to powerful GPUs. In this talk, Meryem discusses ways to reduce this cost by over 90% through better choice of model, hardware, and model compression techniques. This is an essential talk to go to for anyone looking to put NLP models into production.
Meryem Arik
Meryem is the co-founder of TitanML - TitanML is an NLP development platform that focuses on deployability of LLMs - allowing businesses to build smaller and cheaper deployments of language models with ease. The TitanML platform automates much of the difficult MLOps and Inference Optimisation science to allow businesses to build and deploy state-of-the-art language models with ease.
ComfyUI raised $30M to scale its open-source platform that gives creators granular, node-based control over AI-generated media, addressing the limitations of prompt-based tools. ComfyUI now serves over 4M users and has become essential infrastructure for production studios and creative agencies.
Cohere and Germany's Aleph Alpha are merging to create a $20 billion transatlantic AI powerhouse focused on sovereign AI solutions, targeting the $600 billion market for organizations seeking independence from dominant US and Chinese AI providers.
OpenAI released GPT-5.5, a model that achieves state-of-the-art performance across coding, knowledge work, and scientific research while preserving efficiency. GPT-5.5 marks progress toward OpenAI's vision of an AI "super app" combining ChatGPT, Codex, and browser capabilities.
Tomorrow marks the opening of Imagine Next — Silicon Valley’s Global Climate Tech Capital Summit, bringing together founders, investors, corporates, and system leaders to accelerate planet-first innovation.
Loop raised $95 million in Series C funding led by Valor Equity Partners to scale its AI platform that transforms fragmented supply chain data into actionable intelligence, addressing operational and financial decision-making across logistics, finance, and enterprise workflows.
Data Phoenix is a live media platform for AI and Data professionals, covering technologies under the hood, best practices, and live demos from the builders shaping the industry, via original shows.
Copyright © 2026 Data Phoenix. Published with Ghost and Data Phoenix.
Privacy Policy | Terms of Service | Cookie Preferences
Comments