Reducing NLP Inference costs through model specialisation
This talk will discuss ways to reduce costs for NLP inference through a better choice of model, hardware, and model compression techniques.
This talk will discuss ways to reduce costs for NLP inference through a better choice of model, hardware, and model compression techniques.
NLP inference can be very expensive, requiring access to powerful GPUs. In this talk, Meryem discusses ways to reduce this cost by over 90% through better choice of model, hardware, and model compression techniques. This is an essential talk to go to for anyone looking to put NLP models into production.
Meryem Arik
Meryem is the co-founder of TitanML - TitanML is an NLP development platform that focuses on deployability of LLMs - allowing businesses to build smaller and cheaper deployments of language models with ease. The TitanML platform automates much of the difficult MLOps and Inference Optimisation science to allow businesses to build and deploy state-of-the-art language models with ease.
Freeform raised $67 million in Series B funding to scale its AI-powered metal 3D-printing platform, with its next-generation Skyfall system set to expand capacity 25x and produce thousands of kilograms of parts daily starting in 2026.
Anthropic raised $30 billion in Series G funding at a $380 billion valuation—more than doubling from $183 billion—driven by $14 billion in run-rate revenue and rapid adoption of Claude Code, which now accounts for 4% of all GitHub commits worldwide.
German synthetic data startup simmetry.ai raised €330,000 from NBank to expand its platform that generates photorealistic, annotated synthetic data for training computer vision models in agriculture, food production, and industrial applications.
OpenAI disbanded its mission alignment team, which communicated the company's mission to employees and the public, and reassigned its leader Josh Achiam to a new "chief futurist" role. Achiam described the new role as involving "studying how the world will change in response to AI".
Benchmark Capital raised $225 million through two special vehicles to invest in AI chipmaker Cerebras Systems' $1 billion Series H. backing the company's massive wafer-scale chips that compete with Nvidia.
SF Bay Area media and education platform focused on AI and Data. As a voice of AI industry, Data Phoenix delivers news, practical knowledge, and helps companies be heard in the community.
Copyright © 2026 Data Phoenix. Published with Ghost and Data Phoenix.
Comments