Videos

Reducing NLP Inference costs through model specialisation

This talk will discuss ways to reduce costs for NLP inference through a better choice of model, hardware, and model compression techniques.

Dmitry Spodarets

· Jul 10, 2023

Reducing NLP Inference costs through model specialisation

NLP inference can be very expensive, requiring access to powerful GPUs. In this talk, Meryem discusses ways to reduce this cost by over 90% through better choice of model, hardware, and model compression techniques. This is an essential talk to go to for anyone looking to put NLP models into production.

Meryem Arik
Meryem is the co-founder of TitanML - TitanML is an NLP development platform that focuses on deployability of LLMs - allowing businesses to build smaller and cheaper deployments of language models with ease. The TitanML platform automates much of the difficult MLOps and Inference Optimisation science to allow businesses to build and deploy state-of-the-art language models with ease.

Comments

Prometheus raises $12B to build an AI to automate physical manufacturing processes

Jeff Bezos's physical AI startup Prometheus has raised $12B at a $41B valuation to build AI tools that automate the design and manufacturing of complex physical products.

Jun 16, 2026

by Ellie Ramirez-Camara

News

Niteshift raises $7M to build the cloud infrastructure layer for AI coding agents

Niteshift, founded by two Datadog veterans, has raised $7M to build a model-agnostic cloud infrastructure layer for AI coding agents, betting that enterprises will want to avoid vendor lock-in with the major AI labs.

Jun 10, 2026

by Ellie Ramirez-Camara

News

PhysicsX raises $300M Series C at $2.4B valuation to scale AI for engineering and manufacturing

PhysicsX, a London-based AI engineering startup, has raised $300M at a $2.4B valuation to scale its physics simulation platform across industries like aerospace, semiconductors, and automotive.

Jun 08, 2026

by Ellie Ramirez-Camara

News

Suno raised a $400M Series D at a $5.4B valuation despite ongoing lawsuits

Suno raised $400 million at a $5.4 billion valuation—more than doubling its worth in seven months—despite facing copyright lawsuits from Universal Music Group and Sony alleging unauthorized use of over 61,000 copyrighted works in its AI training data.

Jun 03, 2026

by Ellie Ramirez-Camara

News

Codex now boasts plugins for white-collar work and other new features for Enterprise users

OpenAI expanded Codex with six role-specific plugins for jobs like sales and investment banking, a Sites feature for sharing work as hosted interactive webpages, and inline Annotations for targeted edits, as non-developer users grow three times faster than developers on the platform.

Jun 02, 2026

by Ellie Ramirez-Camara

Subscribe

Reducing NLP Inference costs through model specialisation

Comments

Read Next

Prometheus raises $12B to build an AI to automate physical manufacturing processes

Niteshift raises $7M to build the cloud infrastructure layer for AI coding agents

PhysicsX raises $300M Series C at $2.4B valuation to scale AI for engineering and manufacturing

Suno raised a $400M Series D at a $5.4B valuation despite ongoing lawsuits

Codex now boasts plugins for white-collar work and other new features for Enterprise users