Videos

Reducing NLP Inference costs through model specialisation

This talk will discuss ways to reduce costs for NLP inference through a better choice of model, hardware, and model compression techniques.

Dmitry Spodarets

· Jul 10, 2023

Reducing NLP Inference costs through model specialisation

NLP inference can be very expensive, requiring access to powerful GPUs. In this talk, Meryem discusses ways to reduce this cost by over 90% through better choice of model, hardware, and model compression techniques. This is an essential talk to go to for anyone looking to put NLP models into production.

Meryem Arik
Meryem is the co-founder of TitanML - TitanML is an NLP development platform that focuses on deployability of LLMs - allowing businesses to build smaller and cheaper deployments of language models with ease. The TitanML platform automates much of the difficult MLOps and Inference Optimisation science to allow businesses to build and deploy state-of-the-art language models with ease.

Comments

Interloom raises $16.5M to build an operational knowledge "memory" for enterprise AI agents

Interloom has raised a $16.5M seed round to develop a platform that captures undocumented operational expertise and transforms it into a permanent context layer for AI agents. With its "Context Graph", Interloom aims to address the critical knowledge gap that affects enterprise AI deployment.

Mar 25, 2026

by Ellie Ramirez-Camara

News

Mistral AI takes on closed-source custom model enterprise services with Mistral Forge

Mistral has launched Forge, a platform that lets enterprises train AI models from scratch on their own proprietary data for greater accuracy and control.

Mar 24, 2026

by Ellie Ramirez-Camara

News

Cursor launches Composer 2: a model more capable, cheaper and faster than its predecessor

Cursor recently released Composer 2, a new in-house coding model that vastly improves its predecessor's performance. While Composer 2's benchmark scores may not be outstanding, Cursor is betting that the model's lower price point and native integration to the coding environment will drive adoption.

Mar 20, 2026

by Ellie Ramirez-Camara

News

Yann LeCun's AMI Labs just raised Europe's largest seed round for its world models

Yann LeCun's AMI Labs raised a $1.03 billion seed round at a $3.5 billion valuation, Europe's largest seed round on record. The startup will use the raised money to continue developing world models that can be applied to robotics, industrial, and healthcare applications.

Mar 19, 2026

by Ellie Ramirez-Camara

News

Encyclopedia Britannica and Merriam-Webster are the latest publishers to sue OpenAI

Encyclopedia Britannica and Merriam-Webster have sued OpenAI for copyright infringement and trademark infringement. The publishers accuse OpenAI of unlawful scraping and reproduction of their content and claim that falsely attributed hallucinations are damaging their reputations as trusted sources.

Mar 17, 2026

by Ellie Ramirez-Camara

Subscribe

Reducing NLP Inference costs through model specialisation

Comments

Read Next

Interloom raises $16.5M to build an operational knowledge "memory" for enterprise AI agents

Mistral AI takes on closed-source custom model enterprise services with Mistral Forge

Cursor launches Composer 2: a model more capable, cheaper and faster than its predecessor

Yann LeCun's AMI Labs just raised Europe's largest seed round for its world models

Encyclopedia Britannica and Merriam-Webster are the latest publishers to sue OpenAI