The Art of RAG Evaluation

Join AI Makerspace for a special event on RAG! Deep dive into the art of evaluating complex LLM applications that leverage Retrieval Augmented Generation (RAG), the technique that aims to ground the outputs of LLMs in fact-checkable information.

RSVP

With the advent of open-source evaluation tools like RAG ASsessment (RAGAS) and the emergence of built-in evaluation tools within emerging LLM Ops platforms like LangSmith, it’s getting easier to measure how good, right, or correct LLM application outputs are. Further, with Metrics-Driven Development (MDD), we can actually use these quantitative measures to directionally improve our applications systematically.

While it still remains a bit of a black art today, we are beginning to get some real clarity on best practices for the industry looking to build production LLM applications.

In this event, we’ll do a deep dive into the art of evaluating complex LLM applications that leverage Retrieval Augmented Generation (RAG), the technique that aims to ground the outputs of LLMs in fact-checkable information.

We will begin by building a simple RAG system using the latest from LangChain v0.1.0 before baselineing the performance of our system with the RAGAS framework and metrics. We will explore each calculation used to estimate performance during Retrieval (Context Recall, Precision, and Relevancy), Generation (Faithfulness, Answer Relevance), and throughout our entire RAG pipeline (Answer Semantic Similarity, Answer Correctness).

We will then focus on improving key retrieval metrics by making advanced retrieval upgrades to our system. Finally, we’ll discuss important tradeoffs that come with improvements during any production AI product development process and the limitations of using quantitative metrics and AI evaluation approaches!

Special thanks to LangChain and RAGAS for partnering with us on this event!

In this event, you'll learn:

How to build and improve a RAG system in LangChain v0.1.0
How to leverage the RAGAS framework for Metrics-Driven Development
The limitations of current RAG evaluation techniques and what to watch out for!

Speakers:

Dr. Greg Loughnane is the Co-Founder & CEO of AI Makerspace, where he serves as an instructor for their AI Engineering Bootcamp. Since 2021 he has built and led industry-leading Machine Learning education programs. Previously, he worked as an AI product manager, a university professor teaching AI, an AI consultant and startup advisor, and ML researcher. He loves trail running and is based in Dayton, Ohio.
Chris Alexiuk, is the Co-Founder & CTO at AI Makerspace, where he serves as an instructor for their AI Engineering Bootcamp. Previously, he’s held roles as a Founding Machine Learning Engineer, Data Scientist, and ML curriculum developer and instructor. He’s a YouTube content creator YouTube who’s motto is “Build, build, build!” He loves Dungeons & Dragons and is based in Toronto, Canada.

Follow AI Makerspace on LinkedIn & YouTube to stay updated with workshops, new courses, and opportunities for corporate training.

Subscribe

The Art of RAG Evaluation

Comments

Read Next

ŌURA's first proprietary AI model focuses on evidence-based women's health guidance

Freeform Raises $67M Series B to scale its 'AI-native' metal manufacturing platform

Anthropic has closed a $30B Series G funding round valued at $380B post money

Synthetic data startup simmetry.ai secured €330K for its computer vision training platform

OpenAI names Josh Achiam its "chief futurist" after disbanding its mission alignment team