News AI Makerspace

The Art of RAG Evaluation

Join AI Makerspace for a special event on RAG! Deep dive into the art of evaluating complex LLM applications that leverage Retrieval Augmented Generation (RAG), the technique that aims to ground the outputs of LLMs in fact-checkable information.

by Sarah DeSouza

Updated January 31, 2024

RSVP

With the advent of open-source evaluation tools like RAG ASsessment (RAGAS) and the emergence of built-in evaluation tools within emerging LLM Ops platforms like LangSmith, it’s getting easier to measure how good, right, or correct LLM application outputs are. Further, with Metrics-Driven Development (MDD), we can actually use these quantitative measures to directionally improve our applications systematically.

While it still remains a bit of a black art today, we are beginning to get some real clarity on best practices for the industry looking to build production LLM applications.

In this event, we’ll do a deep dive into the art of evaluating complex LLM applications that leverage Retrieval Augmented Generation (RAG), the technique that aims to ground the outputs of LLMs in fact-checkable information.

We will begin by building a simple RAG system using the latest from LangChain v0.1.0 before baselineing the performance of our system with the RAGAS framework and metrics. We will explore each calculation used to estimate performance during Retrieval (Context Recall, Precision, and Relevancy), Generation (Faithfulness, Answer Relevance), and throughout our entire RAG pipeline (Answer Semantic Similarity, Answer Correctness).

We will then focus on improving key retrieval metrics by making advanced retrieval upgrades to our system. Finally, we’ll discuss important tradeoffs that come with improvements during any production AI product development process and the limitations of using quantitative metrics and AI evaluation approaches!

Special thanks to LangChain and RAGAS for partnering with us on this event!

In this event, you'll learn:

How to build and improve a RAG system in LangChain v0.1.0
How to leverage the RAGAS framework for Metrics-Driven Development
The limitations of current RAG evaluation techniques and what to watch out for!

Speakers:

Dr. Greg Loughnane is the Co-Founder & CEO of AI Makerspace, where he serves as an instructor for their AI Engineering Bootcamp. Since 2021 he has built and led industry-leading Machine Learning education programs. Previously, he worked as an AI product manager, a university professor teaching AI, an AI consultant and startup advisor, and ML researcher. He loves trail running and is based in Dayton, Ohio.
Chris Alexiuk, is the Co-Founder & CTO at AI Makerspace, where he serves as an instructor for their AI Engineering Bootcamp. Previously, he’s held roles as a Founding Machine Learning Engineer, Data Scientist, and ML curriculum developer and instructor. He’s a YouTube content creator YouTube who’s motto is “Build, build, build!” He loves Dungeons & Dragons and is based in Toronto, Canada.

Follow AI Makerspace on LinkedIn & YouTube to stay updated with workshops, new courses, and opportunities for corporate training.

by Sarah DeSouza

Updated January 31, 2024

Subscribe to Our Newsletter

The Art of RAG Evaluation

Stable Video 4D showcases Stability AI's research into multi-angle video generation

Mistral AI released Mistral Large 2, a multilingual, tool use-capable, open model of its own

The FTC is gathering information on surveillance pricing products and services

A new Meta AI update brings multilingual support, Llama 3.1 models, and "Imagine me" prompts

Meta wants the open-source Llama 3.1 405B to compete with heavyweights like GPT-4 and Claude 3.5 Sonnet

Data Phoenix Digest

Read More

Stable Video 4D showcases Stability AI's research into multi-angle video generation

Mistral AI released Mistral Large 2, a multilingual, tool use-capable, open model of its own

The FTC is gathering information on surveillance pricing products and services

A new Meta AI update brings multilingual support, Llama 3.1 models, and "Imagine me" prompts