Webinar "Large Language Model Evaluations

Webinar "Large Language Model Evaluations - What and Why"

Join us for an insightful talk as we delve deep into the intricacies of assessing the performance and quality of LLMs and discover the best practices to ensure the reliability and accuracy of your LLM applications.

Data Phoenix team invites you all to our upcoming "AI Project Spotlight" webinar that’s going to take place on July 20 at 7 pm CET / 10 am PST.

Topic: LLM Evaluations - What and Why
Speaker: Sourabh Agrawal (CEO & Founder, UpTrain)
Participation: free (but you’ll be required to register)

ABOUT THE SPEAKER AND TOPIC
Large language models are trained on billions of data points and perform exceptionally well across a wide range of tasks. However, one aspect where these models often fall short is their lack of determinism. While building a prototype of an LLM application has become remarkably easy, transforming that prototype into a fully-fledged product is equally challenging. Even with carefully crafted prompts, the model can exhibit problematic behavior such as hallucinations, incorrect output structures, toxic or biased responses, or irrelevant replies for certain inputs. The potential error modes can be extensive.

This is where a robust LLM evaluation tool like UpTrain comes to the rescue which empowers you to:

Validate and correct the model's responses before presenting them to end-users.
Obtain quantitative measures for experimenting with multiple prompts, model providers, and more.
Conduct unit testing to ensure that no faulty prompts or code make their way into your production environment.

Sourabh Agrawal
Sourabh is a 2X founder in the AI/ML space. He started his career working at Goldman Sachs where he built ML models for financial markets. Post that he joined the autonomous driving team at Bosch/Mercedes, building state-of-the-art CV modules for scene understanding. He started his entrepreneurial journey in 2020 and founded an AI-powered fitness startup that he scaled to 150K+ users. During his past experiences, he encountered a frequent source of frustration due to the lack of tools to evaluate these models- a problem even more pronounced in the case of Generative AI models. To solve this, he is building UpTrain - an open-source tool to evaluate, prompt test, and monitor LLM applications.

Subscribe

Webinar "Large Language Model Evaluations - What and Why"

Comments

Read Next

Prometheus raises $12B to build an AI to automate physical manufacturing processes

Niteshift raises $7M to build the cloud infrastructure layer for AI coding agents

PhysicsX raises $300M Series C at $2.4B valuation to scale AI for engineering and manufacturing

Suno raised a $400M Series D at a $5.4B valuation despite ongoing lawsuits

Codex now boasts plugins for white-collar work and other new features for Enterprise users