Data Phoenix Digest - ISSUE 9.2023
Upcoming Data Phoenix community webinars: LLM evaluations, multilingual semantic search, rise in the use of synthetic data for regulated industries, how to use LLMs to interface with multiple data sources, best practices for building LLM-based applications, leveraging LLMs for enterprise usage.
Hey folks,
Welcome to this week's edition of Data Phoenix Digest! Today we will introduce you to the exciting list of Data Phoenix community webinars that our team has prepared for you.
Be active in our community and join our Slack to discuss the latest news of our community, top research papers, articles, events, jobs, and more...
Upcoming Data Phoenix webinars
LLM Evaluations - What and Why
Large language models are trained on billions of data points and perform exceptionally well across a wide range of tasks. However, one aspect where these models often fall short is their lack of determinism. While building a prototype of an LLM application has become remarkably easy, transforming that prototype into a fully-fledged product is equally challenging. Even with carefully crafted prompts, the model can exhibit problematic behavior such as hallucinations, incorrect output structures, toxic or biased responses, or irrelevant replies for certain inputs. The potential error modes can be extensive.
This is where a robust LLM evaluation tool like UpTrain comes to the rescue which empowers you to:
- Validate and correct the model's responses before presenting them to end-users.
- Obtain quantitative measures for experimenting with multiple prompts, model providers, and more.
- Conduct unit testing to ensure that no faulty prompts or code make their way into your production environment.
Join us for an insightful talk as we delve deep into the intricacies of assessing the performance and quality of LLMs and discover the best practices to ensure the reliability and accuracy of your LLM applications.
Multilingual Semantic Search
Connecting Large Language Models with embeddings and semantic search on your own data has become widely popular. But how does this work in other languages and across languages? Join me for this talk why multilingual semantic search is amazing, how respective models are trained, and new use-cases this unlocks.
Rise in the use of synthetic data for regulated industries
Synthetic data is evolving and becoming extremely important for organizations. This session will uncover facts about synthetic data. It will also talk about some of the most impactful use cases associated with it, along with challenges that companies face while harnessing its power.
How to use LLMs to Interface with Multiple Data Sources
Following emerging Large Language Model Operations (LLM Ops) best practices in the industry, you’ll learn about the key technologies that enable Generative AI practitioners like you to build complex LLM applications. Specifically, we’ll deep dive on “data frameworks” like LlamaIndex, and we’ll demonstrate how to create state-of-the-art hierarchical indexes from different data sources. During the event, we will also show you how another commonly known LLM Ops framework (LangChain) underlies much of the functionality of LlamaIndex. All demo code will be provided via GitHub links during and after the event!
Best practices for building LLM-based applications
Many businesses started incorporating Large Language Models into their applications. There are, however, several challenges that may impact such systems. It’s great to be aware of them before you start. During the talk, we will review the existing tools and see how to move from development to production without a headache.
Leveraging Large Language Models for Enterprise Usage
Organizations worldwide are still trying to understand how to leverage generative AI models and put them into practical use. To enable them, NVIDIA developed a full-stack approach, from the hardware to develop and serve these models, to the variety of customizable SDKs and services to assist research and industry alike. However, LLMs, like any other technology, are not perfect and require guardrails to address shortcomings such as hallucination, inherited bias, and toxicity. By providing toolsets and mechanisms to mitigate these limitations, in the roads ahead, we hope to see generative AI open up new horizons and brings about positive revolution. Join this talk to learn about foundation and ChatGPT-style models, generative AI and LLM technology at NVIDIA, shortcomings and proposed guardrails, and the road ahead.
Video records of past Data Phoenix webinars
Building production ready LLMs with specialisation
LLM inference can be very expensive, requiring access to powerful GPUs. In this talk, Meryem discusses ways to reduce this cost by over 90% through better choice of model, hardware, and model compression techniques. This is an essential talk to go to for anyone looking to put LLM into production.
Unlocking Data Value with Large Language Models
Large Language Models or Foundation Models are the ones that power Generative AI applications. FMs challenge classical Machine Learning with a paradigm shift towards Prompt Engineering which is the new way of building ML applications for businesses. In this talk we will discuss how businesses can leverage FMs using Prompt Engineering and build Generative AI application in the cloud. We will also go over the architectural components and resources on how to get started alongside how much does it cost.