GPT on a Leash: Evaluating LLM-based Apps & Mitigating Their Risks

The task of testing and evaluating AI systems is extremely challenging, especially when it involves text and unstructured data. In the case of LLM-based applications, these challenges are magnified by the fact that there isn't "one correct answer" and by a combination of various external constraints such as topics that shouldn't be discussed.

​Speaker:
Philip is the co-founder and CEO of Deepchecks. Philip is an experienced Data Scientist and in the past, he led a top-tier ML research group that tackled difficult problems from various disciplines (NLP, Computer Vision, Signal Processing, etc). Philip has a B.Sc. in Physics from the Hebrew University, which he obtained as part of the Talpiot excellence program, and an M.Sc. in Electrical Engineering from Tel Aviv University (Thesis in ML, accepted to IJCAI 2019). He was selected as a featured honoree in the Israeli Forbes 30 Under 30 list, class of 2021.