Root Signals is a Helsinki and Palo Alto-based startup tackling the challenges concerning automating LLM evaluations with LLM-as-a-judge techniques. We often hear that LLMs are shipped with guardrails to prevent undesirable behaviors. However, we also commonly find out that once said models are deployed as the core of an application, like a chatbot or an assistant, there are still instances of unexpected behavior, or that users with malicious intentions were still able to coax the LLMs into delivering unsafe outputs.
Even when there are no cases of explicitly malicious behavior, critical enterprise-grade applications often demand accurate outputs, and cannot tolerate a high hallucination rate. These issues can be traced to the same origin: LLM performance is difficult to predict, as is the diversity of use cases that different users will want the model to solve. Guardrails and other measures are shipped as risk mitigation features, thus leaving those in charge of developing and deploying generative AI applications with the burden of constantly having to perform evaluations to ensure the applications perform as expected.
To contribute to the resolution of model evaluation challenges, Root Signals has developed a platform that automates evaluations by leveraging LLM-as-a-judge techniques by leveraging LLMs as evaluators, Root Signals' platform enables users to test for over 30 metrics, including faithfulness, answer relevance, confidentiality, harmlessness, and policy adherence. Additionally, Root Signals supports the creation of custom evaluators tailored to specific use cases and targets. The service ranges from a free single-seat plan to a team (5-seat, $179 monthly) or scalable subscription.
Root Signals recently announced it secured a $2.8M funding round led by Angular Ventures, with participation from Business Finland. The startup plans to use the funds to continue building its platform, developing proprietary models, and to boost its sales and marketing efforts.
Comments