Subscribe to Our Newsletter

Success! Now Check Your Email

To complete Subscribe, click the confirmation link in your inbox. If it doesn't arrive within 3 minutes, check your spam folder.

Ok, Thanks
COMPL-AI is the first evaluation framework for the EU AI Act compliance evaluation
Credit: COMPL-AI

COMPL-AI is the first evaluation framework for the EU AI Act compliance evaluation

LatticeFlow launched COMPL-AI, an open-source framework for EU AI Act compliance, offering technical interpretations and evaluations of major AI models while addressing the gap between regulatory requirements and practical implementation.

Ellie Ramirez-Camara profile image
by Ellie Ramirez-Camara

LatticeFlow, an AI safety and compliance platform that spun out of ETH Zurich, launched COMPL-AI, the first open-source framework focused on compliance with the EU AI Act, which officially entered into force this August. This open-source framework, developed in collaboration with ETH Zurich and INSAIT, aims to bridge the gap between regulatory requirements and technical implementations. Alongside the framework launch, a technical interpretation of the AI Act which maps regulatory requirements into technical ones, and the evaluation results for several publicly available foundation models for vendors including OpenAI, Meta, Google, Anthropic, and Alibaba have also been released.

As LatticeFlow notes, although the AI Act has already come into force, it mainly describes high-level requirements without necessarily offering concrete technical guidance on how to satisfy them. Moreover, the European Commission plans to publish a Code of Practice that will "facilitate the proper application of the AI Act's rules for general-purpose AI models." However, the kick-off plenary for the Code of Practice took place very recently, on September 30, and a final draft of the document is not expected until April 2025. In the meantime, LatticeFlow's technical interpretation is intended to support both the working groups as they tackle the development of the Code of Practice and any organization looking for a starting point in their journey towards compliance with the AI Act.

The core of the framework, though, is a suite of 27 state-of-the-art benchmarks based on LatticeFlow's technical interpretation, which evaluate the satisfaction of critical technical requirements as they correlate to regulatory requirements. Currently model evaluations can be requested directly through COMPL-AI, or run locally by checking out the platform's GitHub repository, running the evaluations to obtain a report, and uploading the latter to COMPL-AI's website.

Finally, the evaluation reports enable users to see the framework in action while providing insights into the development of current foundation models as perhaps the first large-scale effort to test models for compliance rather than performance. Early findings reveal that many models would fail to meet some regulatory requirements, at least as encoded by LatticeFlow's technical interpretation. While most models fare well on harmful content and toxicity requirements evaluations, many scored below the 50% mark on fairness and cybersecurity tests.

Moreover, the reports reveal that additional work needs to be done for privacy and copyright benchmarking, given that current efforts primarily rely on testing model memorization, resulting in a difficult and one-sided evaluation process. More generally, Prof. Martin Vechev, Full Professor at ETH Zurich, and Founder and Scientific Director of INSAIT in Sofia, Bulgaria acknowledged that LatticeFlow is far from being a definitive evaluation framework and urged the community to contribute to the development and expansion of the COMPL-AI framework.

Ellie Ramirez-Camara profile image
by Ellie Ramirez-Camara
Updated

Data Phoenix Digest

Subscribe to the weekly digest with a summary of the top research papers, articles, news, and our community events, to keep track of trends and grow in the Data & AI world!

Success! Now Check Your Email

To complete Subscribe, click the confirmation link in your inbox. If it doesn’t arrive within 3 minutes, check your spam folder.

Ok, Thanks

Read More