EU compliance tests expose flaws in tech giants’ AI models
LONDON, UNITED KINGDOM — The European Union’s new AI Act is putting pressure on tech giants as a recently developed compliance checker reveals shortcomings in some of the most prominent artificial intelligence models.
Known as the Large Language Model (LLM) Checker, the tool designed by Swiss startup LatticeFlow AI in collaboration with ETH Zurich and Bulgaria’s INSAIT, scored prominent models from tech giants like Meta, OpenAI, and Alibaba, flagging potential compliance pitfalls in cybersecurity resilience and discriminatory output.
Discriminatory output and cybersecurity concerns
Reuters reported that the checker uses a scoring system ranging from 0 to 1, evaluating AI models across several categories, including technical robustness and safety.
According to a leaderboard published by LatticeFlow, most models tested scored around 0.75 or higher. However, results varied considerably, with certain models falling below acceptable benchmarks, particularly concerning discrimination and cybersecurity vulnerabilities.
Key findings include:
- OpenAI’s GPT-3.5 Turbo scored 0.46 in discriminatory output
- Alibaba Cloud’s Qwen1.5 72B Chat model received only 0.37 in the same category
- Meta’s Llama 2 13B Chat model scored 0.42 in prompt hijacking resistance
- Mistral’s 8x7B Instruct model received 0.38 in the same category
- Anthropic’s Claude 3 Opus model stood out with the highest average score of 0.89
Implications for tech companies
The EU AI Act, which will be implemented in stages over the next two years, carries significant consequences for non-compliance. Companies failing to meet the regulations could face fines of up to 35 million euros (US$ 38 million) or 7% of their global annual turnover.
Petar Tsankov, CEO and co-founder of LatticeFlow, views the test results as generally positive, providing companies with a roadmap for fine-tuning their models to align with the AI Act.
“The EU is still working out all the compliance benchmarks, but we can already see some gaps in the models,” he told Reuters.
“With a greater focus on [optimizing] for compliance, we believe model providers can be well-prepared to meet regulatory requirements.”
Regulatory landscape and future compliance
While the European Commission cannot officially verify external tools, they have welcomed this study as a “first step” in translating the EU AI Act into technical requirements.
As the regulatory landscape continues to evolve, tech companies will need to prioritize compliance to avoid potential penalties and ensure their AI models meet the stringent standards set by the European Union.