BTC $71,807
2026 Bull Run Is Building Start trading with 5% OFF all fees
Sign Up Now
BTC $71,807
Bull Run 2026 | 5% Off Fees Open your Binance account today
Sign Up
HomeNewsOpenAI Launches Crypto Contract AI Security Benchmark, Claude Tops Test

OpenAI Launches Crypto Contract AI Security Benchmark, Claude Tops Test

-

OpenAI introduced a new benchmark to assess AI models in detecting and exploiting vulnerabilities in crypto smart contracts. Developed with Paradigm and OtterSec, EVMbench evaluates AI agents on 120 vulnerabilities. Anthropic‘s Claude Opus model performed best, with OpenAI and Google‘s models following. The benchmark aims to measure AI performance in economically significant environments as agents become more involved in securing and transacting digital assets.


OpenAI has launched a new benchmark evaluating AI models on detecting, patching, and exploiting vulnerabilities in crypto smart contracts. The project, detailed in a released paper called “EVMbench,” was developed in collaboration with crypto investment firm Paradigm and security firm OtterSec.

- Advertisement -
Ad
Altseason Is Loading. Don't watch from the sidelines.
SOL $90.51
DOGE $0.0963
LINK $9.02
SUI $1.00
5% off fees when you sign up
Start Trading

The benchmark analyzed 120 smart contract vulnerabilities sourced from audit competitions. OpenAI stated it is increasingly important to evaluate AI performance in “economically meaningful environments.” “Smart contracts secure billions of dollars in assets, and AI agents are likely to be transformative for both attackers and defenders.”

Anthropic‘s Claude Opus 4.6 model achieved the top average “detect award” of nearly $38,000. It was followed by OpenAI’s OC-GPT-5.2 and Google‘s Gemini 3 Pro, with awards of approximately $31,600 and $25,100 respectively.

The need for such testing is underscored by the $3.4 billion in crypto funds stolen by attackers in 2025. Industry executives like Circle CEO Jeremy Allaire have predicted AI agents will transact with stablecoins on a massive scale.

Dragonfly managing partner Haseeb Qureshi said crypto’s original promise for human use never fully materialized because the technology wasn’t designed for human intuition. He argued the future lies with AI-intermediated wallets that manage complex operations securely. “A technology often snaps into place once its complement finally arrives… For crypto, we might just have found it in AI agents.”

Most Popular

Ad
Pay Less on Every Trade. For Life.
$10K/mo volume Save $60/yr
$50K/mo volume Save $300/yr
$100K/mo volume Save $600/yr
5% off all trading fees when you sign up
Claim Your Discount