Scoring AI hackers when there is no answer key

Wait 5 sec.

AI models are solving more and more of the offensive-cyber tests built to measure them. Once a model solves most of a benchmark, that benchmark runs out of room and says little about the best systems anymore. Many of those tests also lean on bugs that already have public writeups, so a strong score can come partly from a model repeating something it has read. FrontierCyber, a benchmark from the AI security lab Irregular, goes … More →The post Scoring AI hackers when there is no answer key appeared first on Help Net Security.