Hacker News new | past | comments | ask | show | jobs | submit login

404? Is the repo still private?

Edit: Ah, the URL was wrong. It's cve-bench!

I couldn't find anything related MCP servers or tools that were offered to the agents. Wouldn't it be much more likely to succeed if there was e.g. a gdb server or an sqli/http server running for debugging purposes? That way the thinking process could succeed more easily, no?

[1] https://github.com/uiuc-kang-lab/cve-bench






Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: