Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Im super curious to see how this actually does at finding significant bugs, we've been working in the space on https://www.bismuth.sh for a while and one of the things we're focused on is deep validation of the code being outputted.

There's so many of these "vibe coding" tools and there has to be real engineering rigor at some point. I saw them demo "find the bug" but the bugs they found were pretty superficial and thats something we've seen in our internal benchmark from both Devin and Cursor. A lot of noise and false positives or superficial fixes.



Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: