Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

o1 is impressive, I tried feeding it some of the trickier problems I have solved (that involved nontrivial algorithmic challenges) over the past few months, and it managed to solve all of them, and usually came up with slightly different solutions than I did, which was great.

However what I've found odd was the way it formulated the solution was in excessively dry and obtuse mathematical language, like something you'd publish in an academic paper.

Once I managed to follow along its reasoning, I understood what it came up with could essentially be explain in 2 sentences of plain english.

On the other hand, o1 is amazing at coding, being able to turn an A4 sheet full of dozens of separate requirements into an actual working application.



> actual working application

Working != maintainable

The things that ChatGPT or Claude spit out are impressive one-shots but hard to iterate on or integrate with other code.

And you can’t just throw Aider/Cursor/Copilot/etc at the original output without quickly making a mess. At least not unless you are nudging it in the right directions at every step, occasionally jumping in and writing code yourself, fixing/refactoring the LLM code to fit style/need, etc.


This is how I use Cursor Composer Agents. Detailed outline up front and see what it comes up with. I then use it to iterate on that idea. Sometimes it breaks things, so I'll have to reject/revert the change and then ask it again, but tell it not to change XYZ. If it starts going down the wrong path, I'll step it and code it myself. But I've ran into cases where the next question I ask it seems to be based on the state of the code form its last change, not the current state as I have changed it. So that can be frustrating.

I've really only done greenfield hobby projects with it so far. Hesitant to throw larger things at it that have been growing for 8/9 years. But, there's always undo or `git reset`. :P


One place where all LLMs fail hard is in graphics programming. I try on and off since the release of ChatGPT 3 and no model manages to coherently juggle GLSL Shader Inputs, their processing and the output. It fails hard at even the basics.

I guess it's because the topic is such a cross between fields like math, cs, art and so visual, maybe for a similar reason LLMs do so poorly with SVG ouput, like the unicorn benchmark: https://gpt-unicorn.adamkdean.co.uk/


Just tried to generate an unicorn with o1, and it seems to be doing a decent job at it.

To be fair, I'm quite sure an LLM could generate a verbal description of the unicorn's body topology (four skinny legs below body, neck coming from head, head coming from neck etc., above to the right).

It could then use translate this info into geometric coordinates.


Do you mean o1-preview or the current o1? I rarely get anything really useful out of the current one ($20 subscription, not the 200 one). They seem to have seriously nerfed it.


o1, not a big user, but haven't used a big model before, only Sonnet and GPT4 so this all seems new and wonderful to me


o1 has a parameter that affects how long it's willing to think for, whereas o1-preview did not

It's likely o1-preview was permanently pinned at max thinking, and o1 is not




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: