Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I use it on a daily basis.


I notice you said you used it, but did you actually intentionally test its reasoning abilities by giving it "known problems" and equivalently difficult (whatever that means) "novel" problems, and observe the difference in the quality of results it gave?

(btw, apparently GPT-4 does much better on mathematical reasoning, I've yet to try myself though)


Yes.

It fails sometimes on known problems, not even particularly hard ones - i.e.: simple questions about the capabilities of programming libraries for which there is ample documentation available.

The thing is, by talking about its shortcomings I may be sounding dismissive. I'm not. I think ChatGPT is an amazing tool. I just recognize it's shortcomings to use it to its best capacity.

For things that don't have a known answer (i.e.: I couldn't easily find an answer online) it does generate plausible bullshit. On those cases it shines for things that don't really have any strictness requirements (for example, ask it to generate some fiction, ask it to generate a polite response to an email, etc). I used to suggest character names for a game I was going to play based on some parameters, and I loved the responses it gave me.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: