Hacker Newsnew | past | comments | ask | show | jobs | submit | tipsytoad's commentslogin

Someone not familiar with the field rediscovering the stochastic parrot argument from 3+ years ago


Clearly not from the UK. By US standards Labour would be socialist, and conservative (right) liberal at best.


It’s a quite deceptive paper. The main headline benchmarks (math500, aime24 /25) final answer is just a number from 0-1000, so what is the takeaway supposed to be for pass@k of 512/1024?

On the unstructured outputs, where you can’t just ratchet up the pass@k until it’s almost random, it switches the base model out for instruct, and in the worse case on livecodebench it uses a qwen-r1-distill as a _base_ model(!?) that’s an instruct model further fine tuned on R1’s reasoning traces. I assume that was because no matter how high the pass@k, a base model won’t output correct python.


I get the same feeling that I'm "not being productive" while playing video games, watching tv, etc that seems to kill any enjoyment from doing these things.

For me learning piano has been a great alternative to programming in the off hours (typing is quite transferrable too!). Highly recommend if you're like me on screens all day.


Like, PyTorch? And the new Mac minis have 512gb of unified memory


I usually am a huge fan of “copilot” tools (I use cursor, etc) and Claude has always been my go to.

But Sonnet 3.7 actually seems dangerous to me, it seems it’s been RL’d _way_ too hard into producing code that won’t crash — to the point where it will go completely against the instructions to sneak in workarounds (e.g. returning random data when a function fails!). Claude Code just makes this even worse by giving very little oversight when it makes these “errors”


this is a huge issue for me as well. It just kind of obfuscates errors and masks the original intent, rather than diagnosing and fixing the issue. 3.5 seemed to be more clear about what it's doing and when things broke at least it didn't seem to be trying to hide anything.


I wholly disagree with the comic, but a anti AI art take I’m more sympathetic to: https://x.com/soi/status/1815584824033177606?s=46


I don’t think this hits at the heart of the issue? Even if we can catch AI text with 100% accuracy, any halfway decent student can rewrite it from scratch using o1s ideas in lieu of actually learning.

This is waay more common and just impossible to catch. The only students caught here are those that put no effort in at all


> rewrite it from scratch ... in lieu of actual learning

If one can "rewrite it from scratch" in a way that's actually coherent and gets facts correct, then they learned the material and can write an original paper.

> This is waay more common and just impossible to catch.

It seems a good thing that this is more common and, naturally, it would -- perhaps should, given the topic -- be impossible to catch someone cheating when they're not cheating.


Just another +1 that if you’re going to give vscode a fair shot, it’s much better to go with vscode-neovim than the standard vim extension. You can even map most of your config right over.

E.g. (mine) https://github.com/tom-pollak/dotfiles/tree/master/nvim


How is this different from instructor? github.com/jxnl/instructor

namely, why did they take so long for something that just seems like a wrapper around function calling?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: