Yes yes yes we're all aware that these are word predictors and don't actually know anything or reason. But these random dice are somehow able to give reasonably seemingly well-educated answers a majority of the time and the fact that these programs don't technically know anything isn't going to slow the train down any.
i just don't get why people say they don't reason. It's crazy talk. the kv cache is effectively a unidirectional turing machine so it should be possible to encode "reasoning" in there. and evidence shows that llms occasionally does some light reasoning. just because it's not great at it (hard to train for i suppose) doesn't mean it does it zero.
Would I be crazy to say that the difference between reasoning and computation is sentience? This is an impulse with no justification but it rings true to me.
Taking a pragmatic approach, I would say that if the AI accomplishes something that, for humans, requires reasoning, then we should say that the AI is reasoning. That way we can have rational discussions about what the AI can actually do, without diverting into endless discussions about philosophy.
Suppose A solves a problem and writes the solution down. B reads the answer and repeats it. Is B reasoning, when asked the same question? What about one that sounds similar?
The crux of the problem is "what is reasoning?" Of course it's easy enough to call the outputs "equivalent enough" and then use that to say the processes are therefore also "equivalent enough."
I'm not saying it's enough for the outputs to be "equivalent enough."
I am saying that if the outputs and inputs are equivalent, then that's enough to call it the same thing. It might be different internally, but that doesn't really matter for practical purposes.
I think one of the great lessons of our age will be that things being apparently equivalent, or in more applied terms "good enough," are not equal to equality.
In my experience PhD's are not 10x productive. Quite the opposite actually. Too much theory and not much practicality. The only two developers that my company has fired for (basically) incompetency were PhD's in Computer Science. They couldn't deliver practical real code.
"Ketamine has been found to increase dopaminergic neurotransmission in the brain"
This property is likely an important driver of ketamine abuse and it being rather strongly 'moreish', as well as the subjective experiences of strong expectation during a 'trip'. I.e. the tendency to develop redose loops approaching unconsciousness in a chase to 'get the message from the goddess' or whatever, which seems just out of reach (because it's actually a feeling of expectation and not actually a partially installed divine T3 rig).
The “multiple PhDs” thing is interesting. The point of a PhD is to master both a very specific subject and the research skills needed to advance the frontier of knowledge in that area. There’s also plenty of secondary issues, like figuring out the politics of academia and publishing enough to establish a reputation.
I don’t think models are doing that. They certainly can retrieve a huge amount of information that would otherwise only be available to specialists such as people with PhDs… but I’m not convinced the models have the same level of understanding as a human PhD.
It’s easy to test though- the models simply have to write and defend a dissertation!
Totally disagree. The current state of coding AIs is “a level 2 product manager who is a world class biker balancing on a unicycle trying to explain a concept in French to a Spanish genius who is only 4 years old.” I’m not going to explain what I mean, but if you’ve used Qwen Code you understand.
Qwen Code is really not representative of the state of the art though. With the right prompt I have no problem getting Claude to output me a complete codebase (e.g. a non trivial library interfacing with multiple hardware devices) with the specs I want, in modern c++ that builds, runs, has documentation and unit tests sourced from data sheets and manufacturer specs from the go
Assuming there aren't tricky concurrency issues and the documentation makes sense (you know what registers to set to configure and otherwise work the device,) device drivers are the easiest thing in the world to code.
There's the old trope that systems programmers are smarter than applications programmers but SWE-Bench puts the lie to that. Sure, SWE-Bench problems are all in the language of software, applications programmers take badly specified tickets in the language of product managers, testers and end users and have to turn that into the language of SWE-Bench to get things done. I am not that impressed with 65% performance on SWE-Bench because those are not the kind of tickets that I have to resolve at work, but rather at work if I want to use AI to help maintain a large codebase I need to break the work down into that kind of ticket.
> device drivers are the easiest thing in the world to code.
Except the documentation lies and in reality your vendor shipped you a part with timing that is slightly out of sync with what the doc says and after 3 months of debugging, including using an oscilloscope, you figure out WTF is going on. You report back to your supplier and after two weeks of them not saying any thing they finally reply that the timings you have reverse engineered are indeed the correct timings, sorry for any misunderstandings with the documentation.
As an application's engineer, my computer doesn't lie to me and memory generally stays at a value I set it to unless I did something really wrong.
Backend services are the easiest thing in the world to write, I am 90% sure that all the bullshit around infra is just artificial job security, and I say this as someone who primarily does backend work now days.
I'm not sure if this counts as systems or application engineering, but if you think your computer doesn't lie to you, try writing an nginx config. Those things aren't evaluated at /all/ the way they look like they are.
At no point have any of my nginx files ever flipped their own bits.
Are they a constant source of low level annoyance? Sure. But I've never had to look at a bus timing diagram to understand how to use one, nor worried about an nginx file being rotated 90 degrees and wired up wrong!
To some extent, for sure. The fact that electronics engineers that have picked up a bit of software write a large fraction of the world's device drivers does point to it not being the most challenging of software tasks, but on the other hand the real 'systems engineering' is writing the code that lets those engineers do so successfully, which I think is quite an impressive feat.
I was joking! Claude Code is still the best afaik, though I’d compare it more to “sending a 1440p HDR fax of your user story to a 4-armed mime whose mind is then read by a Aztec psychic who has taken just the right amount of NyQuil.”
Probably the saddest comment I've read all day. Crafting software line-by-line is the best part of programming (maybe when dealing with hardware devices you can instead rely on auto-generated code from the register/memory region descriptions).
How long would that be economically viable when a sufficient number of people can generate high-qualify code in 1/10th the time? (Obviously, it will always be possible as a hobby.)
This seems to be the current consensus.
A very similar quote from another recent AI article:
One host compares AI chatbots to “a very smart assistant who has a dozen Ph.D.s but is also high on ketamine like 30 percent of the time.”
https://lithub.com/what-happened-when-i-tried-to-replace-mys...