> So the AI shows genuine creativity in problem solving.
This is garbage. The AI does not demonstrate "genuine creativity". LLMs generate statistically likely sequences of tokens based on their training data, which in this case is more or less the entire internet. The author asserts with absolutely no citation:
> I figured no one had ever made a request like that before.
What an idiotic claim. You're talking about the internet, for crying out loud. It's almost guaranteed somebody has written something like what you're writing before when it comes to requests of this nature.
Is the technology cool? Yes. Is it cooler than similar technologies from a year ago, or even a few months ago? Sure. But it's not intelligent. It doesn't think; it doesn't know. It generates statistically likely sequences of tokens for any input, and nothing more. People need to stop pretending otherwise.
I ultimately don't really disagree with you but your vitriol really undermines your point. Figured you should be aware of how you come across, at least from my perspective.
Plus there's an argument to be made that you and I are also generating likely sequences based on the sum of our experiences just in a more sophisticated (and evolutionarily refined) manner. There's at least no hard evidence to the contrary that I'm aware of.
Fake edit: I'm aware that my final statement is somewhat bait for the neuroscientists in the crowd. In this context I'm of the opinion that defining if agi 'thinks' requires a far more rigorous definition of what it is that we (and other problem solving / pass-the-mirror-test animals) are doing. For example, "It's the soul and GPT-4 doesn't have one" doesn't even come close to being rigorous enough, imo.
I get what you're saying but people don't really care. The typical/average person does not know anything about derivatives, backpropagation, or probabilities so to them it all seems like magic and they anthropomorphize what they're seeing as something intelligent.
Some folks that know how this stuff works and do a good job of explaining the limitations are Melanie Mitchel and Francois Chollet. Both have extensive experience in the field and have also written books on AI.
You can spend your time trying to explain to every random person that computers can't think but they're not gonna understand what you're saying because to them it seems like a large enough Markov chain is actually thinking.
What about those of us who do understand them and just don't agree?
After all you could simplify it to a layperson as: 'the LLM is just doing fancy autocomplete based on how stuff appeared in the training data so that means they're not creative'
At some point, the push back against these models being creative starts to feel like it's just as emotion driven as the people who are over-anthropomorphizing the models: "If I accept something I know is just a ball of linear algebra is creative, then it's cheapening the definition of creativity."
People bring up the stochastic parrot argument forgetting that the original paper was predicated on the dangers of not considering the power that lies in something that's "just" a stochastic parrot.
This is what I mean when I say "inverse-anthropomorphization" crowd is increasingly emotion over facts.
My reply to you was predicated on a compilation of centuries of scientific study on the subject of creativity. Your knee-jerk reply is to proclaim it's bad at sudoku while going out of your way to place artificial constraints on it.
Touting its inability to solve sudoku in-context feels like a slightly hamfisted way of saying it's a probability based model operating on tokens but like I said before, there are plenty of us who already understand that.
We also realize that you can find arbitrary gaps in any sufficiently complex system. You didn't even need to rely on such a specific example, you could have touted any number of variations on common logic puzzles that they just fall completely on their faces for.
Gaps aren't damning until you tie them to what you want out of the system. The LLM can be bad at Sudoku and capable of creativity in some domain. It's more useful to explore unexpected properties of a complex system than it is to parade things that the system is already expected to be bad at.
The fact is that no neural network can solve sudoku puzzles. I think it's hilarious that AI proponents/detractors keep worrying about existential risk when not a single one of these systems can solve logic puzzles.
I didn't say anything about existential risk, and I'm going to assume you meant LLM since training a NN to solve sudoku puzzles has been something you could do as an into to ML project going years back: https://arxiv.org/abs/1711.08028
To me the existential risks are pretty boring and current LLMs are already capable of them: train on some biased data, people embed LLMs in a ton of places, the result is spreading bias in a black box where introspection is significantly harder.
In some ways it mirrors the original stochastic parrot warning, except "parrot" is a much significantly less loaded term in this context.
Ah, the God of the Gaps argument. What's your next move when somebody implements a plugin that has the effect of being able to solve Sudoku puzzles?
A good friend swore that ML was 50 years away from being able to beat a Go grandmaster. To his credit, he stopped making such sweeping predictions after that happened. He didn't fall back to, "Well, I don't know, let's see how it does at Risk."
>What an idiotic claim. You're talking about the internet, for crying out loud. It's almost guaranteed somebody has written something like what you're writing before when it comes to requests of this nature.
I wonder if this is as true as we collectively think? ~15% of Google searches are unique†. Informally, folks I know that work on other search engines see even higher percentages, even after decades+ of history.
For me, the biggest productivity boost is from work that is (1) boring or time consuming to do, but (2) easy to verify / QC. (3) Extra useful if it is something that can the code interpreter can unit test in place, and iterate the code until the test passes.
For example, I’ll show it some big nested json response, then ask: Please write a python helper function get_foo_and_bar(endpoint: str = “whatever”)->Dict[str, Any] that hits the endpoint with <parameters>, then pulls out the foo and bar, and converts bar to List[int]. Please write and run a test that verifies the retrieved values from the above response are foo=“baz” and bar=[6,9,2].
It might take it 2 or 3 tries to get right, but it will iterate on its own until it works. Same process as if I were writing but by hand, but the 2-3 iterations might take me a few minutes and only takes the model a few seconds.
Code Interpreter is great for "quick and dirty" data analysis & visualizations. I've seen it struggle with larger datasets (not sure what the current limit is) but I can certainly see this sort of stuff replacing basic BI at organizations.
Microsoft is working on embedding this sort of stuff into what they call Fabric - which is essentially meant to be PowerBI on steroids. I know in the current preview, you can interact with data using a chat-like interface ("Show me our revenue for the last 12 months sorted by customer") .
One large use case is certainly data analysis. But one big challenge with analytics is: you always get a result.
Wrong filter, wrong aggregation function, wrong table or wrong join can all still result in plausible but wrong results.
Hallucinations take the form of arbitrary logic.
Disclaimer: I'm a cofounder at getdot.ai. We are starting to solve that with a semantic layer, but for now it is limited to business reporting (sales per territory, etc.). PowerBI will imo make it even easier for people to create visuals (NL vs drag n drop), but right now fails at the consistency part.