When people say "oh look how amazing, it can solve programming problems!" when i...

notnaut · on Nov 9, 2023

It can generate never-before-seen strings of comprehensible language. It can react to the inherent logic embedded in words and text and provide a brute forced version of what a human could. That it can “solve” a problem only through “cheating” is an anthropomorphism that betrays the magic that is evident to anyone who has used these things.

FeepingCreature · on Nov 9, 2023

I've seen it code on completely novel tasks, so I'm not sure what you're suggesting here. The model can unquestionably code.

raincole · on Nov 9, 2023

Almost 2024 and people still can't accept that LLM can code...

danielmarkbruce · on Nov 9, 2023

Of course they can't. And self-driving cars also don't exist, it's like 10 years away at best.

og_kalu · on Nov 9, 2023

Okay... Funny how forcing it to not CHEAT did not increase apparent ability.

It can code and it has memorized some coding questions are not mutually exclusive.

hanselot · on Nov 9, 2023

Though this is exactly what happened. The initial test was ran on a model that "Cheated" (aka has memorized the answers). The second test was run on a model that didn't "Cheat" as much, yet still got only 2% less score. So, the question is not resolved really. How much did the first model cheat, and how much did the second? If the second model "cheats" less, then it wins.

Also, I don't understand your obsession with the word cheating. If you have solved a problem before on a different website and solve it again, did you cheat? Or did you just use your brain to store the solution for later?

boxed · on Nov 9, 2023

> Also, I don't understand your obsession with the word cheating.

It's all about the rule set yea. Since the rule set is not defined, technically nothing is cheating. I just interpret the rule set as "can it code?" and for this rule set, it seems to me that it's cheating.

boxed · on Nov 9, 2023

> How much did the first model cheat, and how much did the second? If the second model "cheats" less, then it wins.

They both cheated 100%. Because they both never saw the problem. AT ALL. They just saw the title and the name of the website.

boxed · on Nov 9, 2023

> Okay... Funny how forcing it to not CHEAT did not increase apparent ability.

The article did the opposite. It forced the models to cheat to solve the problems. Which it did happily. It should have stated "there is no actual problem to solve here, you must supply a problem for me to solve".

> It can code and it has memorized some coding questions are not mutually exclusive

This I will give you. Many humans try to cheat at basic math because they are lazy, so will this model. Maybe that's a sign of intelligence :P

raincole · on Nov 9, 2023

Me: What's 6x6?

You: 36

Me: You cheated! You just cited the answer you memorized! You should have started from addition.

You: ...okay? 6+6=12, 12+6=18, 18+...

Me: You cheated again! You just have 6+6=12 memorized! You should make the rule of addition out of Peano axioms.

You: ...you're being annoying, but okay? First axiom, we define 0 as...

Me: You cheated again! You memorized Peano Axioms! Jesus Christ, is there any intelligent creature left?

jsight · on Nov 9, 2023

TBH, people underestimate how much of coding is just memorization. I'm guessing those of us with bad memories understand this more than the ones with good memories. :)

I can't remember how many times I've googled, "how do I create a directory in Python?". Now bard often generates an inline answer for me.

boxed · on Nov 9, 2023

But in this case it's not like that at all. They only saw the NAME of the problem. Like if I said "Page 23 of Mathbook Y, problem number 3". Which happens to be 6x6.

qup · on Nov 9, 2023

I know this is deep down a bad comment thread, but I thought I'd chime in here.

I have been writing function names and test names, and then telling gpt to fill in the test, which is usually does how I want (maybe with errors, but it tests the correct thing), and then I tell it to fill out the answers.

this is in a thing I'm building that's never been built, with names that I made up (but describe the functionality well)

It cannot have this spot memorized, I just invented it myself

stnmtn · on Nov 9, 2023

If I gave you a programming problem and all I told you was that the problem name was Traveling Salesman, you might be able to solve it based on that.

If not that, then if I just said "fizzbuzz" to you, I'm sure you would be able to give the solution without me needing to say any other descriptions of the problem

boxed · on Nov 9, 2023

Again, because of memorization, not being able to code.

FeepingCreature · on Nov 10, 2023

But in that case, not memorization of the specific problem set, but "programming background knowledge." Hardly something to blame the machine for when we rely on it every day.

raincole · on Nov 9, 2023

Me: I was being in such a blah blah situation... does the article 3 of Digital Government Act applies here?

My lawyer: Hmm the article 3 says--

Me: I knew it! Lawyers are not intelligent...

furyofantares · on Nov 9, 2023

It said they gave the exercise name, which doesn't sound like just the exercise number but probably mildly descriptive -- and they also gave it function stubs.