For that to be somewhat true, I can see at least two prerequisites: 1) the function must be pure(no side-effects); 2) predicting well is not enough, GPT must predict perfectly a hundred percent of the time.
It is fundamentally different in mathematical foundations; some functions are proven formally verified and therefor will execute 100% perfectly (I guess you are talking about actual bugs like hardware issues?); what gpt3 does is not even close to that; if you put the same input to gpt3 multiple times it comes up with different answers. That is nowhere close to a computer executing an algorithm.
I'm not talking about GPT-3, I'm discussing the theoretical question raised by the grandparent of my comment: How is predicting the output of a function fundamentally different from executing the code?
We call computers deterministic despite the fact that they don't with perfect reliability perform the calculations we set them. The probability that they'll be correct is very high, but it's not 1. So the requirement we have for something to be considered deterministic is certainly not "perfectly a hundred percent of the time", as the parent to my comment suggested.
> if you put the same input to gpt3 multiple times it comes up with different answers. That is nowhere close to a computer executing an algorithm.
It's a non-deterministic algorithm, of which many kinds exist. Producing different answers that are close-ish to correct is in fact what a Monte Carlo algorithm does. Not that you'd use GPT3 as a Monte Carlo algorithm though, but it's not that different.
I don't think that's a reasonable assumption. If we allow ourselves to assume no errors, we could just assume GPT-3 makes no errors and declare it equivalent to a code interpreter.
Interpreter? Sure. That interpretation is not "equivalent to executing the code", though.
Imagine a C compiler that does aggressive optimizations - sacrificing huge amounts of memory for speed. On one hand, it even reduces computational complexity, on the other it produces incorrect results for many cases.
GPT-3 as presented here would be comparable to that. Neither are equivalent to executing the original code.
Meanwhile, the result of something like gcc is, even if it runs on a computer with faulty RAM.
Speed and memory is orthogonal to my point, which is about the output of two methods of arriving at an answer. I'm obviously not saying GPT-3 is anything like as efficient as running a small function.
What distinction are you drawing between the output of an interpreted program and a compiled program?
Still, technically you're not executing the code.