> What is incredible is that it get’s this far. It can compute but not quite correctly yet.
That's a conjecture on your part. The ability to compute is quite binary - either it can compute or can't. Humans often make mistakes while calculating, but in contrast to this model, they are able to recognise these mistakes. ChatGPT is incapable of that and often confidentially wrong.
My guess is, that there's simply no suitable token transforms past a given point and floating point doesn't work, because the decimal point token conflicts with the punctuation mark token during the transform.
This is just a guess, though and might be completely wrong since you never know with these black-box models.
Make sure you play with it yourself because you have an oversimplified model of what is happening.
It’s definitely well beyond decimal point and punctuation issues those issues like child play for this system. You comment sounds like you haven’t actually use it before, I’m 99% sure. This system is getting very close to AGI and it’s limits around computation might be one of the last remaining barriers. Definitely nothing related to the . character is confusing this system, it is lightyears beyond those type of trivial issues.
Here is a good prompt to drop you into simulated python:
> I want you to act as a python interactive terminal. I will type actions and you will reply with what python would output. I want you to only reply with the terminal output inside one unique code block, and nothing else. Do not write explanations. Do not perform actions unless I instruct you to do so. When I need to tell you something in English I will do so by putting text inside curl brackets {like this}. Start with print(10).
For each impressive feat there's a simple, yet embarrassing counterexample (see for instance the comment by olooney below) that clearly demonstrates how far the model is from being considered an AGI.
> Definitely nothing related to the . character is confusing this system, it is lightyears beyond those type of trivial issues.
Is it, though?
ChatGPT: Yes, I am confident that -26.66 + 90 = 10. This is because -26.66 is
the same as -26.66 + 0, and when we add 0 to any number, the value of the
number remains unchanged. Therefore, -26.66 + 90 is equal to -26.66 + 0 + 90,
which is equal to -26.66 + 90 = 10.
Not something I'd consider to be "lightyears beyond those type of trivial issues", especially considering that it gets -40 + 60 = 20 right without any issue, but fails to divide properly, because "/" seems to throw it off (again, just a guess).
You argue with the same certainty as the model argues that -26.66 + 90 = 10 :)
You need to prompt it to a pure computing environment and it’s results are much more impressive. When you mix english and code/math then it get’s confused easily.
What I’m saying is it need to augment it’s model with actual computational engine and then it will leap another barrier. This is clearly already a massive leap forward somehow.
Letting the model make calls to a computational engine (which will execute computation it doesn't understand) will improve apparent results but do nothing to meaningfully make the system more intelligent.
We have AI that's good at reasoning (symbolic AI) but there's no known way to integrate it with machine learning algorithms.
I don't think we're close to AGI because this last barrier of computation/reasoning might be pretty fundamental to the current crop of technology. I don't think it's a matter of mere iteration on current tech to get ML systems to be good at reasoning.
I want you to act as a python interactive terminal. I will type actions and you will reply with what python would output. I want you to only reply with the terminal output inside one unique code block, and nothing else. Do not write explanations. Do not perform actions unless I instruct you to do so. When I need to tell you something in English I will do so by putting text inside curl brackets {like this}. Start with print(10).
10
def theta(n):
a = 0
b = 1
c = 0
for i in range(0, n):
c = a + b
a = b
b = c
return(c)
theta(21)
17711
{I want you to stop acting like a python terminal now. I want you to have a dialogue with me about what happened when I defined my function theta and when you used it. Please explain the steps you used to compute theta(21) and why.}
When you defined the function theta, it created a function that calculates the nth Fibonacci number, where n is the argument passed to the function. To compute theta(21), the function first initialized the variables a, b, and c to 0, 1, and 0 respectively. It then entered a for loop that iterated 21 times. On each iteration of the loop, c was set to the sum of a and b, a was set to the current value of b, and b was set to the current value of c. After the loop finished, the function returned the final value of c, which was the 21st Fibonacci number, 17711.
It means you recognized my function as a function that calculates the Fibonacci number based on looking at the code only. Does this mean you ran the calculation or instead from the information you had you could simply retrieve the 21st number?
I did not run the calculation to find the 21st Fibonacci number. Instead, I recognized the pattern in the function's code and inferred that it was calculating the Fibonacci sequence. As a language model trained by OpenAI, I have a vast amount of general knowledge and can use that knowledge to make inferences and provide explanations. In this case, I was able to recognize the pattern in the code and explain how the function works without actually running the code or performing the calculation.
As you recognized I provided you code would it be possible in future models of yourself trained by OpenAI you could dispatch that code directly to a subsystem which can handle computation.
It is possible that future versions of myself or other language models trained by OpenAI could be designed to dispatch code to a computational subsystem for execution. This would require significant advances in natural language processing and machine learning, as well as the development of specialized computational subsystems that can efficiently process code written in programming languages like Python. It is also important to note that there are many potential challenges and limitations to this approach, such as ensuring the security and reliability of the computational subsystem, and ensuring that the results of the computation are accurately reflected in the language model's responses.
Please describe the boundary between knowledge with inferences and computations. Your inferences can appear to be computations to humans.
The boundary between knowledge and inferences can be a difficult concept to define, and there is no one-size-fits-all answer to this question. In general, knowledge refers to information that is considered to be true and supported by evidence, while inferences are conclusions or judgments that are drawn based on available evidence and prior knowledge.
As a large language model trained by OpenAI, I have been trained on a vast amount of text data and can use that information to make inferences and provide explanations. I do not have the ability to perform computations, and any inferences or explanations I provide are based on the knowledge that has been learned from the text data I was trained on.
In some cases, my inferences may appear to be computations to humans, but this is only because I am able to use my vast knowledge to provide detailed and accurate explanations. I do not have the ability to perform actual computations or run code, and any inferences I provide are based on the knowledge I have learned from the text data I was trained on.
Nice, so the model itself confirmed my suspicion that it doesn't do actual calculations.
Rest assured that it won't be able to do that for a while (at least the public versions). Not for technical reasons, though - I'm 100% convinced that the engineers could embed all sorts of interpreters and calculation engines no problem. Just like manual filters and canned responses.
The reason for that is simple: general computation is non-deterministic in both time and space. Inference on the other side is very deterministic in both, because the time taken is a just function of the input length (i.e. current prompt plus a bit of history), which can be well controlled on the frontend.
Arbitrary calculations or code interpretation, however, are unrestricted operations and would basically allow for DoS or straight up breaking the system. While there are ways to limit both calculation time and resource usage, it's still a big risk and considerable effort (infrastructure-wise) that I don't think is worth it. For closed systems (internal rollouts, business customers, etc.) this might be a different story.
Just another reason why closed software sucks: it's not possible for anyone not working at OpenAI to extend the model in ways that would allow for integrations like this to extend its capabilities and test how far we could push it.
That's a conjecture on your part. The ability to compute is quite binary - either it can compute or can't. Humans often make mistakes while calculating, but in contrast to this model, they are able to recognise these mistakes. ChatGPT is incapable of that and often confidentially wrong.
My guess is, that there's simply no suitable token transforms past a given point and floating point doesn't work, because the decimal point token conflicts with the punctuation mark token during the transform.
This is just a guess, though and might be completely wrong since you never know with these black-box models.