Hacker News new | past | comments | ask | show | jobs | submit login

I tried it in OpenAI's O1. If I give it minimaxir's original prompt it writes the obvious loop, even if I include the postamble "Look for tricks that will make this function run as fast as possible in the common case".

However, if I then simply ask "What is the most probable result for this function to return?" it figures out the answer and a very good approximation of the probability (4.5e-5). From there it's easily able to rewrite the program to use the trick. So the creative step of spotting that this line of reasoning might be profitable seems missing for now, but 2025's models might solve this :-)




The information on the creative step which you provided to o1, was also the key step and contained almost all the difficulty. The hope is that 2025 models could eventually come up with solutions like this given enough time, but this is also a toy problem. The question is how much clever answers will cost for real world complex problems. At present it looks like, very much.


For me O1 found this by telling it "There is a further significant optimization possible."


What if you keep telling it that "there is a further significant optimization possible"?


I claim we can do O(1) complexity (minus precompute) in all cases, see another comment of mine. Curious if O1 will figure it out.


In that comment, you are generating your own random numbers and then optimizing away the actual generation. It can't take an input array.

While clever, I think that strays too far from the initial prompt.


All I need is the proportion of the qualifying numbers to the input array to run the algorithm and the number of samples. Then we can sample min, max index of the qualifying array and return their difference without having to sample many times if we can derive the joined min max distribution conditional on the Bernoulli.

In other words the procedure can take any input array and qualifying criteria.

The joint distribution is relatively simple to derive. (This is related to the fact that min, max of continuous uniform on 0, 1 are Beta distributions.)


Sampling doesn't give you the actual answer for an actual array. If the program uses the array for multiple things, such as organizing the numbers after allocating the correct number of buckets, your method will cause logic errors and crashes.

The O(1) method based on statistics only works when the function making this calculation can hide the array (or lack of array) behind a curtain the entire time. If it has to take an array as input, or share its array as output, the facade crumbles.

The prompt is not "generate this many random numbers and then say max qualifying minus min qualifying". If it was, your method would give valid solutions. But the prompt starts with "Given a list".

In the article, we let ChatGPT generate the random numbers as a matter of convenience. But the timing results are only valid as long as it keeps that part intact and isolated. We have to be able to swap it out for any other source of random numbers. If it invents a method that can't do that, it has failed.


It depends on how you read the problem still. In a lot of the llms solutions the array is not provided in the solving functions but rather constructed inside (as instead of defining the function with an input and then creating a main function that would be called with no argument, construct an array and call the solving function with that as argument, as typical in python), so I assume the llm did not read it like this or also failed this aspect of the code (which was never really mentioned). It is not clear if we are given a specific array of integers or one input is an array of random variables that we need to instantiate ourselves.


Given the problem size is bounded, all solutions for solving this could be considered O(1).


This gets to the old saw, "knowing what question to ask is the most important thing". To the extent that LLMs can answer questions better than formulate which ones to ask, they may be inherently limited. We will see.


But it does seem they are good (to the extent that they are good at anything) at identifying the questions first if you ask them. It does mean you need an ok enough meta-question to start the chain of the reasoning, but that is the key insight of the recent wave of "reasoning models." First ask the LLM to reformulate the problem and structure an approach, or multiple approaches on how to address it, then have a second pass do just that.


Google search with less steps? Still a huge advancement, of course.

Wonder how much benefit a meta lang for describing these problems correctly for the LLMs to process into code, an even-higher level language perhaps we could call it English?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: