I tried it in OpenAI's O1. If I give it minimaxir's original prompt it writes th...

Vetch · 2025-01-03T17:41:37 1735926097

The information on the creative step which you provided to o1, was also the key step and contained almost all the difficulty. The hope is that 2025 models could eventually come up with solutions like this given enough time, but this is also a toy problem. The question is how much clever answers will cost for real world complex problems. At present it looks like, very much.

cultureulterior · 2025-01-03T18:30:28 1735929028

For me O1 found this by telling it "There is a further significant optimization possible."

freehorse · 2025-01-03T18:53:24 1735930404

What if you keep telling it that "there is a further significant optimization possible"?

afiodorov · 2025-01-03T19:23:16 1735932196

I claim we can do O(1) complexity (minus precompute) in all cases, see another comment of mine. Curious if O1 will figure it out.

Dylan16807 · 2025-01-03T23:49:58 1735948198

In that comment, you are generating your own random numbers and then optimizing away the actual generation. It can't take an input array.

While clever, I think that strays too far from the initial prompt.

afiodorov · 2025-01-04T00:06:17 1735949177

All I need is the proportion of the qualifying numbers to the input array to run the algorithm and the number of samples. Then we can sample min, max index of the qualifying array and return their difference without having to sample many times if we can derive the joined min max distribution conditional on the Bernoulli.

In other words the procedure can take any input array and qualifying criteria.

The joint distribution is relatively simple to derive. (This is related to the fact that min, max of continuous uniform on 0, 1 are Beta distributions.)

Dylan16807 · 2025-01-04T01:49:20 1735955360

Sampling doesn't give you the actual answer for an actual array. If the program uses the array for multiple things, such as organizing the numbers after allocating the correct number of buckets, your method will cause logic errors and crashes.

The O(1) method based on statistics only works when the function making this calculation can hide the array (or lack of array) behind a curtain the entire time. If it has to take an array as input, or share its array as output, the facade crumbles.

The prompt is not "generate this many random numbers and then say max qualifying minus min qualifying". If it was, your method would give valid solutions. But the prompt starts with "Given a list".

In the article, we let ChatGPT generate the random numbers as a matter of convenience. But the timing results are only valid as long as it keeps that part intact and isolated. We have to be able to swap it out for any other source of random numbers. If it invents a method that can't do that, it has failed.

freehorse · 2025-01-04T10:33:42 1735986822

It depends on how you read the problem still. In a lot of the llms solutions the array is not provided in the solving functions but rather constructed inside (as instead of defining the function with an input and then creating a main function that would be called with no argument, construct an array and call the solving function with that as argument, as typical in python), so I assume the llm did not read it like this or also failed this aspect of the code (which was never really mentioned). It is not clear if we are given a specific array of integers or one input is an array of random variables that we need to instantiate ourselves.

ryao · 2025-01-03T19:42:14 1735933334

Given the problem size is bounded, all solutions for solving this could be considered O(1).

pinko · 2025-01-03T19:49:26 1735933766

This gets to the old saw, "knowing what question to ask is the most important thing". To the extent that LLMs can answer questions better than formulate which ones to ask, they may be inherently limited. We will see.

jacobr1 · 2025-01-03T20:14:44 1735935284

But it does seem they are good (to the extent that they are good at anything) at identifying the questions first if you ask them. It does mean you need an ok enough meta-question to start the chain of the reasoning, but that is the key insight of the recent wave of "reasoning models." First ask the LLM to reformulate the problem and structure an approach, or multiple approaches on how to address it, then have a second pass do just that.

intelVISA · 2025-01-03T21:37:30 1735940250

Google search with less steps? Still a huge advancement, of course.

Wonder how much benefit a meta lang for describing these problems correctly for the LLMs to process into code, an even-higher level language perhaps we could call it English?