Yup, it flunked that one. I also have a question that LLMs always got wrong unti...

novaRom · 2025-04-29T06:21:32 1745907692

4o with thinking:

By systematic (BFS) search of the entire 32-state space under these rules, one finds no path from to that stays always safe. Thus the puzzle has no solution—there is no way for the man to ferry all four items across without at least one of them being eaten.

mavamaarten · 2025-04-29T07:33:56 1745912036

You go with the cabbage, goat, wolf and lion all together!

SamBam · 2025-04-29T15:46:24 1745941584

O3 gave me basically that solution. "Below is the shortest safe schedule that really works ‒ but it assumes the boat can hold the man plus two passengers (three beings total). If your version of the puzzle only lets him move one passenger at a time, the puzzle has no solution: at the very first trip he would always leave at least one forbidden pair alone."

cyprx · 2025-04-29T03:39:35 1745897975

i tried grok 3 with Think and it was right also with pretty good thinking

SamBam · 2025-04-29T17:03:19 1745946199

I don't have access to Think, but I tried Grok 3 regular, and it was hilarious, one of the longest answers I've ever seen.

Just giving the headings, without any of the long text between each one where it realizes it doesn't work, I get:

    Solution
        [... paragraphs of text ommitted each time]
    Issue and Revision
    Revised Solution
    Final Solution
    Correct Sequence
    Final Working Solution
    Corrected Final Solution
    Final Correct Solution
    Successful Solution
    Final answer
    Correct Final Sequence
    Final Correct Solution
    Correct Solution
    Final Working Solution
    Correct Solution
    Final Answer
    Final Answer

Each time it's so confident that it's worked out the issue, and now, finally, it has the correct, final, working solution. Then it blows it again.

I'm surprised I didn't start seeing heading titles such as "Working solution-FINAL (3) revised updated ACTUAL-FINAL (2)"