This article is a good example (to me) of presenting a "maths result" as an incorrect fact.
The "5,472,730,538" is the number of different (under symmetry) completed grids -- so are we saying we could make a (hypothetical) book with 5,472,730,538 completed grids and say "Here is all your Sudoku"? Or, that we are only allowed to create exactly one Sudoku from each finished grid? Neither of those seem like reasonable claims.
A more reasonable question would be, how many partially completed sudoku grids are there which have a unique solution. We don't know the answer to that.
Even if we did do that, I've done some research on measuring difficulty of Sudoku -- many Sudoku grids are unsolvable by humans -- they are far too hard.
A different, and interesting, question would be how many Sudoku are there which can be considered human-solvable. That's an even more open problem (we'd have to exactly define human-solvable to start).
> Even if we did do that, I've done some research on measuring difficulty of Sudoku -- many Sudoku grids are unsolvable by humans -- they are far too hard.
> A different, and interesting, question would be how many Sudoku are there which can be considered human-solvable. That's an even more open problem (we'd have to exactly define human-solvable to start).
What does it mean for a sudoku to be non solvable by human? I wrote a sudoku solver once and thought humans solving harder sudokus is basically exactly what a computer would do: guessing number and trying to solve it further, backtracking on failure. Or do you mean non-human solvable as "it would take a lot of time for a human to solve it"?
Not the GP, but I think in general the “guess and check” strategy frowned-upon by the Sudoku community. Ideally you should be able to make progress by making some logical observation (a trivial example: the numbers 1-8 are present in this column, therefore the remaining cell in this column is a 9). Humans can carry this out to complex patterns (the “swordfish” involves looking at 6 cells in 3 different rows/columns and inferring that cells visible to those cannot take on particular values), but it is seems reasonable to say that patterns will reach complexity that humans can’t hold in their heads. Obviously, any such classification would be subjective, but there’s probably a “reasonable” limit to solvability. (Without guess and check)
Obviously, any such classification would be subjective, but there’s probably a “reasonable” limit to solvability. (Without guess and check)
It’s totally subjective. Since sudoku is a closed (fixed size grid, fixed number of symbols), guess-and-check is just a catch-all label for logical inference strategies which haven’t been named yet. New strategies are discovered (and named) all the time and one of the ways to do this is to guess and then, if the guess was correct, go back and try to understand why.
This bias against guess and check seems to be some deep-seated issue from our culture. I know a lot of mathematics teachers frown on guess-and-check as well and that can rub off on their students, possibly implanting the bias for life. Unfortunately, having this bias can really damage a person’s ability to learn and succeed at mathematics in university. It turns out that guess-and-check makes a triumphant return as a strategy for quickly completing proofs when one is unfamiliar with established theory. Sound familiar? That’s just like sudoku!
Working mathematicians, in contrast to poor math teachers at lower levels, have a healthy relationship to the guess-and-check strategy. And that’s good news for them, since they are often working in areas where there is insufficient established theory to make any progress.
It's all down to whether or not there is any logic to the guess and check strategy. Most of the computer algorithms work by picking a number in the possible values, following the eliminations, and repeating. If they hit an error, they backtrack to the last decision, choose a new number from the set and repeat.
If you look at some of the Cracking the Cryptic puzzles on YouTube with "computer" in the title, that covers cases where the computer resorts to a guess and check strategy but where better underlying logic is available in the solve path.
The finite nature of sudoku puzzles guarantees that there is an underlying logic to a given move, whether the solver is aware of it or not. Ultimately, all sudoku strategies are backed by an overriding master logic:
"Does this branch of the game tree contain the completed solution or not?"
Whether or not humans have come up with a name for a given strategy depends, almost entirely, on the depth of that branch.
No that’s not the case. Having a single solution does not mean that there is a logical path to it.
If I have a puzzle where I’m down to a bunch of different options and the only way to make progress is to just try fixing different numbers until the puzzle resolved that not a logical progression , it’s an exhaustive search.
If the solution is unique then there is a logical path to it, though maybe not one feasible to discover for an agent with bounded time and (especially for humans) bounded working memory.
Consider the search tree of a backtracking algorithm operating on a SAT-based representation of a puzzle. A leaf of the tree corresponds to some clause that becomes empty after eliminating literals assigned on the path to that node. If you flip the tree upside down and start from the clauses at the leaves, then two leaves with a common parent represent two clauses with complementary literals that can be resolved upon. So any branch of the algorithm's search tree can be viewed as a tree-structured resolution-refutation proof eliminating some value given a set of prior assignments on the path to the branch, and the tree as a whole can be decomposed into an ordered set of proofs for each elimination in the path to an overall solution.
Or, if you prefer, start again from the SAT-based representation and run a prime implicates algorithm like Tison's. Such an algorithm never advances a "guess" or generates a clause that is not entailed. It simply finds one logical consequence after another until it has found all of them, throwing away subsumed clauses along the way until all that's left is the unique solution.
Either way you will always produce a path to a solution and proof that the solution is unique using elementary rules of propositional logic that anyone can recognize. That, I think, is what your parent means.
Of course, the existence of a proof doesn't mean that a human can find that proof, and that's what we usually mean when we say there's no logical solution. That is, nothing new can be proven by scanning the puzzle looking for pattern matches against a fixed library of known proof templates taken to be the universe of logical techniques, and nothing can be proven from less structured heuristic search within the limits of human working memory and patience.
guess and check for a human works when the problem space is small enough. You can't guess and check yourself through a 300-move forced mate in chess or some bizarre sudoku before dying of old age.
Yeah in theory a human can brute force or guess like a computer, but that's obviously not the point of a meaningful game or problem to solve because it's mindless.
Nothing about the guess and check strategy implies that you have zero prior information and must guess blindly in an attempt to win the lottery for picking the correct move. Typically, people using guess and check have other principles, rules, or heuristics they use to narrow the search space down to a handful of options.
My original point still stands. A cultural bias against guess-and-check is just another way of saying that strategies which determine an exact sequence of moves using only strict deduction from known theory are the only “acceptable” way to play sudoku. That, to me, is highly limiting and absurd. It’s also inefficient because often you’re faced with a choice between two mutually exclusive options and it’s just plain faster and less error-prone to guess and check than it is to carry out all of the deduction in your head before making the correct move.
it's not an absurd limitation. You're not playing sudoku to beat an opponent (idk maybe there's competitive sudoku where people guess if it's most efficient), you're playing to get better at reasoning. Deduction literally is the game, not filling squares with numbers.
If you're doing programming exercises you can brute force or guess a solution but there's a cultural bias against doing that, because what you're trying to learn is how to be smart about the problem, otherwise the exercise is pointless.
Guessing never adds anything to the experience in a game that's deliberately set up to be a test of skill. It's like doing an actual puzzle. Yes you can trial and error the edges of the pieces randomly, maybe that's faster, but it defeats the purpose of puzzling. A puzzle that heavily benefits from guesswork is just a bad puzzle.
If you're doing programming exercises you can brute force or guess a solution but there's a cultural bias against doing that, because what you're trying to learn is how to be smart about the problem, otherwise the exercise is pointless.
What exactly counts as “being smart about the problem” is culturally determined. In my view, taking ten times as long to solve the problem using pure deduction (because you’re struggling to hold all of the possibilities in your head simultaneously) is being less smart about the problem than making an informed guess and then quickly and mechanically checking whether you were correct.
A puzzle that heavily benefits from guesswork is just a bad puzzle.
That puts an upper limit on the difficulty rating of puzzles which is in some sense limited by the imagination of solvers.
Unless the whole point of the puzzle is to determine the logic.
Bruteforcing (aside from taking longer is just not interesting. When people are setting puzzles a lot of time and effort is put into how the logic is expected to work on the path to solving. When people post puzzles for testing if people find a spot where bifurcation is necessary they’ll generally point it out and the setter will check to see if they made an error or whether the person testing has missed some of the logic.
Because general rule for a sudoku to not be considered is that at no point should you be in a position where you are needing to simply try different solution paths to find one that works.
I’d recommend watch a video by cracking the cryptic on YouTube to see how the solve works at the higher levels, as it’s possible I’m misunderstanding what you’re meaning by guess and check, and you’re meaning similar logic to what is used. In that case the problem is calling it “guess and check” leading us sudoku folk to think you’re meaning bifurcation.
Guess-and-check is not the same as brute force (nor exhaustive search). No one is sitting there with an empty sudoku grid, filling the whole thing in with guessed numbers, checking to see if the solution is valid, then erasing everything and starting over. Literally no one does that, not even software sudoku solvers (it takes way too long).
Guess-and-check is literally just another name for backtracking algorithms. These do not work (for humans) without an adequate suite of rules that can be use to prune the search space.
I highly doubt it. In Sudoku, it is either logically deducible that a particular value is to be placed into a given square. Or else it isn't; one can deduce that one of several values can be placed there, after all other exploration elsewhere has been exhausted.
Deducing what can be placed into what square is a simple elimination. If the square is in a row that already has 7, 7 cannot be placed there. If it's in a 3x3 square that has 5, 5 cannot be placed there. If it's in a column that has 4, 4 cannot be placed there. For every square of the Sudoku board, we can do this evaluation. We hit upon a square where all values are eliminated but one, and so that one is confirmed. Based on that confirmation, we can update the other eliminations and keep going. When this process cannot continue, there is no choice but to guess. OK, here we can have a 7 or a 4, and so it goes.
I don't see how you can pretend to have some strategies here, let alone ones with pet names and all.
People have different systems for tracking the information, to be sure.
Sure, you can always use backtracking to solve any sudoku, but people come up with new, more effective strategies all the time, eg. https://www.youtube.com/watch?v=ZLcey7qiXv8
I think there's some leeway for bifurcation, so "guess and check" when there's only two possibilities. It's not as "clean", but from what I gather it's commonplace in competitive sudoku because in some situations it's the fastest way to find the next digit.
But obviously to get to bifurcation you still need to infer enough to narrow a cell down to two choices, so that would still put a limit on solvability.
I would not call what the GP describes as ”guess and check”, but as “backtracking”.
That is just making logic observations, using the grid as a memory aid (“let’s see, if foo is a 1 and bar is a 2, then that’s a 7, and then that must be a 4, and that a 3, and then we hit a dead end, so the claim “this is a 1 and this is a 2” in’s true. Maybe foo is a 1 and bar is a 3? No, that’s clearly incorrect. Maybe foo is a 1 and bar is a 4? Etc)
I think human Sudoku solving is different, though. Humans don’t do a depth-first search, but a breadth-first one for paths that soon lead to being able to fill in some squares. Experienced solvers probably have heuristics for finding them.
I think I would rate the difficulty of a puzzle by the ¿average? depth needed in such a depth-first search to permanently place a new digit in a new cell, with correction factors for the number of paths at that depth that do that and the number of cells to fill in.
Generally if a puzzle requires bifurcation it’s considered to be at least partially broken.
The setter is expected to have made it so that determining each cell can be done without simply seeing if a choice eventually succeeds. Of course exactly how many steps to count as bifurcation is fairly nebulous.
What community, and does it include mathematicians and logicians?
Sudoku puzzles that are solvable without guessing must be generated in a way that this is assured.
There is also an understanding that Sudoku puzzles should have exactly one solution.
It's possible for Sudoku puzzles to have a unique solution, yet be impossible to solve without guessing and backtracking. The process of deduction reaches a point at which it cannot be deduced what value to fill next into what square. It is narrowed down to several possibilities, which have to be pursued in parallel; e.g with backtracking. You follow all the paths, and discard them, returning to the divergent point, whenever a contradiction is reached: violation of the Sudoku constraints.
You can't just frown upon some solution method which is inescapable.
A human with sufficient pens and paper available is a (time-constrained) universal Turing machine, so the only Sudokus that aren't solvable by humans are those that take longer than a few decades to solve. In practice, humans don't like spending that much time on solving a single puzzle. And to most humans emulating a Turing machine is also not particularly enjoyable.
Generally people solve Sudokus in almost the exact opposite way.
They do very clever reasoning (much more advanced than basically any computer sudoku solver), but very little, or no, backtracking. This is mainly because backtracking more than 1 or 2 times becomes unreasonable (do you start making hundreds of photocopies of your Sudoku?)
The reason computer solvers don't do the advanced reasoning which humans do is it turns out it isn't worth it -- by the time you find and apply the reasoning, you can have done a few million backtracks, which will usually have solved the problem anyway.
I once wrote a Sudoku solver using the following algorithm: For each empty field, find which values could possibly be written there (which values are not already written in the same row/column/quadrant). Then it applied the following two rule: (1) If there is only one possible value for a field, write it there. (2) For each row/column/quadrant, for each value: if the value can possibly be written only in one field, write it there. Repeat this in a while loop, finish if you checked the rules and couldn't write anything.
No recursion, constant memory usage.
I assumed that this would not be enough. I was like "first I will code the obvious thing, then try the algorithm on examples from a book I bought, find the examples this cannot solve, and then think about further rules which could solve those examples". But as I tried it on the puzzles in the book, it solved all of them.
Which went against my intuition, as I was pretty sure I sometimes use more complicated rules when solving Sudoku by hand. But it turned out that it is somehow really simply for me to miss an opportunity.
This is of course not a proof that each uniquely solvable Sudoku can be solved by this algorithm, but I think that in practice at least 99% of them can be.
Then I tried an opposite thing, to use this solver to create puzzles. Like, create a full board, then try removing random numbers and check if the solver can solve this; stop when no number can be removed. This actually created pretty hard puzzles; harder than any example in my book.
Search for 'minimal Sudoku' I think it's 17 given numbers and try those on your solver. I wrote a backtracking solver which tried on the fields with least possibilities first. It needed considerable time for these puzzles.
The process you're describing is commonly (and usually derisively) referred to as "bifurcation," because experience solvers will generally only resort to "guessing" when there are only two possible values for a cell.
Strategies for solving sudoku puzzles are many and varied, but if a puzzle can only be solved by "guessing a number" rather than logically working out where a number goes or which number goes in a cell, it's probably a pretty bad puzzle.
Puzzles which require guessing at any point are generally considered "not solvable". The goal of solving a sudoku puzzle is not actually to get a grid with 81 numbers in it, and trying to progress by guessing defeats the whole point.
> A more reasonable question would be, how many partially completed sudoku grids are there which have a unique solution. We don't know the answer to that.
If you interpret "unique" to mean that two puzzles that lead to the same solution count as one, the answer would be equal to the number of completed grids. Just remove one number and you get a partially completed grid, and there cannot be more.
The interpretation you mean is probably how many partially completed sudoku grids have only a single solution instead of multiple, which leads to a much more interesting question.
No, in Sudoku, the constraint of a "unique solution" is that a valid puzzle (i.e. an incomplete grid) must only have one correct way of being filled in without violating the rules. If there are two possible ways of filling the grid without violating the rules, the puzzle does not have a unique solution.
It has nothing to do with whether then solution to one puzzle (starting grid) happens to be the same as the solution to a different puzzle (starting grid).
Then the maximum number of solveable puzzles must be the maximum number of unique grids...since you remove grid entries to make the puzzle and if you remove entries with the constraint that the puzzle remains solveable with a single solution etc etc
Imagine you start with one unique solved grid. There's 81 numbers, so by just removing one entry you arguably have 81 different "puzzles" that all lead to the same unique solution. From all of these you can also remove any second number, and get another 6480 puzzles that will still all have the same unique solution.
Now imagine taking away 61 numbers from the solved puzzle to get a sudoku with 20 starting numbers. Some of those puzzles have to be discarded because they might lead to multiple possible solutions, but you will still be left with millions of possible puzzles that look distinct to a human, and require different strategies to solve, all leading to the same solved grid.
In theory yes, but layouts that require just filling in missing numbers are not puzzles. The layouts with missing numbers become puzzles when some thought must go into entry selection. The question is how many puzzles requiring logical deductions are there for each unique 81*81 grid.
At first approximation, the number of logical deductions just goes up (exponentially?) with the number of missing digits. That's why you see many sudoku books grouping puzzles by "difficulty" by just stating how many digits are given.
Of course humans can make much more interesting puzzles, where you are expected to make a certain string of logical deductions to reach the solution. But simple "machine-made" sudokus seems to sell well enough, so I see no reason to exclude them from the definition.
> No, in Sudoku, the constraint of a "unique solution" is that a valid puzzle (i.e. an incomplete grid) must only have one correct way of being filled in without violating the rules.
My answer is to the answer to the above comment:
> Then the maximum number of solveable puzzles must be the maximum number of unique grids
> A more reasonable question would be, how many partially completed sudoku grids are there which have a unique solution. We don't know the answer to that.
The number of minimal Sudoku puzzles is estimated (with 0.065% relative error) to be 3.1055e+37 (or 2.5477e+25 for non-isomorphic minimal puzzles). See Bethier https://arxiv.org/pdf/1111.4083.pdf
Early in the pandemic I cracked my knuckles on writing generator for monster Sudoku [1]; an example can be seen at [2]. I haven't solved a single one of them by hand.
I'm surprised not to see any mention of satisfiability here. There are a few papers I've seen about it, but this blog post seems to give a reasonable overview:
> Using combinatorics, we can take any one Sudoku grid and, with various simple tricks, create enough unique grids for you to do one each day for the next century. Simply by transposing and rotating the grid or interchanging columns and rows we get exponentially more unique puzzles.
You can sort the three columns of boxes in six ways, same for the rows of boxes. Within each column of boxes there are three columns that can be sorted in six ways - same again for the rows. That gives us (6 × 6) × (6 × 6 × 6) × (6 × 6 × 6) = 1 679 616 variations. You can also turn the grid 90 degrees, 1 679 616 × 2 = 3 359 232.
Then you can shuffle the order of the nine digits, since they don’t have any mathematical meaning, but are just arbitrary symbols. That’s another 9! = 362 880.
3 359 232 × 362 880 = 1 218 998 108 160
That’s going to keep you busy for more than the next century. In fact, even if you solve one per minute, you’ll be busy for 2.3 million years solving permutations of a single sudoku.
Funny that you bring this up. These puzzles would not be considered distinct within group theory. A simple application of Burnside’s lemma would allow a person to recover the true number of distinct puzzles from this figure.
A person playing the same puzzle every day that’s just been rotated or had the symbols permuted is going to quickly get bored. While the puzzle looks superficially different the graph structure and the logical steps in the solution will be identical. So doing this kind of combinatorial shenanigans is “cheating” and should be considered dubious, if not fraudulent, within the sudoku community.
I always thought it would be a fun prank to create a book of sudokus that were all permutations of the same puzzle. See how long it would take for someone to notice that they are solving the same puzzle over and over again.
No, that could definitely happen. Not sure how to estimate how many duplicates there would be. It would have to depend on the specific puzzle you started with.
A good example of calculable performance vs perceived performance. There's more sudoku puzzles in existence than anyone can memorize, therefore there's infinite sudoku puzzles from a human perspective.
From a computer's perspective there is only one sudoku puzzle. Writing a sudoku solver is a college level weekend exercise
while I was learning lisp I did one as a way to learn the language a bit, and it was absolutely delightful. I feel like this sort of problem lent itself very well to the core features of the language I was trying to learn, so I highly recommend it as an exercise. Peter Norvig also has "his" solution to it in Python on the plane, and it's just remarkable how just looking at it I could tell that there was just a clarity of thinking to that man's code that was something I could only ever hope to emulate some day.
A good starting point is the YouTube channel Cracking the Cryptic, linked below, which you may already be familiar with. He does superb sudokus which employ advanced techniques. This includes a LOT of variants.
Variants may be a turnoff if you’re not starting light. My favourite very light variant is this one:
Normal sudoku rules apply but in the final grid, every clue is wrogn (aka invalid)...
Clues are valid if: [Everything is "totally standard" Undar promises...]
-Digits in a killer cage sum up to the small clue in the top left of the cage;
-Clues outside the grid are correct X-sums. ie Clues show the sum of the first X digits in that direction where X is the 1st digit.
- Clues outside the grid are correct skyscraper clues. ie The digits in the grid represent the "heights" of skyscrapers. Clues represent the number of skyscrapers seen from that direction. Taller skyscrapers block shorter ones.
-Digits on a thermometer increase from the bulb end
-Digits separated by a black dot has a ratio of 1:2
-Digits separated by a white dot are consecutive
-Digits separated by X have a sum of 10
-Digits separated by V have a sum of 5
-All digits in a circle appear in the 4 cells touching it
-Maximum cells are larger than all 4 adjacent cells
That one is just crazy... but solvable. Apparently, there's a new one from just a week ago that is similar in construction - https://youtu.be/JTkvR5M0yFo (you'll note the 2h long video)
> Group Sums: A number in a small circle indicates the sum of the digits in the cells covered by the corresponding circle. These digits may or may not repeat. Arrow Sudoku: Digits along an arrow must sum to the number in the corresponding circle. In this Sudoku all group sum clues contain a comparison operator. The clue ≥26 for example indicates a sum of at least 26. An arrow between two cells points to the smaller number.
You'll note that I picked a particular channel with the longest videos.
The community around that channel has some particularly difficult puzzles especially if you're willing to go beyond "normal Sudoku rules apply."
Sudoku variants, and even more importantly human-designed puzzles, are where all the fun is.
I grew bored with classic machine-generated sudoku ages ago because it began to feel very mechanical. Adding variant rules like thermometers lets puzzle setters use more bits of logic that are satisfying to reason about.
Puzzle creators are still finding new theorems about how sudoku works, even within the standard rules, and creating puzzles that teach them to solvers. The Phistomephel Ring, an extension of set theory, applies to every single sudoku grid in existence.
As a middling solver, many recently-featured puzzles are too difficult for me, but it’s still fascinating to watch CTC solve them.
The "5,472,730,538" is the number of different (under symmetry) completed grids -- so are we saying we could make a (hypothetical) book with 5,472,730,538 completed grids and say "Here is all your Sudoku"? Or, that we are only allowed to create exactly one Sudoku from each finished grid? Neither of those seem like reasonable claims.
A more reasonable question would be, how many partially completed sudoku grids are there which have a unique solution. We don't know the answer to that.
Even if we did do that, I've done some research on measuring difficulty of Sudoku -- many Sudoku grids are unsolvable by humans -- they are far too hard.
A different, and interesting, question would be how many Sudoku are there which can be considered human-solvable. That's an even more open problem (we'd have to exactly define human-solvable to start).