Programming Sudoku


  1. Solution Methods
    1. Exhaustive Search
    2. Constraint-Based Search
    3. Strategy-Based Approaches
  2. Generating Sudoku
  3. Links on Sudoku Programming

Solution Methods

Even large Sudoku can typically be solved in short order by a program using exhaustive search. A depth-first searcher is a fairly straightforward programming task, hence the large number of 10-line Sudoku solvers available on the Web. So if solutions are what you are after, that's by far the simplest way to go.

Constraint-Based Search

Exhaustive search has the disadvantage that incorrect choices made early in the search require a lot of search effort to eliminate. Search time can be greatly reduced by adding constraint-propagation to the search process, such as Knuth's Dancing Links algorithm. The solver still does an exhaustive search, but by dynamically updating the set of options at each cell the solver can recognize forced or incompatible choices earlier, greatly reducing the search space. Constraint-based search trades some relatively constant overhead for a much more predictable completion time. When generating Sudoku boards by exhaustive search, sudoku variants just slightly more constrained than standard Sudoku, such as those with diagonal constraints or jigsaw regions, tend to generate much longer searches. Constraint propagation fixes this.

Strategy-Based Approaches

Search is a straightforward way to find a solution and determine whether it is unique. But it provides no information on the logic or difficulty of the puzzle, nor does it help solve other puzzles more effectively. A satisfactory solution to a complex problem, in Sudoku as in chess, provides a directed solution process - for Sudoku, a sequence of steps, each showing how some cell value can be deduced to be either necessary or excluded. Those deduction steps provide strategies one can apply to solve other puzzles.

Many Sudoku strategies have been developed, varying widely in generality and complexity. Many solvers on the Web implement multiple strategies ranging from basic single value checks to fiendish procedures such as Medusa Chains and Almost-Locked Sets. The strategies have a fairly natural ordering in complexity, so that people tend to learn and to apply them roughly in that order. Automated solvers are most interesting as support tools for learning and executing increasingly sophisticated strategies. To our knowledge there is no one strategy or set of strategies that is guaranteed to solve all Sudoku, much less no simple ones, so all non-exhaustive solvers are heuristic to some degree.

Generating Sudoku

Generating Sudoku puzzles is harder, because the search space is much larger, and the search needs to satisfy multiple requirements -- puzzles need to have a solution, the solution should be unique, the puzzle should be in a certain range of difficulty, there may be symmetry or other requirements on the arrangement of clues on the board, the puzzle should not include unnecessary clues, and so forth.

Our generator uses multiple phases such that only a subset of requirements can be addressed at each phase. It currently uses 2 types of solver, constraint-propagation and heuristic. First, a solved board is generated by randomized constraint-based search. Then a depth-first search is done of possible boards based on that solution: at each step a set of 1 or more clues to remove is selected that maintain the specified symmetry requirement - no symmetry, 180-degree (the standard), 90-degree, etc. - and the resulting puzzle is checked for uniqueness using a constraint-propagation solver. If there is more than 1 solution, the last set of clue removals is backed out and a different set is tried. Third, at each terminal node of the search, a heuristic solver is called to rate the puzzle's difficulty. This process continues until a board with the desired difficulty range and clue density is found, or the limit on evaluated positions is reached.

Working backward by removing clues from a solved board means that solvability is "built in", and the search can be limited to board configurations with the desired symmetry. The stronger the symmetry requirement, the more clues have to removed at each step, and the smaller the search space. Uniqueness has to be tested at each step, but since only a yes/no answer is needed, a fast non-heuristic solver can be used. The heuristic solver is used for rating puzzles because it provides a solution using the simplest strategy available at each step; a combination of the maximum and average step difficulty provides a reasonable gauge of puzzle difficulty.

Links on Sudoku Programming

Simonis 2005 suggests various generation methods and recommends working backward from solved boards.


© 2010-2013 Roland Zito-Wolf

To send feedback and suggestions click here