The Hidden Logic of Sudoku


Denis Berthier



book cover


Online Supplements


The confluence property




Denis Berthier. All the material in this page and the pages it gives access to are the property of the author and may not be re-published or re-posted without his prior written permission.


Confluence is a very important (both theoretical and practical) property of a resolution theory T (i.e. a fixed set of resolution rules).
In practice, and this may be the best way for players to consider it informally, it guarantees that one can apply the resolution rules in T in any order without changing the final state that can be reached within this theory. In particular, if a puzzle can be solved within T, the confluence property of T guarantees that one can find this solution in an opportunistic way, e.g. by applying the rules in T as soon as one finds the corresponding patterns.
More generally, it allows to consider various formalised resolution strategies using the rules in T.



1) THE CONFLUENCE PROPERTY


In the first edition of my book (May 2007), I have introduced the confluence property of resolution theories.

Choose a fixed set of resolution rules, i.e. a resolution theory: T.
Starting from a given puzzle, you can apply different resolution rules from T. For each choice of a rule, you get a new state (values + candidates).
Considering all the states you can get when iterating this process, and putting an arrow between two states if there is a rule leading from the first to the second (a resolution step), you get a graph of "knowledge states". Nothing mystical here: by definition, a knowledge state is just what you can see on a grid (values + candidates), what you explicitly know about it, in short what is usually called the PM (PencilMarks), at some point in the resolution process along some resolution path.

A final state is a state in which no rule can be applied. Normally, most puzzles with a unique solution should have a single final state, if T is sufficiently powerful: the solution. But this is not always the case, depending on T.

Definition: a resolution theory T has the confluence property if, for any puzzle P, any two partial resolution paths for P can be completed within T to meet at a same knowledge state.
Equivalently:
- for any puzzle P and any two knowledge states KS1 and KS2 for P, there is a knowledge state KS3 such that KS3 can be reached independently from both KS1 and KS2 using only resolution rules from T;
- for any puzzle P, the set of knowledge states for P, ordered by the reachability relation defined by T, is a DAG (Directed Acyclic Graph).

Consequence: if a resolution theory T has the confluence property, then for any puzzle P there is a single final state and all the resolution paths lead to this state. In particular, if T solves P, one can't miss the solution by choosing to apply the "wrong" rule at any time.


This definition is meaningful in the context of my conceptual framework, based on the well defined concept of a resolution rule (whose conditions are based on and only on the current state of knowledge) and on the view that a puzzle is solved sequentially (i.e. only one rule is applied at a time).

One important consequence of the confluence property of T is that you can super-impose on T any ordering of the rules you want (giving priorities to the rules is not part of first order logic). It won't change the state you'll finally get to. In pratice, this means that, if you have missed some elimination while you were using another rule (to assert values or eliminate candidates), the same rule or other rules in T will always allow you to do this elimination later. Still more concretely, it means that you, as a player, don't have to care about the order in which you apply the rules in T. This is a huge simplification in your player's life. And it is also a huge simplification if one wants to define a rating based on the rules in T.

This property allows to define a strategic level above the logic level (the level of the resolution rules) - which is itself above the implementation level in case the rules are implemented in a computer program.
Concretely, I'll soon (I mean as soon as I've time) give examples of how one can take advantage of this property to define a lot of different resolution strategies for a given resolution theory having it.


Let me now introduce a new property, a priori stronger than confluence.
Definition: a resolution theory T is stable for confluence if for any puzzle P, for any knowledge state KS for P and for any resolution rule R from T applicable in state KS and leading to an elimination X, if any elimination Y is done before applying R and if this elimination destroys the pattern of R (which can therefore no longer be applied), then after Y is done, there is another rule in T that can be applied to eliminate X.
(In this definition, the elimination Y is not necessarily done by a rule from T).

It is obvious that: T is stable for confluence => T has the confluence property.

What's interesting here is the following (obvious):
Theorem: Let T1 and T2 be two resolution theories. If T1 and T2 are stable for confluence, the union of T1 and T2 is stable for confluence (and has therefore the confluence property).



2) SUBSET THEORIES


Let me recall the definitions of these theories.

Definition: let Subsetn be the following sets of resolution rules:
Subset0: ECP (elementary constraints propagation) (no unsolved puzzle can be solved here)
Subset1_0: Naked and Hidden Singles
Subset1: rules for elementary row-block and column-block interactions - see the subsumption  page)
Subset2: Naked, Hidden and Super-Hidden Pairs
Subset3: Naked, Hidden and Super-Hidden Triplets
Subset4: Naked, Hidden and Super-Hidden Quads

Remarks:
- Super-Hidden = Fish
- Pairs, Triplets and Quads are always considered as non-degenerate.

Definition: for each n = 0, 1_0, 1, 2, 3, 4, the resolution theory Tn(Subset) is the set of resolution rules at Subsetn levels from 0 to n, i.e. Tn(Subset) = Subset0 + Subset1_0 + ... + Subsetn.

The situation for subset theories is very simple: all the subset resolution theories Tn(Subset) have the confluence property and are stable for confluence. The proof is obvious.



3) NRCZT-BRAID THEORIES


The situation for braids is very simple: [b]all the braid resolution theories I have defined have the confluence property and are stable for confluence[/b].

3.1) pB-NRCZT theories
Definition: let pB-NRCZTn be the following sets of resolution rules:
pB-NRCZT0: ECP (elementary constraints propagation) (no unsolved puzzle can be solved here)
pB-NRCZT1_0: Naked and Hidden Singles
pB-NRCZT1: rules for nrczt-braids of length 1 (equivalent to elementary row-block and column-block interactions - see the subsumption  page)
....
pB-NRCZTn : rules for nrczt-braids of length n.

Remember that the length of a braid is the number of rc, rn, cn or bn cells on which it resides.

Definition: for each each n ≥ 0, the resolution theory Tn(pB-NRCZT) is the set of resolution rules at pB-NRCZT levels from 0 to n, i.e. Tn(pB-NRCZT) = pB-NRCZT0 + pB-NRCZT1_0 + pB-NRCZT1 + ... pB-NRCZTn.

I have shown (http://wwww.carva.org/denis.berthier/CISSE08-CSP-P2.pdf) that all these Tn(pB-NRCZT) theories have the confluence property. Indeed, the proof shows that these theories are stable for confluence.

The pB-NRCZT rating of a puzzle P is defined as the smallest n such that P can be solved within Tn(pB-NRCZT). A puzzle P has pB-NRCZT rating n if it can be solved using only braids of length no more than n but it cannot be solved using only braids of length strictly smaller than n.
The pB-NRCZT rating has therefore all the good properties one can expect of a rating:
- it is defined in a purely logical way, independent of any solver;
- it is invariant under symmetry and supersymmetry;
- it is based on an increasing sequence of theories Tn(pB-NRCZT) with the confluence property.


3.2) B-NRCZT theories

Definition: let B-NRCZTn be the following sets of resolution rules:
B-NRCZT0: ECP (elementary constraints propagation) (no unsolved puzzle can be solved here)
B-NRCZT1_0: Naked and Hidden Singles
pB-NRCZT1: rules for nrczt-braids of length 1 (equivalent to elementary row-block and column-block interactions - see the subsumption  page)
B-NRCZT2: Naked, Hidden and Super-Hidden Pairs + rules for nrczt-braids of length 2
B-NRCZT3: Naked, Hidden and Super-Hidden Triplets + rules for nrczt-braids of length 3
B-NRCZT4: Naked, Hidden and Super-Hidden Quads + rules for nrczt-braids of length 4
For n>4: B-NRCZTn: rules for nrczt-braids of length n

(Super-Hidden = Fish)

Definition: for each each n ≥ 0, the resolution theory Tn(B-NRCZT) is the set of resolution rules at B-NRCZT levels from 0 to n, i.e. T(B-NRCZTn) = B-NRCZT0 + B-NRCZT1_0 + B-NRCZT1 + ... B-NRCZTn.
All these Tn(B-NRCZT) theories have the confluence property and are stable for confluence. This is an easy consequence of the property of stability for confluence for Tn(Subset) and Tn(pB-NRCZT).

The B-NRCZT rating of a puzzle P is defined as the smallest n such that P can be solved within Tn(B-NRCZT).
The B-NRCZT rating has all the good properties one can expect of a rating:`
- it is defined in a purely logical way, independent of any solver;
- it is invariant under symmetry and supersymmetry;
- it is based on an increasing sequence of theories Tn(B-NRCZT) with the confluence property.



4) NRCZT-WHIP THEORIES


The situation for whips is more complex. To make it short: strictly speaking, whip resolution theories don't have the confluence property, but they almost have it statistically.

4.1) pNRCZT theories
Definition: let pNRCZTn be the following sets of resolution rules:
p-NRCZT0: ECP (elementary constraints propagation) (no unsolved puzzle can be solved here)
pNRCZT1_0: Naked and Hidden Singles
pNRCZT1: rules for nrczt-whips of length 1 (equivalent to elementary row-block and column-block interactions - see the subsumption  page)
....
pNRCZTn : rules for nrczt-whips of length n.

Definition: for each each n ≥ 0, the resolution theory Tn(pNRCZT) is the set of resolution rules at pNRCZT levels from 0 to n, i.e. T(pNRCZTn) = pNRCZT0 + pNRCZT1_0 + pNRCZT1 + ... pNRCZTn.

The [b]pNRCZT rating[/b] of a puzzle P is defined as the smallest n such that P can be solved within Tn(pNRCZT). A puzzle P has pNRCZTn rating n if it can be solved using only whips of length no more than n but it cannot be solved using only whips of length strictly smaller than n.
The pNRCZT rating has the following good properties one can expect of a rating:
- it is defined in a purely logical way, independent of any solver;
- it is invariant under symmetry and supersymmetry.


4.2) NRCZT theories

Definition: let NRCZTn be the following sets of resolution rules:
NRCZT0: ECP (elementary constraints propagation) (no unsolved puzzle can be solved here)
NRCZT1_0: Naked and Hidden Singles
NRCZT1: rules for nrczt-whips of length 1 (equivalent to elementary row-block and column-block interactions - see the subsumption  page)
NRCZT2: Naked, Hidden and Super-Hidden Pairs + rules for nrczt-whips of length 2
NRCZT3: Naked, Hidden and Super-Hidden Triplets + rules for nrczt-whips of length 3
NRCZT4: Naked, Hidden and Super-Hidden Quads + rules for nrczt-whips of length 4
For n>4: NRCZTn: rules for nrczt-whips of length n

(Super-Hidden = Fish)

Definition: for each each n ≥ 0, the resolution theory Tn(NRCZT) is the set of resolution rules at NRCZT levels from 0 to n, i.e. T(NRCZTn) = NRCZT0 + NRCZT1_0 + NRCZT1 + ... NRCZTn.

The [b]NRCZT rating[/b] of a puzzle P is defined as the smallest n such that P can be solved within Tn(NRCZT).
The NRCZT rating has the good following properties one can expect of a rating:
- it is defined in a purely logical way, independent of any solver;
- it is invariant under symmetry and supersymmetry.


4.3) An example of non-confluence for whips
Unfortunately, neither the Tn(pNRCZT) nor the Tn(NRCZT) theories strictly have the confluence property.
See the example in section 6.


4.4) Statistical almost confluence of the whip theories

As is very often the case in Sudoku, a property that is not strictly valid becomes almost valid when considered from a statistical point of view.

pB-NRCZT is always smaller than pNRCZT. The difference between the 2 ratings is an upper bound for the measure of non confluence of whip theories.
In practice, pB-NRCZT is smaller than pNRCZT in only fewer than 2 cases in 1,000. And in all these cases, the difference is only 1.
Moreover, the a difference between the 2 ratings that can be assigned to a problem of non confluence in fewer than 1 in 10,000 puzzles.
( This last estimation is an imprecise upper bound, because I have to check manually all the cases where a discrepancy appears between pB-NRCZT and pNRCZT, which puts limits on the number of cases I can study in detail.)
This shows how close to confluence the whip theories are.



See the ratings page for references to the ratings of the sudogen0 collection for all the ratings above.



5) ZT-BRAID(FP) THEORIES


It can easily be shown that if the FP family is stable for confluence, then the naturally defined Tn(braid(FP)) theories have the confluence property and are stable for confluence. (Here, as usual, for each n, the n in Tn defines the maximal length of the braids(FP) in Tn.)
In particular, all the Tn(braid(Subset)) have the confluence property.



6) AN EXAMPLE OF NON-CONFLUENCE FOR NRCZT-WHIPS


Take puzzle #279845 in the random Sudogen0_1M collection of 1,000,000 minimal puzzles:

981
752
364
703
001
090
250
900
008
017
043
090
300
009
007
092
060
000
400
029
000
100
008
900
029
000
500



(solve "9817.325.752..19..364.9...8.173...92.43..9.6..9...7...4..1...29.29..8......9..5..")

*****  SudoRules version 13.7wter2  *****
9817.325.752..19..364.9...8.173...92.43..9.6..9...7...4..1...29.29..8......9..5..
interaction column c7 with block b9 ==> r9c9 <> 6, r8c9 <> 6
nrc-chain[2]  c8n8{r6 r9} - r7n8{c7 c3} ==> r6c3 <> 8
interaction column c3 with block b7 ==> r9c1 <> 8
nrc-chain[3]  c8n8{r9 r6} - r4c7{n8 n4} - c6n4{r4 r9} ==> r9c8 <> 4
nrczt-whip[2]  r1n4{c5 c9} - b9n4{r8c9 .} ==> r8c5 <> 4
nrc-chain[3]  r9n2{c5 c6} - r3c6{n2 n5} - r7c6{n5 n6} ==> r9c5 <> 6
nrc-chain[3]  r9c3{n6 n8} - r7n8{c3 c7} - b9n6{r7c7 r8c7} ==> r8c1 <> 6
nrczt-whip[3]  r1n4{c9 c5} - r9n4{c5 c6} - c4n4{r8 .} ==> r6c9 <> 4
xyzt-chain[4]  r7c6{n6 n5} - r3c6{n5 n2} - r9c6{n2 n4} - r8c4{n4 n6} ==> r8c5 <> 6, r7c5 <> 6
nrc-chain[4]  b6n7{r5c7 r5c9} - b6n5{r5c9 r6c9} - c3n5{r6 r7} - r7n8{c3 c7} ==> r7c7 <> 7, r5c7 <> 8
naked-pairs-in-a-column c7{r3 r5}{n1 n7} ==> r8c7 <> 7, r8c7 <> 1, r6c7 <> 1
;;; KS1
nrc-chain[4]  r9c3{n6 n8} - b9n8{r9c8 r7c7} - r4c7{n8 n4} - c6n4{r4 r9} ==> r9c6 <> 6
interaction row r9 with block b7 ==> r7c3 <> 6
;;; KS2
nrct-chain[5]  c6n4{r4 r9} - c6n2{r9 r3} - r3n5{c6 c4} - r8c4{n5 n6} - r7c6{n6 n5} ==> r4c6 <> 5
;;;
nrc-chain[2]  r4n5{c5 c1} - b7n5{r8c1 r7c3} ==> r7c5 <> 5
naked-pairs-in-a-row r7{c2 c5}{n3 n7} ==> r7c7 <> 3
xy-chain[3]  r7c7{n6 n8} - r4c7{n8 n4} - r4c6{n4 n6} ==> r7c6 <> 6
singles to the end


Consider knowledge state KS1

+-------------------------+-----------------------------+----------------------------+
| 9          8        1         | 7           46         3        | 2          5          46        |
| 7          5        2         | 468       468       1        | 9          34        346      |
| 3          6        4         | 25         9           25      | 17        17        8          |
+-------------------------+-----------------------------+----------------------------+
| 568      1        7         | 3          4568      456    | 48        9          2          |
| 258      4        3         | 258      1258      9        | 17        6          157      |
| 2568    9        58       | 24568  124568  7        | 348      1348    135      |
+-------------------------+-----------------------------+----------------------------+
| 4          37      568     | 1          357        56      | 368     2          9           |
| 15        2        9         | 456      3567      8        | 346     1347    1347     |
| 16        37      68       | 9          2347      246    | 5        1378    1347      |
+-------------------------+-----------------------------+----------------------------+



At knowledge state KS1, we had the potential whip elimination:
nrczt-whip[4]  c6n4{r4 r9} - c6n6{r9 r7} - r8c4{n6 n5} - r3n5{c4 .} ==> r4c6 <> 5
In details
nrczt-whip[4]  c6n4{r4 r9} - c6n6{r9 r7 r4*} - r8c4{n6 n5 n4#1} - r3n5{c4 . c6*} ==> r4c6 <> 5

Unfortunately, the nrc-chain[4] rule is applied before this whip[4] rule and it deletes the left-linking candidate n6r9c6 in the second cell of this whip[4]. Only a slightly longer nrct-chain[5] can now make the r4c6 <> 5 elimination.
Notice that, if the whip[4] had been applied before the nrc-chain[4], this last chain would still have been applicable.

We have a clear case of non confluence. Notice that if we allow braids, after the nrc-chain[4] is applied and until KS2, we have the replacement braid (as provided by the general proof of confluence for braid resolution theories):
nrczt-braid[4]  c6n4{r4 r9} - c6n6{r4 r7} - r8c4{n6 n5 n4#1} - r3n5{c4 . c6*} ==> r4c6 <> 5
The previous z-candidate in cell 2 is now used as a left-linking candidate. (It is nrc-linked to the target).


Does SudoRule find this replacement braid?
;;; idem until KS2
nrczt-braid-rc[4]  r7c6{n5 n6} - r3n5{c6 c4} - c6n4{r4 r9} - r8c4{n6 .} ==> r4c6 <> 5
;;;
end unchanged

SudoRules has found another braid, but this is not important, as it has the same length.


After this example, one could wonder whether the difference between the braid and whip theories is only one of adding confluence to the whip theories. But the answer is negative. Most of the braids appear for other reasons than dealing with a deleted candidate.



7) CONFLUENCE AND RULES FOR UNIQUENESS


By definition, resolution rules are logical consequences of the Sudoku axioms (which do not include uniqueness). If one wants to consider uniqueness, we can define similarly U-resolution rules (which are consequences of the Sudoku axioms plus the axiom of uniqueness) and U-resolution theories.

Examples of a BUG that disappears after other rules have been applied are well known.

As another example of the problems raised by such rules, let me prove the following:

Theorem: let T be any resolution theory. If UR1 (Unique Rectangles type 1) is added to it, then the resulting U-resolution theory cannot have the confluence property.
Remember that, by definition, any resolution theory contains ECP (elementary constraints propagation rules) and NS (Naked-Single rule); this is the minimal reasonable assumption on a resolution theory.
Proof:
UR1 is defined by the condition pattern:

{1 2} --------- {1 2}
  :               :
{1 2} --------- {1 2 3}

(all 4 cells in exactly two rows, two columns and two blocks)

and by the conclusion that 3 can be asserted in the rightmost bottom cell.

Suppose we add UR1 to T.
Suppose this patterns appears at some point in the resolution process of a puzzle P but, for any reason, rule UR1 is not applied immediately (e.g. because you haven't seen it). If any candidate in any of the other 3 cells is eliminated by any other rule in T, then, using ECP and NS, these 3 cells are reduced to one candidate and the 4th cell is reduced to {1 3} or {2 3}.
What remains is the mere pattern of bivalue cells, for which there can exist no resolution rule able to eliminate 1 or 2.
q.e.d.

Now that I have given the general principle of such proofs of non-confluence, anyone can extend the above theorem to other cases of URs or to other Uniqueness rules.
As I am not interested by uniqueness, that's all I'll say about it.



Home(The Hidden Logic of Sudoku)         Home(Denis Berthier)