The Hidden Logic of Sudoku

Denis Berthier

book cover

Online Supplements


Denis Berthier. All the material in this page and the pages it gives access to are the property of the author and may not be re-published or re-posted without his prior written permission.


The rules presented here were inspired by Steve's beautifully symmetric pattern in his attack on EasterMonster, here: t=5600&postdays=0&postorder=asc&start=75
(OK, the beauty is in the pattern on the grid, not in Steve's syntax!!).
Some parts of this introduction may become clearer after you've read further on, but I didn't want to overload the definitions with too many comments.

As Steve has not yet provided a formal general definition of his pattern, I'm not sure my definitions correspond to what he meant.
Indeed, they clearly don't, if we consider his first definition with all the intermediate cells in the central "tower" and central "floor" (such as (2&7-1&6)r56c2).
Others before me have noticed that these intermediate cells played no role in justifying Steve's eliminations.
And, when I speak of beauty, I mean it becomes really beautiful when these parasitic cells have been expurgated from the pattern.

EasterMonster (EM) is one of the most famous puzzles.
Discovered by JPF and published the 7th of April 2007 on the Sudoku Payer's Forum, it has long been considered as the hardest puzzle and it is still considered as one of the few hardest known puzzles:


In "The Hidden Logic of Sudoku" (HLS), I indicated that EasterMonster cannot be solved by any known set of resolution rules.
Several other examples have been produced that satisfy the same "pattern" (without the parasitic cells), e.g.

- the 178 variations on the EM core configuration, to be found in Coloin's first post on this page:

- the 14 from Ocean - not from the 411 list of patterns starting with a fish (normal for ocean) but from another list of 23 (thanks ronk for the reference), here:
 It may be interesting to know that their Sudoku Explainer ratings vary from 9.5 to 9.9:

- and the 3 from gsf's list (including EM) identified by Champagne (thanks Mike for the reference):
500000009020100070008000300040600000000050000000207010003000800060004020900000005 11.4 # # m_b_metcalf
500000009020100070008000300040002000000050000000706010003000800060004020900000005 # StrmCkr 11.4 #
100000002090400050006000700050903000000070000000850040700000600030009080002000001 # JPF 04/07/01 (Easter Monster) 11.4 #

I've checked that the two definitions I'll give soon produce the right eliminations in these examples (I'ven't checked the 178 variations though).

Now, several interpretations and justifications of Steve's eliminations have already been proposed, in terms of Hidden Pairs Loops, of "double" AICs, of chains of AAAALSs, of matrices,
There's nothing wrong in these interpretations, but considering the beauty of Steve's pattern, none of these explanations is really satisfying - at least for me.
What I like in a resolution rule is its being fitted to its purpose: I don't like having to use a hammer to kill a fly. I wouldn't call Jellyfish a pattern that has degenerated into a Swordfish or two X-Wings. This is also why I like families of resolution rules of increasing complexities. Even if, someday, someone finds a RuleOfEverything subsuming all the known rules, players in the real world will always need the simple specialisations.
Of course, even before this, what I like in a rule is its being clearly defined. I'm never satisfied with just a few examples.

A few words on vocabulary:
- I'm not using the word "loop" for the rules I'm going to define, because it is already too much overloaded (and as it has been used, it conveys the idea that the targets belongs to the pattern, which is never the case in my apprach and can't be the case here).
- I'm not using the word "ring", although I did initially, because the patterns I'm going to define are not so rigid as may be suggested by "ring": they may have shapes that are not at all ring-like.
- I'm using the word "belt", not thinking of the belt around our waists, but thinking of conveyor belts. Indeed, this is what the following x2y2-belts do: they convey constraints, both ways.
- I'm using the prefix "x2y2" because x2y2-belts work much like xy-chains, but on pairs of cells, each pair with 2 left-linking candidates and two right linking candidates - instead of just one of each in the case of x-chains.


From a given example, or even from a set of examples, defining a corresponding resolution rule is not obvious.
I shall use Steve's pattern to illustrate this. I shall:
- give a (non exhaustive) list of questions that must be answered in the process,
- give not one but two interpretations of Steve's pattern,
- issue three challenges to the puzzle creators.
This is at least one positive aspect of using computers in Sudoku solving: when programming a rule, one has to answer each of the following questions. (The logical conditions I'll define for my interpretations can be understood as full formal specifications for an implementation.)

Q1: transforming constants into variables
Given one or several examples, it may seem easy to replace specific numbers, rows, columns and blocks by generic variables, but the relations they should have may then be ambiguous (e.g. is the fact that two candidates belong to both the same row and the same block meaningful?)

Q2: optional and additional candidates
A second group of two questions that are not obvious is: which candidates present in the example can be made optional? Which additional candidates can be allowed as optional candidates in each of the cells?
I already gave an example in my book (and recently in the "fish" thread) of how to deal with this when I defined the non-degeneracy condition for Subset rules (Naked, Hidden and Super-Hiddden or Fish). Even with this elementary example, things were not so obvious as it may seem.
The following two interpretations of Steve's pattern I'll give below define both:
- which candidates present in the example can be made optional,
- which additional candidates can be optionally allowed.
In the EM example, all the cells concerned by Steve's pattern have exactly three candidates. With my definitions, they may have 2, 3 or 4.
Therefore, each interpretation is already a non obvious generalisation of the example.

Q3: non-degeneracy conditions
A third question is: what conditions should we put to on the global pattern to ensure that it doensn't degenerate into something else?
I'll illustrate this with the condition preventing x2y2-belts to degenerate into Naked-Quads.

Q4: generalising the overall pattern, changing its length
A fourth question is: can the overall structure be generalised? Can it be included in a whole family of patterns (e.g. as xy-chains of different lengths are)?
The following two interpretations allow longer belts, each with more different physical shapes than the previous one.
None of the examples cited in this forum has longer belts.
So this will be my first challenge to puzzle creators: can one build a puzzle with a belt of 6 crosses (see first definition of belts below)?

Q5: is the pattern made of repeated building blocks, and which
A fifth question, related to the previous one and still less obvious, is: what are the building blocks of the overall structure?
The following two interpretations rely on different building blocks (crosses vs x2y2-segments). The second is a priori more general than the first.
Nevertheless, none of the examples cited in this forum satisfies the second but not the first.
I haven't had time to look for examples of the second that wouldn't be examples of the first.
This will be my second challenge to puzzle creators: can one build a puzzle with an x2y2-belt (see second definition below) that is not a belt of crosses (see examples of possible such patterns below)?

Q6: classifying the pattern in some complexity hierarchy
A sixth question is: where should the pattern be classified in a complexity hierarchy? Notice that this is not an abstract question independent of the rule formulation. When you want to state the rule precisely, you implicitly have to make such classification decisions. For instance, when I spoke of non-degenerating into Naked-Quads (NQ), it means I observed that a first tentative definition R subsumed NQ and produces NQ eliminations even when NQ was de-activated. Therefore, after checking that R was more general than NQ, I had to give R a higher complexity than NQ. But I also had to introduce a non-degeneracy condtion in R.
R doesn't subsume the Hidden or Super-Hidden counterparts of NQ: HQ or SHQ (Jellysfish), but as HQ and SHQ are related to NQ via symmetry and super-symmetr, R must be given a higher complexity than these rules.
Of course, this is not enough to fully classify R, say in my set of rules, but this entails that it has to be above my level L4_0, which reduces the problem to classifying R in relation to the xy-to-nrczt family.

Finally, this is not a question of generalising the pattern, but it arises naturally: does the presence of this pattern imply a high ER rating?
As all the known examples do, this is my third challenge: can we find a puzzle with this pattern (in either the simpler or more complex forms) with low ER? Or can we prove that it does imply a high ER?



The building blocks are considered to be "crosses".

Definition: a cross is defined by the following data and conditions:
- a block b,
- a row r that intersects b,
- a column c that intersects b,
- the intersection of r and c is called the center of the cross (notice that it doesn't have to be the physial center of block b, it is a conceptual center (in EM it does always appear as the physical center),
- two different rows ri and rj that intersect b,
- two different columns ck and cl that intersect b,
- one of ri or rj may be equal to r,
- one of ck or cl may be equal to c,
- the four "ends of the cross" (, ri.c, rj.c) must be different,
- the contents of (i.e. the candidates in) these four ends must be as follows:  = {a (b) c (d)} = {(a) b (c) d}
ri.c = {c' (d') e (f)}
rj.c = {(c') d' (e) f}
- { } means that the content of the cell is completely listed (nothing else allowed),
- parentheses mean optional,
- a, b, b and b are all different,
- c', d', e and f are all different,
- {c', d'} = {c, d} as sets, i.e. either (c'=c and d'=d) or (d'=c and c'=d),
- a and b are called the horizontal outer candidates (they correspond to the values that will be eliminated from row r),
- e and f are called the vertical outer candidates (they correspond to the values that will be eliminated from column c),
- c, d, c', d' are called the inner candidates (they correspond to the values that will be eliminated from block b).

Notice how the compulsory and the optional candidates are defined.

A graphical representation of the block may be clearer. Let us draw rows ri, r, rj in this order and column ck, c, cl in this order.

e (f) c' (d')

a (b) c (d)

(a) b (c) d

(e) f (c') d'

Notice that, for all the crosses in EasterMonster, all the inner optional candidates are absent and all the outer optional candidates are present. EM is therefore a very special case.

Seen from the outside, inner candidates can be forgotten and the cross can be seen as a transfer mechanism for constraints, working in both directions:
- if none of the vertical candidates in the vertical branch is true, then both horizontal candidates in the horizontal branch are true,
- if none of the horizontal candidates in the horizontal branch is true, then both vertical candidates in the vertical branch are true.

The above cross is associated with the formal auxiliary predicate (defined by all the above conditions): cross(b, r, ri, rj, c, ck, cl, a, b, c, d, c', d', e, f).

Definition: row-aligned crosses: two crosses cross(b1, r1, r1a, r1b, c1, c1a, c1b, a1, b1, c1, d1, c'1, d'1, e1, f1) and cross(b2, r2, r2a, r2b, c2, c2a, c2b, a2, b2, c2, d2, c'2, d'2, e2, f2) are said row-aligned if:
- they are in different blocks (b2 <> b1),
- they are centered in the same row (r1 = r2),
- they have the same set of horizontal outer candidates: {na2, nb2} = {na1, nb1} as sets.

By permuting rows and columns, one can define column-aligned crosses. More precisely:
Definition: column-aligned crosses two crosses cross(b1, r1, r1a, r1b, c1, c1a, c1b, a1, b1, c1, d1, c'1, d'1, e1, f1) and cross(b2, r2, r2a, r2b, c2, c2a, c2b, a2, b2, c2, d2, c'2, d'2, e2, f2) are said column-aligned if:
- they are in different blocks (b2 <> b1),
- they are centered in the same column (c2 = c1),
- they have the same set of vertical outer candidates: {ne2, nf2} = {ne1, nf1} as sets.

Definition: a spine for a belt of even length 2l is defined by the following data and conditions:
- a sequence of 2l cells such that, when repeating the first at the end of the list, two consecutive cells are alternatively in the same row and in the same column (these cells define the global structure of the belts to be defined below).
In the EM example, the spine is: r2c2, r2c8, r8c8, r8c2.

Definition: a belt of crosses of even length 2l is defined by the following data and conditions:
- a spine for a belt of length 2l,
- for each of the cells in the spine, a cross centered on this cell,
- when repeating the first cross at the end of the list, consecutive crosses are alternatively row-aligned and column-aligned.

Theorem: Given a belt of crosses, one can eliminate:
- in each of its blocks: any inner candidate that is not in one of the cells at the ends of the cross,
- in each central row of consecutive row-aligned crosses: any horizontal outer candidate (they are the same for the two crosses) that is not in the horizontal ends of any of the two row-aligned crosses,
- in each central column of consecutive column-aligned crosses: any vertical outer candidate (they are the same for the two crosses) that is not in the vertical ends of any of the two column-aligned crosses.

Proof: I'll write the proof only for belts of length 4
Suppose we start with two row-aligned crosses and let the four crosses be:
(cross b1, r1, r1a, r1b, c1, c1a, c1b, a1, b1, c1, d1, c'1, d'1, e1, f1)
(cross b2, r2, r2a, r2b, c2, c2a, c2b, a2, b2, c2, d2, c'2, d'2, e2, f2)
(cross b3, r3, r3a, r3b, c3, c3a, c3b, a3, b3, c3, d3, c'3, d'3, e3, f3)
(cross b4, r4, r4a, r4b, c4, c4a, c4b, a4, b4, c4, d4, c'4, d'4, e4, f4)

(cells are noted as e.g. r1a.c1b, with a dot between the row and the column)

1) Elimination of inner candidates nc1 and nd1 in block b1:
If neither of nc1, nd1 is in {r1c1a, r1c1b}, we have successively:
{r1.c1a, r1.c1b} = {a1, b1} as sets
neither of na1 or nb1 is in {r2c2a, r2c2b} (remember that n2=n1)
{r2.c2a, r2.c2b} = {ne2, nf2} as sets
neither of ne2, nf2 is in {r3.c3a, r3.c3b} (remember that r3=r2)
{r3.c3a, r3.c3b} = {nc3, nd3} as sets
{r1.ac1, r1.bc1} = {nc1, nd1} as sets

Similarly, if neither of nc1, nd1 is in {r1.ac1, r1.bc1}, then {r1c1a, r1c1b} = {nc1, nd1} as sets

So, if none of nc1, nd1 is in any of the two ends of a branch of the cross in b1, then the two are in the ends of the other branch.
Moreover, if nc1 but not nd1 is in a branch, nc1 can't be in the other. Therefore, nd1 must be in the other.
In any case, each of nc1 and nd1 must be in one of the four ends of the cross in b1.
Whence the eliminations of nc1 and nd1 in b1.

2) Elimination of the outer horizontal candidates na1 and nb1 (= na2 and nb2) in row r1
The proof is similar:
If neither of na1, nb1 is in {r1c1a, r1c1b}, then the two must be in {r2c2a, r1c2b} (= {r1c2a, r1c2b})
Similarly, if neither of na1, nb1 is in {r2c2a, r1c2b}, then the two must be in {r1c1a, r1c1b}.

In any case, each of na1 and nb1 must be in one of the four horizontal ends of the two crosses aligned on row r1.
Whence the eliminations of na1 and nb1 in the other cells in r1.

3) Elimination of the outer vertical candidates ne1 and nf1 in column c1
Transpose the previous proof.

Notice that this proof doesn't take full advantage of the xy-transfer principle that will be seen in the next approach.
But, from the point of view of patterns, this is certainly the most beautiful and the easiest to find of the two.
It is also the most efficient in computation time.

Remarks on the shape of the spine:

With belts of length 4, the spine can only be a rectangle. Moreover, as floors and towers can be permuted, there is essentially one spine.

With belts of length 6, the spine can (must) have different shapes (but always with spine ends in different blocks), e.g.:

     I                              I
     I                              I
     I-------------I            I
                      I             I
                      I             I

I therefore repeat here my first challenge to puzzle creators: can one build a puzzle with a belt of 6 crosses with the above spine?
As of today, no one has been found.


The building blocks are now considered to be "x2y2-segments".
For the purposes of the following definitions, a cell will not be designated as usual (as r1c1) but as r1c1b1 (block coordinate added).

- a rowblock is the intersection of a row and a block,
- a colblock is the intersection of a column and a block,
- a segment is either a rowblock or a colblock.

Definition: x2y2-segment:
An x2y2-segment is a segment with two distinguished cells C1=r1c1b1 and C'1=r'1c'1b'1 called its ends (although they don't have to be the physical ends, they are the conceptual ends). These data must satisfy the following conditions on their candidates:
- the contents of C1 and C'1 are respectively {a1 (b1) c1 (d1)} and {(a1) b1 (c1) d1}, where:
- a1, b1, c1, d1 are all different,
- parentheses mean optional.
For reasons that will appear soon:
- a1 and b1 are called the left-linking candidates,
- c1 and d1 are called the right-linking candidates.

The basic property of an x2y2-segment is that if none of its left-linking candidates is true in any of its two cells, then each of its right-linking candidates must be true in one of its two cells.

An x2y2-segment is associated with the formal predicate x2y2-segment(type1, r1, c1, b1, r'1, c'1, b'1, a1, b1, c1, d1), where type1 is row or col, depending on the type of the underlying segment.

Definition: chainable x2y2-segment: Two x2y2-segments x2y2-segment(type1, r1, c1, b1, r'1, c'1, b'1, a1, b1, c1, d1) and x2y2-segment(type2, r2, c2, b2, r'2, c'2, b'2, a2, b2, c2, d2) are said chainable if:
{c1, d1} = {a2, b2} as sets and eiter of the following sets of conditions is true:
- type1=type2=row, r1=r2, b1<>b2,
- type1=type2=col, c1=c2, b<>b2,
- type1= row, type2=col, b1=b2,
- type2= row, type1=col, b1=b2,
- type1=type2=row, r1<>r2, b1=b2,
- type1=type2=col, c1<>c2, b=b2,

- if two x2y2-segments are chainable, then the reversed segments taken in reversed order are chainable,
- notice that the last two cases introduce completely new possibilities that were not available in the "belt of crosses" view.

The basic property of chainable x2y2-segments is what I call the x2y2-transfer principle (a natural generalisation of the classical xy-transfer principle): if none of the left-linking candidates of the first x2y2-segment is true in any of its two cells, then each of the right-linking candidates of the second x2y2-segment must be true in one of its two cells. This can be extended to seqences of chainable x2y2-segments.

xy-transfer principle:
      if not x1 then y1
         then not x2
            then y2
               then not x3
                  then y3 ....

x2y2-transfer principle:
      if neither of x1 x'1 then both of y1 and y'1
         then neither of x2 x'2
            then both of y2 y'2
               then neither of x3 x'3
                  then both of y3 y'3 ....

Reversed x2y2-segment: it is the x2y2-segment(type1, r'1, c'1, b'1, r1, c1, b1, b1, a1, d1, c1)

Definition: x2y2-belt
An x2y2-belt is a sequence of n different x2y2-segments (n is called the length of the belt), S1, S2, ..., Sn, all different, such that (setting Sn+1=S1):
for each k in {1, 2, ..., n}, Sk and Sk+1 are chainable.
- notice the circularity condition,
- two consecutive rowblocks can be in the same block,
- the length n doesn't have to be even,
- if one defines the reversed belt as being the sequence of the reversed segments in the reversed order, then it is an x2y2-belt.

Definition: targets of an x2y2-belt The targets of an x2y2 belt are the targets of each of its couples of consecutive (therefore chainable) x2y2-segments (still setting Sn+1=S1).
A target of two chainable x2y2-segments is any of the values c1 and d1 (= a2 and b2) in any cell C that is not an end of either segment such that:
- if type1=type2=row, r1=r2, b1<>b2: C belongs to the same row as both segments,
- if type1=type2=col, c1=c2, b<>b2: C belongs to the same column as both segments,
- if (type1=row & type2=col) or (type1=col & type2=row), b1=b2: C belongs to the same block as both segments,
- if (type1=type2=row, r1<>r2, b1=b2) or (type1=type2=col, c1<>c2, b=b2): C belongs to the same block as both segments.

x2y2-belts theorem: Given an x2y2-belt, any of its targets can be eliminated.

The proof is essentially the same as for the previous definition of belts. It is based on iterating the x2y2-tranfer principle in both directions and concluding as in the previous definition of belts and crosses.

Remarks on the shape of the spine:

Given an x2y2-belt, we can define its spine more or less as previously (details are left to the reader or to a future post).

With x2y2-belts of length 4, the spine can only be a rectangle, but it can now be flat, as in the following example (the two rows cross the same blocks):


With belts of length 6, the spine can have new different shapes, e.g. (with the leftmost three segments in the same block, 2 horizontal and one vertical in between):

     I----I--------I            I
                      I            I
                      I            I

with cells in the leftmost block distributed like this:

   -r1.c1a---------r2.c2a---------r1.c1b-                              -r1.c1a---------r2.c2a---------r1.c1b-
   -----------------r2.c2b-----------------             or               --------------------------------------
   -r3.c3a-------------------------r3.c3b-                             -r3.c3a---------r2.c2b---------r3.c3b-

Moreover we can now have belts of odd length, such as (with the leftmost two segments parallel and in the same block:

     I-------------I             I
                       I            I
                       I            I

with cells in the leftmost block distributed like this:

   -r1.c1a---------------------------r2.c2a-                    -r1.c1a-------------------------r2.c2a-
   -------------------------------------------           or        ---------------------------------------
   -r2.c2a---------r2.c2b------------------                      -r2.c2a------------------------r2.c2b-

Question: can one build a puzzle with an x2y2-belt with any of these spines?

As of today, no example has been found.

See the discussion here:

Home(The Hidden Logic of Sudoku)         Home(Denis Berthier)