The 1992 two-phase algorithm by Herbert Kociemba: drop any state into the subgroup G1 = ⟨U, D, L², R², F², B²⟩ first, then solve inside G1. Each phase runs its own IDA*, sharing three coordinate prune tables (CO / EO / UD-slice). Average 22 moves, almost never above 25, tens of milliseconds per solve — to this day every scramble generator and most timers ship a descendant of this algorithm under the hood.
In 1981 Morwen Thistlethwaite published his famous four-phase algorithm: solve the cube through a chain of subgroups G0 ⊃ G1 ⊃ G2 ⊃ G3 ⊃ G4 = {e}, each phase restricting the move set further than the last. It was the first algorithm to seriously tie cube-solving to group theory, but it averaged about 52 moves — mathematically elegant, practically long.
In 1992, German maths schoolteacher Herbert Kociemba collapsed the four phases into two and published the result in his own GUI tool, Cube Explorer. His key insight: Thistlethwaite's G1 is already special enough — once you reach G1, the remaining puzzle lives in a small enough state space to be searched directly, with no need to subdivide further. Removing the middle stages drops the average from 52 to about 22.
Herbert Kociemba (b. 1953, Germany) spent his career teaching maths at a Gymnasium (German academic high school), writing cube solvers on the side. Cube Explorer has been maintained since the late 1990s; kociemba.org remains the canonical first-party reference on the two-phase algorithm. He never commercialised any of it — the code and documentation have been open for decades.
The 3×3 has about 4.3 × 10¹⁹ states; running plain IDA* on a graph that size is hopeless (prune tables don't fit, heuristics aren't tight enough). Kociemba's strategy is classical divide and conquer: use one set of constraints to drop the state into a smaller intermediate subgroup G1, then use the remaining constraints to finish inside G1. Both sub-problems have small enough state spaces to drive with precomputed prune tables.
| Algorithm | Year | Phases | Avg moves (HTM) | Worst |
|---|---|---|---|---|
| Thistlethwaite | 1981 | 4 | ~52 | ≤ 52 |
| Kociemba | 1992 | 2 | ~22 | ≤ 25 (typ.) |
| Rokicki (optimal) | 2010 | — | ~18 | 20 (God's #) |
G1 is the set of all states reachable using only the 10 moves U, D, L², R², F², B² (i.e. the subgroup generated by those moves). A state S lies in G1 iff three conditions hold simultaneously:
G1 itself has about 8! · 8! · 4! / 2 ≈ 1.95 × 10¹⁰ states (corner permutations × U/D-layer edge perms × slice perm, divided by a parity constraint) — already an order of magnitude smaller than the full 4.3 × 10¹⁹. That balance is why Kociemba picked G1 as the rendezvous point: large enough that most cubes get there cheaply, small enough that IDA* inside it still fits in cache.
You can't index every state with a single integer (too many), but you can quantify each of the three G1 conditions as a small integer — these are Kociemba's coordinates. Each coord captures one "slice" of the state and lives in a small range:
| Coord | Meaning | Range | Size |
|---|---|---|---|
twist (CO) | Corner orientation | 2187 | 3⁷ |
flip (EO) | Edge orientation | 2048 | 2¹¹ |
slice (UD) | UD-slice edge positions | 495 | C(12, 4) |
cperm | Corner permutation (phase 2) | 40320 | 8! |
eperm | U/D edge permutation (phase 2) | 40320 | 8! |
sperm | Slice edge permutation (phase 2) | 24 | 4! |
Key fact: a state S lies in G1 iff twist(S) = 0, flip(S) = 0, and slice(S) = 0. So Phase 1's goal is literally "zero these three coords simultaneously." Phase 2 assumes we're already in G1 with CO/EO/slice = 0, and only the permutations remain.
Each coord has explicit encoder/decoder functions. Take corner-orientation twist: 7 corner twists (each ∈ {0, 1, 2}) packed as a base-3 number — the 8th corner's twist is determined by the others (total twist must be ≡ 0 mod 3), so 7 digits suffice. That's how it's written in this repo's pages/timer/scramble/kociemba/coords.ts.
// twist ∈ [0, 3^7) = 2187 corner orientation
// flip ∈ [0, 2^11) = 2048 edge orientation
// slice ∈ [0, C(12,4)) = 495 UD-slice edges presence (unsorted)
// cperm ∈ [0, 8!) = 40320 corner permutation
// eperm ∈ [0, 8!) = 40320 permutation of UD-face edges
// sperm ∈ [0, 4!) = 24 permutation of slice edgesPhase 1 uses all 18 face turns (U U' U2, R R' R2, ..., B B' B2); the goal is to drive twist, flip, slice all to zero simultaneously. The state space is the Cartesian product: 2187 × 2048 × 495 ≈ 2.2 × 10⁹. We can't store that whole table, but for searching we only need lookups, so we factor it into pairwise sub-tables (see §8).
Phase-1 optimum distance distributes very tightly: about 10 moves on average, with a proven maximum of 12. In other words, any cube can reach G1 in at most 12 moves. This bound is the fundamental reason Kociemba averages around 22.
Once inside G1, only 10 moves are legal: U U' U2 D D' D2 L2 R2 F2 B2. State is described by three permutations: cperm (corners, 8!), eperm (U/D edges, 8!), sperm (slice edges, 4!). No orientation coords here — inside G1, half-turns and U/D quarter-turns preserve CO/EO automatically.
Phase 2's state space is 40320 × 40320 × 24 / 2 ≈ 1.95 × 10¹⁰ (the /2 comes from a global parity constraint), with a max distance of 18 moves. So from any state, (phase1 ≤ 12) + (phase2 ≤ 18) ≤ 30 is always achievable; in practice the optimal phase1 + phase2 sum sits around 22.
The "phase 1 ≤ 12, phase 2 ≤ 18" bounds were both verified by Kociemba via exhaustive BFS in the 1990s — both phase state spaces are small enough that the full distance table can be built in a matter of hours to days. They're constants, not estimates.
During search we shouldn't reapply moves to the whole cube state — we only need to update coords. The move tables are precomputed 2-D arrays of "input coord + move → new coord":
twistMove [2187][18] // CO after any of 18 face turns
flipMove [2048][18] // EO after any of 18 face turns
sliceMove [495][18] // UD after any of 18 face turns
cpermMove [40320][10] // corner perm after 10 G1 moves
epermMove [40320][10] // U/D edge perm after 10 G1 moves
spermMove [24][10] // slice perm after 10 G1 movesConstruction is simple: for each coord value, decode to its partial state (e.g. twist=42 → a corner-twist array), apply the move, re-encode. All move tables together are under 10 MB and fit comfortably in memory.
IDA* needs an admissible heuristic h(s) ≤ d*(s) — a lower bound on the true minimum distance from s to the goal. Kociemba uses BFS distance tables over coordinate pairs: a 2-D table whose entry at (a, b) is "the min moves required to zero both a and b simultaneously."
| Phase | Prune table | Entries | Storage (4-bit / entry) |
|---|---|---|---|
| 1 | twist × slice | 2187 × 495 ≈ 1.08 M | ~540 KB |
| 1 | flip × slice | 2048 × 495 ≈ 1.01 M | ~510 KB |
| 2 | cperm × sperm | 40320 × 24 = 967 680 | ~485 KB |
| 2 | eperm × sperm | 40320 × 24 = 967 680 | ~485 KB |
During search: h_phase1(s) = max(prune[twist, slice], prune[flip, slice]); both tables are true lower bounds (since each only uses partial information), and taking the max is still admissible. Entries fit in 4 bits each (max distance < 13), so storage is halved by packing.
IDA* (iterative-deepening A*) does a sequence of bounded DFS searches: in each pass it expands as long as g + h ≤ bound, otherwise records the minimum over-bound value as the next iteration's threshold. Space is proportional to current depth only — no priority queue, unlike plain A*. Phase-1 pseudocode:
function phase1Solve(s0):
bound = h(s0) // initial threshold
loop:
(found, next) = dfs(s0, g=0, bound, path=[])
if found: return path
if next = ∞: return UNSOLVABLE // (impossible for a valid cube)
bound = next
function dfs(s, g, bound, path):
f = g + h(s) // h = max(prune_ts, prune_fs)
if f > bound: return (false, f)
if h(s) = 0: return (true, g) // reached G1
minOver = ∞
for m in 18 moves:
if disallowedAfter(path.last, m): continue // canonical-form filter
s' = applyMove(s, m)
(found, t) = dfs(s', g+1, bound, path + [m])
if found: return (true, t)
if t < minOver: minOver = t
return (false, minOver)Key optimisation: disallowedAfter rejects same-face follow-ups (R R') and same-axis canonical-form duplicates (allow R L but ban a trailing same-axis repeat). This cuts the branching factor from 18 down to about 13, almost free.
The naive version is: find the optimal phase 1 → enter G1 → find the optimal phase 2 → concatenate. But shortest phase 1 ≠ shortest phase1 + phase2. An 11-move phase 1 followed by a 14-move phase 2 (= 25) can easily beat a 10-move optimal phase 1 followed by a 17-move phase 2 (= 27).
So Kociemba's real procedure is to enumerate phase-1 solutions in increasing length, run phase 2 on each, and keep the running best total. As soon as the phase-1 lower bound itself ≥ best total, we can safely cut the rest of the search. The interleaving is nearly free — phase 2 has a tiny state space and tight prune, costing tens of microseconds.
This "iterate phase-1 solutions" hook was later pushed to its limit by Shuang Chen's min2phase (a separate page on this site) — peeking phase-2 prune tables before phase 1 has even finished, and using the phase-2 lower bound to prune phase 1 in reverse.
| Metric | Classic Kociemba | min2phase | Optimal (Rokicki) |
|---|---|---|---|
| Avg moves (HTM) | ~22 | ~20 | ~18 |
| Max moves | 25 | ≤ 21 | 20 |
| Avg time (single-core PC) | ~20 ms | ~1 ms | seconds–minutes |
| Total table size | ~10 MB | ~80 MB (huge) | huge clusters |
This site's /scramble/solver, the random-state scramble in /timer, and the canonicalisation in /recon all run a TS implementation of the two-phase algorithm — source at packages/client/src/pages/timer/scramble/kociemba/ (cube.ts for state, coords.ts for coords, movetables.ts + prune.ts for precomputation, search.ts as the IDA* driver, kociemba.worker.ts as the Web-Worker entry). Cold-start table build ≈ 100 ms, then each solve averages 1–5 ms.
In 2010 Tomas Rokicki, Herbert Kociemba, Morley Davidson and John Dethridge used about 35 CPU-years of Google compute to prove that any 3×3 state is solvable in ≤ 20 moves (HTM) — the famous "God's number." Kociemba's own algorithm doesn't return optimal solutions; it averages 22, close but not best. Rokicki's optimal solver is essentially Kociemba's two-phase + cosets, parallelised so heavily that enumerating the symmetry classes needs a cluster.
Practical implication: for scramble generation (WCA requires ≥ 17 moves and enough randomness), timer analysis, recon canonicalisation — Kociemba's 20–22 moves is fine. Nobody trades 10× latency to save 1–2 moves. Optimal solving only matters for research contexts (God's number, specific-state shortest proofs).