Chapter 13 – Improving the Completeness of TMS (CLTMS)
The LTMS reasons quickly but does not always reason enough; this chapter adds the missing pieces so it can find everything its clauses logically entail. companion index
The big idea
By Chapter 9 we had a Logic-based TMS whose engine room is Boolean Constraint Propagation (BCP): unit resolution running over a clause store, kept fast by the counter trick described in STUDY-NOTES.md. BCP is wonderful in one respect and frustrating in another:
- It is sound. Anything it concludes is genuinely entailed by the clauses.
- It is incomplete. There are things the clauses entail that BCP simply never derives.
Incompleteness is not a bug in our implementation. It is a property of unit resolution itself. Unit resolution only ever fires when a clause has been reduced to a single undecided literal (a unit clause). If no clause is ever a unit, BCP sits still even when the answer is logically forced.
Two small examples make the gap concrete:
-
The clause set
{x ∨ ¬y, x ∨ y}entailsx. Resolve the two clauses onyand you get the unit{x}. But neither original clause is ever a unit on its own (each has two undecided literals), so BCP never resolves them and leavesxlabeledUNKNOWN. -
The four-clause set over two variables,
{p∨q, p∨¬q, ¬p∨q, ¬p∨¬q}, is unsatisfiable (try all four assignments; each falsifies one clause). Yet no clause is ever a unit, so BCP never fires and never notices the contradiction.
Chapter 13’s answer is elegant: make BCP complete by giving it more clauses to chew on – specifically the prime implicates of the clause set. A prime implicate is a clause that is logically redundant (it adds no new models) but is operationally useful, because once it is present BCP can reach conclusions in one step that previously required resolution it would never perform. Add all of them and BCP becomes refutation-complete: every entailed literal is forced and every contradiction is caught.
The price is potential blowup – a clause set can have astronomically more prime implicates than original clauses – so the recommended discipline is “delay, then complete”: reason cheaply with plain BCP, and only pay for completeness when you explicitly ask for it.
This chapter’s code lives in ../src/ltms/cltms.py, a
small optional layer that sits on top of the core LTMS.
Concepts, step by step
Consensus (resolution between two clauses)
The single inference rule behind everything here is consensus, which is just
propositional resolution. Two clauses resolve when they contain exactly one
complementary literal pair – one clause has p, the other has ¬p. The
resolvent (their consensus) is the union of the two clauses with that pair
removed.
For example, {x, ¬y} and {x, y} share the complementary pair y / ¬y. Drop
it and union the rest: the consensus is {x}.
Two rules guard the operation:
- If the clauses share no complementary pair, they do not resolve at all (there is nothing to cancel).
- If they share more than one complementary pair, the resolvent would contain some literal and its negation – a tautology, which is useless – so consensus is undefined (returns nothing).
In cltms.py a clause is represented as a
frozenset of (node, sign) pairs, where the node is a TmsNode and the sign
is Label.TRUE or Label.FALSE. The consensus function:
def consensus(c1, c2):
complementary = {n for (n, s) in c1 if (n, _opposite(s)) in c2}
if len(complementary) != 1:
return None # no pair, or more than one -> tautology
pivot = next(iter(complementary))
merged = (set(c1) | set(c2)) - {(pivot, Label.TRUE), (pivot, Label.FALSE)}
... # second tautology check on the merged literals
return frozenset(merged)
The complementary set finds every node that appears with one sign in c1 and
the opposite sign in c2. Insisting on exactly one of them rolls both guard rules
into a single length check. There is a second tautology check after merging,
because two non-pivot literals could still clash (the same node positive in one
clause, negative in the other) – that would also make the resolvent a tautology.
One subtle and important fact, worked out as Exercise 2: consensus is not
associative. Whether (c1 ∘ c2) ∘ c3 is even defined can differ from
c1 ∘ (c2 ∘ c3). That is exactly why the saturation algorithm below cannot just
fold a list left-to-right; it has to keep trying all pairs until nothing new
appears.
Subsumption (throwing away redundant clauses)
A clause a subsumes clause b when a ⊆ b – every literal of a also
appears in b. If a subsumes b then a is logically stronger (a shorter
clause is a tighter constraint), so b is redundant and can be discarded.
Example: {x} subsumes {x, y}, so once we know {x} we never need {x, y}.
In the code this is a one-liner over frozensets:
def subsumes(a, b):
return a <= b
Exercise 5 asks whether a fancier subsumes that
also checks clause identity would be faster. The answer is no: because literals
reference the same TmsNode objects, identity is already captured by the subset
test, and the real cost lives in the number of consensus operations, not the
per-pair check.
Prime implicates (the saturation)
An implicate of a clause set is any clause it entails. A prime implicate is an implicate that no shorter implicate subsumes – it is a minimal entailed clause. The set of all prime implicates is the canonical “everything BCP would ever need” set.
prime_implicates computes them by saturation under consensus, kept
subsumption-minimal (this is the brute-force version of Algorithm 13.1 in the
book):
def prime_implicates(clauses):
result = []
queue = list(dict.fromkeys(clauses)) # dedup, keep order
while queue:
c = queue.pop()
if any(subsumes(p, c) for p in result):
continue # already covered -> skip
result = [p for p in result if not subsumes(c, p)] # c covers some -> drop them
for p in result:
r = consensus(c, p)
if r is not None and not any(subsumes(q, r) for q in result):
queue.append(r) # new resolvent -> revisit later
result.append(c)
return result
Read it as a fixpoint loop:
- Pull a clause
coff the work queue. - If something we already kept subsumes
c,cis redundant – drop it. - Otherwise
cearns its place: first remove anything inresultthatcitself subsumes (we just found something stronger). - Resolve
cagainst every surviving kept clause; any new, non-redundant resolvent goes back on the queue. - Keep
c.
When the queue empties, result is the subsumption-minimal closure under
consensus: the prime implicates. This is correct but can be exponential in
the worst case, which is the whole reason for the “delay” discipline. The book’s
incremental Tison method (IPIA) is the efficiency improvement; the module note
flags it as future work and our prime_implicates does the straightforward full
saturation instead.
complete (installing the prime implicates into a live LTMS)
prime_implicates is pure set crunching. complete is the bridge that wires the
results back into a running LTMS:
def complete(ltms, informant="prime-implicate"):
existing = [frozenset(cl.literals) for cl in ltms.clauses]
have = set(existing)
added = 0
for pi in prime_implicates(existing):
if pi in have:
continue # already present, skip
true_nodes = [n for (n, s) in pi if s is Label.TRUE]
false_nodes = [n for (n, s) in pi if s is Label.FALSE]
ltms.add_clause(true_nodes, false_nodes, informant)
have.add(pi)
added += 1
return added
It reads the LTMS’s current clauses, computes the prime implicates, and installs
only the ones not already present, each tagged with an informant so its origin is
recorded. Because add_clause runs ordinary BCP as each clause goes in, two good
things happen automatically:
- Any newly unit prime implicate (such as
{x}in our first example) immediately forces its literal, so the entailed value appears. - Any prime implicate that turns out to be the empty clause (the four-clause
example resolves down to one) signals a contradiction through the normal LTMS
machinery – so
completecan raiseLTMSContradiction.
complete returns the count of clauses it added, which is a handy measure of how
much “completion work” a clause set actually needed.
Indirect proof (completeness without saturation)
There is a second, cheaper route to many of the same conclusions:
proof by contradiction, implemented in
../src/ltms/indirect.py as try_indirect_proof. To
prove fact, assume ¬fact, run BCP, and watch for a contradiction that
implicates that assumption. If one appears, retract the assumption and install a
nogood that forces fact to hold.
This recovers a specific entailment on demand without generating the entire
prime-implicate set. It is the targeted tool; complete is the exhaustive one.
Chapter 10 introduced the contradiction-handler stack that makes this
composable; here it pays off as a completeness device.
Relating the two clauses of incompleteness
It is worth being precise about what completeness we are buying. After
complete, BCP is complete in the refutation sense relative to the current
clauses: every entailed unit literal is forced, and unsatisfiability is detected
in one BCP pass. That is exactly the property the two motivating examples needed.
Exercise 10 explores the natural refinement – a
notion of BCP-prime implicates that adds only the clauses BCP actually needs,
rather than all logically prime ones – which is the more economical target the
incremental method aims at.
The examples, explained
Example A: the entailment BCP misses ({x ∨ ¬y, x ∨ y} ⊨ x)
This is the canonical “BCP is incomplete” demonstration, and it appears both as a
runnable .kb file (../examples/kb/completeness.kb)
and inside the exercise solutions.
- Assert the two clauses. Query
x: it isUNKNOWN, because neither clause is ever a unit, so BCP never resolves them. - Call
complete. The single prime implicate{x}is computed (consensus of the two clauses ony) and installed. As a unit clause it immediately forcesxtoTRUE. - Query
xagain: nowTRUE.
The .kb script captures exactly this arc:
x | ~ y
x | y
expect x unknown
complete
expect x true
The same shape recurs as proof by cases in
../exercises/ch13/kb/proof_by_cases.kb:
from a→c, b→c, a∨b it is classically obvious that c holds, but BCP leaves
c unknown until complete derives the prime implicate c (resolve a∨b with
¬a∨c to get b∨c, then with ¬b∨c to get c).
Example B: the contradiction BCP misses (the four-clause set)
The set {p∨q, p∨¬q, ¬p∨q, ¬p∨¬q} is unsatisfiable but, again, no clause is ever
a unit, so plain BCP reports no contradiction. Saturating under consensus drives
the set down to the empty clause, so complete detects the unsatisfiability and
raises LTMSContradiction. This is the contradiction-side counterpart to Example
A’s missing entailment, and it is checked in the completeness_demo solution.
Example C: indirect proof as the lightweight alternative
The same {x ∨ ¬y, x ∨ y} ⊨ x problem is solved a second way in the solutions:
build an LTRE, assert the two clauses, confirm x is not yet true under plain
BCP, then call try_indirect_proof(e, ("x",)). It assumes ¬x, derives a
contradiction, and installs the nogood that establishes x – giving the same
answer as complete but without building the whole prime-implicate set. Seeing
both side by side is the point: completion is global and exhaustive, indirect
proof is local and targeted.
Example D: prime-implicate counting (where the blowup comes from)
Two solutions exist mainly to quantify the cost that motivates “delay, then complete”:
- A transitive implication chain
x0→x1→…→x_{n-1}is stored as justn-1binary clauses, but its prime implicates are all the derived implicationsx_i→x_jfori<j, which isC(n,2) = n(n-1)/2. The input grows linearly; the prime implicates grow quadratically. - A taxonomy (“exactly one of
n”) has exactly1 + C(n,2)prime implicates: the one “at least one” clause plus then(n-1)/2pairwise “not both” clauses. Notably its CNF is already prime – consensus produces nothing new – which is a nice sanity check that the saturation does not invent spurious clauses.
These are not just trivia; they are the empirical case for why you do not run
complete on every assertion.
Exercises walk-through
The full, paraphrased problem statements and answers are in
../exercises/ch13/README.md; the runnable parts
are in ../exercises/ch13/solutions.py. The
notable ones:
-
Ex 1 (demonstrated) – A formula with far more prime implicates than CNF conjuncts. The implication chain above:
n-1clauses butC(n,2)prime implicates. The solution verifies 4 clauses → 10 prime implicates (n=5) and 7 → 28 (n=8). For an exponential gap, see Ex 9. -
Ex 2 (demonstrated) – Consensus is not associative. With
c1={a},c2={¬a,b},c3={a,¬b,c}: grouping on the left gives{a,c}, while grouping on the right is undefined (becausec2 ∘ c3shares two complementary pairs and would be a tautology). This is the formal justification for the all-pairs saturation loop. -
Ex 4 (demonstrated) – Prime implicates of a taxonomy. Exactly
1 + C(n,2), confirmed for n = 3, 5, 6 (giving 4, 11, 16). A good demonstration that some CNFs are already their own prime-implicate set. -
Ex 9 (demonstrated, with an honesty caveat) – The
keancounting problem. Under a faithful reading of the (clearly OCR-garbled) printed listing, everys_jimplies everya_i, so each¬s_jbecomes a unit prime implicate and the total ism+1; the code computeskean(3,6)=7andkean(5,10)=11, matchingm+1. The book’s statedkean(5,10) = 60,466,236must come from a different, un-garbled listing (one in which thes_jclauses do not force¬s_junits), which yields a count exponential ink. The solution reports both, and the README is candid that the exact garbled source could not be reconstructed. -
Ex 3, 6, 7, 8 (design answers) – extending the interface to justify/enable/ retract whole formulas via a reified controlling node (Ex 3); turning qualitative sign equations into clauses and reducing them with
prime_implicates(Ex 6); rebuilding an ATMS on top of the CLTMS, where labels are the prime implicates restricted to assumption literals and nogoods are the empty-conclusion ones (Ex 7); and a trie-based nogood store for faster subsumption queries (Ex 8). -
Ex 5 (answer) – an id-checking
subsumesis correct but makes essentially no difference, because clause identity is already captured by the same-object literal references and the cost lives elsewhere. -
Ex 10 (discussion) – BCP-prime implicates: the smallest set of added clauses that makes unit propagation refutation-complete relative to what BCP can already derive. Many ordinary prime implicates are not BCP-prime. This is the operational target the incremental Tison/IPIA method pursues; our
completedoes the full post-hoc saturation instead. -
Ex 11 (proof + procedure) – prime implicants as negated prime implicates. The prime implicants of a formula set Σ are exactly the negations of the prime implicates of ¬Σ. The proof is a clean duality argument (
T ⊨ Σiff¬Σ ⊨ ¬T, with the subset order reversed by negation), and the procedure reuses the existing pipeline:normalize(("not", ("and", *formulas)))thenprime_implicates, then negate each resulting clause literal by literal.
Try it in this repo
Set up once, then run the demonstrations. All commands are run from the repo root.
python -m venv .venv && . .venv/bin/activate
pip install -e ".[dev]"
Run the executable exercise solutions (Ex 1, 2, 4, 9, plus the completeness demo):
python exercises/ch13/solutions.py
You will see, among other things, that the chain has 10 prime implicates for
n=5 and 28 for n=8, that consensus is not associative (left grouping yields
{a,c}, right grouping is None), that the taxonomy counts hit 1 + C(n,2),
and that the completeness demo reports x going from UNKNOWN to TRUE after
complete, plus complete_4clause_contradiction: True.
Run the .kb world-model files, which show the “expect … / complete /
expect …” arc directly:
python examples/run_kb.py exercises/ch13/kb/proof_by_cases.kb
python examples/run_kb.py examples/kb/completeness.kb
Both print the queried literal as unknown before the complete directive and
true after it – the BCP-incompleteness gap opening and then closing.
Try it live in a Python session:
from ltms import LTMS, Label, add_formula, complete, consensus, prime_implicates
m = LTMS()
x = m.create_node("x", assumption=True)
y = m.create_node("y", assumption=True)
add_formula(m, ("or", x, ("not", y)), "c1") # x v ~y
add_formula(m, ("or", x, y), "c2") # x v y
print(x.label.name) # UNKNOWN -- BCP cannot get there
print(complete(m)) # 1 -- one prime implicate added
print(x.label.name) # TRUE -- now forced
And the indirect-proof alternative, which proves x without full completion:
from ltms import LTRE, try_indirect_proof
e = LTRE()
e.assert_(("or", ("x",), ("not", ("y",))))
e.assert_(("or", ("x",), ("y",)))
print(e.is_true(("x",))) # False -- plain BCP
print(try_indirect_proof(e, ("x",))) # True -- proved by refutation
print(e.is_true(("x",))) # True
The whole test suite (which includes the chapter-13 behavior) stays green with:
pytest
Takeaways
- BCP is sound but incomplete. Unit resolution only fires on unit clauses, so
some entailed literals stay
UNKNOWNand some contradictions go unnoticed. This is inherent, not an implementation defect. - Prime implicates close the gap. They are the minimal entailed clauses; installing them lets BCP reach every entailment in one step. The engine is consensus (resolution) saturated to a fixpoint and kept subsumption- minimal.
- Consensus is not associative, which is precisely why saturation has to try all pairs repeatedly instead of folding a list.
- Completeness can be expensive. Prime implicates can be quadratically or even
exponentially more numerous than the input clauses, so the discipline is
“delay, then
complete” rather than completing eagerly. - Two tools, two scopes.
completeis the global, exhaustive route; indirect proof (try_indirect_proof) is the cheap, targeted route for one entailment at a time. - In this repo the whole layer is a few dozen readable lines in
../src/ltms/cltms.py:consensus,subsumes,prime_implicates, andcomplete, sitting cleanly on top of the unchanged LTMS core.