Chapter 13 – Improving the Completeness of TMS (CLTMS)

The LTMS reasons quickly but does not always reason enough; this chapter adds the missing pieces so it can find everything its clauses logically entail. companion index

The big idea

By Chapter 9 we had a Logic-based TMS whose engine room is Boolean Constraint Propagation (BCP): unit resolution running over a clause store, kept fast by the counter trick described in STUDY-NOTES.md. BCP is wonderful in one respect and frustrating in another:

It is sound. Anything it concludes is genuinely entailed by the clauses.
It is incomplete. There are things the clauses entail that BCP simply never derives.

Incompleteness is not a bug in our implementation. It is a property of unit resolution itself. Unit resolution only ever fires when a clause has been reduced to a single undecided literal (a unit clause). If no clause is ever a unit, BCP sits still even when the answer is logically forced.

Two small examples make the gap concrete:

The clause set {x ∨ ¬y, x ∨ y} entails x. Resolve the two clauses on y and you get the unit {x}. But neither original clause is ever a unit on its own (each has two undecided literals), so BCP never resolves them and leaves x labeled UNKNOWN.
The four-clause set over two variables, {p∨q, p∨¬q, ¬p∨q, ¬p∨¬q}, is unsatisfiable (try all four assignments; each falsifies one clause). Yet no clause is ever a unit, so BCP never fires and never notices the contradiction.

Chapter 13’s answer is elegant: make BCP complete by giving it more clauses to chew on – specifically the prime implicates of the clause set. A prime implicate is a clause that is logically redundant (it adds no new models) but is operationally useful, because once it is present BCP can reach conclusions in one step that previously required resolution it would never perform. Add all of them and BCP becomes refutation-complete: every entailed literal is forced and every contradiction is caught.

The price is potential blowup – a clause set can have astronomically more prime implicates than original clauses – so the recommended discipline is “delay, then complete”: reason cheaply with plain BCP, and only pay for completeness when you explicitly ask for it.

This chapter’s code lives in ../src/ltms/cltms.py, a small optional layer that sits on top of the core LTMS.

Concepts, step by step

Consensus (resolution between two clauses)

The single inference rule behind everything here is consensus, which is just propositional resolution. Two clauses resolve when they contain exactly one complementary literal pair – one clause has p, the other has ¬p. The resolvent (their consensus) is the union of the two clauses with that pair removed.

For example, {x, ¬y} and {x, y} share the complementary pair y / ¬y. Drop it and union the rest: the consensus is {x}.

Two rules guard the operation:

If the clauses share no complementary pair, they do not resolve at all (there is nothing to cancel).
If they share more than one complementary pair, the resolvent would contain some literal and its negation – a tautology, which is useless – so consensus is undefined (returns nothing).

In cltms.py a clause is represented as a frozenset of (node, sign) pairs, where the node is a TmsNode and the sign is Label.TRUE or Label.FALSE. The consensus function:

def consensus(c1, c2):
    complementary = {n for (n, s) in c1 if (n, _opposite(s)) in c2}
    if len(complementary) != 1:
        return None            # no pair, or more than one -> tautology
    pivot = next(iter(complementary))
    merged = (set(c1) | set(c2)) - {(pivot, Label.TRUE), (pivot, Label.FALSE)}
    ...                        # second tautology check on the merged literals
    return frozenset(merged)

The complementary set finds every node that appears with one sign in c1 and the opposite sign in c2. Insisting on exactly one of them rolls both guard rules into a single length check. There is a second tautology check after merging, because two non-pivot literals could still clash (the same node positive in one clause, negative in the other) – that would also make the resolvent a tautology.

One subtle and important fact, worked out as Exercise 2: consensus is not associative. Whether (c1 ∘ c2) ∘ c3 is even defined can differ from c1 ∘ (c2 ∘ c3). That is exactly why the saturation algorithm below cannot just fold a list left-to-right; it has to keep trying all pairs until nothing new appears.

Subsumption (throwing away redundant clauses)

A clause a subsumes clause b when a ⊆ b – every literal of a also appears in b. If a subsumes b then a is logically stronger (a shorter clause is a tighter constraint), so b is redundant and can be discarded. Example: {x} subsumes {x, y}, so once we know {x} we never need {x, y}.

In the code this is a one-liner over frozensets:

def subsumes(a, b):
    return a <= b

Exercise 5 asks whether a fancier subsumes that also checks clause identity would be faster. The answer is no: because literals reference the same TmsNode objects, identity is already captured by the subset test, and the real cost lives in the number of consensus operations, not the per-pair check.

Prime implicates (the saturation)

An implicate of a clause set is any clause it entails. A prime implicate is an implicate that no shorter implicate subsumes – it is a minimal entailed clause. The set of all prime implicates is the canonical “everything BCP would ever need” set.

prime_implicates computes them by saturation under consensus, kept subsumption-minimal (this is the brute-force version of Algorithm 13.1 in the book):

def prime_implicates(clauses):
    result = []
    queue = list(dict.fromkeys(clauses))      # dedup, keep order
    while queue:
        c = queue.pop()
        if any(subsumes(p, c) for p in result):
            continue                          # already covered -> skip
        result = [p for p in result if not subsumes(c, p)]   # c covers some -> drop them
        for p in result:
            r = consensus(c, p)
            if r is not None and not any(subsumes(q, r) for q in result):
                queue.append(r)               # new resolvent -> revisit later
        result.append(c)
    return result

Read it as a fixpoint loop:

Pull a clause c off the work queue.
If something we already kept subsumes c, c is redundant – drop it.
Otherwise c earns its place: first remove anything in result that c itself subsumes (we just found something stronger).
Resolve c against every surviving kept clause; any new, non-redundant resolvent goes back on the queue.
Keep c.

When the queue empties, result is the subsumption-minimal closure under consensus: the prime implicates. This is correct but can be exponential in the worst case, which is the whole reason for the “delay” discipline. The book’s incremental Tison method (IPIA) is the efficiency improvement; the module note flags it as future work and our prime_implicates does the straightforward full saturation instead.

`complete` (installing the prime implicates into a live LTMS)

prime_implicates is pure set crunching. complete is the bridge that wires the results back into a running LTMS:

def complete(ltms, informant="prime-implicate"):
    existing = [frozenset(cl.literals) for cl in ltms.clauses]
    have = set(existing)
    added = 0
    for pi in prime_implicates(existing):
        if pi in have:
            continue                          # already present, skip
        true_nodes  = [n for (n, s) in pi if s is Label.TRUE]
        false_nodes = [n for (n, s) in pi if s is Label.FALSE]
        ltms.add_clause(true_nodes, false_nodes, informant)
        have.add(pi)
        added += 1
    return added

It reads the LTMS’s current clauses, computes the prime implicates, and installs only the ones not already present, each tagged with an informant so its origin is recorded. Because add_clause runs ordinary BCP as each clause goes in, two good things happen automatically:

Any newly unit prime implicate (such as {x} in our first example) immediately forces its literal, so the entailed value appears.
Any prime implicate that turns out to be the empty clause (the four-clause example resolves down to one) signals a contradiction through the normal LTMS machinery – so complete can raise LTMSContradiction.

complete returns the count of clauses it added, which is a handy measure of how much “completion work” a clause set actually needed.

Indirect proof (completeness without saturation)

There is a second, cheaper route to many of the same conclusions: proof by contradiction, implemented in ../src/ltms/indirect.py as try_indirect_proof. To prove fact, assume ¬fact, run BCP, and watch for a contradiction that implicates that assumption. If one appears, retract the assumption and install a nogood that forces fact to hold.

This recovers a specific entailment on demand without generating the entire prime-implicate set. It is the targeted tool; complete is the exhaustive one. Chapter 10 introduced the contradiction-handler stack that makes this composable; here it pays off as a completeness device.

Relating the two clauses of incompleteness

It is worth being precise about what completeness we are buying. After complete, BCP is complete in the refutation sense relative to the current clauses: every entailed unit literal is forced, and unsatisfiability is detected in one BCP pass. That is exactly the property the two motivating examples needed. Exercise 10 explores the natural refinement – a notion of BCP-prime implicates that adds only the clauses BCP actually needs, rather than all logically prime ones – which is the more economical target the incremental method aims at.

The examples, explained

Example A: the entailment BCP misses (`{x ∨ ¬y, x ∨ y} ⊨ x`)

This is the canonical “BCP is incomplete” demonstration, and it appears both as a runnable .kb file (../examples/kb/completeness.kb) and inside the exercise solutions.

Assert the two clauses. Query x: it is UNKNOWN, because neither clause is ever a unit, so BCP never resolves them.
Call complete. The single prime implicate {x} is computed (consensus of the two clauses on y) and installed. As a unit clause it immediately forces x to TRUE.
Query x again: now TRUE.

The .kb script captures exactly this arc:

x | ~ y
x | y
expect x unknown
complete
expect x true

The same shape recurs as proof by cases in ../exercises/ch13/kb/proof_by_cases.kb: from a→c, b→c, a∨b it is classically obvious that c holds, but BCP leaves c unknown until complete derives the prime implicate c (resolve a∨b with ¬a∨c to get b∨c, then with ¬b∨c to get c).

Example B: the contradiction BCP misses (the four-clause set)

The set {p∨q, p∨¬q, ¬p∨q, ¬p∨¬q} is unsatisfiable but, again, no clause is ever a unit, so plain BCP reports no contradiction. Saturating under consensus drives the set down to the empty clause, so complete detects the unsatisfiability and raises LTMSContradiction. This is the contradiction-side counterpart to Example A’s missing entailment, and it is checked in the completeness_demo solution.

Example C: indirect proof as the lightweight alternative

The same {x ∨ ¬y, x ∨ y} ⊨ x problem is solved a second way in the solutions: build an LTRE, assert the two clauses, confirm x is not yet true under plain BCP, then call try_indirect_proof(e, ("x",)). It assumes ¬x, derives a contradiction, and installs the nogood that establishes x – giving the same answer as complete but without building the whole prime-implicate set. Seeing both side by side is the point: completion is global and exhaustive, indirect proof is local and targeted.

Example D: prime-implicate counting (where the blowup comes from)

Two solutions exist mainly to quantify the cost that motivates “delay, then complete”:

A transitive implication chain x0→x1→…→x_{n-1} is stored as just n-1 binary clauses, but its prime implicates are all the derived implications x_i→x_j for i<j, which is C(n,2) = n(n-1)/2. The input grows linearly; the prime implicates grow quadratically.
A taxonomy (“exactly one of n”) has exactly 1 + C(n,2) prime implicates: the one “at least one” clause plus the n(n-1)/2 pairwise “not both” clauses. Notably its CNF is already prime – consensus produces nothing new – which is a nice sanity check that the saturation does not invent spurious clauses.

These are not just trivia; they are the empirical case for why you do not run complete on every assertion.

Exercises walk-through

The full, paraphrased problem statements and answers are in ../exercises/ch13/README.md; the runnable parts are in ../exercises/ch13/solutions.py. The notable ones:

Ex 1 (demonstrated) – A formula with far more prime implicates than CNF conjuncts. The implication chain above: n-1 clauses but C(n,2) prime implicates. The solution verifies 4 clauses → 10 prime implicates (n=5) and 7 → 28 (n=8). For an exponential gap, see Ex 9.
Ex 2 (demonstrated) – Consensus is not associative. With c1={a}, c2={¬a,b}, c3={a,¬b,c}: grouping on the left gives {a,c}, while grouping on the right is undefined (because c2 ∘ c3 shares two complementary pairs and would be a tautology). This is the formal justification for the all-pairs saturation loop.
Ex 4 (demonstrated) – Prime implicates of a taxonomy. Exactly 1 + C(n,2), confirmed for n = 3, 5, 6 (giving 4, 11, 16). A good demonstration that some CNFs are already their own prime-implicate set.
Ex 9 (demonstrated, with an honesty caveat) – The kean counting problem. Under a faithful reading of the (clearly OCR-garbled) printed listing, every s_j implies every a_i, so each ¬s_j becomes a unit prime implicate and the total is m+1; the code computes kean(3,6)=7 and kean(5,10)=11, matching m+1. The book’s stated kean(5,10) = 60,466,236 must come from a different, un-garbled listing (one in which the s_j clauses do not force ¬s_j units), which yields a count exponential in k. The solution reports both, and the README is candid that the exact garbled source could not be reconstructed.
Ex 3, 6, 7, 8 (design answers) – extending the interface to justify/enable/ retract whole formulas via a reified controlling node (Ex 3); turning qualitative sign equations into clauses and reducing them with prime_implicates (Ex 6); rebuilding an ATMS on top of the CLTMS, where labels are the prime implicates restricted to assumption literals and nogoods are the empty-conclusion ones (Ex 7); and a trie-based nogood store for faster subsumption queries (Ex 8).
Ex 5 (answer) – an id-checking subsumes is correct but makes essentially no difference, because clause identity is already captured by the same-object literal references and the cost lives elsewhere.
Ex 10 (discussion) – BCP-prime implicates: the smallest set of added clauses that makes unit propagation refutation-complete relative to what BCP can already derive. Many ordinary prime implicates are not BCP-prime. This is the operational target the incremental Tison/IPIA method pursues; our complete does the full post-hoc saturation instead.
Ex 11 (proof + procedure) – prime implicants as negated prime implicates. The prime implicants of a formula set Σ are exactly the negations of the prime implicates of ¬Σ. The proof is a clean duality argument (T ⊨ Σ iff ¬Σ ⊨ ¬T, with the subset order reversed by negation), and the procedure reuses the existing pipeline: normalize(("not", ("and", *formulas))) then prime_implicates, then negate each resulting clause literal by literal.

Try it in this repo

Set up once, then run the demonstrations. All commands are run from the repo root.

python -m venv .venv && . .venv/bin/activate
pip install -e ".[dev]"

Run the executable exercise solutions (Ex 1, 2, 4, 9, plus the completeness demo):

python exercises/ch13/solutions.py

You will see, among other things, that the chain has 10 prime implicates for n=5 and 28 for n=8, that consensus is not associative (left grouping yields {a,c}, right grouping is None), that the taxonomy counts hit 1 + C(n,2), and that the completeness demo reports x going from UNKNOWN to TRUE after complete, plus complete_4clause_contradiction: True.

Run the .kb world-model files, which show the “expect … / complete / expect …” arc directly:

python examples/run_kb.py exercises/ch13/kb/proof_by_cases.kb
python examples/run_kb.py examples/kb/completeness.kb

Both print the queried literal as unknown before the complete directive and true after it – the BCP-incompleteness gap opening and then closing.

Try it live in a Python session:

from ltms import LTMS, Label, add_formula, complete, consensus, prime_implicates

m = LTMS()
x = m.create_node("x", assumption=True)
y = m.create_node("y", assumption=True)
add_formula(m, ("or", x, ("not", y)), "c1")   # x v ~y
add_formula(m, ("or", x, y), "c2")            # x v  y
print(x.label.name)        # UNKNOWN  -- BCP cannot get there
print(complete(m))         # 1        -- one prime implicate added
print(x.label.name)        # TRUE     -- now forced

And the indirect-proof alternative, which proves x without full completion:

from ltms import LTRE, try_indirect_proof
e = LTRE()
e.assert_(("or", ("x",), ("not", ("y",))))
e.assert_(("or", ("x",), ("y",)))
print(e.is_true(("x",)))                 # False -- plain BCP
print(try_indirect_proof(e, ("x",)))     # True  -- proved by refutation
print(e.is_true(("x",)))                 # True

The whole test suite (which includes the chapter-13 behavior) stays green with:

pytest

Takeaways

BCP is sound but incomplete. Unit resolution only fires on unit clauses, so some entailed literals stay UNKNOWN and some contradictions go unnoticed. This is inherent, not an implementation defect.
Prime implicates close the gap. They are the minimal entailed clauses; installing them lets BCP reach every entailment in one step. The engine is consensus (resolution) saturated to a fixpoint and kept subsumption- minimal.
Consensus is not associative, which is precisely why saturation has to try all pairs repeatedly instead of folding a list.
Completeness can be expensive. Prime implicates can be quadratically or even exponentially more numerous than the input clauses, so the discipline is “delay, then complete” rather than completing eagerly.
Two tools, two scopes. complete is the global, exhaustive route; indirect proof (try_indirect_proof) is the cheap, targeted route for one entailment at a time.
In this repo the whole layer is a few dozen readable lines in ../src/ltms/cltms.py: consensus, subsumes, prime_implicates, and complete, sitting cleanly on top of the unchanged LTMS core.