Chapter 5 – Extending Pattern-Directed Inference Systems (FTRE)

How a tiny forward chainer grows into FTRE: a focused, resource-bounded, context-aware reasoner built by compiling rules into Lisp procedures and searching over a stack of assumptions. Back to the companion index.

Chapter 5 is the “now make it real” chapter. Chapters 3 and 4 gave us the bare-bones TRE (Tiny Rule Engine): a database, pattern-directed rules, and forward chaining where “believed” just means “in the database.” Chapter 5 takes that skeleton and bolts on the three things you actually need to write a working problem solver: speed (compile the matcher instead of interpreting it), assumptions (reason hypothetically and back out of dead ends), and resource bounds (do not search forever). The result is named FTRE.

A heads-up before we start: FTRE itself is not implemented in this package. Porting it was deliberately ruled out of scope (see ../BRIEFING.md). So this chapter is taught conceptually, with the exercise answers in ../exercises/ch05/ as the worked material, and the “Try it” section points you at the closest machinery we did build (the LTMS / LTRE), which shares the assumption-and-contradiction ideas even though its plumbing is different. Where the book’s FTRE and our LTRE genuinely differ, this chapter says so.

The big idea

A pattern-directed inference system (PDIS) is a loop: match rules against facts, fire the ones that match, let them assert new facts, repeat until nothing new happens. The TRE of the previous chapters does exactly that and nothing more. It is correct but naive in three ways, and FTRE fixes each:

Matching is interpreted and slow. Every time a fact arrives, the TRE walks every candidate rule’s pattern with a generic unify. FTRE instead compiles each rule once, at definition time, into ordinary Lisp functions: a match procedure that tests shape and binds variables with open-coded car/cdr/eq tests, and a body procedure that runs the rule’s action. You pay a little compile time once and win at every match afterward.
It can only chain forward over things it already believes. Real problem solving needs hypothetical reasoning: “suppose P were true; what follows? Oops, contradiction, so P is false.” FTRE adds assumptions and logical contexts. A context is the set of assumptions currently in force; contexts are organized as a stack, which is exactly what makes the search depth-first. A special kind of rule, the a-rule (assumption rule), introduces an assumption, explores the resulting subcontext, and then pops back out.
It will happily search forever. A hypothetical search tree can be infinite. FTRE adds resource bounds – limits on how deep the assumption stack may go, and (a natural extension, see Exercise 12) on how many assumptions may be made in total – so the engine degrades gracefully instead of looping.

The “F” in FTRE is for “focused,” and that focus comes precisely from the context stack plus the bounds: at any moment the engine is concentrating on one branch of the hypothetical search tree, with a hard ceiling on how far it will chase it.

If you remember one sentence from this chapter, make it this: FTRE = compiled TRE + a stack of assumption contexts + resource bounds. Everything below is detail on those three moves.

Concepts, step by step

1. Car-indexing (carried over from the TRE)

Both the TRE and FTRE key every fact and every rule trigger by its leftmost constant symbol (the car of the pattern, hence “car-indexing”). The database is a hash table from that symbol to a dbclass bucket holding all the facts and rules whose pattern starts with that symbol. When a new fact arrives, the engine only ever tries to match it against rules in the same bucket. This is the only indexing the system has, and it is cheap and effective because most patterns are distinguished by their predicate. Our package keeps the exact same scheme: see get_dbclass in ../src/ltms/tre/engine.py and ../src/ltms/ltre.py.

Hold on to this fact, because Exercise 4 asks what happens if you give each dbclass its own rule and fact stacks instead of scanning global ones, and Exercise 1 asks whether the shape-checking tests inside the matcher are redundant given this indexing.

2. Compiling a rule into a match procedure and a body procedure

This is the technical heart of the chapter. Instead of storing a rule as data and interpreting it, FTRE runs a small code generator at rule-definition time:

build-rule is the top-level driver. For each rule it produces a named match procedure and a named body procedure and installs them.
generate-match-body walks the trigger pattern and emits the code that tests the incoming datum’s shape and extracts variable bindings. For a pattern like (parent ?x ?y) it emits roughly: “is this a cons? is its car the symbol parent? bind ?x to the cadr, bind ?y to the caddr.” These are the (consp p) and car-equality tests Exercise 1 scrutinizes.
generate-unify-tests emits the per-constituent comparisons: eq for symbols, eql for integers, recursion for nested conses. Exercise 11 asks you to extend this to floats and user-defined types, which is genuinely tricky because equality on floats and on arbitrary objects is not the clean, hashable, immutable thing the indexer assumes.

A subtle but important variable in the generator is *bound-vars*: the set of pattern variables that are already bound at the point the generator is emitting code. With multiple triggers, a variable bound by trigger 1 can be reused by trigger 2 – trigger 2 does not re-extract it, it just tests that its position equals the already-bound value. That is what turns a multi-trigger rule into a relational join. Exercise 10 notices that *bound-vars* is threaded into every match procedure even when that procedure does not mention all of them, and asks whether trimming the parameter lists is safe and worthwhile.

3. Multi-trigger rules are joins; a-rules must have exactly one trigger

A rule with several triggers fires once per consistent combination of matches – a join over the database. (The base TRE of Chapters 3-4 had no multi-trigger syntax at all: you built a join by nesting rules, one body installing the next; FTRE adds direct multi-trigger support, and our LTRE follows FTRE here.) For ordinary deductive rules that is fine and useful. But a-rules are restricted to a single trigger, and do-rule raises an error otherwise. Exercise 2 is about why. The short version: an a-rule’s job is “one match -> one assumption -> one subcontext to explore and then pop.” A join would make it ambiguous when and under which bundled binding the assumption should be made, would spawn a combinatorial number of subcontexts, and would wreck the clean push/pop discipline of the context stack. If you really need a conjunction to drive an assumption, derive an intermediate fact with a normal rule first, then trigger the a-rule on that single fact.

4. Logical contexts and the assumption stack

A context is the set of assumptions currently believed. FTRE keeps *context* as the current context and organizes contexts as a stack. Making an assumption pushes a new context (old assumptions plus the new one); finishing with it pops back. Because it is a stack, exploration is inherently depth-first: you dive down one chain of assumptions, hit a wall (a contradiction or a resource limit), pop, and try the next alternative.

Runtime primitives you will see referenced across the exercises:

rassert! – assert a fact (the runtime version, with bound values substituted in).
fetch – look up assertions matching a pattern.
assume / a-rule firing – push an assumption and its context.
contradiction – assert that the current context is impossible, which pops the bad assumption and forces the choice point to try its next alternative.
seek-in-context – post a goal to be watched within a specific context, and the subject of Exercise 16’s (= *context* ,*context*) guard.
show – post a proposition as something to prove.
rlet – bind FTRE variables for use in rule bodies (Exercise 8 flags that it does not verify the names are real FTRE variables).

5. a-rules and indirect proof

The canonical use of an a-rule is indirect proof (proof by contradiction): to prove P, assume (not P), run the rules, and if that assumption leads to contradiction, you have proven P. This is the same idea our package implements (in a different substrate) as try_indirect_proof in ../src/ltms/indirect.py – assume the negation, run rules, and if the negation participates in a contradiction, retract it and record a nogood that forces the original fact. Reading that file is the most concrete way to see the FTRE indirect-proof pattern in working Python, even though the bookkeeping (a TMS with nogoods, rather than a raw context stack) is not identical.

6. Resource bounds

FTRE caps the assumption stack depth so the depth-first search cannot run away. Exercise 12 extends this with a total cap on the number of assumptions ever made, and Exercise 13 turns the depth cap into iterative deepening: repeatedly run with a depth limit of 0, 1, 2, …, resetting between rounds, which keeps memory linear in depth while still finding the shallowest proof first. The key insight for both is that a bound should fail a branch gracefully (treat it as a dead end) rather than erroring, so partial results stay valid.

7. KM*: the worked application

The chapter’s flagship example is KM* (a natural-deduction theorem prover for propositional logic) built entirely out of FTRE rules. Each logical connective gets an introduction rule and an elimination rule (to prove A and B, prove each conjunct; from A and B, conclude A; and so on), plus indirect proof via a-rules. KM* is where the whole machine comes together: deductive rules do the forward chaining, a-rules supply the hypotheses (assume the antecedent of an implication, assume the negation for indirect proof), and the context stack supplies the backtracking. Exercise 20 asks you to push KM* all the way to first-order logic by adding quantifier rules, where the hard parts are fresh-constant (eigenvariable) generation and capture-avoiding substitution.

The examples, explained

The chapter’s running examples each isolate one of the three FTRE moves.

The compiled matcher (Figure 5.2 and friends). The point of showing the generated match code is to make concrete that a rule is just two functions after compilation. Once you see the open-coded (consp p) / car-equality / cdr-walk pattern, several exercises become obvious questions about that code: are the shape tests redundant (Exercise 1), does the matcher recompute car/cdr it could have cached (Exercise 14), does the cost heuristic for choosing where to compute a variable ever pick wrong (Exercise 7), and is the parameter list bigger than it needs to be (Exercise 10)?
N-queens, two ways (Section 5.3.1). The book solves N-queens with a hand-written solver that uses FTRE for search control but keeps the board logic in Lisp. Exercise 5 challenges the “purist” position by asking you to solve it with rules only. The lesson the comparison teaches: search control (choice points + backtracking) maps beautifully onto a-rules + the context stack, and constraints map onto contradiction-detecting deductive rules, so the purist is largely right – but the all-rules version is slower and more memory-hungry because every partial placement and every constraint check becomes an asserted fact and a fired rule. That declarative-versus-procedural trade-off is the whole point. (Our package makes the same trade visible from the other direction: see the DD-search N-queens solver built on the LTMS in ../examples/ and ../src/ltms/dds.py.)
*The fnd natural-deduction rule set (Section 5.3.2 / KM).** This example shows a real rule set that almost works: it passes the basic tests but fails some harder examples. Exercise 18 walks through diagnosing and fixing it, and the moral (the book’s “an ounce of analysis”) is that the fix is a small, general addition – goal-directed backward chaining via show – not a pile of example-specific patches. Forward chaining alone cannot attack every conclusion; you need to work backward from the goal to the subgoals that would prove it.
seek-in-context and the context guard (Exercise 16). The example here is a goal rule that carries (= *context* ,*context*). It teaches the single most important safety property of stack-based contexts: a goal posted under one set of assumptions must only be considered proved by something believed under that same set of assumptions. Without the guard, a fact derived in a deeper subcontext (because of an extra assumption the original context never made) could be mistaken for a proof of the original goal – an unsound result. Our LTMS makes the analogous guarantee structurally, by recording well-founded support back to the actual enabled assumptions; see assumptions_of_node and the explanation machinery in ../src/ltms/explain.py.

Exercises walk-through

All twenty problems are answered, conceptually, in ../exercises/ch05/README.md. FTRE is out of scope for the package, so these are analysis only – algorithm sketches, complexity arguments, and design discussion, with nothing executed (codeRuns = false). Highlights, grouped by theme:

The compiled matcher (1, 7, 10, 11, 14)

Exercise 1 – are the consp and car-equality tests redundant? Not in general. Dbclass indexing fixes only the top-level predicate and only gives you a coarse bucket of candidates. The match code is recursive, and at inner positions (a variable bound to an atom, an integer, nil) nothing upstream has proven the value is a cons or that an embedded constant matches. You could special-case the outermost call, but removing the tests wholesale is unsafe. Keep them; they cost two cheap tests per node.
Exercise 7 – when the “cheapest expression” heuristic misfires. The generator greedily computes each newly bound variable via the locally cheapest access path. It can lose when the locally cheap path duplicates work later: computing ?x from trigger 2 (one car) looks cheaper than from trigger 1 (a deep cadr/caddr), but the trigger-1 traversal would have computed an intermediate subterm needed anyway for another variable. The true optimum is global common-subexpression elimination over the car/cdr DAG – doable, but usually not worth it, because triggers are tiny and match code is compiled once and reused.
Exercise 10 – *bound-vars* carries variables a match procedure never uses. Passing the extra bindings cannot cause wrong answers (unused parameters are inert); it only costs memory and time. You can trim each match procedure to vars-mentioned-in(trigger) ∩ *bound-vars* via a straightforward used-variable analysis, but the win is marginal. The same trick does not safely apply to body procedures, because a body is arbitrary user code and can reference any in-scope variable (through macros, rlet, helpers) – you cannot statically prove a body variable is dead the way you can for a structural trigger.
Exercise 11 – open-coding more constituent types. Floats need a type dispatch and a choice: tolerance comparison (intuitive, but breaks the transitivity/hashing the index relies on) versus exact = (well-defined, surprising). User-defined types need an extensibility registry mapping a recognizer to an identity test and an optional decomposer. The danger (11c) is that arbitrary structures have ambiguous equality, may be mutable (silently changing “what was asserted”), may hash poorly, and may be cyclic. Safe rule: only admit immutable constituents with cheap, well-defined equality and hash.
Exercise 14 – eliminate redundant car/cdr. Walk the pattern with a single mutable cur pointer and let-bind each subterm the first time it is reached, then refer to the temporary thereafter. This drops match cost from “sum of access-path lengths” (up to O(n^2) for deep nesting) to O(n) – each cons cell visited once. It is the same optimization Exercise 7’s global plan wants; combine them.

Assumptions, contexts, and a-rules (2, 3, 16, 17, 19)

Exercise 2 – why a-rules forbid multiple triggers. Covered above: one match -> one assumption -> one subcontext keeps assumptions atomic and the push/pop discipline clean.
Exercise 3 – unless-body versus :test-clause for guarding an indirect-proof a-rule. They compute the same net assertions but differ in when the guard runs. A :test is part of matching, so a failing binding never becomes a rule instance: no firing, no assumption, no work. An unless in the body fires first and then does nothing, paying instantiation cost and polluting firing counts. Since the three guards here are pure functions of the match binding, they belong in :test. Reserve the body-unless form for guards that depend on values not known until the body runs.
Exercise 16 – the (= *context* ,*context*) guard in seek-in-context. The headline soundness exercise, explained in “The examples” above. The quasiquote captures the context that was current when the goal was posted; the runtime = checks it against the context current when a candidate fires. Drop it and a fact derived in a deeper assumption context can be mistaken for a proof of a goal that never made that assumption – an unsound indirect proof.
Exercise 17 – control predicates premise and goal. Wrap raw assuming/showing in declarative markers: a rule on (premise ?x) actually assumes ?x; a rule on (goal ?x) posts (show ?x). The payoff is solved?, which can self-check success by looking up every asserted (goal ?x) and testing whether each ?x is now believed – the engine decides on its own whether the stated problem is solved, with no caller hard-coding which proposition to check.
Exercise 19 – from a context stack to a context tree (non-DFS search). The stack is what forces depth-first. Replace it with a tree of contexts (each node = parent + one added assumption), turn a-rules from “push/explore/pop” into “create a child and add it to a frontier, then return,” and add an explicit scheduler that pops the best fringe context. The comparator then selects the strategy: LIFO recovers DFS, ordering by an evaluation function gives best-first, keeping top-k per level gives beam search. This is essentially turning FTRE’s implicit DFS recursion into explicit agenda-driven graph search, closer in spirit to an ATMS multi-context view. Stack wins on memory and best-case speed; tree wins on solution quality and worst-case search shape.

Performance and engineering (4, 9, 15)

Exercise 4 – per-dbclass stacks. Give each dbclass its own facts-stack and rules-stack instead of scanning two global ones. Matching a new fact then scans only the rules in its class. With K well-spread classes the total matching work drops from roughly Θ(F·R) to about FR/K – a factor-of-K speedup that grows with the number of distinct predicates, and degrades gracefully to the centralized cost (plus a little hashing) when one predicate dominates.
Exercise 9 – automatic trigger reordering. Multi-trigger matching is a join, so trigger order matters exactly like database join ordering: test the most selective and most constraining trigger first. The automatic design keeps selectivity statistics per predicate, computes which variables each trigger binds versus requires (a partial-order constraint), and greedily picks the next trigger that adds the fewest expected matches. Recommendation: automatic greedy ordering with an author override hook, because authors sometimes know things the statistics never capture.
Exercise 15 – reclaim memory from generated functions. Implementing rules as named Lisp procedures gives them indefinite extent: their function cells survive even after you throw away the FTRE, leaking code. The clean fix is to generate the match/body code as anonymous closures stored in slots of the rule structure itself, so they become collectable when the rule does. This is exactly the design our package follows in spirit: rule bodies are first-class Python callables stored on the engine, not global definitions (see add_rule / rule in ../src/ltms/ltre.py).

Negation, defaults, and robustness (6, 8)

Exercise 6 – a rule that fires only when nothing matches its trigger. This is negation-as-failure / closed-world default reasoning (“no fact says the train is late, so assume it is on time”). Implementable but semantically tricky because absence is non-monotone: you must fire it only at quiescence, and you must record a dependency so the conclusion is retracted if a matching fact later arrives – which is to say, it really wants a TMS underneath. That is precisely why our package handles closure through dedicated closed-world machinery (../src/ltms/cwa.py) rather than ad-hoc negative rules.
Exercise 8 – rlet does not check its variables are real FTRE variables. A latent robustness hazard, not an everyday bug: a typo’d or non-FTRE name binds silently and gives a wrong-answer bug instead of a clean error. Cheap fix: validate each name with the same predicate the matcher uses. Tolerable for a teaching system; add the check for a library.

Resource bounds and search (12, 13)

Exercise 12 – a total-assumptions bound. Add a cumulative counter and limit, check it before each new assumption, and cut the branch gracefully at the ceiling rather than erroring. It is orthogonal to the existing depth bound: depth limits the height of any one assumption chain, the total bound limits overall effort across all branches. Whichever binds first cuts the search; tune them together, and remember either can make a solvable problem look unsolvable (incompleteness from resource exhaustion).
Exercise 13 – iterative deepening on KM*. Run depth-limited DFS with the limit climbing 0, 1, 2, …, resetting the database (but keeping the rules) each round. You re-derive shallow consequences every round, but because the deepest level dominates the work, total cost stays within a constant factor of one full DFS while guaranteeing the shallowest proof is found first and memory stays linear in depth.

First-order logic and KM* (5, 18, 20)

Exercise 5 – N-queens in pure rules and Exercise 18 – tuning the fnd rule set are discussed under “The examples” above.
Exercise 20 – extend KM* to first-order predicate calculus. Add the four quantifier rules following KM’s introduction/elimination pattern: universal introduction (prove P(a) for a *fresh, arbitrary constant a, then generalize), universal elimination (instantiate with any term), existential introduction (a witness exists), and existential elimination (introduce a fresh witness constant, and require that the conclusion not mention it). The genuinely first-order difficulties are fresh-constant (eigenvariable) generation with correct occurrence side-conditions and capture-avoiding substitution – exactly the issues absent from the propositional system. Good tests are the ones that both prove valid arguments (syllogisms, quantifier duality) and block the classic unsound generalizations.

Try it in this repo

Honest disclaimer first: FTRE is not in this package. There is no ftre module, no context stack, no rule code-generator – porting them was out of scope (see ../BRIEFING.md). The full conceptual treatment lives in ../exercises/ch05/. What you can run here is the closest available machinery, which exercises the same ideas (assumptions, contradictions, indirect proof, dependency-directed search) on a different substrate: a TMS with nogoods instead of a raw context stack.

The car-indexed forward-chaining loop (the TRE skeleton FTRE compiles) is real and runnable. One caveat that itself illustrates the chapter: the base TRE has no multi-trigger syntax, so a conjunctive (“A and B”) match – the join FTRE supports directly – is built by nesting rules, where one rule’s body installs the next. Rule bodies take (bindings, engine):

from ltms.tre import Tre
from ltms.terms import Var

e = Tre()
x, y, z = Var("x"), Var("y"), Var("z")

# transitivity, written as a nested join: outer matches (parent ?x ?y),
# and its body installs an inner rule that matches (parent ?y ?z).
@e.rule(("parent", x, y))
def outer(b, eng):
    by = b[y]
    @eng.rule(("parent", by, z))
    def inner(b2, eng2, bx=b[x]):
        eng2.assert_(("grandparent", bx, b2[z]))

e.assert_(("parent", "ann", "bob"))
e.assert_(("parent", "bob", "cy"))
e.run_rules()
print(e.fetch(("grandparent", Var("g"), Var("d"))))   # -> [('grandparent', 'ann', 'cy')]

Hypothetical reasoning and indirect proof – FTRE’s a-rule trick – show up here as the LTRE’s assumptions plus try_indirect_proof:

from ltms import LTRE
from ltms.indirect import try_indirect_proof

e = LTRE()
# Encode enough that assuming (not goal) leads to a contradiction, then:
proved = try_indirect_proof(e, ("goal",))   # assume the negation, run rules, refute
print(proved)

The depth-first, backtracking search that FTRE gets from its context stack is here as dependency-directed search, including a working N-queens solver:

python examples/coloring_dds.py        # map-coloring via dependency-directed search
python examples/run_kb.py examples/kb/belief_revision.kb   # assume / retract / belief revision
pytest -k dds                          # the DD-search tests, incl. N-queens

Read ../src/ltms/indirect.py and ../src/ltms/dds.py side by side with Exercises 5 and 16: you will see the same “assume, run, catch the contradiction, learn from it, back out” shape that FTRE implements with a context stack, implemented here with a contradiction-handler stack and nogoods. The bookkeeping differs; the idea is the same.

Takeaways

FTRE = compiled TRE + assumption-context stack + resource bounds. Three independent upgrades to the same forward-chaining loop.
Rules become two functions (a match procedure and a body procedure) generated once at definition time. Most of the matcher-flavored exercises (1, 7, 10, 11, 14) are really questions about that generated code – its redundant tests, its variable-cost heuristic, its parameter lists, its repeated car/cdr.
Assumptions live in a stack of contexts, which is what makes the search depth-first. a-rules push and pop; the single-trigger restriction (Exercise 2) and the (= *context* ,*context*) guard (Exercise 16) both exist to keep that stack discipline sound.
Negation-as-failure and resource bounds are non-monotone and want a TMS or careful quiescence handling underneath (Exercises 6, 12, 13).
KM* is the payoff: a natural-deduction prover where deductive rules chain forward, a-rules supply hypotheses, and the stack supplies backtracking – extensible all the way to first-order logic (Exercise 20) with eigenvariables and capture-avoiding substitution.
In this repo, FTRE is conceptual. The runnable analogues of its assumption ideas are the LTMS/LTRE’s assume/retract, try_indirect_proof, and dependency-directed search; the worked analysis of every Chapter 5 exercise is in ../exercises/ch05/.