Chapter 4 – Pattern-Directed Inference Systems (TRE)
A friendly tour of TRE, the smallest forward-chaining rule engine in this project, and the substrate every truth-maintenance layer in later chapters is built on. Back to the companion index.
The big idea
A pattern-directed inference system is a rule engine that does not call rules by name. Instead, each rule advertises a pattern it cares about, and the engine automatically wakes a rule whenever a fact matching that pattern shows up. You never write “now run rule 7”; you assert facts, and the relevant rules fire themselves. This is the data-driven, forward-chaining style of inference: new data flows in, and conclusions ripple outward until nothing new can be derived.
TRE (the “Tiny Rule Engine”) is the bare-bones version of this idea. It is deliberately minimal so you can see the whole mechanism at once. In TRE, belief is nothing more than presence in the database. A fact is “believed” if and only if it has been asserted. There is no notion of why a fact is believed, no support, no retraction, and no contradictions. That austerity is the point: TRE is the floor that every later chapter stacks truth maintenance on top of. Chapters 7 onward keep TRE’s pattern-matching skeleton almost unchanged and bolt a TMS (a JTMS, then the LTMS) underneath, so that belief becomes justified belief that can change as assumptions change. Get TRE clear in your head and the rest of the book is “the same engine, but now facts can become unbelieved.”
The implementation lives in
../src/ltms/tre/engine.py. It is about 175 lines,
and you can profitably read all of it. This chapter walks through what those
lines do and why, then works through the exercises in
../exercises/ch04/.
Concepts, step by step
Terms, patterns, and variables
Everything the engine reasons about is a term: a tuple like
("parent", "ann", "bob") standing for the proposition ann is a parent of
bob. A term is either an atom (a plain symbol or string) or a compound
(a tuple whose first element is the relation name). A variable is written
Var("x") and prints as ?x. A pattern is just a term that may contain
variables, for example ("parent", Var("x"), Var("y")). Facts in the database
are always ground (variable-free); patterns in rules may have variables that
get filled in by matching. These primitives are shared verbatim with every other
layer of the project (see ../src/ltms/terms.py and
../src/ltms/unify.py).
Unification: the one matching primitive
Matching a pattern against a fact is done by unification, a single shared
function unify(a, b, bindings). It tries to find an assignment of variables
that makes the two terms identical, returning the (extended) bindings on success
or the sentinel FAIL on failure. One subtlety worth burning into memory:
An empty binding dict is a legitimate success (it means “these unify with no new bindings needed”). Success is therefore tested as
result is not FAIL, never by truthiness. Testingif result:would wrongly treat a perfectly good empty-binding match as a failure.
You can see this discipline everywhere in the engine, for example in
assert_, fetch, and the private _try_rule_on.
The full unifier is two-sided and includes an occurs-check (it refuses to
bind ?x = (g ?x), which would create an infinite term). TRE’s rule matcher
actually only needs one-sided matching, because database facts are always
ground, but the project reuses the one general unifier for simplicity. The
difference between one-sided matching and full two-sided unification is exactly
what Exercise 3 explores.
Car-indexing: the only index TRE has
If every assertion had to be tried against every rule, the engine would be
hopelessly slow. TRE uses one cheap trick: it indexes facts and rules by their
leftmost symbol (in Lisp this is the car of the list, hence “car
indexing”). The method is get_dbclass:
def get_dbclass(self, term, env=None):
if is_compound(term):
return self.get_dbclass(term[0], env) # recurse to the leftmost symbol
...
bucket = self.dbclass_table.get(term) # one bucket per relation name
...
Each relation name gets a Dbclass “bucket” holding the facts that use that
name and the rules triggered by that name. So ("parent", "ann", "bob")
and the pattern ("parent", ?x, ?y) both live in the parent bucket, and they
are the only things ever compared with one another. A variable in head position
(for example trying to index the pattern (?r ann bob)) is an error unless that
variable is already bound, because then there is no symbol to file it under.
The three moving parts of an engine
A Tre instance is one self-contained database (not a global). It holds:
dbclass_table– the dict mapping each leftmost symbol to itsDbclassbucket of facts and rules.queue– a list of pending rule activations, each a pair(body, bindings). This is drained byrun_rules.- counters (
rule_counter,rules_run) for naming rules and bookkeeping.
Asserting facts
assert_(fact) adds a fact, deduplicating by equality. The important detail:
def assert_(self, fact):
bucket = self.get_dbclass(fact)
if fact in bucket.fact_set:
return False # already known: nothing to do
bucket.fact_set.add(fact); bucket.facts.append(fact)
for rule in list(bucket.rules): # only on a genuinely new fact
self._try_rule_on(rule, fact) # ...do we test the bucket's rules
return True
Two things to notice. First, only a genuinely new fact triggers work (the
dedup check returns early). This is what keeps the engine from looping forever
re-deriving the same things, and it is why forward chaining terminates on
finite, monotone rule sets. Second, assert_ queues matching rules; it does
not run them. Running is a separate step.
Adding rules, and why order does not matter
add_rule(trigger, body, environment) installs a rule. A rule body is just a
Python callable body(bindings, tre). The crucial line:
def add_rule(self, trigger, body, environment=None):
...
bucket.rules.append(rule)
for fact in list(bucket.facts): # back-test against facts already present
self._try_rule_on(rule, fact)
return rule
When you add a rule, the engine immediately tests it against facts that are already in the database. Combine this with the assert path (a new fact is tested against rules already present) and you get the heart of TRE:
Both “new fact meets old rules” and “new rule meets old facts” funnel through the single match point
_try_rule_on. Therefore the order in which facts and rules arrive never changes the final set of conclusions.
This property is called order-independence or confluence, and it is the subject of Exercise 1. It is what lets you load rules and facts from many sources in any sequence and still trust the result.
_try_rule_on itself is tiny and is the only place matching happens:
def _try_rule_on(self, rule, fact):
result = unify(fact, rule.trigger, dict(rule.environment))
if result is not FAIL:
self.queue.append((rule.body, result)) # enqueue, don't run
There is also a decorator form, @tre.rule(pattern), that is sugar over
add_rule.
The control loop
run_rules drains the queue to quiescence (until nothing is left):
def run_rules(self):
fired = 0
while self.queue:
body, bindings = self.queue.pop() # LIFO
body(bindings, self) # may assert facts / add rules -> more work
fired += 1
return fired
A rule body can assert new facts and even install new rules, each of which can
push more activations onto the queue, so the loop keeps going until the database
is closed under the rules. The queue is LIFO here, but the book stresses
that the order is logically irrelevant: because of order-independence, FIFO
would reach the same final database (it might visit conclusions in a different
sequence on the way). run_forms(facts) is a convenience batch driver that
asserts each fact and runs to quiescence.
A deliberate non-feature: TRE has no loop detection. A rule like
(integer ?x) => (integer (1+ ?x)) is supposed to run forever. Detecting and
preventing that is not the engine’s job; writing terminating rule sets is the
author’s job.
Conjunctive rules by nesting
TRE has no multi-trigger syntax. There is no built-in way to say “fire when both
(parent ?x ?y) AND (ancestor ?y ?z) are present.” You get conjunction by
nesting: the outer rule triggers on the first pattern, and inside its body
it calls add_rule again to install a second rule watching the second pattern.
The inner rule’s closure captures the bindings from the first match, so the two
matches share variables. This is exactly the trick the transitive-ancestor
example uses (see below), and it is the mechanism Exercise 6 exploits for “lazy”
rules. The environment field on a Rule exists precisely to carry those
captured bindings forward.
What TRE deliberately leaves out
No support, no justifications, no retraction, no contradiction handling, no
negation. Belief is monotone: once asserted, always asserted. Every one of these
omissions is filled in by a later layer. The LTMS-backed sibling LTRE
(../src/ltms/ltre.py) keeps TRE’s pattern matching but
replaces “present in the database” with a three-valued, retractable, justified
label managed by Boolean constraint propagation. Several Chapter 4 exercises are
demonstrated on LTRE because they need exactly that extra power (case analysis,
negation), which TRE alone cannot provide.
The examples, explained
Worked example: deriving ancestors (transitive closure)
../examples/family_tre.py is the canonical TRE
demo. It computes the ancestor relation as the transitive closure of parent
using two rules:
# Base case: a parent is an ancestor.
tre.add_rule(("parent", X, Y),
lambda b, t: t.assert_(("ancestor", b[X], b[Y])))
# Recursive case: parent(x,y) AND ancestor(y,z) => ancestor(x,z), via NESTING.
def chain(b, t):
x = b[X]
t.add_rule(("ancestor", b[Y], Z),
lambda b2, t2: t2.assert_(("ancestor", x, b2[Z])))
tre.add_rule(("parent", X, Y), chain)
tre.run_forms([("parent","ann","bob"), ("parent","bob","cy"), ("parent","cy","dee")])
This single example concentrates most of the chapter’s ideas:
- Forward chaining: facts drive rules; you never call a rule by name.
- Pattern matching with variables:
?x,?y,?zbind to people. - Conjunctive rules by nesting: the recursive rule is two chained triggers,
the inner one watching
(ancestor ?y ?z)for the?ybound by the outer. - Order-independence: the rules are installed before any facts, but
run_formsinterleaves asserts and runs; you would get the same six ancestor pairs no matter how you ordered things.
Running it derives every ancestor pair: ann of bob, cy, dee; bob of
cy, dee; and cy of dee.
The same world as a knowledge-base file
../exercises/ch04/kb/ancestor.kb expresses
the identical world in the project’s tiny DSL, so you can see the declarative
form next to the Python form:
rule (parent ?x ?y) => (ancestor ?x ?y)
rule (parent ?x ?y) & (ancestor ?y ?z) => (ancestor ?x ?z)
parent ann bob
parent bob cy
parent cy dee
expect ancestor ann dee true
expect ancestor bob dee true
Here & is the surface syntax for the conjunctive (nested) rule, and expect
lines are checked assertions. The loader (in
../src/ltms/dsl.py) compiles this to the same engine
calls the Python example makes.
Exercises walk-through
The full write-ups, with paraphrased problem statements and runnable code, are
in ../exercises/ch04/README.md, and every
“demonstrated” exercise is a demo_* function in
../exercises/ch04/solutions.py. Highlights:
Exercise 1 – Why order-independence matters
A paper question, demonstrated in code. The payoff of confluence: rule authors never have to reason about interleavings, debugging is local (a missing conclusion means a missing rule or fact, never a timing accident), and independent knowledge modules compose without caring about load order. The demo builds the same fact and rule in two orders – rule-then-fact and fact-then-rule – and confirms the conclusion set is identical. The trade-off: you may not rely on firing order for control (no cut-like pruning); control must be declarative.
Exercise 2 – Solving a logic-textbook problem
The chosen problem is the constructive dilemma: from A ∨ B, A → C,
B → C, derive C. The lesson is a clean split of labor:
- Modus ponens is free. Asserting
man(socrates)and the implicationman(x) → mortal(x)into LTRE immediately makesmortal(socrates)true, because Boolean constraint propagation does unit propagation for you. - The dilemma is not free. It needs reasoning by cases, which plain
propagation cannot do: the demo shows
c_before_indirect_proof = False. DerivingCusestry_indirect_proof, which assumes¬C, propagates, finds the clause set unsatisfiable, and concludesC(c_proved_by_cases = True, thenc_now_true = True).
This is the BPS moral in miniature: a forward chainer with a TMS hands you the easy deductions directly, while case splits and assumption-based proofs need the explicit proof procedures of later chapters.
Exercise 3 – Full (two-sided) unification
Part (a) asks why a naive one-way matcher mishandles patterns with variables
on both sides, like (Foo ?x ?x) versus (Foo ?y ?z). A one-way matcher treats
the second pattern’s variables as constants, so it never enforces that ?y and
?z must collapse to the same variable, and it has no occurs-check.
Part (b) is satisfied by the project’s existing
unify, which is Robinson’s algorithm with the
occurs-check. The demo exhibits the four telling cases:
unify(("Foo",?x,?x), ("Foo",?y,?z)) returns {?x: ?y, ?y: ?z} (so the two
right-hand variables are forced equal); (Foo ?x ?x) vs (Foo a b) correctly
fails; vs (Foo a a) succeeds; and unify(?x, ("g", ?x)) fails the
occurs-check.
Exercise 4 – multi-fetch (the relational join)
multi_fetch(engine, patterns) returns every binding environment under which
all the patterns match simultaneously, that is the relational join with
shared variables. The implementation threads a list of environments through the
patterns one at a time, substituting bindings so far into the next pattern and
unifying it against the relevant Dbclass bucket. With parent(ann,bob),
parent(cy,dee), employed(ann), employed(ed), the patterns
[(parent ?x ?y), (employed ?x)] yield the single joint answer ?x=ann,
?y=bob (the only employed parent). Note how it reuses get_dbclass so it only
scans relevant facts.
Exercise 5 – show-rule
show_rule(engine, counter) looks a rule up by the integer in its counter
field and prints a readable description, deliberately showing the trigger
pattern and the seed environment separately (as the exercise requires), plus
the firing condition and the bucket it is indexed under. This works because the
Rule dataclass exposes exactly those fields, so the object returned by
add_rule carries everything needed for introspection.
Exercise 6 – More efficient AND / BICONDITIONAL introduction
Part (a) rewrites AND-introduction to look for the second conjunct only after
the first is proved, avoiding wasted work. The fix is the nested-rule pattern: an
outer rule triggers on proved(a), and only inside its body installs the inner
rule that watches proved(b). The demo runs it both ways: when the first
conjunct is unprovable the second is examined 0 times and no conjunction is
made; when the first holds, the second is examined 1 time and the conjunction
is asserted.
Part (b) names the assumption this optimization relies on: there must be no
cross-dependence or synergy between the two sub-proofs (proving B must not be
what makes A provable), the cost ordering must be stable (the first conjunct
really is the cheaper or more-likely-to-fail one), and forward-chaining
monotonicity must hold. It is violated if some other rule derives A from
B: then a fact set supporting B should yield A ∧ B, but the lazy version,
seeing A not yet proved and never examining B, never fires the chain that
would have produced A, and misses the conjunction.
Exercise 7 – Blackboard systems on TRE knowledge sources
Part (a) asks which TRE design decisions must change before TRE engines could
serve as blackboard knowledge sources. Two clear ones: (1) run_rules drains the
whole queue to quiescence with no external scheduler or budget, whereas
blackboards need opportunistic, interruptible control that can preempt one
knowledge source to run a more promising one; and (2) TRE has a single private,
monotonic, non-retractable database with no provenance, whereas blackboards need
a shared, structured, multi-level workspace with the ability to revise and
remove hypotheses (which really wants the JTMS/LTRE layer from later chapters).
Part (b) sketches a blackboard shell: each knowledge source is its own Tre
engine, a shared list is the blackboard, and an explicit controller cycles the
knowledge sources and moves new blackboard entries between them. The demo posts
observed(fire), lets KS1 derive hypothesis(fire), copies it to KS2 alongside
evidence(fire), and KS2 produces confirmed(fire). A production shell would
add priority queues, per-cycle budgets, change-events, and hypothesis retraction
(best done by swapping Tre for the LTMS-backed LTRE).
Try it in this repo
All commands are run from the repository root with the virtual environment active. (These were all executed while writing this chapter and produce the output shown.)
# activate the project's virtual environment
. .venv/bin/activate
# 1) Run every demonstrable Chapter 4 exercise and pretty-print the results.
python exercises/ch04/solutions.py
# 2) The canonical TRE worked example: derive ancestors by forward chaining.
python examples/family_tre.py
# 3) The same ancestor world as a declarative .kb file, with expected results.
python examples/run_kb.py exercises/ch04/kb/ancestor.kb
What you should see:
solutions.pyprints a dict keyed by exercise. Spot-check thatex1_order_independencereportsidentical: True,ex2_natural_deductionshowsc_before_indirect_proof: Falsebutc_now_true: True, andex4_multi_fetchgivesjoint_parent_and_employed: [('ann', 'bob')].family_tre.pyprints the six derived ancestor pairs.run_kb.pyprintsancestor ann dee: trueandancestor bob dee: true, matching theexpectlines in the.kbfile.
To experiment, try editing
../exercises/ch04/kb/ancestor.kb: add a
new parent line and a matching expect, then re-run command 3 and watch the
transitive closure grow. Or open a REPL and build an engine by hand:
from ltms.tre import Tre
from ltms.terms import Var
X = Var("x")
t = Tre()
t.add_rule(("human", X), lambda b, e: e.assert_(("mortal", b[X])))
t.assert_(("human", "socrates"))
t.run_rules()
print(t.fetch(("mortal", X))) # [('mortal', 'socrates')]
Then swap the order (assert the fact before adding the rule) and confirm you get the same answer – that is Exercise 1, live.
Takeaways
- TRE is the smallest useful forward-chaining engine: a database of facts and rules indexed by leftmost symbol, a queue of activations, and a loop that drains it to quiescence.
- Belief = presence in the database. There is no support, no negation, no retraction. That austerity is what makes TRE the clean base every truth maintenance layer extends.
- Unification is the single matching primitive, and the empty-binding-as-
success rule (
is not FAIL) is non-negotiable. - Both new-fact and new-rule paths run through one match point, which buys order-independence (confluence): same facts plus same rules give the same conclusions regardless of arrival order.
- Conjunction is nesting: chained triggers whose inner rule captures the outer match’s bindings. This one pattern powers the recursive ancestor rule and the “lazy” optimizations of Exercise 6.
- Several exercises (case analysis in Ex 2, negation, blackboard retraction in Ex 7) reach past what TRE can do, pointing forward to LTRE and the truth maintenance chapters where belief becomes justified and revisable.
Next, see the truth-maintenance chapters that build directly on this engine:
../src/ltms/jtre.py (JTMS-backed) and
../src/ltms/ltre.py (LTMS-backed).