9.3. Top-down induction¶

We introduce the second induction method by means of an example. Suppose we want to construct a definition of the predicate element/2 by means of induction. After receiving the first example +element(a,[a,b]), we formulate the simplest hypothesis possible:

element(X,Y).

This hypothesis states that everything is an element of everything. Suppose our next example is a negative one: -element(x,[a,b]). Since this negative example is covered by our current hypothesis, we conclude that it is too general and has to be specialised. Under $$\theta$$-subsumption, there are two ways to specialise a clause:

1. apply a substitution to variables in the clause;

2. add a literal to the body of the clause.

We can thus specialise our hypothesis in several ways: we can apply substitutions like { X[] }, { YX } or { Y[V|W]] }, or we can add a literal like element(Y,X) to the body of the clause. So, the set of specialisations of the above clause includes, among others, the following clauses:

element([],Y).
element(X,X).
element(X,[V|W]).
element(X,Y):-element(Y,X).

Note that each of these clauses is a minimal specialisation, in the following sense: each of them is $$\theta$$-subsumed by the original clause, and there exist no more-general clauses which are also $$\theta$$-subsumed by the original clause.

Suppose for the moment that we choose the third clause as our next hypothesis:

element(X,[V|W]).

This hypothesis expresses that anything is an element of a non-empty list. Obviously, this clause is again too general, since it still covers the negative example. Possible minimal specialisations include

element(X,[V]).
element(X,[X|W]).
element(X,[V|X]).
element(X,[V|W]):-element(X,W).

The second of these clauses is true in the intended interpretation, and will therefore never cover any negative example. Since it also covers the only positive example seen up till now, we decide to adopt it as our next hypothesis. Notice that the recursive clause is also among the above specialisations; it will be found if we supply a positive example like +element(b,[a,b]).

Thus, we see that the operation of specialisation generates a search space in which the correct clauses defining element/2 are to be found. Part of this search space, which we will call the specialisation graph, is depicted in Figure 9.1. Notice that, in order to generate the specialisation graph, we need to specify the hypothesis language: the set of predicates, functors and constants that can occur in the hypothesis. We can further restrict the search space by assigning types to the arguments of predicates and functors. For instance, by assigning X and Y in element(X,Y) and [X|Y] the types ‘item’ and ‘list of items’, respectively, it becomes clear that X and Y should not be unified in a specialisation step, and neither should X be substituted by [] or [V|W]. Such typing would rule out three clauses in Figure 9.1. Figure 9.1 Part of the specialisation graph for element/2.

Even with such typing restrictions, the branching factor in the specialisation graph is typically quite large, increasing with the number of variables in a clause. Therefore, an agenda-based search procedure will require large amounts of memory. Instead, we will employ an iterative deepening search strategy with backtracking. Each time a clause in the hypothesis is found to be too general, we search the specialisation graph for an alternative, starting from the root and increasing the depth bound until a suitable clause is found. Identifying and removing the too-general clause is a specialisation operation; searching for an alternative and adding it to the hypothesis is a generalisation step.

The program below implements this top-down induction procedure. Its main loop is given by the predicate process_examples/4. This predicate processes the examples one by one. Whenever the hypothesis is changed by generalisation or specialisation, the new hypothesis should be checked against all previous examples, which are therefore passed in the list Done.

induce_spec(Examples,Clauses):-
process_examples([],[],Examples,Clauses).

% process the examples
process_examples(Clauses,_Done,[],Clauses).
process_examples(Cls1,Done,[Ex|Exs],Clauses):-
process_example(Cls1,Done,Ex,Cls2),
process_examples(Cls2,[Ex|Done],Exs,Clauses).

% process one example
process_example(Clauses,_Done,+Example,Clauses):-
covers_d(Clauses,Example).
process_example(Cls,Done,+Example,Clauses):-
not(covers_d(Cls,Example)),
generalise(Cls,Done,Example,Clauses).
process_example(Cls,Done,-Example,Clauses):-
covers_d(Cls,Example),
specialise(Cls,Done,Example,Clauses).
process_example(Clauses,_Done,-Example,Clauses):-
not(covers_d(Clauses,Example)).

Tip

Notice that top-down induction produces the standard definition of element/2, in which the recursive clause applies even if the list has a single element (and subsequently fails). This demonstrates a key difference between top-down and bottom-up induction.

Intensional coverage of an example by a set of clauses is checked by a simple meta-interpreter. Since the current hypothesis might include circular clauses like element(X,Y):-element(Y,X), the meta-interpreter employs a depth bound to cut off the search for a proof after a fixed number of steps. Additionally, a background theory might be defined by means of the meta-predicate bg/1; we will assume that this background theory is non-circular, and does not contain the predicate to be induced.

% covers_d(Clauses,Ex) <- Ex can be proved from Clauses and
%                         background theory (max. 10 steps)
covers_d(Clauses,Example):-
prove_d(10,Clauses,Example).

prove_d(D,Cls,true):-!.
prove_d(D,Cls,(A,B)):-!,
prove_d(D,Cls,A),
prove_d(D,Cls,B).
prove_d(D,Cls,A):-
D>0,D1 is D-1,
copy_element((A:-B),Cls),  % make copy of clause
prove_d(D1,Cls,B).
prove_d(D,Cls,A):-
prove_bg(A).

prove_bg(true):-!.
prove_bg((A,B)):-!,
prove_bg(A),
prove_bg(B).
prove_bg(A):-
bg((A:-B)),
prove_bg(B).

%%% copy_element/2: see Section 10.2 (appendix)

If the current hypothesis covers a negative example, it follows that it contains at least one clause which is false in the intended interpretation. The predicate specialise/4 identifies such a false clause by examining the proof of the negative example. Once such a clause is found, it is simply thrown out of the hypothesis. Since this is quite a coarse specialisation step, some of the previous positive examples will now become uncovered, and the predicate process_examples/4 is called again.

specialise(Cls,Done,Example,Clauses):-
false_clause(Cls,Done,Example,C),
remove_one(C,Cls,Cls1),
write('.....refuted: '),write(C),nl,
process_examples(Cls1,[],[-Example|Done],Clauses).

% false_clause(Cs,Exs,E,C) <- C is a false clause
%                             in the proof of E
false_clause(Cls,Exs,true,ok):-!.             % empty proof
false_clause(Cls,Exs,(A,B),X):-!,
false_clause(Cls,Exs,A,Xa),               % try first conjunct
( Xa = ok   -> false_clause(Cls,Exs,B,X)  % 2nd one
; otherwise -> X = Xa
).
false_clause(Cls,Exs,E,ok):-                  % no false clause for
element(+E,Exs),!.                        % positive examples
false_clause(Cls,Exs,A,ok):-                  % no false clause for
bg((A:-B)),!.                             % background literals
false_clause(Cls,Exs,A,X):-
copy_element((A:-B),Cls),
false_clause(Cls,Exs,B,Xb),               % false clause in proof B?
( Xb \= ok  -> X = Xb                     % yes
; otherwise -> X = (A:-B)                 % no; return this clause
).

As explained above, the predicate generalise/4 searches the specialisation graph for a clause covering an uncovered positive example. Since there might be several uncovered positive examples, the generalised hypothesis is again tested against all previous examples.

generalise(Cls,Done,Example,Clauses):-
search_clause(Done,Example,Cl),
write('Found clause: '),write(Cl),nl,
process_examples([Cl|Cls],[],[+Example|Done],Clauses).

The current node in the search process is represented by a term a(Clause,Vars), where Vars is the list of variables occurring in Clause, together with their types (see below).

% search_clause(Exs,E,C) <- C is a clause covering E and
%                           not covering negative examples
%                           (iterative deepening search)
search_clause(Exs,Example,Clause):-
literal(Head,Vars),  % root of specialisation graph
Exs,Example,Clause).

search_clause(D,Current,Exs,Example,Clause):-
write(D),write('..'),
search_clause_d(D,Current,Exs,Example,Clause),!.
search_clause(D,Current,Exs,Example,Clause):-
D1 is D+1,
!,search_clause(D1,Current,Exs,Example,Clause).

The search ends when a clause is found that covers the uncovered example, while not covering any of the negative examples.

search_clause_d(D,a(Clause,Vars),Exs,Example,Clause):-
covers_ex(Clause,Example,Exs),    % goal
not((element(-N,Exs),covers_ex(Clause,N,Exs))),!.
search_clause_d(D,Current,Exs,Example,Clause):-
D>0,D1 is D-1,
specialise_clause(Current,Spec),  % specialise
search_clause_d(D1,Spec,Exs,Example,Clause).

Here, extensional coverage is tested against the examples and the background theory:

covers_ex(true,Exs):-!.
covers_ex((A,B),Exs):-!,
covers_ex(A,Exs),
covers_ex(B,Exs).
covers_ex(A,Exs):-
element(+A,Exs).
covers_ex(A,Exs):-
prove_bg(A).

The following predicates generate the specialisation graph. The literals that can be added to the body of a clause are given by the predicate literal/2. The first argument of literal/2 is a literal; the second argument specifies the types of variables in the literal. Thus, for the predicate element/2 the following fact should be added:

literal(element(X,Y),[item(X),list(Y)]).

Likewise, the possible terms to be used in a substitution are specified with their types by the predicate term/2:

term(list([]),[]).
term(list([X|Y]),[item(X),list(Y)]).

For instance, the clause element(X,[V|W]):-true is represented during the search process as

a((element(X,[V|W]):-true),[item(X),item(V),list(W)])

Consequently, X and V can be unified with each other but not with W, and W can be substituted by [] or [Y|Z], but X and V cannot. To restrict the search further, we will again make the assumption that hypothesis clauses are strictly constrained; i.e. the set of variables in a newly added literal is a proper subset of the set of variables in the head of the clause.

% specialise_clause(C,S) <- S is minimal specialisation
%                           of C under theta-subsumption
specialise_clause(Current,Spec):-
specialise_clause(Current,Spec):-
apply_subs(Current,Spec).

literal(L,LVars),
proper_subset(LVars,Vars).  % no new variables in L
literal(L,LVars),
proper_subset(LVars,Vars).  % no new variables in L

apply_subs(a(Clause,Vars),a(Spec,SVars)):-
copy_term(a(Clause,Vars),a(Spec,Vs)),  % don't change
apply_subs1(Vs,SVars).                 % Clause

apply_subs1(Vars,SVars):-
unify_two(Vars,SVars).  % unify two variables
apply_subs1(Vars,SVars):-
subs_term(Vars,SVars).  % subs. term for variable

unify_two([X|Vars],Vars):-  % not both X and Y in Vars
element(Y,Vars),
X=Y.
unify_two([X|Vars],[X|SVars]):-
unify_two(Vars,SVars).

subs_term(Vars,SVars):-
remove_one(X,Vars,Vs),
term(Term,TVars),
X=Term,
append(Vs,TVars,SVars).  % TVars instead of X in Vars

We illustrate the program by applying it to the induction problems of the previous section. The first problem is to induce a definition of the predicate append/3. The hypothesis language is specified by the literals and terms to be used, together with the types of their arguments:

literal(append(X,Y,Z),[list(X),list(Y),list(Z)]).
term(list([]),[]).
term(list([X|Y]),[item(X),list(Y)]).

The following query demonstrates that append/3 can be induced from two positive and four negative examples:

?-induce_spec([ +append([],[b,c],[b,c]),
-append([],[a,b],[c,d]),
-append([a,b],[c,d],[c,d]),
-append([a],[b,c],[d,b,c]),
-append([a],[b,c],[a,d,e]),
+append([a],[b,c],[a,b,c])   ],Clauses).

3..Found clause: append(X,Y,Z):-true
...refuted: append([],[a,b],[c,d]):-true
3..Found clause: append(X,Y,Y):-true
...refuted: append([a,b],[c,d],[c,d]):-true
3..Found clause: append([],Y,Y):-true
3..4..Found clause: append([X|Xs],Ys,[X|Zs]):-append(Xs,Ys,Zs)

Clauses = [ (append([X|Xs],Ys,[X|Zs]):-append(Xs,Ys,Zs)),
(append([],Y,Y):-true) ]

The numbers indicate the level of iterative deepening at which the clauses are found. The first two negative examples are needed for the construction of the non-recursive clause, and the remaining two are needed for the construction of the recursive clause.

The second induction problem concerns the predicate listnum/2. The hypothesis language is declared as follows:

literal(listnum(X,Y),[list(X),list(Y)]).
literal(num(X,Y),[item(X),item(Y)]).
term(list([]),[]).
term(list([X|Y]),[item(X),list(Y)]).

We supply the following background theory:

bg((num(1,one):-true)).
bg((num(2,two):-true)).
bg((num(3,three):-true)).
bg((num(4,four):-true)).
bg((num(5,five):-true)).

The predicate listnum/2 can be learned from six well-chosen examples:

?-induce_spec([ +listnum([],[]),
-listnum([one],[one]),
-listnum([1,two],[one,two]),
+listnum(,[one]),
-listnum([five,two],[5,two]),
+listnum([five],)],Clauses).

3..Found clause: listnum(X,Y):-true
...refuted: listnum([one],[one]):-true
3..Found clause: listnum([],[]):-true
3..4..Found clause: listnum([V|Vs],[W|Ws]):-num(V,W),listnum(Vs,Ws)
3..4..Found clause: listnum([X|Xs],[Y|Ys]):-num(Y,X),listnum(Xs,Ys)

Clauses =
[ (listnum([X|Xs],[Y|Ys]):-num(Y,X),listnum(Xs,Ys)),
(listnum([V|Vs],[W|Ws]):-num(V,W),listnum(Vs,Ws)),
(listnum([],[]):-true) ]

It should again be noted that the examples need to be well-chosen and well-ordered. This is particularly true for the recursive clause. Because of the use of extensional coverage, all positive examples occurring in a proof should be given; moreover, it is good practice to supply negative examples for a particular recursive clause before the positive ones. For this induction program, which induces by specialising overly general clauses, negative examples are particularly crucial.

Exercise 9.5

Replace the iterative deepening search strategy with beam search (see the article by Quinlan, referred to below, for a possible heuristic).