DFA Minimization using Rational Trees in Prolog

I was inspired by Phil Zucker's blog post Naive Automata Minimization. On Twitter he wrote:

Maybe a way to actualize this is using knot tied rational trees instead of integer state ids. But this has its own confusions.

I really wanted to try to implement this in SWI-Prolog.

Rational Trees

Rational trees are Prolog terms that are circular in some way. They can be created by unification:

?- X = f(X).
X = f(X).

Now the term X represents the term f(f(f(f(f(f(...)))))):

?- X = f(X), X = f(f(f(f(f(f(f(A))))))).
X = A, A = f(A).

This is a bit like appending a list to itself in Python:

>>> l = []
>>> l.append(l)
>>> l
[[...]]

SWI-Prolog is good at dealing with these recursive terms, it can unify them etc. Python is less good. While repr does the right thing, as we saw above, == produces a recursion error:

>>> import copy
>>> copy.deepcopy(l)
[[...]]
>>> copy.deepcopy(l) == l
Traceback (most recent call last):
  File "<python-input-9>", line 1, in <module>
    copy.deepcopy(l) == l
RecursionError: maximum recursion depth exceeded

Defining DFAs with Rational Trees

We can use these rational trees in Prolog to represent deterministic finite automata (DFAs) over the alphabet with two characters, a and b. A DFA is a Prolog term representing the starting state. A state is represented by a term state(Accept, NextStateA, NextStateB) where Accept is true or false. The two next states are again state terms. A next state can also be none. To achieve loops in the DFA, we use rational trees to refer to outer states.

To represent a DFA that matches a+ we can use the term state(false, state(true, ..., none), none), where the ... is the second state term. We can build it like this:

dfa_a_plus(S1) :-
    S1 = state(false, S2, none),
    S2 = state(true, S2, none).

We can also build it as a non-minimal DFA with an extra state:

dfa_a_plus_inefficient(S1) :- % less efficient example automaton for a+ with extra states
    S1 = state(false, S2, none),
    S2 = state(true, S3, none),
    S3 = state(true, S4, none),
    S4 = state(true, S2, none).

Those two terms unify successfully and are thus equivalent:

?- dfa_a_plus(S1), dfa_a_plus_inefficient(S2), S1=S2.
 S1 = S2, S2 = state(false, _S1, none), % where
     _S1 = state(true, state(true, state(true, _S1, none), none), none).

They unify because the Prolog rational trees are "observationally equivalent" in the sense of Phil's blog post, which the Prolog unification algorithm can check.

The string representation of SWI-Prolog is not minimal though. This shows that for the printer uses some internal knowledge about how the trees were constructed, because from pure Prolog it should not be observable where the bound logic variable "was" in the structure (the term is ground).

Matching

So far we only defined automata, but we can also write a predicate that matches one against a list of a or b.

match([], state(true, _, _)).
match([a | Rest], state(_, StateA, _)) :- match(Rest, StateA).
match([b | Rest], state(_, _, StateB)) :- match(Rest, StateB).

Some example calls:

?- dfa_a_plus(S), match([a, a, a, a], S). % succeeds

?- dfa_a_plus(S), match([], S).
false.

?- dfa_a_plus(S), match([a, a, a, b], S).
false.

?- dfa_a_plus(S), match([b], S).
false.

Let's look at a more complicated DFA example, the one from the blog post:

dfa_blog_post(S1) :-
    S1 = state(false, S2, S3),
    S2 = state(false, S4, S3), % states S2 and S3 are equivalent
    S3 = state(false, S5, S3),
    S4 = state(true, S5, S4), % states S4 and S5 are equivalent
    S5 = state(true, S4, S4).

(there's a nice diagram in Phil's post.)

If we want to minimize it, we can assign a number to each unique state reachable from the starting state. We can use an association list, i.e. a list where the entries are terms State/Number. States that are observationally equivalent unify, so we can just use Prolog unification (via the "member" predicate which checks whether a term is an element of a list).

number_states(D, L) :-
    % number the states, the first number is 0
    number_states_helper(D, 0, _, [], LRev),
    % reverse the result, because it is built in reverse order
    reverse(LRev, L).

number_states_helper(none, Num, Num, L, L). % none doesn't need a number
number_states_helper(State, Num, Num, L, L) :-
    % if we already gave a number to the state, we are done.
    member(State/_, L).
number_states_helper(State, NumIn, NumOut, LIn, LOut) :-
    % if we've never seen the state, we assign it NumIn
    not(member(State/_, LIn)),
    L1 = [State/NumIn | LIn],
    State = state(_, NextStateA, NextStateB),
    succ(NumIn, Num1),
    number_states_helper(NextStateA, Num1, Num2, L1, L2),
    number_states_helper(NextStateB, Num2, NumOut, L2, LOut).

We can apply this predicate from the non-minimal DFA:

:- dfa_blog_post(D), number_states(D, L), length(L, Length), write(Length), nl, write(L), nl.
3
@([state(false,state(false,S_1,S_2),S_2)/0,state(false,S_1,S_2)/1,S_1/2],[S_1=state(true,state(true,S_1,S_1),S_1),S_2=state(false,state(true,S_1,S_1),S_2)])

The list L of numbered states is kind of hard to grasp due to the messy way that rational tree terms gets printed, but the important part is that we found out that three states is the minimal size.

After we numbered the states we can use that numbering to produce a graphviz graph for the DFA:

to_dot(D) :-
    number_states(D, L),
    write("digraph G {"), nl,
    write('start [label="", shape=none]'), nl,
    write("start -> 0"), nl,
    to_dot_list(L, L),
    write("}"), nl.

to_dot_list([], _).
to_dot_list([State/NumState | Rest], FullList) :-
    State = state(Accept, StateA, StateB),
    to_dot_node(Accept, NumState),
    to_dot_edge(StateA, NumState, FullList, a),
    to_dot_edge(StateB, NumState, FullList, b),
    to_dot_list(Rest, FullList).

to_dot_node(true, Num) :-
    write(Num), write(" [shape=doublecircle]"), nl.
to_dot_node(false, Num) :-
    write(Num), write(" [shape=circle]"), nl.

to_dot_edge(none, _, _, _).
to_dot_edge(State, NumPrev, FullList, Char) :-
    member(State/NumState, FullList),
    write(NumPrev), write(" -> "), write(NumState), write(" [label="), write(Char), write("]"), nl.

If we apply this to the example we get:

digraph G {
start [label="", shape=none]
start -> 0
0 [shape=circle]
0 -> 1 [label=a]
0 -> 1 [label=b]
1 [shape=circle]
1 -> 2 [label=a]
1 -> 1 [label=b]
2 [shape=doublecircle]
2 -> 2 [label=a]
2 -> 2 [label=b]
}

Which looks like this:

DFA vizualization