# A tiny theorem prover At work we have many great (and some quite funny and not so great) old books, and when I came across this LISP book, I had to borrow it. (At some point there must have been a real library, the books have numbering, and an old fashioned book card at the back, with date and name of previous borrowers. So maybe I didn’t borrow the book properly, I just took it off the shelf). The book has many great chapters, for instance “Symbolic pattern matching and simple theorem proving”, “Program writing programs and natural language interfaces” and “LISP in LISP”. Here is a little theorem prover from chapter “Symbolic pattern matching and simple theorem proving” in the book, based on the resolution rule, and rewritten in Racket.

The resolution rule produces a new clause if two clauses contains complementary literals, i.e, one clause contains and the other contains . The simplest case is when we have the clauses and . Since and cannot both be true at the same time we know that either or has to be true, hence .

(If you’re not familiar with logic operands and terms, here is the shortest intro, from a non-expert: the sign is negation, and a literal is either a letter or the negation of a letter, and a letter is a variable that can be either true or false. The operators is “or” (disjunction) and is “and” (conjuction), and a clause is a finite disjuction of literal, i.e., letters or negation of letters connected with or. In the fraction above, the clauses above the line are arguments, and the clause below the line is the conclusion, and it follows/can be deduced from the arguments. Hence, the conclusion is true if all the arguments are true)

The logic operators and the literals are symbols/quotes in the code (since we’re not in the “Lisp in Lisp”-chapter of the book), and the following resolve-function is my Racket version of the equivalent function in Common Lisp from the book. The code in the book uses SETQ to update variables, and GO for jumping in code, a lot, but I’m using let-expressions and recursing instead.
The resolve function uses two helper functions; invert puts the symbol ‘not in front of a literal, or removes the ‘not if the literal is a negation, and the combine function is a cleanup function that adds a literal to a list of literals if itself or the negation is not in the list already, and removes duplicates.

(define (resolve x y)
(letrec ([res (λ (rest-x rest-y)
(cond
[(null? rest-x) 'no-resolvent]
[(member? (invert (car rest-x)) rest-y) (cons 'or (combine (car rest-x) (append (cdr x) (cdr y))))]
[else (res (cdr rest-x) rest-y)]))])
(res (cdr x) (cdr y))))


And some tests to show how the function works:

(check-equal? (resolve '(or P) '(or Q)) 'no-resolvent)
(check-equal? (resolve '(or P) '(or (not P))) '(or))
(check-equal? (resolve '(or P (not Q)) '(or Q (not R))) '(or P (not R)))


So how can we use the resolution rule in a proof?

A conjecture is proved if the premise implies the conclusion, hence a conjecture is false if there is a combination of literals that makes the premise true and the conclusion false at the same time. Which again is equivalent to say that the conjecture is false if there is a combination of literals that makes the premise and the negation of the conclusion true at the same time.

We require both the premise and the negation of the conclusion to be on conjunctive normal form, which means that they both are of the form of clauses connected with “and”s. So for the premise and the negation of the conclusion to be true at the same time all the clauses have to be true.

If then the combined clauses contain both a literal and its negation then it is impossible for all the clauses to be true at the same time, hence the conjecture is not false, it is true. And this is the key idea of the proof algorithm; to loop through all clauses, and use the resolution rule one each pair of clauses. If the resolvent is empty, we know that the teorem is true, if we get a non-empty resolvent back, it is added to the list of clauses, and the resolvent is again used in the resolution rule with each of the other clauses in the list. If we have looped all the clauses and not have met an empty resolvent then the conjecture is false.

(define (prove premise negation)
(letrec ([try-next-remainder (λ (remainder clauses)
(cond [(null? remainder)
(displayln '(theorem not proved))
#f]
[else (try-next-resolvent (car remainder) (cdr remainder) remainder clauses)]))]
[try-next-resolvent (λ (first rest remainder clauses)
(if (null? rest) (try-next-remainder (cdr remainder) clauses)
(let ([resolvent (resolve first (car rest))])
(cond
[(or (equal? resolvent 'no-resolvent)
(member? resolvent clauses)) (try-next-resolvent first (cdr rest) remainder clauses)]
[(null? (cdr resolvent)) (display-clauses first (car rest))
(displayln '(produce the empty resolvent))
(displayln '(theorem proved))
#t]
[else (display-clauses first (car rest))
(displayln (append '(produce a resolvent:) (list resolvent)))
(try-next-remainder (cons resolvent clauses) (cons resolvent clauses))]))))]
[clauses (append (cdr premise) (cdr negation))])
(try-next-remainder clauses clauses)))

(test-case
"prove true"
(let ([premise '(and (or Q (not P))
(or R (not Q))
(or S (not R))
(or (not U) (not S)))]
[negation '(and (or P) (or U))])
(check-true (prove premise negation))))


The complete code, with tests, can be found here.