Dec
2008
Doesn’t that title just get the blood pumping? Proof verification has a reputation for being an inordinately academic subject. In fact, even within scholarly (otherwise known as “unrealistically intelligent“) circles, the automated verification of proofs is known mainly as a complex, ugly and difficult task often not worth the effort. This is a shame really, because rigorous proofs are at the very core of both mathematics and computer science. We are nothing without logic (paraphrased contrapositive from Descartes). Believe it or not, understanding basic proof techniques will be of tremendous aid to your cognitive process, even when working on slightly less ethereal problems such as how to get the freakin’ login page to work properly.
Well, if you made it all the way to the second paragraph, then you either believe me when I say that this is legitimately useful (and cool!) stuff, or you’re just plain bored. Either way, read on as we commence our exciting journey into the land of rigorous proofs!
SASyLF Crash Course
If you’re at all familiar with the somewhatspecialized field of proof verification, you probably know that SASyLF (pronounced “sassy elf”) is not the most widely used tool for the job. In fact, it may very well be the least wellknown. More commonly, proofs that require automatic verification are written in Twelf or Coq. Both of these are fine tools and capable of a lot more than SASyLF, but they can also be extremely difficult to use. One of the primary motivations behind SASyLF was to produce a tool which was easier to learn, had a higher level syntax (easier to read) and which gave more helpful error messages than Twelf. The main idea behind these convolutions was to produce a tool which was more suitable for use in the classroom.
The main design decision which sets SASyLF apart from Twelf is the way in which proofs are expressed. As I understand it, Twelf exploits CurryHoward correspondence to represent proofs implicitly in the types of a functional program (update: this is incorrect; see below). While this can be very powerful, it’s not the most intuitive way to think about a proof. Eschewing this approach, SASyLF expresses proofs using unification (very similar to Prolog) and defines inference rules explicitly in a naturallanguage style.
There are three main components to a SASyLF proof:
 Syntax
 Judgments
 Theorems/Lemmas
Intuitively enough, the syntax section is where we express the grammar for the language used throughout our proof. This grammar is expressed very naturally using BNF, just as if we were defining the language mathematically for a handwritten proof. Leftrecursion is allowed, as is rightrecursion, arbitrary symbols, ambiguity and so on. SASyLF’s parser is mind bogglingly powerful, capable of chewing threw just about any syntax you throw at it. The main restriction is that you cannot use parentheses, square brackets ([]
), pipes (
) or periods (.
) in your syntax. The pureuntyped lambda calculus defined in SASyLF would look something like this:
t ::= fn x => t[x]
 t t
 x

I said we couldn’t use brackets, but that’s only because SASyLF assigns some special magic to these operators. In a nutshell, they allow the above definition of lambda calculus to ignore all of the issues associated with variable name freshness and context. For simplicity’s sake, that’s about as far as I’m going to go into these mysterious little thingies.
The judgments section is where we define our inference rules. Just as if we were defining these rules by hand, the syntax has the conditionals above a line of hyphens with the conclusion below. The label for the rule goes to the right of the “line”. What could be more natural?
judgment eval: t > t
t1 > t1'
 EBeta1
t1 t2 > t1' t2

The judgment
syntax is what defines the syntax for the >
“operator”. Once SASyLF sees this, it knows that we may define rules of the form t > t
, where t
is defined by the syntax section. Further on down, SASyLF sees our EBeta1
rule. Each of the tokens within this rule (aside from >
) begins with “t
“. From this, SASyLF is able to infer that we mean “a term as defined previously”. Thus, this rule is syntactically valid according to our evaluation judgment and the syntax given above.
Of course, theorems are where you will find the real meat of any proof (I’m using the word “proof” very loosely to mean the collection of proven theorems and lemmas which indicates some fact(s) about a language). SASyLF wouldn’t be a very complete proof verification system without support for some form of proving. Once again, the syntax is extremely natural language, almost to the point of being overlyverbose. A simple theorem given the rules above plus a little would be to show that values cannot evaluate:
theorem evalvalueimpliescontradiction:
forall e: t > t'
forall v: t value
exists contradiction .
_: contradiction by unproved
end theorem

Note that contradiction
is not more SASyLF magic. We can actually define what it means to have a contradiction by adding the following lines to our judgment section:
judgment absurd: contradiction

In other words, we can have a contradiction, but there are no rules which allow us to get it. In fact, the only way to have a contradiction is to somehow get SASyLF to the point where it sees that there are no cases which satisfy some set of proven facts (given the forall
assumptions). If SASyLF cannot find any cases to satisfy some rules, it allows us to derive anything at all, including judgments which have no corresponding rules.
Readers who have yet to fall asleep will notice that I cleverly elided a portion of the “theorem
” code snippet. That’s because there wasn’t really a way to prove that contradiction given the drastically abbreviated rules given in earlier samples. Instead of proving anything, I used a special SASyLF justification, unproved
, which allows the derivation of any fact given no input (very useful for testing incomplete proofs). Lambda calculus isn’t much more complicated than what I showed, but it does require more than just an application context rule in its evaluation semantics. In order to get a taste for SASyLF’s proof syntax, we’re going to need to look at a much simpler language.
Case Study: Integer Comparison
For this case study, we’re going to be working with simple counting numbers which start with 0
and then proceed upwards, each value expressed as the successor of its previous value. Thus, the logical number 3 would be s s s 0
. Not a very useful language in the real world, but much easier to deal with in the field of proof verification. The syntax for our natural numbers looks like this:
n ::= 0
 s n

With this humble definition for n
, we can go on to define the mathematical greaterthan comparison using two rules under a single judgment:
judgment gt: n > n
 gtone
s n > n
n1 > n2
 gtmore
s n1 > n2

Believe it or not, this is all we need to do in terms of definition. The first rule says that the successor of any number is greater than that same number (3 > 2). The second rule states that if we already have two numbers, one greater than the other (12 > 4), then the successor of the greater number will still be greater than the lesser (13 > 4). All very intuitive, but the real question is whether or not we can prove anything with these definitions.
An Easy Lemma
For openers, we can try something reasonably simple: prove that all nonzero numbers are greater than zero. This is such a simple proof that we won’t even bother calling it a theorem, we will give it the lesser rank of “lemma”:
lemma allgtzero:
forall n
exists s n > 0 .
_: s n > 0 by induction on n:
case 0 is
_: s 0 > 0 by rule gtone
end case
case s n1 is
g: s n1 > 0 by induction hypothesis on n1
_: s s n1 > 0 by rule gtmore on g
end case
end induction
end lemma

In order to prove anything about n
, we first need to “pull it apart” and find out what it’s made of. To do that, we’re going to use induction
. We could also use case analysis
, but that would only work if our proof didn’t require “recursion” (we’ll get to this in a minute). There are two cases as given by the syntax for n
: when n
is “0
“, and when n
is “s n1
“, where n1
is some other number. We must prove that s n > 0
for both of these cases individually, otherwise our proof is not valid.
The first case is easy. When n
is 0, the proof is trivial using the rule gtone
. Notice that within this case we are no longer proving s n > 0
, but rather s 0 > 0
. This is the huge win brought by SASyLF’s unification: n
is “0
” within this case. Anything we already know about n
, we also know about 0
. When we apply the rule gtone
, SASyLF sees that we are attempting to prove s n > n
where n
is “0
“. This is valid by the rule, so the verification passes.
The second case is where things get interesting. We have that n
is actually s n1
, but that doesn’t really get us too much closer to proving s s n1 > 0
(remember, unification). Fortunately, we can prove that s n1 > 0
because we’re writing a lemma at this very moment which prove that. This is like writing a function to sum all the values in a list: when the list is empty, the result is trivial; but when the list has contents, we must take the head and then add it to the sum of the tail as computed by…ourself. Induction is literally just recursion in logic. Interestingly enough, SASyLF is smart enough to look at all of the inductive cases in your proof and verify that they are valid. This is sortof the equivalent of a compiler looking at your code and telling you whether or not it will lead to an infinite loop.
To get that s n1 > 0
, we use the induction hypothesis, passing n1
as the “parameter”. However, we’re not quite done yet. We need to prove that s s n1 > 0
in order to unify with our original target (s n > 0
). Fortunately, we already have a rule that allows us to prove the successor of a number retains its greaterthan status: gtmore
.
However, gtmore
has a condition in our definition. It requires that we already have some fact n1 > n2
in order to obtain s n1 > n2
. In our case, we already have this fact (s n1 > 0
), but we need to “pass” it to the rule. SASyLF allows us to do this by giving our facts labels. In this case, we have labeled the s n1 > 0
fact as “g
“. We take this fact, pack it up and send it to gtmore
and it gives us back our final goal.
A Slightly Harder Theorem
A slightly more difficult task would be to prove that the successors of two numbers preserves their greaterthan relationship. Thus, if we know that 4 > 3, we can prove that 5 > 4. More formally:
theorem gtimpliesgtsucc:
forall g: n1 > n2
exists s n1 > s n2 .
_: s n1 > s n2 by unproved
end theorem

At first glance, this looks impossible since we don’t really have a rule dealing with s n
on the righthand side of the >
sign. We can try to prove this one step at a time to see whether or not this intuition is correct.
Almost any lemma of interest is going to require induction, so immediately we jump to inducting on the only fact we have available: g
. Note that this is different from what we had in the earlier example. Instead of getting the different syntactic cases, we’re looking at the the rules which would have allowed the input to be constructed. After all, whoever “called” our theorem will have needed to somehow prove that n1 > n2
, it would be helpful to know what facts they used to do that. SASyLF allows this using the case rule
syntax. We start with the easy base case:
_: s n1 > s n2 by induction on g:
case rule
 gtone
_: s n2 > n2
is
_: s s n2 > s n2 by rule gtone
end case
end induction

In this case, the term _: s n2 > n2
is unified with n1 > n2
. Thus, n1
is actually “s n2
“. This means that by unification, we are actually trying to prove s s n2 > s n2
. Fortunately, we have a rule for that. If we let “n
” be “s n2
“, we can easily apply the rule gtone
to produce the desired result.
The second case is a bit trickier. We start out by defining the case rule according to the inference rules given in the judgment section. The only case left is gtmore
, so we mindlessly copy/paste and correct the variables to suit our needs:
case rule
g1: n11 > n2
 gtmore
_: s n11 > n2
is
_: s s n11 > s n2 by unproved
end case

In this case, n1
actually unifies with “s n11
“. This is probably the most annoying aspect of SASyLF: all of the syntax is determined by token prefix, so every number has to start with n
, occasionally making proofs a little difficult to follow.
At this point, we need to derive s s n11 > s n2
. Since the left and right side of the >
“operator” do not share a common subterm, the only rule which could possibly help us is gtmore
. In order to apply this rule, we will somehow need to derive s n11 > s n2
(remember, gtmore
takes a known greaterthan relationship and then tells us something about how the leftsuccessor relates to the right). We can reflect this “bottomup” step towards a proof in the following way:
case rule
g1: n11 > n2
 gtmore
_: s n11 > n2
is
g: s n11 > s n2 by unproved
_: s s n11 > s n2 by rule gtmore on g
end case

At this point, SASyLF will warn us about the unproved
, but it will happily pass the rest of our theorem. This technique for proof development is extremely handy in more complicated theorems. The ability to find out whether or not your logic is sound even before it is complete can be very reassuring (in this way you can avoid chasing entirely down the wrong logical path).
In order to make this whole thing work, we need to somehow prove s n11 > s n2
. Fortunately, we just so happen to be working on a theorem which could prove this if we could supply n11 > n2
. This fact is conveniently available with the label of “g1
“. We feed this into the induction hypothesis
to achieve our goal. The final theorem looks like this:
theorem gtimpliesgtsucc:
forall g: n1 > n2
exists s n1 > s n2 .
_: s n1 > s n2 by induction on g:
case rule
 gtone
_: s n2 > n2
is
_: s s n2 > s n2 by rule gtone
end case
case rule
g1: n11 > n2
 gtmore
_: s n11 > n2
is
g2: s n11 > s n2 by induction hypothesis on g1
_: s s n11 > s n2 by rule gtmore on g2
end case
end induction
end theorem

Conclusion
I realize this was a bit of a deviation from my normal semipractical posts, but I think it was still a journey well worth taking. If you’re working as a serious developer in this industry, I strongly suggest that you find yourself a good formal language and/or type theory textbook (might I recommend?) and follow it through the best that you can. The understanding of how languages are formally constructed and the mental circuits to create those proofs yourself will have a surprisingly powerful impact on your daytoday programming. Knowing how the properties of a language are proven provides tremendous illumination into why that language is the way it is and somtimes how it can be made better.
Credit: Examples in this post drawn rather unimaginatively from Dr. John Boyland’s excellent course in type theory.
 SASyLF Main Page at Carnegie Mellon University
 Download math.slf (the full greaterthan example)
 Further reading…
Comments
Alloy is another interesting software analysis/modeling tool for reasoning about systems: http://alloy.mit.edu/alloy4/tutorial4/
Very interesting! It looks like Alloy is more geared toward program proving than SASyLF. On the inverse, it doesn’t look like I could prove the soundness of a type system in Alloy. Different tools for different jobs I suppose.
@Daniel
Alloy works off structure — i.e. you define sigs (signatures) which loosely correspond to classes. The model checker (finder?) will then look for solutions to your logical specification. If it doesn’t find any, you have only really verified your logic within the specified state space. One of the key things it doesn’t have is recursive functions — you have to use recursive structures for that.
As for proving the soundness of a type system — i’m not so sure. It can be used for large and complex examples if needed. I used Alloy to formally specify a fairly large and intense component system with full type inference: http://www.doc.ic.ac.uk/~amcveigh/papers/transfer/alloy.pdf
Cheers,
Andrew
@Andrew
Cool! I don’t really understand everything that’s going on in the linked Alloy proof, but to prove a system with full type inference is an impressive feat (I assume you mean “inference” and not “reconstruction”?). SASyLF is nice, but things get horribly ugly if you try to prove the soundness of things involving unification at the type level. For example, you could probably prove the soundness of a Prolog type system (if it had one) in SASyLF, but it would be nasty in the extreme. Alloy looks like it may not suffer from that shortcoming.
@Daniel
“I assume you mean “inference” and not “reconstruction”?”
It’s simpler really than what you mean by both. My system is a hierarchical component model (i.e. a component contains instances of other components wired in using connectors). My “inference” allows the required and provided interfaces of ports of composite components (made up of other components) to be determined automatically. You can also represent relationships between ports on implementation components which bear on the composite case.
Whilst not trivial, it’s a long way from specifying something like the HindleyMilner type equations.
Andrew
Cool! (Both SASyLF and Alloy.) What I don’t understand is why do you need “exists” in all the theorems and lemmas you present and why couldn’t you just say something like
theorem gtimpliesgtsucc:
forall g: n1 > n2 . s n1 > s n2
?
> What I don’t understand is why do you need “exists” in all
> the theorems and lemmas
:S Like many things in SASyLF, it’s there to bring the syntax more in line with naturallanguage. I didn’t show it, but if you wanted to prove something using a theorem or lemma, the syntax looks like this:
In more complicated proofs, things get very, *very* verbose.
Actually, I’d make a distinction between natural language and logical language. SASyLF looks like it’s trying to approach the logical language and succeeds.
There are just two things that look wrong from your examples:
1) Having to explicitly say “_:” for “I won’t name this”;
2) This extra “exists”.
Anyway, I clearly need to read the papers
Nice post. I’m glad to see SASyLF is being appreciated. Regarding Twelf: you’re right to have a disclaimer. Twelf works with (higherorder!) unification and types for logic programs (not functional programs). Twelf is closer to Prolog than even SASyLF. On the other hand, you’re absolutely right that SASyLF is massively more readable.
Regarding alloy: I think (and now I’m out of my depth) that Alloy can only declare the absence of counterexamples up to a particular size. So it would be less useful for verifying type systems.
Regarding forall vs exists: “forall” are the “inputs” to the theorem, and “exists” are the outputs. Hence the extra keyword/marker.
@Dr. Boyland
Thanks for the clarification on Twelf! I had planned on rereading your introduction to the system but never got around to it. I’ll insert a note in the article to avoid leading too many people astray.
Regarding forall/exists: it occurs to me that I never showed an example with multiple forall(s). This makes the motivation for exists a little bit more apparent. A simple example would be the header for the preservation theorem for the simplytyped lambda calculus:
Post a Comment