Abstract. The purpose of this note is to show an example development of a program, introducing definitions that support the specification of the program, stating and proving lemmas about those definitions, and using the lemmas in proving the correctness of the program. Euclid's subtractive algorithm for computing the greatest common divisor is used as the example.
Let's specify and verify an algorithm to compute the greatest common divisor (GCD) of two numbers. For the specification, we will introduce a function whose definition is intended to be “obviously correct”. We won't use that function to compute the GCD, because the “obviously correct” definition would give really inefficient code if compiled directly. Instead, we'll use Euclid's algorithm for computing the value that the “obviously correct” function defines. We'll prove that the algorithm does compute that value.
In essence, we'll have
function Gcd(x: pos, y: pos): pos
method EuclidGcd(x: pos, y: pos) returns (gcd: pos)
ensures gcd == Gcd(x, y)
where pos
denotes the type of positive integers.
Everything we do will concern (strictly) positive integers. Dafny builds in a type for natural numbers (that is, non-negative integers), but not positive integers. We can define these using a subset type in Dafny:
type pos = x | 1 <= x // error: cannot find witness to show type is inhabited
Dafny wants to know if this type has any inhabitants, and it doesn't figure that out by
itself. This doesn't matter for our example, but we do need to address the error we're
getting. To do that, we supply a witness
clause:
type pos = x | 1 <= x witness 1
If we really didn't care to exhibit a witness that shows the type to be nonempty, we could
have written witness *
, which causes Dafny to treat the type pos
as possibly empty.
For our example, you can do either, but since it's easy to supply an actual witness, we
do that.
In the sequel, I will just say number when I mean positive integer.
The divisors of a number are its factors. We define a predicate that says what it means
for a number p
to be a factor of a number x
:
predicate IsFactor(p: pos, x: pos) {
exists q :: p * q == x
}
In words, p
is a factor of x
iff there is a multiplicand q
such that x
is the
product p * q
.
To talk about all the factors of a number, we introduce a function Factors
that we define using a set comprehension. A straightforward definition would be:
function Factors(x: pos): set<pos> {
set p: pos | IsFactor(p, x) // error: set constructed must be finite
}
A set
in Dafny denotes a finite set (for possibly infinite sets, use iset
).
In this case, Dafny doesn't immediately see that the comprehension would generate
a finite set. Luckily, it is simple for us to add another conjunct to the comprehension
that lets Dafny see that the set is finite:
function Factors(x: pos): set<pos> {
set p: pos | p <= x && IsFactor(p, x)
}
In adding this conjunct, there's a risk we're making a mistake, because perhaps
the new set doesn't include all the factors we'd like. Our conjunct p <= x
certainly looks innocent enough, but why not prove that adding it does not accidentally
leave out any factors. We can do that by proving that this set has the same elements
as the possibly infinite set:
lemma FactorsHasAllFactors(x: pos)
ensures forall n :: n in Factors(x) <==> n in iset p: pos | IsFactor(p, x)
{
}
The proof of a lemma is given in the lemma's body (that is, between the pair of curly braces that follow the lemma's specification). In this case, the proof is empty, because Dafny proves the lemma automatically without any further help from us.
Before leaving the definition of factors, let's state and prove two simple lemmas. These lemmas act as sanity checks on our definitions, and they will also be helpful later in our development.
lemma FactorsContains1(x: pos)
ensures 1 in Factors(x)
{
assert 1 * x == x;
}
lemma FactorsContainsSelf(x: pos)
ensures x in Factors(x)
{
assert x * 1 == x;
}
To prove that a number n
(here, 1
or x
) is in the set Factors(x)
, we need
to establish that n
satisfies the condition in the set comprehension
(in the body of Factors(x)
). The conjunct n <= x
is proved automatically,
but the conjunct IsFactor(n, x)
is not. By the definition of IsFactor
, we need
to prove the existence of a multiplicand q
for which n * q == x
. Such a proof typically
involves demonstrating a witness, which is what the assert
statements in
the two lemmas above do. From those assertions, the verifier completes the proofs
of the lemmas.
To talk about the greatest common divisor, we need a function that picks out
the largest number in a set. A somewhat declarative way to do that is to use
the such-that construct. In particular, for a set s
, the let-such-that expression
var x :| x in s && forall y :: y in s ==> y <= x;
x
says to bind x
to a value satisfying the condition x in s && forall y :: y in s ==> y <= x
,
and then return the value of the expression x
. The condition says that x
is in the set s
, and that, among all the numbers in s
, x
is the largest.
Use of a such-that construct comes with a proof obligation that a value satisfying
the given condition exists. If we require s
to be nonempty, then the x in s
condition is easily satisfied, but it takes more work to convince the verifier that
a value for x
satisfies the quantifier. We'll define a lemma for that purpose.
We'll name the lemma MaxExists
and then we can write our function Max
as
follows:
function Max(s: set<pos>): pos
requires s != {}
{
MaxExists(s);
var x :| x in s && forall y :: y in s ==> y <= x;
x
}
lemma MaxExists(s: set<pos>)
requires s != {}
ensures exists x :: x in s && forall y :: y in s ==> y <= x
Dafny uses the lemma invocation MaxExists(s)
in establishing the well-formedness
of the subsequent expression. Note, by the way, that Max
(and lemma MaxExists
, too)
has a precondition s != {}
(keyword requires
). This means that the function
(and the lemma, too) can only be called for a nonempty set.
Alright, so then how do we prove MaxExists
? The most straightforward way to prove
the existence of such an x
is to compute an x
satisfying the desired properties.
We'll introduce another function for computing the max, call it FindMax
, and
use it in the proof of the MaxExists
lemma. Function FindMax
will be implemented
recursively.
lemma MaxExists(s: set<pos>)
requires s != {}
ensures exists x :: x in s && forall y :: y in s ==> y <= x
{
var x := FindMax(s);
}
function FindMax(s: set<pos>): pos
requires s != {}
ensures max in s && forall y :: y in s ==> y <= FindMax(s)
Aren't we going in circles now? Yes, in some ways we're making life more difficult
than necessary. If we have FindMax
, we don't need Max
, and then we also don't
need lemma MaxExists
. Indeed, we could have written and used just FindMax
and
never introduced Max
or MaxExists
. But for this example, I wanted the primary
definitions to be as clear as possible without concern for how things are computed.
In that sense, the body of Max
is more declarative than the body we are about to
write for FindMax
.
Here is the full definition of FindMax
:
function FindMax(s: set<pos>): (max: pos)
requires s != {}
ensures max in s && forall y :: y in s ==> y <= max
{
var x :| x in s;
if s == {x} then
x
else
var s' := s - {x};
assert s == s' + {x};
var y := FindMax(s');
if x < y then y else x
}
When a function postcondition wants to mention the result value of the function,
you can just use the function itself, with the arguments given: FindMax(s)
.
I did this when I first introduced FindMax
above. In the full definition, I
show an alternative way of doing this, which is to introduce a name for the result
value: max
. That name is usable only in the postcondition of the function.
Many times, introducing such a name for the result leads to a shorter
specification.
With the functions we defined, we're now ready to define GCD. Take the factors
of x
and the factors of y
, intersect them to get their common factors, and then
take the maximum thereof:
function Gcd(x: pos, y: pos): pos {
var common := Factors(x) * Factors(y);
Max(common) // error: common must be nonempty
}
For this simple definition, the verifier reports a precondition violation, because
it's unable to prove that common
satisfies the precondition of Max
. We know
that common
is nonempty, because we know that 1
is a common factor of any two
numbers x
and y
. To bring that information to the verifier's attention, we
write an assertion:
function Gcd(x: pos, y: pos): pos {
var common := Factors(x) * Factors(y);
assert 1 in common; // error: assertion violation
Max(common)
}
Alas, the verifier is not able to prove this assertion. But we can see that the
presence of the assertion is enough to eliminate the precondition violation. So,
we now focus on proving the assertion. This is where we use the FactorsContains1
lemma we introduced earlier. Two calls to that lemma will prove the assertion,
which is best captured in the program text by changing the assert
to an
assert by
and giving the proof of the assertion in the by
block:
function Gcd(x: pos, y: pos): pos {
var common := Factors(x) * Factors(y);
assert 1 in common by {
FactorsContains1(x);
FactorsContains1(y);
}
Max(common)
}
That does it! We have now given a well-formed definition of Gcd
.
We'll prove three properties of our Gcd
function—call them sanity checks, if
you will. (We'll need a fourth property as well, but I'll introduce it later.)
As a first sanity check, we expect Gcd(x, y)
to return a number that is a
factor of both x
and y
. Furthermore, among all the numbers that are factors
of both x
and y
, what Gcd(x, y)
returns should be the largest.
lemma AboutGcd(x: pos, y: pos)
ensures IsFactor(Gcd(x, y), x)
ensures IsFactor(Gcd(x, y), y)
ensures forall p: pos :: IsFactor(p, x) && IsFactor(p, y) ==> p <= Gcd(x, y)
The first two postconditions of this lemma are proved automatically, but not
the third. How do we go about proving that a universal quantifier (that is,
a forall
expression) holds? We use Dafny's forall
statement. When
used in a proof, the forall
statement corresponds to the “universal introduction”
rule in logic. This is the rule that says "if you want to prove forall x :: P(x)
,
then all you need to do is pick an arbitrary x
and prove P(x)
for that x
.
We introduce the forall
statement like this:
forall p: pos | IsFactor(p, x) && IsFactor(p, y)
ensures p <= Gcd(x, y)
To prove it, we only need to bring up the fact that p
, which is a factor of
both x
and y
, is in the intersection of factors of x
and y
. The verifier
is then able to complete the proof.
lemma AboutGcd(x: pos, y: pos)
ensures IsFactor(Gcd(x, y), x)
ensures IsFactor(Gcd(x, y), y)
ensures forall p: pos :: IsFactor(p, x) && IsFactor(p, y) ==> p <= Gcd(x, y)
{
forall p: pos | IsFactor(p, x) && IsFactor(p, y)
ensures p <= Gcd(x, y)
{
assert p in Factors(x) * Factors(y);
}
}
The Dafny verifier often needs help with properties like this. To prove them, just write them as an assertion. In other words, the verifier knows this property about set intersection, but it isn't creative enough the bring that property into the proof. By asserting the property, we're asking the verifier to confirm the property (which it's able to do) and then to use that property in the rest of the proof (which in this case completes the proof).
As a second sanity check, we prove that Gcd
is symmetric.
lemma GcdSymmetric(x: pos, y: pos)
ensures Gcd(x, y) == Gcd(y, x)
{
assert Factors(x) * Factors(y) == Factors(y) * Factors(x);
}
The proof comes down to the fact that set intersection is symmetric, which we bring to the verifier's attention by writing it as a lemma.
As a third sanity check, we prove that Gcd
is idempotent. That is, if you
give it the same argument twice, it will return that argument.
lemma GcdIdempotent(x: pos)
ensures Gcd(x, x) == x
{
FactorsContainsSelf(x);
assert x in Factors(x) * Factors(x);
}
The proof of this property comes down to the fact that set intersection is idempotent, as well as the property that a number is one of its own factors.
Euclid's subtractive algorithm for finding the GCD of two numbers is to
repeatedly subtract the smaller of the numbers from the larger until they are
both equal. Each such subtraction preserves the GCD—an invariant that we
will need to prove—and the GCD of two equal numbers is that number—which
we established by lemma GcdIdempotent
above.
The algorithm, with the loop invariant and idempotence lemma, thus looks like this:
method EuclidGcd(X: pos, Y: pos) returns (gcd: pos)
ensures gcd == Gcd(X, Y)
{
var x, y := X, Y;
while
invariant Gcd(x, y) == Gcd(X, Y) // error: invariant not maintained
decreases x + y
{
case x < y =>
y := y - x;
case y < x =>
x := x - y;
}
GcdIdempotent(x);
return x;
}
This method uses a while
-case
loop. (If you're familiar with Dijkstra's
guarded commands [1], this is the do-od loop.) Each
iteration of this loop chooses one of the case
s to execute. The case
chosen
must be one whose guard condition evaluates to true
(and if the guards of
several case
s evaluate to true
, the loop chooses arbitrarily between those
case
s) . If no such guard condition evaluates to true
, then the loop stops
iterating. The loop in EuclidGcd
could of course be an ordinary
while x != y
loop, but the symmetry of the two cases afforded by the
while
-case
loop makes it aesthetically pleasing.
In addition to a loop invariant, the loop also declares a termination metric
(keyword decreases
). Proving that the loop terminates comes down to proving
that each iteration makes the value of the termination metric decrease (in Dafny's
built-in well-founded order on integers).
The EuclidGcd
method above does not verify, because the verifier is unable
to prove that each iteration maintains the loop invariant. For this, we need
the fourth property of GCD that I alluded to above:
lemma GcdSubtract(x: pos, y: pos)
requires x < y
ensures Gcd(x, y) == Gcd(x, y - x)
Using this lemma and the symmetry of GCD, we can complete the proof of Gcd
:
method EuclidGcd(X: pos, Y: pos) returns (gcd: pos)
ensures gcd == Gcd(X, Y)
{
var x, y := X, Y;
while
invariant Gcd(x, y) == Gcd(X, Y)
decreases x + y
{
case x < y =>
GcdSubtract(x, y);
y := y - x;
case y < x =>
calc {
Gcd(x, y);
== { GcdSymmetric(x, y); }
Gcd(y, x);
== { GcdSubtract(y, x); }
Gcd(y, x - y);
== { GcdSymmetric(y, x - y); }
Gcd(x - y, y);
}
x := x - y;
}
GcdIdempotent(x);
return x;
}
This version adds a call to GcdSubtract
in the first branch of the loop. In the
second branch of the loop, the proof calculation uses equality-preserving steps
to transform the expression Gcd(x, y)
into Gcd(x - y, y)
. The hints given
in the steps appeal to the GcdSubtract
and GcdSymmetric
lemmas.
The proof of GcdSubtract
is more involved than any of the other definitions
and lemmas in this case study.
The proof starts by introducing a name for Gcd(x, y)
:
var p := Gcd(x, y);
We know from the definition of Gcd
that p
is a factor of both x
and y
,
and we can prove that p
is also a factor of y - x
:
assert IsFactor(p, y - x) by {
var a :| p * a == x;
var b :| p * b == y;
calc {
y - x;
==
p * b - p * a;
==
p * (b - a);
}
}
To prove IsFactor(p, y - x)
, we introduce names a
and b
for the multiplicands
that the definition of IsFactor
tells us exist (since p
is a factor of
both x
and y
). A simple calculation using basic arithmetic steps then gives
us that p
can be multiplied another number (namely, b - a
) to get y - x
.
Since p
is a factor of both x
and y - x
, we have that it's in the
common factors of x
and y - x
. We write two lines to make sure the verifier
is on board with this property, phrased in terms of set intersection:
var common := Factors(x) * Factors(y - x);
assert p in common;
Lastly, we need to show that p
is the largest such common factor. We state
this property using a forall
statement:
forall q | q in common
ensures q <= p
To prove this property, we fill in the body of the forall
statement. For
q
, which denotes an arbitrary number in the set common
, we give names to
the multiplicands that yield the products x
and y - x
, respectively:
{
var a :| q * a == x;
var b :| q * b == y - x;
Using simple arithmetic steps, we can use a proof calculation to establish
that q
is also a factor of y
:
assert IsFactor(q, y) by {
calc {
y;
==
x + (y - x);
==
q * a + q * b;
==
q * (a + b);
}
}
So, since q
is a factor of both x
and y
, the definition of Gcd(x, y)
tells us q <= Gcd(x, y)
. By giving yet another hint about set intersection:
assert q in Factors(x) * Factors(y);
}
the verifier completes the proof.
While we now have a full proof of the GCD algorithm, your aesthetic sense may be
bothered by the asymmetry in how we supplied the proofs in the two cases of the loop.
Since the while
-case
loop affords us a symmetric rendition of the two cases,
it would be nice if we could make the proofs of the two cases more similar as well.
There are several ways we can improve on this situation. One is to refactor the
proof calculation of the second case
into its own lemma. Then, then each case
would have one line of proof.
Just for fun, let me describe another “trick” to make the two cases (not entirely
symmetric, but at least) more similar.
The trick is to make the (already asymmetric) GcdSubtract
lemma also swap the
arguments to Gcd
. We rewrite it into:
lemma GcdSubtract(x: pos, y: pos)
requires x < y
ensures Gcd(y, x) == Gcd(x, y - x)
{
GcdSymmetric(x, y);
// ... the proof continues as before
}
Note that the left-hand side of the postcondition is now Gcd(y, x)
, not Gcd(x, y)
as it had been in our first version of this lemma. The only change required for
the proof is to appeal to the symmetry of Gcd
, which we can do by one lemma
call immediately inside the body of the lemma. This gives us a proof of our
reformulated GcdSubtract
lemma.
With this reformulation, we can simplify the second case
of EuclidGcd
, at the
expense of making the first case
more complicated. Essentially, we're moving
one lemma call from the second case
to the first, so instead of having 1 and 3
lemma calls in the two respective case
s, we'll have 2 and 2.
case x < y =>
GcdSubtract(x, y);
GcdSymmetric(y, x);
y := y - x;
case y < x =>
GcdSymmetric(x - y, y);
GcdSubtract(y, x);
x := x - y;
It's not entirely symmetric, but perhaps you still like it. Or perhaps you'll remember this trick for another situation where the shoe fits perfectly. If nothing else, you can stick with the first complete proof we developed above.
If the proof itself doesn't satisfy you and still want to see the algorithm
in action, you can write a Main
method and compile and run the program.
(A simple way of doing that from the command line is to use the /compile:3
option with the dafny
tool. It will verify and then run the program.)
Here is a sample Main
:
method Main() {
Test(15, 9);
Test(14, 22);
Test(371, 1);
Test(1, 2);
Test(1, 1);
Test(13, 13);
Test(60, 60);
}
method Test(x: pos, y: pos) {
var gcd := EuclidGcd(x, y);
print x, " gcd ", y, " = ", gcd, "\n";
}
This case study shows how to define a domain of interest (here, factors of numbers, leading up to the definition of GCD), state and prove some lemmas about those definitions, and then use these in the proof of a small program.
The program, including all lemmas and other proof obligations associated with the definitions, takes the Dafny verifier less than 3 seconds to verify. You can find the entire program in the Dafny test suite [3].
Euclid's GCD algorithm is a familiar textbook example. It's proved in different forms in various verifiers. For example, the TLA+ tutorial uses this program as an example [0]. It assumes the mathematical properties of GCD that we proved here. The gallery of Why3 programs contains a version of Euclid's GCD algorithm that uses modulo instead of subtraction with each step, which results in fewer iterations [2].
I thank Reto Kramer for suggesting this problem as a useful case study.