Dafny Power User:

Case study of definitions, proofs, algorithm correctness: GCD

Case study of definitions, proofs, algorithm correctness: GCD

**Abstract.** The purpose of this note is to show an example development of a program, introducing
definitions that support the specification of the program, stating and proving
lemmas about those definitions, and using the lemmas in proving the correctness of
the program. Euclid's subtractive algorithm for computing the greatest common divisor
is used as the example.

Let's specify and verify an algorithm to compute the greatest common divisor (GCD) of two numbers. For the specification, we will introduce a function whose definition is intended to be “obviously correct”. We won't use that function to compute the GCD, because the “obviously correct” definition would give really inefficient code if compiled directly. Instead, we'll use Euclid's algorithm for computing the value that the “obviously correct” function defines. We'll prove that the algorithm does compute that value.

In essence, we'll have

```
function Gcd(x: pos, y: pos): pos
method EuclidGcd(x: pos, y: pos) returns (gcd: pos)
ensures gcd == Gcd(x, y)
```

where `pos`

denotes the type of positive integers.

Everything we do will concern (strictly) positive integers. Dafny builds in a type for natural
numbers (that is, non-negative integers), but not positive integers. We can define
these using a *subset type* in Dafny:

`type pos = x | 1 <= x // error: cannot find witness to show type is inhabited`

Dafny wants to know if this type has any inhabitants, and it doesn't figure that out by
itself. This doesn't matter for our example, but we do need to address the error we're
getting. To do that, we supply a `witness`

clause:

`type pos = x | 1 <= x witness 1`

If we really didn't care to exhibit a witness that shows the type to be nonempty, we could
have written `witness *`

, which causes Dafny to treat the type `pos`

as possibly empty.
For our example, you can do either, but since it's easy to supply an actual witness, we
do that.

In the sequel, I will just say *number* when I mean positive integer.

The *divisors* of a number are its factors. We define a predicate that says what it means
for a number `p`

to be a factor of a number `x`

:

```
predicate IsFactor(p: pos, x: pos) {
exists q :: p * q == x
}
```

In words, `p`

is a factor of `x`

iff there is a multiplicand `q`

such that `x`

is the
product `p * q`

.

To talk about *all* the factors of a number, we introduce a function `Factors`

that we define using a set comprehension. A straightforward definition would be:

```
function Factors(x: pos): set<pos> {
set p: pos | IsFactor(p, x) // error: set constructed must be finite
}
```

A `set`

in Dafny denotes a *finite* set (for possibly infinite sets, use `iset`

).
In this case, Dafny doesn't immediately see that the comprehension would generate
a finite set. Luckily, it is simple for us to add another conjunct to the comprehension
that lets Dafny see that the set is finite:

```
function Factors(x: pos): set<pos> {
set p: pos | p <= x && IsFactor(p, x)
}
```

In adding this conjunct, there's a risk we're making a mistake, because perhaps
the new set doesn't include all the factors we'd like. Our conjunct `p <= x`

certainly looks innocent enough, but why not prove that adding it does not accidentally
leave out any factors. We can do that by proving that this set has the same elements
as the possibly infinite set:

```
lemma FactorsHasAllFactors(x: pos)
ensures forall n :: n in Factors(x) <==> n in iset p: pos | IsFactor(p, x)
{
}
```

The proof of a lemma is given in the lemma's body (that is, between the pair of curly braces that follow the lemma's specification). In this case, the proof is empty, because Dafny proves the lemma automatically without any further help from us.

Before leaving the definition of factors, let's state and prove two simple lemmas. These lemmas act as sanity checks on our definitions, and they will also be helpful later in our development.

```
lemma FactorsContains1(x: pos)
ensures 1 in Factors(x)
{
assert 1 * x == x;
}
lemma FactorsContainsSelf(x: pos)
ensures x in Factors(x)
{
assert x * 1 == x;
}
```

To prove that a number `n`

(here, `1`

or `x`

) is in the set `Factors(x)`

, we need
to establish that `n`

satisfies the condition in the set comprehension
(in the body of `Factors(x)`

). The conjunct `n <= x`

is proved automatically,
but the conjunct `IsFactor(n, x)`

is not. By the definition of `IsFactor`

, we need
to prove the existence of a multiplicand `q`

for which `n * q == x`

. Such a proof typically
involves demonstrating a witness, which is what the `assert`

statements in
the two lemmas above do. From those assertions, the verifier completes the proofs
of the lemmas.

To talk about the *greatest* common divisor, we need a function that picks out
the largest number in a set. A somewhat declarative way to do that is to use
the such-that construct. In particular, for a set `s`

, the let-such-that expression

```
var x :| x in s && forall y :: y in s ==> y <= x;
x
```

says to bind `x`

to a value satisfying the condition `x in s && forall y :: y in s ==> y <= x`

,
and then return the value of the expression `x`

. The condition says that `x`

is in the set `s`

, and that, among all the numbers in `s`

, `x`

is the largest.

Use of a such-that construct comes with a proof obligation that a value satisfying
the given condition exists. If we require `s`

to be nonempty, then the `x in s`

condition is easily satisfied, but it takes more work to convince the verifier that
a value for `x`

satisfies the quantifier. We'll define a lemma for that purpose.
We'll name the lemma `MaxExists`

and then we can write our function `Max`

as
follows:

```
function Max(s: set<pos>): pos
requires s != {}
{
MaxExists(s);
var x :| x in s && forall y :: y in s ==> y <= x;
x
}
lemma MaxExists(s: set<pos>)
requires s != {}
ensures exists x :: x in s && forall y :: y in s ==> y <= x
```

Dafny uses the lemma invocation `MaxExists(s)`

in establishing the well-formedness
of the subsequent expression. Note, by the way, that `Max`

(and lemma `MaxExists`

, too)
has a precondition `s != {}`

(keyword `requires`

). This means that the function
(and the lemma, too) can only be called for a nonempty set.

Alright, so then how do we prove `MaxExists`

? The most straightforward way to prove
the existence of such an `x`

is to compute an `x`

satisfying the desired properties.
We'll introduce another function for computing the max, call it `FindMax`

, and
use it in the proof of the `MaxExists`

lemma. Function `FindMax`

will be implemented
recursively.

```
lemma MaxExists(s: set<pos>)
requires s != {}
ensures exists x :: x in s && forall y :: y in s ==> y <= x
{
var x := FindMax(s);
}
function FindMax(s: set<pos>): pos
requires s != {}
ensures max in s && forall y :: y in s ==> y <= FindMax(s)
```

Aren't we going in circles now? Yes, in some ways we're making life more difficult
than necessary. If we have `FindMax`

, we don't need `Max`

, and then we also don't
need lemma `MaxExists`

. Indeed, we could have written and used just `FindMax`

and
never introduced `Max`

or `MaxExists`

. But for this example, I wanted the primary
definitions to be as clear as possible without concern for *how* things are computed.
In that sense, the body of `Max`

is more declarative than the body we are about to
write for `FindMax`

.

Here is the full definition of `FindMax`

:

```
function FindMax(s: set<pos>): (max: pos)
requires s != {}
ensures max in s && forall y :: y in s ==> y <= max
{
var x :| x in s;
if s == {x} then
x
else
var s' := s - {x};
assert s == s' + {x};
var y := FindMax(s');
if x < y then y else x
}
```

When a function postcondition wants to mention the result value of the function,
you can just use the function itself, with the arguments given: `FindMax(s)`

.
I did this when I first introduced `FindMax`

above. In the full definition, I
show an alternative way of doing this, which is to introduce a name for the result
value: `max`

. That name is usable only in the postcondition of the function.
Many times, introducing such a name for the result leads to a shorter
specification.

With the functions we defined, we're now ready to define GCD. Take the factors
of `x`

and the factors of `y`

, intersect them to get their common factors, and then
take the maximum thereof:

```
function Gcd(x: pos, y: pos): pos {
var common := Factors(x) * Factors(y);
Max(common) // error: common must be nonempty
}
```

For this simple definition, the verifier reports a precondition violation, because
it's unable to prove that `common`

satisfies the precondition of `Max`

. We know
that `common`

is nonempty, because we know that `1`

is a common factor of any two
numbers `x`

and `y`

. To bring that information to the verifier's attention, we
write an assertion:

```
function Gcd(x: pos, y: pos): pos {
var common := Factors(x) * Factors(y);
assert 1 in common; // error: assertion violation
Max(common)
}
```

Alas, the verifier is not able to prove this assertion. But we can see that the
presence of the assertion is enough to eliminate the precondition violation. So,
we now focus on proving the assertion. This is where we use the `FactorsContains1`

lemma we introduced earlier. Two calls to that lemma will prove the assertion,
which is best captured in the program text by changing the `assert`

to an
`assert by`

and giving the proof of the assertion in the `by`

block:

```
function Gcd(x: pos, y: pos): pos {
var common := Factors(x) * Factors(y);
assert 1 in common by {
FactorsContains1(x);
FactorsContains1(y);
}
Max(common)
}
```

That does it! We have now given a well-formed definition of `Gcd`

.

We'll prove three properties of our `Gcd`

function—call them sanity checks, if
you will. (We'll need a fourth property as well, but I'll introduce it later.)

As a first sanity check, we expect `Gcd(x, y)`

to return a number that is a
factor of both `x`

and `y`

. Furthermore, among all the numbers that are factors
of both `x`

and `y`

, what `Gcd(x, y)`

returns should be the largest.

```
lemma AboutGcd(x: pos, y: pos)
ensures IsFactor(Gcd(x, y), x)
ensures IsFactor(Gcd(x, y), y)
ensures forall p: pos :: IsFactor(p, x) && IsFactor(p, y) ==> p <= Gcd(x, y)
```

The first two postconditions of this lemma are proved automatically, but not
the third. How do we go about proving that a universal quantifier (that is,
a `forall`

expression) holds? We use Dafny's `forall`

*statement*. When
used in a proof, the `forall`

statement corresponds to the “universal introduction”
rule in logic. This is the rule that says "if you want to prove `forall x :: P(x)`

,
then all you need to do is pick an arbitrary `x`

and prove `P(x)`

for that `x`

.

We introduce the `forall`

statement like this:

```
forall p: pos | IsFactor(p, x) && IsFactor(p, y)
ensures p <= Gcd(x, y)
```

To prove it, we only need to bring up the fact that `p`

, which is a factor of
both `x`

and `y`

, is in the intersection of factors of `x`

and `y`

. The verifier
is then able to complete the proof.

```
lemma AboutGcd(x: pos, y: pos)
ensures IsFactor(Gcd(x, y), x)
ensures IsFactor(Gcd(x, y), y)
ensures forall p: pos :: IsFactor(p, x) && IsFactor(p, y) ==> p <= Gcd(x, y)
{
forall p: pos | IsFactor(p, x) && IsFactor(p, y)
ensures p <= Gcd(x, y)
{
assert p in Factors(x) * Factors(y);
}
}
```

The Dafny verifier often needs help with properties like this. To prove them,
just write them as an assertion. In other words, the verifier *knows* this property
about set intersection, but it isn't creative enough the bring that property into
the proof. By asserting the property, we're asking the verifier to confirm the
property (which it's able to do) and then to use that property in the rest of the
proof (which in this case completes the proof).

As a second sanity check, we prove that `Gcd`

is symmetric.

```
lemma GcdSymmetric(x: pos, y: pos)
ensures Gcd(x, y) == Gcd(y, x)
{
assert Factors(x) * Factors(y) == Factors(y) * Factors(x);
}
```

The proof comes down to the fact that set intersection is symmetric, which we bring to the verifier's attention by writing it as a lemma.

As a third sanity check, we prove that `Gcd`

is idempotent. That is, if you
give it the same argument twice, it will return that argument.

```
lemma GcdIdempotent(x: pos)
ensures Gcd(x, x) == x
{
FactorsContainsSelf(x);
assert x in Factors(x) * Factors(x);
}
```

The proof of this property comes down to the fact that set intersection is idempotent, as well as the property that a number is one of its own factors.

Euclid's subtractive algorithm for finding the GCD of two numbers is to
repeatedly subtract the smaller of the numbers from the larger until they are
both equal. Each such subtraction preserves the GCD—an invariant that we
will need to prove—and the GCD of two equal numbers is that number—which
we established by lemma `GcdIdempotent`

above.

The algorithm, with the loop invariant and idempotence lemma, thus looks like this:

```
method EuclidGcd(X: pos, Y: pos) returns (gcd: pos)
ensures gcd == Gcd(X, Y)
{
var x, y := X, Y;
while
invariant Gcd(x, y) == Gcd(X, Y) // error: invariant not maintained
decreases x + y
{
case x < y =>
y := y - x;
case y < x =>
x := x - y;
}
GcdIdempotent(x);
return x;
}
```

This method uses a `while`

-`case`

loop. (If you're familiar with Dijkstra's
guarded commands [1], this is the **do-od** loop.) Each
iteration of this loop chooses one of the `case`

s to execute. The `case`

chosen
must be one whose guard condition evaluates to `true`

(and if the guards of
several `case`

s evaluate to `true`

, the loop chooses arbitrarily between those
`case`

s) . If no such guard condition evaluates to `true`

, then the loop stops
iterating. The loop in `EuclidGcd`

could of course be an ordinary
`while x != y`

loop, but the symmetry of the two cases afforded by the
`while`

-`case`

loop makes it aesthetically pleasing.

In addition to a loop invariant, the loop also declares a termination metric
(keyword `decreases`

). Proving that the loop terminates comes down to proving
that each iteration makes the value of the termination metric decrease (in Dafny's
built-in well-founded order on integers).

The `EuclidGcd`

method above does not verify, because the verifier is unable
to prove that each iteration maintains the loop invariant. For this, we need
the fourth property of GCD that I alluded to above:

```
lemma GcdSubtract(x: pos, y: pos)
requires x < y
ensures Gcd(x, y) == Gcd(x, y - x)
```

Using this lemma and the symmetry of GCD, we can complete the proof of `Gcd`

:

```
method EuclidGcd(X: pos, Y: pos) returns (gcd: pos)
ensures gcd == Gcd(X, Y)
{
var x, y := X, Y;
while
invariant Gcd(x, y) == Gcd(X, Y)
decreases x + y
{
case x < y =>
GcdSubtract(x, y);
y := y - x;
case y < x =>
calc {
Gcd(x, y);
== { GcdSymmetric(x, y); }
Gcd(y, x);
== { GcdSubtract(y, x); }
Gcd(y, x - y);
== { GcdSymmetric(y, x - y); }
Gcd(x - y, y);
}
x := x - y;
}
GcdIdempotent(x);
return x;
}
```

This version adds a call to `GcdSubtract`

in the first branch of the loop. In the
second branch of the loop, the proof calculation uses equality-preserving steps
to transform the expression `Gcd(x, y)`

into `Gcd(x - y, y)`

. The hints given
in the steps appeal to the `GcdSubtract`

and `GcdSymmetric`

lemmas.

The proof of `GcdSubtract`

is more involved than any of the other definitions
and lemmas in this case study.

The proof starts by introducing a name for `Gcd(x, y)`

:

` var p := Gcd(x, y);`

We know from the definition of `Gcd`

that `p`

is a factor of both `x`

and `y`

,
and we can prove that `p`

is also a factor of `y - x`

:

```
assert IsFactor(p, y - x) by {
var a :| p * a == x;
var b :| p * b == y;
calc {
y - x;
==
p * b - p * a;
==
p * (b - a);
}
}
```

To prove `IsFactor(p, y - x)`

, we introduce names `a`

and `b`

for the multiplicands
that the definition of `IsFactor`

tells us exist (since `p`

is a factor of
both `x`

and `y`

). A simple calculation using basic arithmetic steps then gives
us that `p`

can be multiplied another number (namely, `b - a`

) to get `y - x`

.

Since `p`

is a factor of both `x`

and `y - x`

, we have that it's in the
common factors of `x`

and `y - x`

. We write two lines to make sure the verifier
is on board with this property, phrased in terms of set intersection:

```
var common := Factors(x) * Factors(y - x);
assert p in common;
```

Lastly, we need to show that `p`

is the largest such common factor. We state
this property using a `forall`

statement:

```
forall q | q in common
ensures q <= p
```

To prove this property, we fill in the body of the `forall`

statement. For
`q`

, which denotes an arbitrary number in the set `common`

, we give names to
the multiplicands that yield the products `x`

and `y - x`

, respectively:

```
{
var a :| q * a == x;
var b :| q * b == y - x;
```

Using simple arithmetic steps, we can use a proof calculation to establish
that `q`

is also a factor of `y`

:

```
assert IsFactor(q, y) by {
calc {
y;
==
x + (y - x);
==
q * a + q * b;
==
q * (a + b);
}
}
```

So, since `q`

is a factor of both `x`

and `y`

, the definition of `Gcd(x, y)`

tells us `q <= Gcd(x, y)`

. By giving yet another hint about set intersection:

```
assert q in Factors(x) * Factors(y);
}
```

the verifier completes the proof.

While we now have a full proof of the GCD algorithm, your aesthetic sense may be
bothered by the asymmetry in how we supplied the proofs in the two cases of the loop.
Since the `while`

-`case`

loop affords us a symmetric rendition of the two cases,
it would be nice if we could make the proofs of the two cases more similar as well.

There are several ways we can improve on this situation. One is to refactor the
proof calculation of the second `case`

into its own lemma. Then, then each `case`

would have one line of proof.

Just for fun, let me describe another “trick” to make the two cases (not *entirely*
symmetric, but at least) *more* similar.
The trick is to make the (already asymmetric) `GcdSubtract`

lemma also swap the
arguments to `Gcd`

. We rewrite it into:

```
lemma GcdSubtract(x: pos, y: pos)
requires x < y
ensures Gcd(y, x) == Gcd(x, y - x)
{
GcdSymmetric(x, y);
// ... the proof continues as before
}
```

Note that the left-hand side of the postcondition is now `Gcd(y, x)`

, not `Gcd(x, y)`

as it had been in our first version of this lemma. The only change required for
the proof is to appeal to the symmetry of `Gcd`

, which we can do by one lemma
call immediately inside the body of the lemma. This gives us a proof of our
reformulated `GcdSubtract`

lemma.

With this reformulation, we can simplify the second `case`

of `EuclidGcd`

, at the
expense of making the first `case`

more complicated. Essentially, we're moving
one lemma call from the second `case`

to the first, so instead of having 1 and 3
lemma calls in the two respective `case`

s, we'll have 2 and 2.

```
case x < y =>
GcdSubtract(x, y);
GcdSymmetric(y, x);
y := y - x;
case y < x =>
GcdSymmetric(x - y, y);
GcdSubtract(y, x);
x := x - y;
```

It's not entirely symmetric, but perhaps you still like it. Or perhaps you'll remember this trick for another situation where the shoe fits perfectly. If nothing else, you can stick with the first complete proof we developed above.

If the proof itself doesn't satisfy you and still want to see the algorithm
in action, you can write a `Main`

method and compile and run the program.
(A simple way of doing that from the command line is to use the `/compile:3`

option with the `dafny`

tool. It will verify and then run the program.)

Here is a sample `Main`

:

```
method Main() {
Test(15, 9);
Test(14, 22);
Test(371, 1);
Test(1, 2);
Test(1, 1);
Test(13, 13);
Test(60, 60);
}
method Test(x: pos, y: pos) {
var gcd := EuclidGcd(x, y);
print x, " gcd ", y, " = ", gcd, "\n";
}
```

This case study shows how to define a domain of interest (here, factors of numbers, leading up to the definition of GCD), state and prove some lemmas about those definitions, and then use these in the proof of a small program.

The program, including all lemmas and other proof obligations associated with the definitions, takes the Dafny verifier less than 3 seconds to verify. You can find the entire program in the Dafny test suite [3].

Euclid's GCD algorithm is a familiar textbook example. It's proved in different
forms in various verifiers. For example, the TLA^{+} tutorial uses this program
as an example [0]. It assumes the mathematical properties of GCD
that we proved here. The gallery of Why3 programs contains a version of Euclid's
GCD algorithm that uses modulo instead of subtraction with each step, which
results in fewer iterations [2].

I thank Reto Kramer for suggesting this problem as a useful case study.

[0]TLA proof system.
https://tla.msr-inria.inria.fr/tlaps/content/Documentation/Tutorial/The_example.html. 🔎

[2]Jean-Christophe Filliâtre and Claude Marché.
Greatest common divisor, using the euclidean algorithm.
http://toccata.lri.fr/gallery/gcd.en.html. 🔎

[3]K. Rustan M. Leino.
gcd.dfy.
https://github.com/dafny-lang/dafny/blob/master/Test/dafny4/gcd.dfy,
June 2021. 🔎