Abstract. Processing the contents of a set or map in a deterministic way takes some work. Using a representative example, this note shows a way to do it.
Mathematical collection types include sets, multisets, sequences, and maps. These are essential to specifications as well as programs. Defining functions over the elements of such collections is more difficult (or at least more labor intensive) than one would expect. In this note, I develop the program and proof for an example that converts a map to a sequence of pairs. The central part of the work concerns getting a compilable function that returns a set's elements into a predictable order.
Throughout this note, the goal is to produce compilable functions.
A map is set of key-value pairs, where no key is repeated. We can convert a map to a sequence by putting its keys into a sequence and then making pairs by retrieving the value for each key:
function method MapToSequence<A,B>(m: map<A,B>): seq<(A,B)> {
var keys := SetToSequence(m.Keys);
seq(|keys|, i requires 0 <= i < |keys| => (keys[i], m[keys[i]]))
}
function method SetToSequence<A>(s: set<A>): seq<A>
ensures var q := SetToSequence(s);
forall i :: 0 <= i < |q| ==> q[i] in s
For a map m
, the expression m.Keys
gives the set of keys. Function
MapToSequence
passes this set of keys to SetToSequence
to obtain a
sequence of these keys. Then, the sequence constructor
seq(n, f)
produces a sequence of length n
, where the element with index i
is
specified to be f(i)
.
In MapToSequence
, the second argument to the sequence comprehension is
a partial function given by the lambda expression
i requires 0 <= i < |keys| => (keys[i], m[keys[i]])
For an index in the given range, this function returns a pair consisting
of the key keys[i]
and its corresponding value in m
. It is necessary
to give the range of i
in the function's precondition, or else the
expression keys[i]
would give rise to an index-of-out-bounds error.
To avoid a key-is-not-in-map error in the expression m[keys[i]]
,
we need to know that keys[i]
is a key of m
. That requires us to know
a connection between m.Keys
and keys
, which is stated in the postcondition
of the (not yet implemented) function SetToSequence
.
We now need to implement SetToSequence
.
Let's start easy. We'll define SetToSequence
as a ghost function.
The definition is recursive. When the given set is nonempty, the function
uses the choose operator, aka the let-such-that operator and written :|
,
to pick an element from the set.
function SetToSequence<A>(s: set<A>): seq<A>
ensures var q := SetToSequence(s);
forall i :: 0 <= i < |q| ==> q[i] in s
{
if s == {} then [] else
var x :| x in s;
[x] + SetToSequence(s - {x})
}
That wasn't too bad.
To make the SetToSequence
we wrote above compilable, we change the keyword
function
to the keyword phrase function method
. This generates an error
message:
to be compilable, the value of a let-such-that expression must be uniquely determined
Expressions—and functions, whose bodies are expressions—must be deterministic in Dafny. This means that
var x :| x in s;
always needs to pick the same value for x
when given the same s
.
This would be computationally expensive to do at run time, so Dafny
instead puts this burden on the programmer
(see [0]).
One idea is for us to strengthen the condition x in s
to always pick
“the minimum” value from s
. This would uniquely determine the value of the
let-such-that expression. But what does “minimum” mean? Let's start
with integers, where we know what minimum is, and then work up to a general
situation.
For now, we'll restrict our attention to integer sets. This lets us strengthen the condition in the let-such-that expression to uniquely talk about the minimum element of the set:
function method SetToSequence(s: set<int>): seq<int>
ensures var q := SetToSequence(s);
forall i :: 0 <= i < |q| ==> q[i] in s
{
if s == {} then [] else
var x :| x in s && forall y :: y in s ==> x <= y;
[x] + SetToSequence(s - {x})
}
The “uniquely determined” requirement is now satisfied. However, since our new such-that predicate is more complicated, the verifier issues a different complaint:
cannot establish the existence of a LHS value that satisfies the such-that predicate
It may seem clear to us that a nonempty set has a minimum element, but we need to convince the verifier of it. Let's do this in a separate lemma, which we declare as follows:
lemma ThereIsAMinimum(s: set<int>)
requires s != {}
ensures exists x :: x in s && forall y :: y in s ==> x <= y
We'll insert a call to this lemma just before the let-such-expression in
SetToSequence
. The body of that function thus becomes
if s == {} then [] else
ThereIsAMinimum(s);
var x :| x in s && forall y :: y in s ==> x <= y;
[x] + SetToSequence(s - {x})
Our next task is to give a proof for the ThereIsAMinimum
lemma.
Here is an annotated proof:
lemma ThereIsAMinimum(s: set<int>)
requires s != {}
ensures exists x :: x in s && forall y :: y in s ==> x <= y
{
var x :| x in s;
if s == {x} {
// obviously, x is the minimum
} else {
// The minimum in s might be x, or it might be the minimum
// in s - {x}. If we knew the minimum of the latter, then
// we could compare the two.
// Let's start by giving a name to the smaller set:
var s' := s - {x};
// So, s is the union of s' and {x}:
assert s == s' + {x};
// The following lemma call establishes that there is a
// minimum in s'.
ThereIsAMinimum(s');
}
}
The proof starts by picking an arbitrary element, x
, from s
. It is the
minimum if it's the only element in s
. Otherwise, the proof brings out
the fact that s
is s' + {x}
. The recursive call to the lemma gives us,
through its postcondition, that there exists a minimum element in s'
.
The verifier automatically supplies the remaining proof glue, so we're done.
The assertion in the proof may seem silly. We just defined s'
to be s - {x}
.
Since x
is in s
, the property s == s' + {x}
follows immediately.
Yes, it does. The verifier can easily confirm this, but verifier is not creative
enough to realize that this is a good property to know. This is typical
when working with collections. For example, if you're working with a
sequence q
, you may consider q[0]
separately and do a recursive call
on q[1..]
. If so, you're likely to need to manually supply
assert [q[0]] + q[1..] == q;
As with the set property, the verifier easily proves this property, but wouldn't have thought of it by itself.
The moral of the proof is that it's a good idea, when you split a collection into smaller pieces that you want to reason about separately, to assert that the whole collection equals the combination of the smaller pieces.
The restriction to integer sets above was nice to work with, because the minimum of a set of integers is easy to define and gives a unique element. To apply the same trick other types, we need to be able to define a “minimum” for those types, too. To define a minimum, you need a total order. Let's review what that means.
A relation over a type A
is a function (A, A) -> bool
. Such a relation,
call it R
, is known as a partial order when the following three conditions
hold:
R
is reflexive, that is,forall a :: R(a, a)
R
is antisymmetric, that is,forall a, b :: R(a, b) && R(b, a) ==> a == b
R
is transitive, that is,forall a, b, c :: R(a, b) && R(b, c) ==> R(a, c)
A total order is a partial order that additionally satisfies the following condition:
R
is connex (closely connected), that is,forall a, b :: R(a, b) || R(b, a)
Note that connexity implies reflexivity.
Here is a predicate that says whether or not a given relation is a total order:
predicate IsTotalOrder<A(!new)>(R: (A, A) -> bool) {
// connexity
&& (forall a, b :: R(a, b) || R(b, a))
// antisymmetry
&& (forall a, b :: R(a, b) && R(b, a) ==> a == b)
// transitivity
&& (forall a, b, c :: R(a, b) && R(b, c) ==> R(a, c))
}
Dafny won't allow us to write these quantifiers in the predicate, unless it
knows a little more about type A
. If A
were a class type, then forall
in Dafny means quantifying over the allocated instances of class A
.
That would mean that allocating another A
instance could cause
IsTotalOrder(R)
to change values. To prevent this, Dafny insists that
unbounded quantifications like those in IsTotalOrder
be over types that
do not depend on the allocated state. We express that with the
type characteristic (!new)
, which is written as a suffix of the type
name in the type-parameter declaration.
We revise SetToSequence
to also take a relation, R
, which we require to
be a total order. In doing so, we change the total order <=
on integers
to R
, and we also add R
as a parameter to the ThereIsAMinimum
lemma:
function method SetToSequence<A(!new)>(s: set<A>, R: (A, A) -> bool): seq<A>
requires IsTotalOrder(R)
ensures var q := SetToSequence(s, R);
forall i :: 0 <= i < |q| ==> q[i] in s
{
if s == {} then [] else
ThereIsAMinimum(s, R);
var x :| x in s && forall y :: y in s ==> R(x, y);
[x] + SetToSequence(s - {x}, R)
}
lemma ThereIsAMinimum<A(!new)>(s: set<A>, R: (A, A) -> bool)
requires s != {} && IsTotalOrder(R)
ensures exists x :: x in s && forall y :: y in s ==> R(x, y)
The revised SetToSequence
lemma verifies. Our remaining task is to prove
the lemma.
We start the proof of our generic ThereIsAMinimum
like the analogous
proof for integer sets above:
{
var x :| x in s;
if s == {x} { // error: postcondition might not hold on this return path
} else {
}
}
The case that was “obvious” before is causing the verifier trouble this time.
Let's help it out. Our proof obligation is to show there's some element in
s
that is as small as any element in s
. If s
is the singleton set {x}
,
then the element we're looking for can only be x
. Now, if y
is an
element in s
, we need to prove R(x, y)
. Since s
is a singleton,
we have y == x
, and then R(x, y)
follows from connexity. The verifier
is missing one of both of these facts. Let's add an assertion about the
first fact:
if s == {x} {
assert forall y :: y in s ==> x == y;
Ah, yes, the verifier can prove this assertion and can then do the rest of the proof.
On to the non-singleton case. We'll start like in the integer-set proof above:
} else {
var s' := s - {x};
assert s == s' + {x};
ThereIsAMinimum(s', R);
The verifier complains the lemma's postcondition might not hold, so we'll give the verifier more help.
The recursive call to the lemma (which, by the way, is known as
“the induction hypothesis”) says there is a minimum in s'
. Let's give it
a name, which we do by introducing a local variable whose value we constrain
to satisfy the condition in the lemma's postcondition's
existential quantifier:
var z :| z in s' && forall y :: y in s' ==> R(z, y);
Since s
is s' + {x}
, the minimum we're looking for is either z
or x
.
Let's treat these cases separately. We do that by using an if
statement.
Since the two cases are so symmetric, I think it looks nice to use Dafny's
if-case statement:
// by connexity, one of the two cases below applies
if
case R(z, x) =>
// prove z is the minimum not just of s', but of s
// ...
case R(x, z) =>
// prove x is the minimum of s
// ...
Let's do the R(z, x)
case first. The hard part is likely to be the
quantifier in the postcondition, so let's start with it. To give a proof
of a universal quantifier, you use a forall
statement.
forall y | y in s
ensures R(z, y)
{
From y in s
, we know either y
is x
or y
is in s'
. For the former,
the proof follows from the guard R(z, x)
. For the latter, the
proof follows from the quantification in the such-that condition we used to
introduce z
. Somehow, the verifier is not picking up on this, so let's
help it along. We start here:
assert x == y || y in s';
}
Lo and behold! That's all the verifier needed.
In the R(x, z)
case, we'll lay down the analogous forall
statement to
prove the quantifier in the lemma's postcondition:
forall y | y in s
ensures R(x, y)
{
If y
is x
, then R(x, y)
follows from connexity. On the other hand,
if y
is in s'
, then the such-that condition that introduced z
tells
us R(z, y)
, so we get R(x, y)
by transitivity. You may add these hints
in various ways to make the proof go through. Here is one way:
assert y in s' ==> R(z, y);
}
A final remark. When we proved ThereIsAMinimum
for integer sets, I argued
that it's often useful to include an assertion like:
assert s == s' + {x};
I included it in the generic ThereIsAMinimum
as well. But as it turns out,
it's not actually needed there (because we made up for it in the additional
hints we gave in the two cases), so you can delete it, if you want.
We did it!
Well, before we declare success, let's just make sure we can use
our functions and lemmas to write the MapToSequence
function that we set out to
write. Here's the generic version, which needs a given total order:
function method MapToSequence<A(!new),B>(m: map<A,B>, R: (A, A) -> bool): seq<(A,B)>
requires IsTotalOrder(R)
{
var keys := SetToSequence(m.Keys, (a,a') => R(a, a'));
seq(|keys|, i requires 0 <= i < |keys| => (keys[i], m[keys[i]]))
}
We can also specialize it for integer sets:
function method IntMapToSequence<B>(m: map<int,B>): seq<(int,B)> {
MapToSequence(m, (a, a') => a <= a')
}
We did it!
I hope this long journey showed you a thing or two about working with collections in Dafny.