<small>Dafny Power User:</small><br>Iterating over a Collection

Dafny Power User:
Iterating over a Collection

K. Rustan M. Leino

Manuscript KRML 275, 17 February 2020

Abstract. Processing the contents of a set or map in a deterministic way takes some work. Using a representative example, this note shows a way to do it.

Mathematical collection types include sets, multisets, sequences, and maps. These are essential to specifications as well as programs. Defining functions over the elements of such collections is more difficult (or at least more labor intensive) than one would expect. In this note, I develop the program and proof for an example that converts a map to a sequence of pairs. The central part of the work concerns getting a compilable function that returns a set's elements into a predictable order.

Throughout this note, the goal is to produce compilable functions.

0. Map to sequence

A map is set of key-value pairs, where no key is repeated. We can convert a map to a sequence by putting its keys into a sequence and then making pairs by retrieving the value for each key:

function method MapToSequence<A,B>(m: map<A,B>): seq<(A,B)> {
  var keys := SetToSequence(m.Keys);
  seq(|keys|, i requires 0 <= i < |keys| => (keys[i], m[keys[i]]))
}

function method SetToSequence<A>(s: set<A>): seq<A>
  ensures var q := SetToSequence(s);
    forall i :: 0 <= i < |q| ==> q[i] in s

For a map m, the expression m.Keys gives the set of keys. Function MapToSequence passes this set of keys to SetToSequence to obtain a sequence of these keys. Then, the sequence constructor

seq(n, f)

produces a sequence of length n, where the element with index i is specified to be f(i). In MapToSequence, the second argument to the sequence comprehension is a partial function given by the lambda expression

i requires 0 <= i < |keys| => (keys[i], m[keys[i]])

For an index in the given range, this function returns a pair consisting of the key keys[i] and its corresponding value in m. It is necessary to give the range of i in the function's precondition, or else the expression keys[i] would give rise to an index-of-out-bounds error.

To avoid a key-is-not-in-map error in the expression m[keys[i]], we need to know that keys[i] is a key of m. That requires us to know a connection between m.Keys and keys, which is stated in the postcondition of the (not yet implemented) function SetToSequence.

We now need to implement SetToSequence.

1. A ghost function

Let's start easy. We'll define SetToSequence as a ghost function. The definition is recursive. When the given set is nonempty, the function uses the choose operator, aka the let-such-that operator and written :|, to pick an element from the set.

function SetToSequence<A>(s: set<A>): seq<A>
  ensures var q := SetToSequence(s);
    forall i :: 0 <= i < |q| ==> q[i] in s
{
  if s == {} then [] else
    var x :| x in s;
    [x] + SetToSequence(s - {x})
}

That wasn't too bad.

2. Making the function compilable

To make the SetToSequence we wrote above compilable, we change the keyword function to the keyword phrase function method. This generates an error message:

to be compilable, the value of a let-such-that expression must be uniquely determined

Expressions—and functions, whose bodies are expressions—must be deterministic in Dafny. This means that

var x :| x in s;

always needs to pick the same value for x when given the same s. This would be computationally expensive to do at run time, so Dafny instead puts this burden on the programmer (see [0]).

One idea is for us to strengthen the condition x in s to always pick “the minimum” value from s. This would uniquely determine the value of the let-such-that expression. But what does “minimum” mean? Let's start with integers, where we know what minimum is, and then work up to a general situation.

3. Choosing the smallest integer

For now, we'll restrict our attention to integer sets. This lets us strengthen the condition in the let-such-that expression to uniquely talk about the minimum element of the set:

function method SetToSequence(s: set<int>): seq<int>
  ensures var q := SetToSequence(s);
    forall i :: 0 <= i < |q| ==> q[i] in s
{
  if s == {} then [] else
    var x :| x in s && forall y :: y in s ==> x <= y;
    [x] + SetToSequence(s - {x})
}

The “uniquely determined” requirement is now satisfied. However, since our new such-that predicate is more complicated, the verifier issues a different complaint:

cannot establish the existence of a LHS value that satisfies the such-that predicate

It may seem clear to us that a nonempty set has a minimum element, but we need to convince the verifier of it. Let's do this in a separate lemma, which we declare as follows:

lemma ThereIsAMinimum(s: set<int>)
  requires s != {}
  ensures exists x :: x in s && forall y :: y in s ==> x <= y

We'll insert a call to this lemma just before the let-such-expression in SetToSequence. The body of that function thus becomes

if s == {} then [] else
  ThereIsAMinimum(s);
  var x :| x in s && forall y :: y in s ==> x <= y;
  [x] + SetToSequence(s - {x})

4. Proving there's a minimum

Our next task is to give a proof for the ThereIsAMinimum lemma. Here is an annotated proof:

lemma ThereIsAMinimum(s: set<int>)
  requires s != {}
  ensures exists x :: x in s && forall y :: y in s ==> x <= y
{
  var x :| x in s;
  if s == {x} {
    // obviously, x is the minimum
  } else {
    // The minimum in s might be x, or it might be the minimum
    // in s - {x}. If we knew the minimum of the latter, then
    // we could compare the two.
    // Let's start by giving a name to the smaller set:
    var s' := s - {x};
    // So, s is the union of s' and {x}:
    assert s == s' + {x};
    // The following lemma call establishes that there is a
    // minimum in s'.
    ThereIsAMinimum(s');
  }
}

The proof starts by picking an arbitrary element, x, from s. It is the minimum if it's the only element in s. Otherwise, the proof brings out the fact that s is s' + {x}. The recursive call to the lemma gives us, through its postcondition, that there exists a minimum element in s'. The verifier automatically supplies the remaining proof glue, so we're done.

The assertion in the proof may seem silly. We just defined s' to be s - {x}. Since x is in s, the property s == s' + {x} follows immediately. Yes, it does. The verifier can easily confirm this, but verifier is not creative enough to realize that this is a good property to know. This is typical when working with collections. For example, if you're working with a sequence q, you may consider q[0] separately and do a recursive call on q[1..]. If so, you're likely to need to manually supply

assert [q[0]] + q[1..] == q;

As with the set property, the verifier easily proves this property, but wouldn't have thought of it by itself.

The moral of the proof is that it's a good idea, when you split a collection into smaller pieces that you want to reason about separately, to assert that the whole collection equals the combination of the smaller pieces.

5. Total orders

The restriction to integer sets above was nice to work with, because the minimum of a set of integers is easy to define and gives a unique element. To apply the same trick other types, we need to be able to define a “minimum” for those types, too. To define a minimum, you need a total order. Let's review what that means.

A relation over a type A is a function (A, A) -> bool. Such a relation, call it R, is known as a partial order when the following three conditions hold:

R is reflexive, that is,
forall a :: R(a, a)
R is antisymmetric, that is,
forall a, b :: R(a, b) && R(b, a) ==> a == b
R is transitive, that is,
forall a, b, c :: R(a, b) && R(b, c) ==> R(a, c)

A total order is a partial order that additionally satisfies the following condition:

R is connex (closely connected), that is,
forall a, b :: R(a, b) || R(b, a)

Note that connexity implies reflexivity.

Here is a predicate that says whether or not a given relation is a total order:

predicate IsTotalOrder<A(!new)>(R: (A, A) -> bool) {
  // connexity
  && (forall a, b :: R(a, b) || R(b, a))
  // antisymmetry
  && (forall a, b :: R(a, b) && R(b, a) ==> a == b)
  // transitivity
  && (forall a, b, c :: R(a, b) && R(b, c) ==> R(a, c))
}

Dafny won't allow us to write these quantifiers in the predicate, unless it knows a little more about type A. If A were a class type, then forall in Dafny means quantifying over the allocated instances of class A. That would mean that allocating another A instance could cause IsTotalOrder(R) to change values. To prevent this, Dafny insists that unbounded quantifications like those in IsTotalOrder be over types that do not depend on the allocated state. We express that with the type characteristic (!new), which is written as a suffix of the type name in the type-parameter declaration.

6. Generic set to sequence

We revise SetToSequence to also take a relation, R, which we require to be a total order. In doing so, we change the total order <= on integers to R, and we also add R as a parameter to the ThereIsAMinimum lemma:

function method SetToSequence<A(!new)>(s: set<A>, R: (A, A) -> bool): seq<A>
  requires IsTotalOrder(R)
  ensures var q := SetToSequence(s, R);
    forall i :: 0 <= i < |q| ==> q[i] in s
{
  if s == {} then [] else
    ThereIsAMinimum(s, R);
    var x :| x in s && forall y :: y in s ==> R(x, y);
    [x] + SetToSequence(s - {x}, R)
}

lemma ThereIsAMinimum<A(!new)>(s: set<A>, R: (A, A) -> bool)
  requires s != {} && IsTotalOrder(R)
  ensures exists x :: x in s && forall y :: y in s ==> R(x, y)

The revised SetToSequence lemma verifies. Our remaining task is to prove the lemma.

7. Total order has minimum

We start the proof of our generic ThereIsAMinimum like the analogous proof for integer sets above:

{
  var x :| x in s;
  if s == {x} {  // error: postcondition might not hold on this return path
  } else {
  }
}

The case that was “obvious” before is causing the verifier trouble this time. Let's help it out. Our proof obligation is to show there's some element in s that is as small as any element in s. If s is the singleton set {x}, then the element we're looking for can only be x. Now, if y is an element in s, we need to prove R(x, y). Since s is a singleton, we have y == x, and then R(x, y) follows from connexity. The verifier is missing one of both of these facts. Let's add an assertion about the first fact:

  if s == {x} {
    assert forall y :: y in s ==> x == y;

Ah, yes, the verifier can prove this assertion and can then do the rest of the proof.

On to the non-singleton case. We'll start like in the integer-set proof above:

  } else {
    var s' := s - {x};
    assert s == s' + {x};
    ThereIsAMinimum(s', R);

The verifier complains the lemma's postcondition might not hold, so we'll give the verifier more help.

The recursive call to the lemma (which, by the way, is known as “the induction hypothesis”) says there is a minimum in s'. Let's give it a name, which we do by introducing a local variable whose value we constrain to satisfy the condition in the lemma's postcondition's existential quantifier:

    var z :| z in s' && forall y :: y in s' ==> R(z, y);

Since s is s' + {x}, the minimum we're looking for is either z or x. Let's treat these cases separately. We do that by using an if statement. Since the two cases are so symmetric, I think it looks nice to use Dafny's if-case statement:

    // by connexity, one of the two cases below applies
    if
    case R(z, x) =>
      // prove z is the minimum not just of s', but of s
      // ...
    case R(x, z) =>
      // prove x is the minimum of s
      // ...

Let's do the R(z, x) case first. The hard part is likely to be the quantifier in the postcondition, so let's start with it. To give a proof of a universal quantifier, you use a forall statement.

      forall y | y in s
        ensures R(z, y)
      {

From y in s, we know either y is x or y is in s'. For the former, the proof follows from the guard R(z, x). For the latter, the proof follows from the quantification in the such-that condition we used to introduce z. Somehow, the verifier is not picking up on this, so let's help it along. We start here:

        assert x == y || y in s';
      }

Lo and behold! That's all the verifier needed.

In the R(x, z) case, we'll lay down the analogous forall statement to prove the quantifier in the lemma's postcondition:

      forall y | y in s
        ensures R(x, y)
      {

If y is x, then R(x, y) follows from connexity. On the other hand, if y is in s', then the such-that condition that introduced z tells us R(z, y), so we get R(x, y) by transitivity. You may add these hints in various ways to make the proof go through. Here is one way:

        assert y in s' ==> R(z, y);
      }

A final remark. When we proved ThereIsAMinimum for integer sets, I argued that it's often useful to include an assertion like:

    assert s == s' + {x};

I included it in the generic ThereIsAMinimum as well. But as it turns out, it's not actually needed there (because we made up for it in the additional hints we gave in the two cases), so you can delete it, if you want.

8. Coming back to map to sequence

We did it!

Well, before we declare success, let's just make sure we can use our functions and lemmas to write the MapToSequence function that we set out to write. Here's the generic version, which needs a given total order:

function method MapToSequence<A(!new),B>(m: map<A,B>, R: (A, A) -> bool): seq<(A,B)>
  requires IsTotalOrder(R)
{
  var keys := SetToSequence(m.Keys, (a,a') => R(a, a'));
  seq(|keys|, i requires 0 <= i < |keys| => (keys[i], m[keys[i]]))
}

We can also specialize it for integer sets:

function method IntMapToSequence<B>(m: map<int,B>): seq<(int,B)> {
  MapToSequence(m, (a, a') => a <= a')
}

We did it!

9. Conclusion

I hope this long journey showed you a thing or two about working with collections in Dafny.

References

[0]K. Rustan M. Leino. Compiling Hilbert's epsilon operator. In Ansgar Fehnker, Annabelle McIver, Geoff Sutcliffe, and Andrei Voronkov, editors, LPAR-20. 20th International Conferences on Logic for Programming, Artificial Intelligence and Reasoning — Short Presentations, volume 35 of EPiC Series in Computing, pages 106–118. EasyChair, 2015. 🔎