Dafny Power User:
Functions over Set Elements
 K. Rustan M. Leino
Manuscript KRML 274, 16 February 2020

Abstract. In natural language, it is easy to say and understand a phrase like “the sum of the numbers in a set”. Defining and working with such functions in a formal settings is more work. The problem has to do with how a recursively defined function picks the next element from a set. This note describes a representative example and describes how to make the formal mumbo-jumbo work out. The solution can be applied to any commutative and associative operation on a set.

## 0. Summing the elements of a set

Suppose we have a function that returns the sum of the integers in a set:

function Sum(s: set<int>): int

If we add an element y to a set, we expect its sum to go up by y. That is, we expect that the following method is correctly implemented:

method AddElement(s: set<int>, a: int, y: int) returns (t: set<int>, b: int)
requires a == Sum(s) && y !in s
ensures t == s + {x} && b == Sum(t)
{
t := s + {y};
b := a + y;
}

It turns out, the proof is not automatic. Let's look at the details and fill in the proof.

## 1. Recursive definition of Sum

Function Sum is defined recursively. The sum of the empty set is 0. If the set is nonempty, pick one of its elements, say x. Then, add x to the recursively computed sum of the remaining elements.

function Sum(s: set<int>): int {
if s == {} then 0 else
var x := Pick(s);
x + Sum(s - {x})
}

This definition uses a function Pick, which returns an arbitrary element from a given set. Here is its definition:

function Pick(s: set<int>): int
requires s != {}
{
var x :| x in s; x
}

I'll come back to Pick later. All you need to understand at this time is that the caller of Pick has no control over which element of s is returned.

## 2. The proof that fails

To prove AddElement, we need to show b == Sum(t) holds in its final state. Working backwards over the assignments, this means we need to show

a + y == Sum(s + {y})

in the initial state. Since a is Sum(s), our proof obligation comes down to

Sum(s) + y == Sum(s + {y})


where we are given that y is not in s.

Suppose Pick(s + {y}) returns y. Then, we have

  Sum(s + {y});
==  // def. Sum
var x := Pick(s + {y}); x + Sum(s + {y} - {x});
==  // using the assumption Pick(s + {y}) == y
y + Sum(s + {y} - {y});
==  // sets, since y !in s
y + Sum(s);

That was easy and straightforward. But for this proof, we assumed that the relevant call to Pick returned y. What if Pick returns a different element from s?

## 3. Picking something else

Before you realize Pick can choose a different element than the one you have in mind, the clouds start to clear. What we need is a lemma that says the choice is immaterial. That is, the lemma will let us treat Sum as if it picks, when doing its recursive call, an element that we specify.

Here is that lemma. The proof is also a little tricky at first. It comes down to letting Pick choose whatever element it chooses, and then applying the induction hypothesis on the smaller set that Sum recurses on.

lemma SumMyWay(s: set<int>, y: int)
requires y in s
ensures Sum(s) == y + Sum(s - {y})
{
var x := Pick(s);
if y == x {
} else {
calc {
Sum(s);
==  // def. Sum
x + Sum(s - {x});
==  { SumMyWay(s - {x}, y); }
x + y + Sum(s - {x} - {y});
==  { assert s - {x} - {y} == s - {y} - {x}; }
y + x + Sum(s - {y} - {x});
==  { SumMyWay(s - {y}, x); }
y + Sum(s - {y});
}
}
}

I stated the lemma to look like the expressions in the body of Sum, so the two arguments to Sum are s and s - {y}. Alternatively, we can state the property in terms of calls to Sum with the arguments s + {y} and s. This alternative is a simple corollary of the lemma above:

lemma AddToSum(s: set<int>, y: int)
requires y !in s
ensures Sum(s + {y}) == Sum(s) + y
{
SumMyWay(s + {y}, y);
}

Using the lemma

Equipped with the useful lemma, it's easy to get the proof of AddElement go through: change its body to

t := s + {y};
b := a + y;
AddToSum(s, y);

## 4. Inlining Pick

In the development above, I define Pick as a separate function. Reading the word “pick” in the program text may help understand what Sum and SumMyWay do. But it's such a small function, so why not just inline it in the two places where it's used. Let's try it:

function Sum(s: set<int>): int {
if s == {} then 0 else
var x :| x in s;  // this line takes the place of a call to Pick
x + Sum(s - {x})
}

lemma SumMyWay(s: set<int>, y: int)
requires y in s
ensures Sum(s) == y + Sum(s - {y})
{
var x :| x in s;  // this line takes the place of a call to Pick
if y == x {  // error: postcondition might not hold on this path
} else {
calc {
Sum(s);
==  // def. Sum        // error: this step might not hold
x + Sum(s - {x});
==  { SumMyWay(s - {x}, y); }
x + y + Sum(s - {x} - {y});
==  { assert s - {x} - {y} == s - {y} - {x}; }
y + x + Sum(s - {y} - {x});
==  { SumMyWay(s - {y}, x); }
y + Sum(s - {y});
}
}
}

We now get two errors!

To explain what's going on, let me say a little more about :| and what makes it unusual.

## 5. Let such that

The let-such-that construct in Dafny has the form

var x :| P; E

It evaluates to E, where x is bound to some value satisfying P. For example,

var x :| 7 <= x < 10; 2 * x

evaluates to 14, 16, or 18. As the programmer, you have no control over which value of x is chosen. But you do get to know two important things. One is that x will be chosen to be a value that satisfies P. (The Dafny verifier gives an error if it cannot prove such a value to exist.) The other is that you will get the same value every time you evaluate the expression with the same inputs. In other words, the operator is deterministic.

Here is another example to illustrate the point about determinism:

var x :| x in {2, 3, 5}; x

This expression chooses x to be one of the three smallest primes (2, 3, or 5) and then returns it. You don't know which of the three values you get, but you are guaranteed that every time this expression is evaluated within one run of a program, you will get the same value.

Let's be more precise about what I mean by “this expression”. In Dafny, every textual occurrence of a let-such-that expression gets to make its own choices. One way to think about this is to go through the text of your program and to color each :| operator with a unique color. Then, you can rely on choices being the same only if they are performed by the same-color :|.

Here is an illustrative example.

lemma Choices(s: set<int>)
requires s != {}
{
var a := Pick(s);
var b := Pick(s);
assert a == b;  // this is provable
a := var x :| x in s; x;
b := var x :| x in s; x;
assert a == b;  // error: not provable
}

The first values assigned to a and b originate from the same :| operator. They are the results of choices of the same color. Therefore, they are known to be the same. In contrast, the next values assigned to a and b originate from different :| operatorsones of different colors. Therefore, you cannot be sure a and b are equal.

Actually, if you think about it a little more (or, maybe, a little less), then you realize that we know the first values assigned to a and b to be equal even without knowing anything about the body of Pick. After all, Pick is a function, and if you call a function twice on the same arguments, it will give you back the same value. Mathematics guarantees this, and so does Dafny. So, then what about the second assignments to a and b; aren't the :| operators in those expressions also functions? Yes, they are, but they are different functions. They are functions of different colors, to follow that analogy. As long as you think of every occurrence of :| in your program as being a different function, then all mathematics work out as you'd expect.

This is why it was easier for me to describe the Sum situation if I could use just one :|. To reuse that same :|, I placed it in a function, which I named Pick. I recommend you do the same if you're working with ghost functions that involve choices that you want to prove properties about.

## 6. Different choices

If you tried to define Sum and use it in AddElement before understanding these issues, you would be perplexed. Now, you know that it is easier to put :| into a function by itself, and you know that you'll need to write a lemma like SumMyWay. You may be curious if it's possible to do without the Pick function. That is, you may wonder if there's any way to use one :| operator in Sum and another :| operator in SumMyWay. Yes, it is possible. Let me show you how.

Suppose we inline Pick in function Sum. That is, suppose we define Sum as in Section 4 above. In that section, I mentioned that you'll get a couple of errors if you also inline Pick in SumMyWay. Both of those errors stem from the fact that Sum and SumMyWay make different choices. But we can be more specific in the lemma, to force it to choose the same element as the one chosen in Sum.

You can do that by saying you want x not just to be in s, but to be a value that makes

Sum(s) == x + Sum(s - {x})

hold true. Only one such x exists, and it's the one that Sum chooses. So, if you write lemma as follows:

lemma SumMyWay(s: set<int>, y: int)
requires y in s
ensures Sum(s) == y + Sum(s - {y})
{
var x :| x in s && Sum(s) == x + Sum(s - {x});
if y == x {
} else {
// same calc statement as before...
}
}

then it verifies! This is good to know, but it seems cleaner to introduce the function Pick around your :|.

## 7. Summary

Beware that every textual occurrence of :| in your program is a different function. You'll have a simpler time working with :| if you roll it into a function that you name, because then you reduce the chance of becoming confused because of different kinds (different “colors”) of choices.

Also, beware that the choice made by :| may not be the choice you need. You'll probably want to prove a lemma that says any choice gives the same result in the end. Use lemma SumMyWay above as a template for your proof.