Dafny Power User:

Functions over Set Elements

Functions over Set Elements

**Abstract.** In natural language, it is easy to say and understand a phrase like
“the sum of the numbers in a set”. Defining and working with such functions
in a formal settings is more work. The problem has to do with how a
recursively defined function picks the next element from a set. This note
describes a representative example and describes how to make the formal
mumbo-jumbo work out. The solution can be applied to any commutative and
associative operation on a set.

Suppose we have a function that returns the sum of the integers in a set:

`function Sum(s: set<int>): int`

If we add an element `y`

to a set, we expect its sum to go up by `y`

.
That is, we expect that the following method is correctly implemented:

```
method AddElement(s: set<int>, a: int, y: int) returns (t: set<int>, b: int)
requires a == Sum(s) && y !in s
ensures t == s + {x} && b == Sum(t)
{
t := s + {y};
b := a + y;
}
```

It turns out, the proof is not automatic. Let's look at the details and fill in the proof.

Function `Sum`

is defined recursively. The sum of the empty set is `0`

.
If the set is nonempty, pick one of its elements, say `x`

. Then, add
`x`

to the recursively computed sum of the remaining elements.

```
function Sum(s: set<int>): int {
if s == {} then 0 else
var x := Pick(s);
x + Sum(s - {x})
}
```

This definition uses a function `Pick`

, which returns an arbitrary element
from a given set. Here is its definition:

```
function Pick(s: set<int>): int
requires s != {}
{
var x :| x in s; x
}
```

I'll come back to `Pick`

later. All you need to understand at this time is
that the caller of `Pick`

has no control over which element of `s`

is
returned.

To prove `AddElement`

, we need to show `b == Sum(t)`

holds in its final state.
Working backwards over the assignments, this means we need to show

`a + y == Sum(s + {y})`

in the initial state. Since `a`

is `Sum(s)`

, our proof obligation comes down
to

```
Sum(s) + y == Sum(s + {y})
```

where we are given that `y`

is not in `s`

.

Suppose `Pick(s + {y})`

returns `y`

. Then, we have

```
Sum(s + {y});
== // def. Sum
var x := Pick(s + {y}); x + Sum(s + {y} - {x});
== // using the assumption Pick(s + {y}) == y
y + Sum(s + {y} - {y});
== // sets, since y !in s
y + Sum(s);
```

That was easy and straightforward. But for this proof, we assumed that the
relevant call to `Pick`

returned `y`

. What if `Pick`

returns a different element
from `s`

?

Before you realize `Pick`

can choose a different element than the one you
have in mind, the clouds start to clear. What we need is a lemma that says
the choice is immaterial. That is, the lemma will let us treat `Sum`

as if
it picks, when doing its recursive call, an element that we specify.

Here is that lemma. The proof is also a little tricky at first. It comes
down to letting `Pick`

choose whatever element it chooses, and then
applying the induction hypothesis on the smaller set that `Sum`

recurses
on.

```
lemma SumMyWay(s: set<int>, y: int)
requires y in s
ensures Sum(s) == y + Sum(s - {y})
{
var x := Pick(s);
if y == x {
} else {
calc {
Sum(s);
== // def. Sum
x + Sum(s - {x});
== { SumMyWay(s - {x}, y); }
x + y + Sum(s - {x} - {y});
== { assert s - {x} - {y} == s - {y} - {x}; }
y + x + Sum(s - {y} - {x});
== { SumMyWay(s - {y}, x); }
y + Sum(s - {y});
}
}
}
```

I stated the lemma to look like the expressions in the body of `Sum`

, so
the two arguments to `Sum`

are `s`

and `s - {y}`

. Alternatively, we can state
the property in terms of calls to `Sum`

with the arguments `s + {y}`

and `s`

.
This alternative is a simple corollary of the lemma above:

```
lemma AddToSum(s: set<int>, y: int)
requires y !in s
ensures Sum(s + {y}) == Sum(s) + y
{
SumMyWay(s + {y}, y);
}
```

Using the lemma

Equipped with the useful lemma, it's easy to get the proof of
`AddElement`

go through: change its body to

```
t := s + {y};
b := a + y;
AddToSum(s, y);
```

In the development above, I define `Pick`

as a separate function. Reading
the word “pick” in the program text may help understand what `Sum`

and
`SumMyWay`

do. But it's such a small function, so why not just inline it
in the two places where it's used. Let's try it:

```
function Sum(s: set<int>): int {
if s == {} then 0 else
var x :| x in s; // this line takes the place of a call to Pick
x + Sum(s - {x})
}
lemma SumMyWay(s: set<int>, y: int)
requires y in s
ensures Sum(s) == y + Sum(s - {y})
{
var x :| x in s; // this line takes the place of a call to Pick
if y == x { // error: postcondition might not hold on this path
} else {
calc {
Sum(s);
== // def. Sum // error: this step might not hold
x + Sum(s - {x});
== { SumMyWay(s - {x}, y); }
x + y + Sum(s - {x} - {y});
== { assert s - {x} - {y} == s - {y} - {x}; }
y + x + Sum(s - {y} - {x});
== { SumMyWay(s - {y}, x); }
y + Sum(s - {y});
}
}
}
```

We now get two errors!

To explain what's going on, let me say a little more about `:|`

and what
makes it unusual.

The let-such-that construct in Dafny has the form

`var x :| P; E`

It evaluates to `E`

, where `x`

is bound to some value satisfying `P`

.
For example,

`var x :| 7 <= x < 10; 2 * x`

evaluates to `14`

, `16`

, or `18`

. As the programmer, you have no control
over which value of `x`

is chosen. But you do get to know two important
things. One is that `x`

will be chosen to be a value that satisfies `P`

.
(The Dafny verifier gives an error if it cannot prove such a value to exist.)
The other is that you will get the same value every time you evaluate
the expression with the same inputs. In other words, the operator is
deterministic.

Here is another example to illustrate the point about determinism:

`var x :| x in {2, 3, 5}; x`

This expression chooses `x`

to be one of the three smallest primes
(`2`

, `3`

, or `5`

) and then returns it. You don't know which of the
three values you get, but you are guaranteed that every time this expression
is evaluated within one run of a program, you will get the same value.

Let's be more precise about what I mean by “this expression”. In Dafny,
every *textual occurrence* of a let-such-that expression gets to make its
own choices. One way to think about this is to go through the text of your
program and to color each `:|`

operator with a unique color. Then, you
can rely on choices being the same only if they are performed by the
same-color `:|`

.

Here is an illustrative example.

```
lemma Choices(s: set<int>)
requires s != {}
{
var a := Pick(s);
var b := Pick(s);
assert a == b; // this is provable
a := var x :| x in s; x;
b := var x :| x in s; x;
assert a == b; // error: not provable
}
```

The first values assigned to `a`

and `b`

originate from the same `:|`

operator. They are the results of choices of the same color. Therefore,
they are known to be the same. In contrast, the next values assigned
to `a`

and `b`

originate from different `:|`

operators—ones of different
colors. Therefore, you cannot be sure `a`

and `b`

are equal.

Actually, if you think about it a little more (or, maybe, a little less),
then you realize that we know the first values assigned to `a`

and `b`

to be
equal even without knowing anything about the body of `Pick`

. After all,
`Pick`

is a function, and if you call a function twice on the same arguments,
it will give you back the same value. Mathematics guarantees this, and so
does Dafny. So, then what about the second assignments to `a`

and `b`

; aren't
the `:|`

operators in those expressions also functions? Yes, they are,
but they are different functions. They are functions of different colors,
to follow that analogy. As long as you think of every occurrence of `:|`

in your program as being a *different* function, then all mathematics work
out as you'd expect.

This is why it was easier for me to describe the `Sum`

situation if I could
use just one `:|`

. To reuse that same `:|`

, I placed it in a function, which
I named `Pick`

. I recommend you do the same if you're working with
ghost functions that involve choices that you want to prove properties about.

If you tried to define `Sum`

and use it in `AddElement`

before understanding
these issues, you would be perplexed. Now, you know that it is easier to
put `:|`

into a function by itself, and you know that you'll need to write
a lemma like `SumMyWay`

. You may be curious if it's possible to do without
the `Pick`

function. That is, you may wonder if there's any way to use one
`:|`

operator in `Sum`

and another `:|`

operator in `SumMyWay`

. Yes, it is
possible. Let me show you how.

Suppose we inline `Pick`

in function `Sum`

. That is, suppose we define
`Sum`

as in Section 4 above. In that section, I mentioned
that you'll get a couple of errors if you also inline `Pick`

in `SumMyWay`

.
Both of those errors stem from the fact that `Sum`

and `SumMyWay`

make
different choices. But we can be more specific in the lemma, to force it
to choose the same element as the one chosen in `Sum`

.

You can do that by saying you want `x`

not just to be in `s`

, but to be
a value that makes

`Sum(s) == x + Sum(s - {x})`

hold true. Only one such `x`

exists, and it's the one that `Sum`

chooses.
So, if you write lemma as follows:

```
lemma SumMyWay(s: set<int>, y: int)
requires y in s
ensures Sum(s) == y + Sum(s - {y})
{
var x :| x in s && Sum(s) == x + Sum(s - {x});
if y == x {
} else {
// same calc statement as before...
}
}
```

then it verifies! This is good to know, but it seems cleaner to introduce
the function `Pick`

around your `:|`

.

Beware that every textual occurrence of `:|`

in your program is a different
function. You'll have a simpler time working with `:|`

if you roll it into
a function that you name, because then you reduce the chance of becoming
confused because of different kinds (different “colors”) of choices.

Also, beware that the choice made by `:|`

may not be the choice you need.
You'll probably want to prove a lemma that says any choice gives the same
result in the end. Use lemma `SumMyWay`

above as a template for your proof.