Dafny Power User:

The Parent Trick for proving termination, and a function-by-method use case

The Parent Trick for proving termination, and a function-by-method use case

**Abstract.** This note proves termination for a common situation of mutually recursive functions.
After that, it takes an arduous journey to turn an aggregating set operation
into compiled code, which gives an opportunity to showcase Dafny's
`function-by-method`

construct.

A nonempty prefix-closed set `S`

of strings can be stored as a
DAG where each edge is labeled with one character.
The presence in `S`

of a string `L`

is represented by the presence of a path from the root of the DAG through
edges whose labels, in order, spell out `L`

.
For example, a DAG that represents the set `{"", "a", "ab", "b"}`

is

```
Node (root)
/ \
a b
/ \
Node |
\ |
b |
\ /
Node
```

We declare the data structure for storing such sets of strings as follows:

```
datatype StringSet = Node(children: map<char, StringSet>)
predicate In(s: string, tree: StringSet) {
s == "" ||
var ch := s[0];
ch in tree.children.Keys &&
In(s[1..], tree.children[ch])
}
lemma Examples() {
var tree :=
var empty := Node(map[]);
var n := Node(map['b' := empty]);
Node(map['a' := n, 'b' := empty]);
assert In("", tree);
assert In("ab", tree);
assert !In("xyz", tree);
}
```

Let's write a function that computes the length of the longest string in the set.
We'll do it using two mutually recursive functions (and a helper
function `Maximum`

).

```
function MaxLen(tree: StringSet): nat {
if tree.children == map[] then
0
else
1 + MaxLenForest(tree.children.Values)
}
function MaxLenForest(trees: set<StringSet>): nat {
if trees == {} then
0
else
var t :| t in trees;
var m := MaxLen(t);
var n := MaxLenForest(trees - {t});
Maximum(m, n)
}
function Maximum(x: int, y: int): int {
if x < y then y else x
}
```

But herein lies a problem: proving termination. As it stands, the verifier
complains that it cannot prove termination for the call from `MaxLen`

to
`MaxLenForest`

or for the call from `MaxLenForest`

to `MaxLen`

. (But it is able to
prove termination for the recursive call to `MaxLenForest`

.)

Let's remind ourselves of what the `decreases`

clauses for the two functions are,
since we didn't write them explicitly and thus get Dafny's defaults. If you
hover over the functions in the IDE, you'll discover that the `decreases`

clauses that Dafny used are the following:

```
function MaxLen(tree: StringSet): nat
decreases tree
function MaxLenForest(trees: set<StringSet>): nat
decreases trees
```

Termination for the recursive call of `MaxLenForest`

verifies, because
`trees - {t}`

is smaller than `tree`

. But for the other two calls, it's
no wonder that termination cannot be proved, since Dafny's built-in
well-founded order does not relate datatypes (here, `StringSet`

) and
sets.

If we could arrange for the `decreases`

clauses to have the same
type, or at least that the types of one is a prefix of the types of the
other, we may be able to make progress. This can be done by thinking of
`MaxLenForest`

as operating in “the context of” an enclosing tree, that is,
the “parent” tree of the forest. Adding such a `parent`

parameter to
`MaxLenForest`

and, for the purpose of this presentation, showing the
default `decreases`

clauses explicitly, we get:

```
function MaxLen(tree: StringSet): nat
decreases tree
{
if tree.children == map[] then
0
else
1 + MaxLenForest(tree, tree.children.Values)
}
function MaxLenForest(parent: StringSet, trees: set<StringSet>): nat
decreases parent, trees
{
if trees == {} then
0
else
var t :| t in trees;
var m := MaxLen(t);
var n := MaxLenForest(parent, trees - {t});
Maximum(m, n)
}
```

This is a little better, because now the verifier can prove the
termination of the call from `MaxLen`

to `MaxLenForest`

. The reason it
can do that is that Dafny's lexicographic ordering considers
the 2-tuple `tree, tree.children.Values`

to be smaller than the
1-tuple `tree`

.

But we still need to prove termination of the call from `MaxLenForest`

back to `MaxLen`

. We'd like to argue that this call terminates because `t`

is
“smaller than”—that is, structurally included in—the enclosing parent.
But this information is not spelled out in the program. To make the
information available,
we need to write a precondition for `MaxLenForest`

that says that
the forest in enclosed in `parent`

. Dafny allows operator `<`

to be
used with datatypes, with the meaning of “structurally included in”.

```
function MaxLen(tree: StringSet): nat
decreases tree
{
if tree.children == map[] then
0
else
1 + MaxLenForest(tree, tree.children.Values)
}
function MaxLenForest(parent: StringSet, trees: set<StringSet>): nat
requires forall t :: t in trees ==> t < parent
decreases parent, trees
{
if trees == {} then
0
else
var t :| t in trees;
var m := MaxLen(t);
var n := MaxLenForest(parent, trees - {t});
Maximum(m, n)
}
```

With that precondition, the verifier completes the proof of the program.

So, although Dafny's well-founded order does not relate datatypes and
sets, even for a set that is structurally included in a datatype value,
the verifier does know that *the elements* of that set (assuming the
type of the elements is a datatype) are structurally included in the
parent. In symbols, the well-founded order does not let you prove

`tree.children < tree`

but it does let you prove

`tree.children[i] < tree`

for any `i`

in range. This is why “the parent trick” is an idiom that
lets you prove termination for these sorts of mutually recursive
functions.

In summary, the need for the parent trick came about because we had
two mutually recursive functions. This meant that we needed to “remember”
the context in which the `MaxLenForest`

function is called, which is
done by passing the extra parameter. If `MaxLenForest`

somehow were
written as just one recursive function, then we would still use Dafny's
knowledge that a datatype value `d`

inside the set inside a datatype
value `D`

is structurally included in `D`

(and thus `d < D`

), but we
wouldn't need to apply the parent trick to pass the context as a parameter.
I will show such an example below (see function `MaxLen`

in
Section 6).

First, in presenting the program above, I explicitly showed the `decreases`

clauses. But if you list the parameters in the order that I did, then the
explicit `decreases`

clause will coincide with Dafny's defaults. So, you
can omit the `decreases`

clauses and the verifier will still be able to
prove the program.

Second, as I wrote them, the functions are ghost, so they are erased by
the compiler and not available at run time. If you want the functions to
be available at run time, you might be bothered by having to pass the
“parent” parameter along everywhere. This is easily solved by then marking
the “parent” parameter as `ghost`

. This is possible, because `parent`

is
not used by the function, other than for its proof, so it's fine if the
compiler erases the parameter.

If you try doing what I just said in the two remarks, the functions will be declared as follows:

```
function method MaxLen(tree: StringSet): nat {
if tree.children == map[] then
0
else
1 + MaxLenForest(tree, tree.children.Values)
}
function method MaxLenForest(ghost parent: StringSet, trees: set<StringSet>): nat
requires forall t :: t in trees ==> t < parent
{
if trees == {} then
0
else
var t :| t in trees;
var m := MaxLen(t);
var n := MaxLenForest(parent, trees - {t});
Maximum(m, n)
}
```

but you'll notice an error:

```
var t :| t in trees;
^ here
Error: to be compilable, the value of a let-such-that expression must be
uniquely determined
```

This is a completely different problem than the termination problem we
worked to solve above. The problem is that the choice of a `t`

that satisfies
`t in trees`

is not unique, as is required for compiling the let-such-that expression.
The reasons for this restriction are explained in a different
paper [0].
If you don't care about compiling `MaxLen`

, you can stop reading now. Otherwise,
stretch your legs, get more coffee, and keep reading.

To make the choice of `t`

unique, we need to strengthen the constraint in the
let-such-that expression. For example, we might want to pick the “smallest” of the
values in `trees`

. But what does “smallest” mean among a set of `StringSet`

trees?
An easier way out is to change the program to pass the whole map `tree.children`

to `MaxLenForest`

, rather than just the children trees themselves (`tree.childre.Values`

).
Then, we can pick the smallest among the labels, which—since our labels have
type `char`

—is easy to do. Here's what we get:

```
function method MaxLen(tree: StringSet): nat {
if tree.children == map[] then
0
else
1 + MaxLenForest(tree, tree.children)
}
function method MaxLenForest(ghost parent: StringSet,
children: map<char, StringSet>): nat
requires forall lbl :: lbl in children.Keys ==> children[lbl] < parent
{
if children == map[] then
0
else
var lbl :|
lbl in children.Keys &&
forall lbl' :: lbl' in children.Keys ==> lbl <= lbl';
var m := MaxLen(children[lbl]);
var n := MaxLenForest(parent, children - {lbl});
Maximum(m, n)
}
```

Note that the expression `children - {lbl}`

is a map-domain subtraction. That is,
it denotes the map that is like `children`

, except it doesn't have the key `lbl`

.

The additional constraint makes the choice unique. However, now that we've
strengthened it, the verifier is no longer convinced there is any choice for
`lbl`

. To fix this problem, we need to demonstrate that every nonempty set of
characters has a smallest element. The easiest way to conduct such a demonstration
is to compute it:

```
function PickSmallest(s: set<char>): (ch: char)
requires s != {}
ensures ch in s && forall ch' :: ch' in s ==> ch <= ch'
{
var ch :| ch in s;
if ch' :| ch' in s && ch' < ch then
var s' := s - {ch};
assert s == s' + {ch};
PickSmallest(s')
else
ch
}
```

“But wait!”, you say. "How come this function gets away with the weak let-such-that
constraint that was a problem in `MaxLenForest`

?" It's because `PickSmallest`

is
a ghost function, so its let-such-that expression does not have the uniqueness
requirement (see [0]).
Alright, then, so our final step is to use function `PickSmallest`

just before the let-such-that expression in `MaxLenForest`

:

```
function method MaxLenForest(ghost parent: StringSet,
children: map<char, StringSet>): nat
requires forall lbl :: lbl in children.Keys ==> children[lbl] < parent
{
if children == map[] then
0
else
ghost var smallest := PickSmallest(children.Keys);
var lbl :|
lbl in children.Keys &&
forall lbl' :: lbl' in children.Keys ==> lbl <= lbl';
var m := MaxLen(children[lbl]);
var n := MaxLenForest(parent, children - {lbl});
Maximum(m, n)
}
```

After putting the result of `PickSmallest`

into a ghost variable, the verifier has
all the evidence it needs for the existence of a value of `lbl`

. So, this concludes
our program.

The running example has already showed several techniques and features in Dafny.
But in our final program above, we still have a rather big and unsatisfying
let-such-that constraint. The mandate that we have to pick a unique element from
the set seems unnecessarily harsh, since all we're trying to do is compute the
maximum of a set, and the maximum will be the same regardless of which order we
consider the elements in. So, what we'd like to do is use nondeterminism inside
`MaxLenForest`

, as long as we can prove that the nondeterminism does not affect
the result. And Dafny has just the feature for this: `function`

-`by`

-`method`

declarations.

To make a reusable function, let's define a general function `Max`

as follows:

```
function method Max(s: set<int>): int
requires s != {}
ensures Max(s) in s
ensures forall z :: z in s ==> z <= Max(s)
```

By constructing the set of elements we want to take the maximum of, we can
use `Max`

in `MaxLen`

:

```
function method MaxLen(tree: StringSet): nat {
if tree.children == map[] then
0
else
var s := set t | t in tree.children.Values :: MaxLen(t);
assert s != {} by {
assert forall lbl ::
lbl in tree.children.Keys ==> MaxLen(tree.children[lbl]) in s;
}
1 + Max(s)
}
```

As you see here, the verifier needed help in proving that `s`

is nonempty.
This proof obligation is addressed by asserting that, for every label in (the
known to be nonempty) `tree.children`

, `s`

contains the `MaxLen`

of the corresponding
value. (Since `MaxLen`

is now recursive and doesn't go through a mutually recursive
function like `MaxLenForest`

, we don't need the parent trick. That is, there's
no longer a need to pass any additional context parameter.)

From now on, we'll focus just on `Max`

. It would be nice to write its body as

```
var x :| x in s;
if s == {x} then
x
else
var s' := s - {x};
assert s == s' + {x};
var y := Max(s');
Maximum(x, y)
```

Alas, this is not allowed, because the compiled let-such-that does not uniquely
determine the value for `x`

. This is the exactly the problem that we started with
and want to steer around!

Function-by-method to the rescue.

Say what?! A `function-by-method`

is a combination of a function and method.
The body of the function part of a `function-by-method`

is a ghost
expression, but this expression acts just as the specification
for what value is to be returned. The body of the method part of the
`function-by-method`

then gives a compilable method body for computing the value.

Using a bit of program text, the basic idea is to declare a `function-by-method`

like

```
function F(x: X): (y: Y)
requires Pre(x)
ensures Post(x, y)
{
Expr;
} by method {
MBody;
}
```

where `Expr`

is a ghost expression (the “function part”) and
`MBody`

is a compilable statement list (the “method part”).

To explain the meaning of this `function-by-method`

, it's helpful to look at
it as two declarations:

```
function F(x: X): (y: Y)
requires Pre(x)
ensures Post(x, y)
{
Expr;
}
method _F(x: X) returns (y: Y)
requires Pre(x)
ensures y == F(x)
{
MBody;
}
```

Now, Dafny arranges that every use of the original `function-by-method`

in a ghost context calls ghost function `F`

, and that every use of the
`function-by-method`

is a non-ghost context calls the compiled method `_F`

.

Note that any `ensures`

clause on the original `function-by-method`

is a
proof obligation of the function part. The postcondition of the method part
is `y == F(x)`

. (I'll come back to this point later.)

Using a `function-by-method`

declaration, we write `Max`

as follows:

```
function Max(s: set<int>): (m: int)
requires s != {}
ensures m in s && forall z :: z in s ==> z <= m
{
var x :| x in s;
if s == {x} then
x
else
var s' := s - {x};
assert s == s' + {x};
var y := Max(s');
Maximum(x, y)
} by method {
m :| m in s;
var r := s - {m};
while r != {}
invariant r < s
invariant m in s && forall z :: z in s - r ==> z <= m
{
var x :| x in r;
assert forall z :: z in s - (r - {x}) ==> z in s - r || z == x;
r := r - {x};
if m < x {
m := x;
}
}
assert s - {} == s;
}
```

This function-by-method verifies as given. However, the reason it verifies is subtle
and may not be what you expect (especially if this is the first time you see
a `function-by-method`

). Let me explain the reason and then show a more typical
proof of a `function-by-method`

.

The point of the method part of a `function-by-method`

is to provide a method
implementation that returns exactly the same value as the given function body.
For this reason, the postcondition that the method body has to satisfy is
`m == Max(s)`

. This is a very precise postcondition.
The `ensures`

clause declared in the `function-by-method`

applies to the
function part. It is usually much less precise; in fact, for most functions,
this `ensures`

clause is omitted, since the body of the function transparently
says what the result value is.

In general, just because the method part of a `function-by-method`

happens to satisfy the `ensures`

clause of the `function-by-method`

does not
mean the method part is correct. But in our `Max`

example, the two coincide, because
the `ensures`

clause I gave for the `function-by-method`

uniquely determines `m`

.
Thus, any method body that satisfies this `ensures`

clause also satisfies `m == Max(s)`

.
(You can confirm this by deleting `m in s`

from the `ensures`

clause. This will
cause the method part to fail to verify, while the function part still verifies.)

A more generally applicable pattern for proving the method part of a recursive
makes use of the function in the loop invariant. To illustrate this, I will remove
the `ensures`

clause of `Max`

. Consequently, the iterative method body
needs to establish that it computes exactly what the recursion function body
does.

The idea is to use a body like this:

```
m :| m in s;
var r := {m};
while r != s
invariant {} != r <= s
invariant m == Max(r)
{
var x :| x in s - r;
r := r + {x};
if m < x {
m := x;
}
}
```

This loops maintains the invariant `m == Max(r)`

while enlarging `r`

until it
becomes `s`

. Doing this proof is tricky, because of two issues. Both of these
issues have standard solutions, which are really good to know. The two issues
and solutions are explained for a simpler example in another Dafny Power User
note [1], but it seems worthwhile to show them here as well.

The first issue is that the let-such-that expression is like a function, but by
Dafny semantics it may be a different function for each textual occurrence of `:|`

(see [0]).
It is easy to become confused and frustrated if, without realizing it, you're
using more than one `:|`

function. Therefore, let's make sure there is only one
textual occurrence of this operator. We do that by putting it inside
a named function:

```
function Pick(s: set<int>): int
requires s != {}
{
var x :| x in s; x
}
```

With this function, we write the function part of the `function-by-method`

`Max`

as

```
var x := Pick(s);
if s == {x} then
x
else
Maximum(Max(s - {x}), x)
```

The second issue is that we will need to help the verifier along in the proof
of invariance for the loop. The reason is that the next element that the loop
adds to the growing set `r`

may not be the same element that the `Pick`

function
would choose when going in the opposite direction. Specifically, if the loop
adds `x`

to `r`

as it's computing `Max(r + {x})`

, there's no reason to believe
that the definition of `Max(r + {x})`

picks `x`

and recurses on `Max(r)`

.
(Again, for a simpler example and better motivation, see [1].)

This lemma is tricky to prove, even if you've written similar proofs several
times. An additional complication for our function `Max`

is that `Max`

requires
its argument to be nonempty. Here is the lemma and a proof:

```
lemma MaxOfOneMore(t: set<int>, x: int)
requires t != {} && x !in t
ensures Max(t + {x}) == Maximum(Max(t), x)
decreases |t|
{
var u := t + {x};
var z := Pick(u);
var u'x, u'z := u - {x}, u - {z};
assert Max(u) == Maximum(Max(u'z), z);
if z == x {
// Straight from the definition of Max
} else if t == {z} {
assert u'z == {x} && u'x == {z};
// This is known about Max on singletons:
assert Max(u'z) == x;
assert Max(u'x) == z;
// So, regardless of which of x and z gets picked first, the result is the same
assert Max(u) == Maximum(x, z) == Maximum(z, x);
} else {
var u'z'x := u'z - {x};
calc {
Maximum(Max(u'z), z);
== { assert u'z == (u'z'x) + {x}; }
Maximum(Max((u'z'x) + {x}), z);
== { MaxOfOneMore(u'z'x, x); }
Maximum(Maximum(Max(u'z'x), x), z);
== // Maximum is commutative and associative
Maximum(Maximum(Max(u'z'x), z), x);
== { MaxOfOneMore(u'z'x, z); }
Maximum(Max(u'z'x + {z}), x);
== { assert u'z'x + {z} == u'x == t; }
Maximum(Max(t), x);
}
}
}
```

That's a mouthful. But once you have the lemma, using it once in the loop
body finishes the proof of our `function-by-method`

:

```
function Max(s: set<int>): (m: int)
requires s != {}
{
var x := Pick(s);
if s == {x} then
x
else
Maximum(Max(s - {x}), x)
} by method {
m :| m in s;
var r := {m};
while r != s
invariant {} != r <= s
invariant m == Max(r)
{
var x :| x in s - r;
MaxOfOneMore(r, x);
r := r + {x};
if m < x {
m := x;
}
}
}
```

A final remark. It looks as if `Max`

uses the lemma `MaxOfOneMore`

, and the lemma
uses `Max`

. Doesn't this mutual recursion mean we have to prove termination?
No, because there's isn't actually any mutual recursion in this example. To understand
why, remember that the two parts of a `function-by-method`

are treated as
if they were one function and one method (see Section 8).
This function and method are different points in the call graph, where calls to
`Max`

from ghost contexts go to the function part and calls to `Max`

from compiled
contexts go to the method part.
Therefore, we have that the following call-graph edges:

```
Max (function) --> Max (function)
MaxOfOneMore --> MaxOfOneMore, Max (function)
Max (method) --> MaxOfOneMore, Max (function)
```

As you can see from these, the function part of `Max`

is recursive and the lemma is
recursive, but there is no mutual recursion.

I started this note with the innocent problem of how to prove the termination of two
mutually recursive functions. As the solution, I showed the Parent Trick.
I then changed the code to make it compilable, which
introduced proof obligations to show certain choices to be deterministic. This can
be better solved using a `function-by-method`

, so I gave a brief introduction
to Dafny's `function-by-method`

declaration and used it to present two solutions.
The more straightforward solution worked for the example at hand, whereas
the second solution shows a more typical pattern for such method implementations.

I'm grateful to Mikaël Mayer for asking a question about these mutually recursive
functions, which propelled me to write this note, and for his helpful comments
on the write-up. The `function-by-method`

construct
was designed as a direct outgrowth of some collaborative work with Daniel Matichuk,
Olivier Savary Belanger, and Mike Dodds, where as a step toward verified compilation
we were considering how to implement immutable things with mutable things.

[0]K. Rustan M. Leino.
Compiling Hilbert's epsilon operator.
In Ansgar Fehnker, Annabelle McIver, Geoff Sutcliffe, and Andrei
Voronkov, editors, *LPAR-20. 20th International Conferences on Logic for
Programming, Artificial Intelligence and Reasoning — Short Presentations*,
volume 35 of *EPiC Series in Computing*, pages 106–118. EasyChair, 2015.
https://easychair.org/publications/paper/dM. 🔎

[1]K. Rustan M. Leino.
Dafny power user: Functions over set elements.
Manuscript KRML 274, February 2020.
http://leino.science/papers/krml274.html. 🔎