<small>Dafny Power User:</small><br>Comprehensions

Dafny Power User:
Comprehensions

K. Rustan M. Leino

Manuscript KRML 267, 27 May 2019

Abstract. Dafny has a number of comprehension-like constructs. This note describes and compares these constructs, showing how they compare both syntactically and semantically.

Dafny supports universal and existential quantifications, along with constructs used to prove a universally quantified expression or make use of an existentially quantified expression. Section 0 describes these logical quantifiers in Dafny. Section 1 shows program statements that can be used when reasoning about the quantifiers and points out differences in the various syntactic forms.

Set comprehensions and map comprehensions are like quantifiers in that they introduce bound variables that range over certain values. Section 2 shows the general and common forms of these comprehensions.

0. Quantifiers

Basic quantifier syntax

In mathematical textbooks and papers, the familiar universal quantifier takes on some notation like . It says that the predicate holds for all values of . In programming-language lingo, we say that is a bound variable whose scope is the body of the quantifier, . That is, any free occurrences of in are bound to the introduced by the quantifier.

In Dafny, the same universal quantifier is written forall x :: P. From a parsing perspective, the body of the quantifier extends “as far as possible”. Thus, the program snippet

forall x :: R ==> Q

is parsed as

(forall x :: (R ==> Q))

not as

(forall x :: R) ==> Q

Note that “as far as possible” does not stop at line endings. For example, a common pitfall is to write (here shown for a precondition)

requires
  forall x :: R ==> Q &&
  S

with the intention that forall x :: R ==> Q and S are two separate preconditions. Contrary to this expectation, the meaning of declaration, as written here, is

requires (forall x :: (R ==> (Q && S)))

If you intended to write the conjunction of the quantifier and S, then the proper syntax is

requires
  (forall x :: R ==> Q) &&
  S

A familiar mathematical notation for an existential quantifier is . It says that the predicate holds for some value of . In Dafny, the syntax is exists x :: P.⁰

Types of bound variables

Each variable in Dafny has a type. Typically, the type of a bound variable is inferred, but Dafny also allows the type to be declared explicitly. For example,

forall x: X :: P

declares the type of x to be X. For brevity, and to show the typical ways of writing quantifiers and comprehensions, I will leave off types throughout this note, but remember that you can always include them if you want to.

A common mathematical notation for quantifiers when the bound variables are drawn from some set is . A Dafny-like rendering of this expression is

forall x in S :: P  // error: syntax error

However, this is incorrect Dafny syntax, because it uses a set membership predicate where only the bound variable (optionally, with a type) is expected. The proper way to write such a quantifier in Dafny is

forall x :: x in S ==> P

Multiple bound variables

A quantifier can have more than one bound variable. For example,

forall x, y :: P

says that P holds for all values of x and y. It is logically equivalent to the nested quantifiers

forall x :: forall y :: P

For that matter, it is logically equivalent also to

forall y :: forall x :: P

Common practice in Dafny is to prefer the quantifier with multiple variables over the nested forms, if for no other reason than that it is more concise.¹

In the event that you write a list of bound variables and give types explicitly, note that each given type applies only to the variable that immediately precedes it. For example,

forall x: X, y: Y :: P

says that x has type X and y has type Y. If you only include the type of y, as in

forall x, y: Y :: P

then you are saying that y has type Y and that the type of x is to be inferred. In other words, you can think of this ":“ as having strong binding power than the ”,".

The typical forms of quantifier bodies

The body of a universal quantifier is typically an implication, as in

forall x :: R ==> P

You can read this in one of the following ways:

"for all x, the implication R ==> P holds"

"for all x, R implies P"

"for all x, if R holds, then so does P"

However, the antecedent of this implication (R) often serves the role of restricting (beyond just the type of x) the values of x under consideration. In other words, R tells you which values x ranges over. In that light, you would read the quantifier above in one of the following ways:

"for all x satisfying R, P holds

"for all x such that R holds, P"

"for all x (where x satisfies R), P holds"

"for all x [insert your own descriptive phase for R], P"

As a concrete instance of the last phrase, you may read forall x :: x in S ==> x % 2 == 0 as

"for all x in S, x is even"

and you may read forall i :: 0 <= i < a.Length ==> a[i] == 5 as

"for every index i of array a, a-sub-i is 5"

In analogy to what I just said about universal quantifiers, the typical form of an existential quantifier is a conjunction, as in

exists x :: R && P

For example:

exists x :: x in S && x % 2 == 0

exists i :: 0 <= i < a.Length && a[i] == 5

Again thinking of R as telling you which values x ranges over, you may read these existential quantifiers as

"there is an x in S for which x % 2 == 0 holds"

"there is an index i into a such that a-sub-i is 5"

Following Why3 [1], Dafny issues a warning if you write R ==> P as the body of an existential quantifier, because this is almost always a user error (a typo or a think-o). If this is really what you want to write, you can suppress the warning by instead writing any of the following expressions:

exists x :: (R ==> P)
exists x :: !R || P
exists x :: P <== R

Range predicates

Why did I just spend a page telling you ways to pronounce your quantifiers? Because that discussion spotlights the fact that the condition R, in either of

forall x :: R ==> P
exists x :: R && P

plays a special role, even though R is really just a part of the body of these quantifiers. In fact, others have adopted a notation for quantifier that feature a special place for this range predicate R. Here are some examples:


Universal quantifier	Existential quantifier	Source

		Dijkstra [3]
		Chandy and Misra [2]
		Gries and Schneider [4]
`\forall X x; R; P`	`\exists X x; R; P`	JML [5]

(In the case of JML above, X denotes the type of x.) In textbooks using these notations, it is often remarked that "for brevity, if R is true or is understood from context, then it (and for some of the authors above, some adjacent punctuation) is omitted". These shortened forms are:


Range listed separately	Range `true` or omitted	Source

		Dijkstra [3]
		Chandy and Misra [2]
		Gries and Schneider [4]
`\forall X x; R; P`	`\forall X x;; P`	JML [5]

There's more. By using some notation that separates the R from P, the De Morgan's Law for quantifiers looks especially nice:

Back to Dafny. If you like the notation where you get to separate the range of the bound variables from the rest of the quantifier body, then you'll be glad to learn that you can do this in Dafny, too. The syntax is:

forall x | R :: P
exists x | R :: P

1. Program Statements for Quantifier Reasoning

Dafny includes some proof features that are useful when reasoning about programs or theorems that involve quantifiers. These have a syntax similar to those of quantifiers, but there are differences.

Aggregate statements

The forall statement in Dafny is an aggregate statement: it has the effect of performing a number of simultaneous operations. When used in proofs, the statement has the form:

forall x | R
  ensures P
{
  S;
}

It is used to establish the property forall x | R :: P, that is, forall x :: R ==> P. It does so by checking that the statement S establishes P for any x that satisfies R. In logic, the effect of this statement is called universal introduction.

As a simple example, suppose you have a lemma that proves n <= Fib(n) for any n at least 5, where Fib is the usual Fibonacci function:

function Fib(n: nat): nat {
  if n < 2 then n else Fib(n-2) + Fib(n-1)
}

lemma FibProperty(n: nat)
  requires 5 <= n
  ensures n <= Fib(n)
{
  // some proof goes here
}

This lemma gives you the property n <= Fib(n) for a given n. But suppose you want to have this property in the universally quantified form. That is, you'd like to prove the following lemma:

lemma FibPropertyAll()
  ensures forall n :: 5 <= n ==> n <= Fib(n)
{
  // some proof to go here
}

How would we write this proof?²

The answer is to call FibProperty once for each n. All at once. For an infinite number of different values for n. That's what you do with the aggregate statement forall:

forall n | 5 <= n
  ensures n <= Fib(n)
{
  FibProperty(n);
}

In general, the body of a forall statement is more complicated than just one single lemma call. But for when the body is just one lemma call or just one calc statement, then Dafny infers the ensures clause automatically, so you can omit it:

forall n | 5 <= n {
  FibProperty(n);
}

Existential introduction and elimination

Working with existential quantifications also uses a repertoire of proof features. I will demonstrate these by writing a proof that shows Fibonacci numbers can be arbitrarily large:

lemma EverBigger(k: nat)
  ensures exists n :: k <= Fib(n)
{
  // proof to go here
}

Let's start the proof with some cases we can easily do, namely when k is small, let's say 0 or 1:

if k < 2 {
  // simple case: proof for k being 0 or 1 goes here
} else {
  // difficult case: proof for larger k goes here
}

Dafny does not prove either of these cases automatically, so we need to give more of the proofs ourselves.

To prove the lemma in the simple case, it suffices to demonstrate to the verifier a particular n for which the existential quantifier holds. That is, we want to give a witness to the existential quantifier. One such witness is 1, since k <= 1 == Fib(1). Another such witness is 12, since k <= 144 == Fib(12). Yet another such witness is k, since k <= k == Fib(k) in our simple case. Let's go with this one, so we add an assertion to the “then” branch of the if statement in the lemma body:

assert k <= Fib(k);

Dafny will prove this assertion³ and will then notice that k is an existential witness that proves the postcondition. In logic, this is called existential introduction. All that means is that if you have a value that satisfies a particular property, then such a value exists. Stated different, if you have a value “in your hands”, then a value exists—this seems so obvious that we almost feel awkward speaking about it (your next-door neighbor would certainly think you crazy to hear that this is what you did for a living).

So what about the difficult case? We can prove it by induction, by first obtaining an n whose Fibonacci value is at least k-1 and then building an even larger Fibonacci value from there. To start this off, we call the lemma recursively on k-1:

EverBigger(k-1);

This lets us obtain the postcondition of EverBigger(k-1). To write that down explicitly in our proof—to check that the verifier draws the conclusion we'd expect from the lemma call and to remind ourselves of what property is—we can write an assertion:

assert exists n' :: k-1 <= Fib(n');

Good so far. Next, we want to construct a Fibonacci number that is at least 1 larger than Fib(n'), because that would complete the proof. But what is this n' that i just mentioned? All the assertion above tells us that some such n' exists. We'd like to have such an n' “in our hands” so that we can work with it.

Going from something we know exists to something “in our hands” is called Skolemization or existential elimination. You achieve it in Dafny by the assign-such-that statement:

var m: nat :| k-1 <= Fib(m);

This statement introduce a local variable m and gives it some arbitrary value that satisfies k-1 <= Fib(m). Of course, this would not be possible if no such value exists, so the assign-such-that statement incurs a proof obligation that such an m exists. This proof obligation follows from the property we asserted just above.

Almost there. All that remains of our plan to establish the lemma's postcondition is to construct a Fibonacci number strictly larger than Fib(m). We observe that Fib(m) + Fib(m+1) is strictly larger than Fib(m), and thus we have that Fib(m+2) is strictly larger than Fib(m). Boom!

Okay, let's be frank. Maybe we didn't so much “observe” this as we did “wish” or “conjecture” or “loosely think” that it may hold. Well, it does hold. (Phew!) We can check that by asking the verifier if it can prove it for us:

assert k <= Fib(m) + Fib(m + 1) == Fib(m + 2);

The verifier immediately prove this assertion.⁴ Moreover, by writing down this assertion, we are also showing the verifier the witness m+2, which proves the existential quantifier in the lemma's postcondition.

The point I set out to illustrate with this example is that you can Skolemize a quantifier

exists x :: P

by the assign-such-that statement

var x :| P;

Notice the difference in punctuation.

Lemmas with out-parameters

I just showed you an example that involves existential quantifiers. The example showed that the proof of the EverBigger lemma used existential introduction twice (Fib(k) in the simple case and Fib(m+2) in the difficult case), thus converting the k and the m+2 “in ours hands” into existential quantifications. The example also showed that the invocation of the (recursive call to the) lemma used existential elimination to convert the existential quantification in the lemma's postcondition into an m “in our hands”. As impressive it is that Dafny has such features, it has an even more useful feature that lets you avoid these existential-quantifier conversions in the first place: lemma out-parameters.

In mathematics, lemmas are parameterized by the variables they mention. These are like in-parameters. Rarely or never would a mathematical lemma be thought of as having out-parameters. In Dafny, a lemma is really just a ghost method, and a method can have both in- and out-parameters. This can be quite useful. Instead of a lemma proving the existence of some value, it may as well just return some such value.

Here is the EverBigger lemma from above, but with n declared as an out-parameter:

lemma EverBigger(k: nat) returns (n: nat)
  ensures k <= Fib(n)
{
  if k < 2 {
    n := k;
  } else {
    var m := EverBigger(k-1);
    n := m + 2;
  }
}

Binding guards

Dafny includes one other feature that makes working with quantifiers more streamlined: if statements with binding guards. Such a statement answers the order “if there is one, gimmie one in my hand”.

Suppose we write a proof that splits into two cases according whether or not the value y is a Fibonacci number. We could then write:

if exists n :: y == Fib(n) {
  var n :| y == Fib(n);
  // y is the nth Fibonacci number
} else {
  // y is not a Fibonacci number
}

This expresses what we want, but feels a little clumsy, since we are repeating the condition y == Fib(n). Instead using a binding guard, we can write this if statement as

if n :| y == Fib(n) {
  // y is the nth Fibonacci number
} else {
  // y is not a Fibonacci number
}

The punctuation :| is the same as in the assign-such-that statement, not the :: in the similar position of the existential quantifier.

2. Sets and Maps

Logical quantifiers and the other constructs we've seen introduce some bound variables and in some way restrict the values that those bound variables may range over. This is also the case with set comprehensions and map comprehensions.

Set comprehensions

It is easy to write down a set in Dafny. For example,

{ 2, 3, 5 }

is the set of the three smallest prime numbers. Such an expression, where the elements of the set are listed explicitly, is called a set display. But what if the set you want to define cannot be written as a set display?

A set comprehension defines a set of elements in a schematic way. An example of a set comprehension in common mathematical notation is

which defines the set of the smallest 100 natural numbers. Another example is

which defines the 100 smallest perfect squares. The bound variable in both of these comprehensions is , and the values over which is specified to range is defined by the predicate . Letting range over those values, the first set then contains the elements of the form whereas the second set contains the elements of the form . That is, in the first set, the elements are the legal values of themselves, whereas in the second set, the elements are the squares of each legal value of .

More generally, the mathematical notation takes some shape like . The reader is supposed to understand that is the bound variable. With the understanding that is the bound variable, we can define the set comprehension by describing exactly when the set contains an element :

Or, to use the notation where the existential's range is given separately:

In Dafny, the same set comprehension has the following form:

set x | R :: f(x)

x is the bound variable (or, more generally, a list of bound variables), R is the range predicate for the bound variables, and f(x) is the term expression of the set comprehension. The bound variable is listed explicitly, unlike the common mathematical notation where the reader has to infer what the bound variable is. The two example sets given above are written as follows in Dafny:

set x | 0 <= x < 100 :: x
set x | 0 <= x < 100 :: x*x

Although at first sight it just looks less concise, the notation that lists the bound variable explicitly has some niceties.

One nicety is that the definition in terms of the quantifier, like above, shows the similarity of notation:

y in (set x | R :: f(x))   <==>   exists x | R :: y == f(x)

Another nicety is that one can easily list additional bound variables. Supposing that R is a predicate over both x and z, then here are two examples:

set x,n | Fib(n) <= x < Fib(n) + n :: f(x)
set x,n | Fib(n) <= x < Fib(n) + n :: g(x,n)

The first of these sets contains f(x) for every x that is within n of Fib(n) for some n. An equivalent way to write it in common mathematical notation is:

The second set contains g(x,n) for every x and n such that x is within n of Fib(n). Here, the equivalent mathematical notation is more clumsy and requires using yet another bound variable:

Dafny's general notation of making the bound variable explicit is also used by many authors (e.g., [3, 4]). It is also similar to the list-comprehension notation used in some other languages. For example, the Dafny set

set x,y | 0 <= x <= y <= 100 && x + y == 100 :: (x,y)

which contains the pairs of natural numbers that sum to 100, contains the same elements as the Python list:

[(x,y) for x in range(0, 101) for y in range(x, 101) if x + y == 100]

and the Haskell list:

[(x,y) | x <- [0..100], y <- [x..100], x + y = 100]

Simplified set comprehensions

I just spent many words describing the general set-comprehension notation in Dafny. However, in many set comprehensions in practice, there is just one bound variable and the term expression is just that bound variable. For example, as we have seen, the set of the smallest 100 natural numbers is:

set x | 0 <= x < 100 :: x

For this common case, Dafny lets you omit the term expression and simply write:

set x | 0 <= x < 100

This expression looks like the common mathematical notation . Indeed, for these simplified set comprehensions, it is easy to “understand” what the mathematical notation intends to be the bound variables.

As a note about the verifier in Dafny, automation tends to work better for the simplified set comprehensions where the term expression can be omitted.

Map comprehensions

A map is essentially a set of pairs, where the left-elements of the pairs are unique (that is, each left-element functionally determines the corresponding right-element). Like the display expressions for sets, a map can be defined by a map display. For example,

map[2 := 'c', 137 := 'a']

maps the integer 2 to the character 'c' and maps the integer 137 to the character 'a'. Each pair like 2 := 'c' can be called a maplet. Also, the left-element of the maplet is called a key and the right-element gets the nondescript name value.

Like the comprehensions for sets, a map can be defined by a map comprehension. It has the form:

map x | R :: f(x) := g(x)

For example,

map x | 0 <= x < 100 :: x*x := x

is the map from each of the first 100 perfect squares to their respective square roots.

If you read a map comprehension as a set of maplets with unique keys, then you essentially already understand the notation. Nevertheless, I will offer some notes and point out some features specific to maps.

One thing to note is that the maplets must have unique keys. For example, the verifier will complain if you try to write a map comprehension like

map x | -10 <= x <= 10 :: x*x := x

because it says to map 4 to both 2 and -2, which is not functional.

The general map-comprehension expression is quite flexible. For example, suppose m is a map from numbers to characters, and suppose we want to create a new map n from a subset of the keys in m to some other characters. More precisely, whenever a key in m is in the image of a function f, say a key f(x) for some x, then we want n to map that key to h(x). We then define n as

map x | f(x) in m.Keys :: f(x) := h(x)

Most of the time, however, the map comprehensions we tend to write have the form

map x | R :: x := g(x)

For these common maps, Dafny allows us to omit the "x :=" and write just

map x | R :: g(x)

Almost all map comprehensions in practice can be written in this simplified form. But for when the simplified form is not sufficient (like in the example above with the maplets f(x) := h(x)), the general form is available.

Lambda expressions

Finally, a note about the difference between maps and functions. You may think of a map as a precomputed table, whereas a function is something that computes a value from a given key. For comparison, let's consider writing the map

map x | R :: g(x)

as a function.

Typically, a function is declared with a name. The map above is then written along the lines of

function F(x: X): Y
  requires R
{
  g(x)
}

A function can also be anonymous, in which case it is usually called a lambda expression. The example map is then written

x requires R => g(x)

3. Summary

Here is a listing of the syntactic forms discussed in this note:

forall x :: P
forall x | R :: P
forall x | R ensures P { S; }
exists x :: P
exists x | R :: P
var x :| P;
if x :| P { S; }
set x | R :: f(x)
set x | R
map x | R :: f(x) := h(x)
map x | R :: g(x)
function F(x: X): Y { g(x) }
x requires R => g(x)

Acknowledgments

I'm grateful to Jay Lorch for many helpful comments on this note.

References

[0]Nada Amin, K. Rustan M. Leino, and Tiark Rompf. Computing with an SMT solver. In Martina Seidl and Nikolai Tillmann, editors, Tests and Proofs — 8th International Conference, TAP 2014, volume 8570 of Lecture Notes in Computer Science, pages 20–35. Springer, July 2014. 🔎

[1]François Bobot, Jean-Christophe Filliâtre, Claude Marché, and Andrei Paskevich. Why3: Shepherd your herd of provers. In Boogie 2011: First International Workshop on Intermediate Verification Languages, pages 53–64, Wrocław, Poland, August 2011. https://hal.inria.fr/hal-00790310. 🔎

[2]K. Mani Chandy and Jayadev Misra. Parallel Program Design: A Foundation. Addison-Wesley, 1988. 🔎

[3]Edsger W. Dijkstra. A Discipline of Programming. Prentice Hall, Englewood Cliffs, NJ, 1976. 🔎

[4]David Gries and Fred B. Schneider. A Logical Approach to Discrete Math. Texts and Monographs in Computer Science. Springer-Verlag, 1994. 🔎

[5]Gary T. Leavens, Albert L. Baker, and Clyde Ruby. JML: A notation for detailed design. In Haim Kilov, Bernhard Rumpe, and Ian Simmonds, editors, Behavioral Specifications of Businesses and Systems, pages 175–188. Kluwer Academic Publishers, 1999. 🔎

[6]K. Rustan M. Leino and Clément Pit-Claudel. Trigger selection strategies to stabilize program verifiers. In Swarat Chaudhuri and Azadeh Farzan, editors, Computer Aided Verification - 28th International Conference, CAV 2016, Proceedings, Part I, volume 9779 of Lecture Notes in Computer Science, pages 361–381. Springer, 2016. 🔎

^0.The Emacs IDE for Dafny typesets certain Dafny constructs in the notation you're more likely to see in a paper. By default, it shows forall x :: P as and shows exists x :: P as . ↩

^1.Internally, the Dafny verifier works more effectively with certain quantifiers. The verifier tries to detect when an alternative form of a given quantifier may perform better, and will in those cases rewrite the quantifier automatically [6]. For example, it may choose to un-nest some quantifiers. The goal of such rewrites is to support natural-looking programs while getting good prover performance. ↩

^2.As it turns out, Dafny's automatic induction will prove both FibProperty and FibPropertyAll automatically. If these were the only lemmas we cared about, there would be nothing else to say or do. Nevertheless, I'm using this example to show the forall statement. If you want to make sure what I'm about to say gives a proof, you can turn off automatic induction for FibPropertyAll by marking it with the attribute {:induction false}. ↩

^3.Dafny can also prove an assertion like assert Fib(12) == 144;. Internally, Dafny uses a “dual-rail encoding” of functions that lets it obtain the value of Fib(12) (since 12 is a literal constant) and Fib(k) (where k is a variable). If you're interested in how this is done, I refer you to [0]. ↩

^4.Here is a proof of the assertion k <= Fib(m) + Fib(m+1). The “difficult case” in our proof applies when k is at least 2, so k-1 is at least 1, so we know about Fib(m) that it is at least 1. From this, it follows that m cannot be 0, for Fib(0) == 0. This is important, because it means that m+1 is at least 2 and therefore the inductive case of the definition of Fib applies. In other words, we have Fib(m+1) == Fib(m) + Fib(m-1). We already concluded that Fib(m) is at least 1. On behalf of Fib returning a nat, we have that Fib(m-1) is at least 0. So, Fib(m+1) is at least 1. In other words, Fib(m) + Fib(m+1) is at least 1 more than Fib(m), which in turn is at least k-1. Thus, Fib(m) + Fib(m+1) is at least k.

Had we split the “simple case” and “difficult case” up so that the “simple case” only covered k == 0, then we could not have concluded m != 0 in the argument above. Some users of other interactive proof assistants may be bothered by this, because they would say k has type nat and therefore the induction on k should use k == 0 as the base case. Mathematics imposes no such restriction on induction, and indeed, as this proof shows, we profit from splitting the cases of EverBigger into k < 2 and 2 <= k. ↩