<small>Dafny Power User:</small><br>Statement versus Expression Syntax

Dafny Power User:
Statement versus Expression Syntax

K. Rustan M. Leino

Manuscript KRML 266, 17 May 2019

Abstract. This note describes some constructs that exist in both a statement form and an expression form in Dafny. It points out the syntactic similarities and differences between the two forms, and discusses related constructs.

Dafny distinguishes between statements and expressions. For example, the body of a method is a statement whereas the body of a function is an expression. A statement can assign to local variables, allocate and mutate objects in the heap, use loops in control flow, and be nondeterministic. An expression cannot do any of those things. Instead, it can bind values to bound variables, read values in the heap, use recursion, and is always deterministic. Syntactically, most statements are terminated with a semi-colon and sometimes a sequence of statements is surrounded by curly braces. A few expressions contain semi-colons, but never at the end, and an expression can be surrounded by round parentheses.

Let's take a look at some constructs that exist in both a statement form and an expression form.

0. If Statements and If-Then-Else Expressions

Of constructs that exist in both statement and expression form, the conditional (if) is probably the most common, and it is also the one with the most conspicuous difference between the two forms.

The conditional statement is the if statement, which has a “then” branch and an optional “else” branch. The “then” branch is always a block statement, which is necessarily surrounded by curly braces. The “else” branch, if present, is either a block statement or another if statement.

For example, the following statement declares local variables min and max that are assigned according to the relative sizes of x and y:

var min, max;
if x < y {
  min, max := x, y;
} else {
  min, max := y, x;
}

Notice that min and max are declared before the if statement (Dafny will infer the types of these two variables), which means they are still in scope after the whole if statement. If these variables were instead declared inside each branch, they would go out of scope at the curly brace that ends each respective block statement.

The conditional expression is the if-then-else expression. Unlike the if statement, the expression form uses the keyword then to separate the guard from the “then” branch, and always requires and explicit “else” branch. For example, an expression that returns the smaller of x and y is:

if x < y then x else y

Such expressions can be used as right-hand sides of assignments to local variables:

var min, max := if x < y then x else y, if x < y then y else x;

For this example, we can also use a single if-then-else expression where each branch results in a pair. We can assign that pair to a variable, and then assign the pair's components to min and max:

var minmax := if x < y then (x,y) else (y,x);
var min, max := minmax.0, minmax.1;

The following shows an example of nested and cascaded if statements:

var middle;
if x < y {
  if y < z {
    middle := y;
  } else if z < x {
    middle := x;
  } else {
    middle := z;
  }
} else {
  if x < z {
    middle := x;
  } else if z < y {
    middle := y;
  } else {
    middle := z;
  }
}

And here is the analogous program using conditional expressions:

var middle :=
  if x < y then
    if y < z then y else if z < x then x else z
  else
    if x < z then x else if z < y then y else z;

For statements, there are some additional forms of conditionals. Because these forms allow nondeterminism, there statements do exist in an analogous expression form. The next few subsections describe these additional conditional statements.

0.0. If-case Statements

One additional form of conditional statements is the if-case statement. Instead of two mutually exclusive branches, the if-case statement provides any number of branches (cases, alternatives), and their conditions need not be mutually exclusive. Syntactically, the if-case statement follows the keyword if with ones or more branches of the form

case Guard => Body

where Guard is an expression and Body is a list of statements.

For example, the computation of min/max can be written as follows:

var min, max;
if
case x <= y =>
  min, max := x, y;
case y <= x =>
  min, max := y, x;

Note that <= is the “at most” (or, some say “less than or equal to”) operator, whereas => is the piece of syntax that separates the guard and body of each case.

The if-case statement picks any one of the cases whose guard evaluates to true. The ordering of the cases is not relevant. In the example above, if x and y are equal, then the choice of which case to execute is nondeterminstic. For this example, it so happens that both cases have the same effect if x and y are equal, but that need not be so. Not having to specify which branch to take when the choice is irrelevant frees the programmer's mind and, arguably, results in a cleaner, more symmetric, and more abstract program.

Each case gives rise to a new block scope, despite the fact that the body is a list of statements, not a block statement. In other words, any local variables declared in one case go out of scope where the case ends.

The body of a case continues until the next case begins or until the enclosing block ends. If you want to end the last case sooner, you could surround the entire if-case statement with a pair of curly braces, but that doesn't look so nice. Therefore, Dafny optionally allows the set of cases to be enclosed in a pair of curly braces, like this:

var min, max;
if {
  case x <= y =>
    min, max := x, y;
  case y <= x =>
    min, max := y, x;
}

An if-case statement can have any number of branches. For example:

var middle;
if {
  case x <= y <= z =>  middle := y;
  case z <= y <= x =>  middle := y;
  case y <= x <= z =>  middle := x;
  case z <= x <= y =>  middle := x;
  case y <= z <= x =>  middle := z;
  case x <= z <= y =>  middle := z;
}

As I mentioned, if more than one guard evaluates to true, Dafny may choose to execute any one of the the corresponding bodies (but only one!). What if none of the guards evaluates to true? That is not allowed, and the verifier will report an error if it cannot prove that at least one case applies.

0.1. Indiscriminate Choice

When Dafny is used as a modeling language, one may want to write down several statements that can be chosen between indiscriminately. This can be done by writing * in place of the guard expression in the common if statement.

For example, the effect of the statement

if * {
  S;
} else if * {
  T;
} else {
  U;
}

is to execute one of the statements S, T, or U, but you can never predict which one of those statements will be chosen.

Such an indiscriminate choice can also be written using an if-case statement:

if
case true =>  S;
case true =>  T;
case true =>  U;

0.2. Binding Guards

Finally, both the common if statement and the if-case statement allow guards to introduce a variable with some constraints. These are called binding guards.

For example, suppose you want to perform an operation to an arbitrary even number in a given set, unless the set contains no even number, in which case you want to do something else. One way to accomplish this is to write an if statement whose then branch uses an assign-such-that statement to pick such an element:

if exists x :: x in S && x % 2 == 0 {
  var x :| x in S && x % 2 == 0;
  ProcessEven(x);
} else {
  ProcessEmpty();
}

But there is a lot of duplication between the existential quantifier in the guard and the assign-such-that statement in the then branch. With a binding guard, the same program can be written as follows:

if x :| x in S && x % 2 == 0 {
  ProcessEven(x);
} else {
  ProcessEmpty();
}

This statement introduces x and picks as its value any value satisfying the condition following the :|. If no such x exists, the else branch is executed. The bound variable x is in scope only in the then branch.

Similarly, suppose you want to pick an even number from a set and apply some operation to it, or pick an odd number from the set and apply some other operation to it, or do something else if the set contains neither an even nor an odd number. Then you can use an if-case statement with binding guards:

if
case x :| x in S && x % 2 == 0 =>
  ProcessEven(x);
case x :| x in S && x % 2 == 1 =>
  ProcessOdd(x);
case S == {} =>
  ProcessEmpty();

Note that if S contains both even and odd numbers, then Dafny may pick either of the first two cases.

1. Match Statements and Match Expressions

Another branching construct is match, which also comes in a statement form and in an expression form. The syntax of these two are identical, except that each body of the former is a list of statements where each body of the latter is an expression. These have the forms

match s
case A(x) =>
  StmtList0;
case B(y) =>
  StmtList1;
case C(z) =>
  StmtList2;

and

match s
case A(x) => Expr0
case B(y) => Expr1
case C(z) => Expr2

respectively.

What follows the keyword case is a pattern, which looks like an expression built of datatype constructors and bound variables. For example, consider the standard datatype definition for a list, parameterized by an element type T:

datatype List<T> = Nil | Cons(T, List<T>)

The following expression returns the minimum of 2 and the length of a list xs:

match xs
case Nil => 0
case Cons(x, Nil) => 1
case Cons(x, Cons(y, ys)) => 2

Unlike in several other programming languages, the cases of a match in Dafny are unordered. A bound variable that is not used can be replaced by an underscore, _. So, the match expression above can be written equivalently as

match xs
case Cons(_, Nil) => 1
case Cons(_, Cons(_, _)) => 2
case Nil => 0

Analogously to the if-case statement, each case of a match statement is implicitly a lexical scope. In other words, any local variables declared in one case goes out of scope at the end of that branch.

Also as for the if-case statement, each case of a match goes as far as possible, but not past another case. To end a case early, the set of cases can be enclosed in a pair of curly braces. For example:

match xs
case Cons(_, ys) =>
  match ys {
    case Cons(_, _) => 2
    case Nil => 1
  }
case Nil => 0

Note that curly braces, not parentheses, are used in this way for both match statements and match expressions. One can, of course, also surround an entire match expression with parentheses, so the expression above can be written equivalently as:

match xs
case Cons(_, ys) =>
  (match ys
   case Cons(_, _) => 2
   case Nil => 1)
case Nil => 0

In some situations, it may also be possible to simply reorder the cases for the same effect:

match xs
case Nil => 0
case Cons(_, ys) =>
  match ys
  case Nil => 1
  case Cons(_, _) => 2

2. Local Variables and Let Bindings

The statement that introduces a local variable has the form:

var x: int := 17;

The type decoration ": int“ is needed only if Dafny cannot infer the type, so it is typically omitted. The assignment of an initial value, here ”:= 17", can also be omitted, if the initial value of the variable is irrelevant. (If the variable is used before the program explicitly assigns to it, Dafny will initialize the variable to an arbitrary value of its type.) The variable introduced stays in scope until the current lexical scope ends (typically at the next close-curly-brace).

A var statement can introduce several variables. If so, any type decoration acts only on the variable it follows. That is, you can think of : as having higher binding power than the , that separates the variable names. If an initial assignment is used, it must give a value to all variables being declared. For example,

var a, b: bool, c := 13, true, 8.29;

declares an integer variable a, a boolean variable b, and a real-valued variable c.

The following example introduces three local variables, z, x, and y, and arranges in a complicated way to set z to 0 (where k is some integer variable in the enclosing scope):

var z;
var x, y := k, k;
z := x - y;

There is a form of the var statement in expressions: the let expression. The variables introduced by let expressions are usually called bound variables or let-bound variables. The syntax of the let expression is like the var statement, but followed by an expression. For example, the expression

var w := 5; w + w

returns 10. If you're used to functional programming, you may choose to read this statement as “let” (var) w “be” (:=) 5 “in” (;) w + w.

The syntax intentionally mimics that of the var statement, to deemphasize the distinction between statements and expressions. For example, the statement that assigns to z above can be written with a let expression:

var z :=
  var x, y := k, k;
  x - y;

except that, here, the variables x and y are only in scope in the body of the let expression (which ends at the semi-colon); in the statement above, x and y are introduced as local variables, which remain in scope until the end of the lexical block. Note that the first semi-colon is a separator in the let expression, whereas the second semi-colon terminates the var statement.

Local variables can be modified by assignment statements. In contrast, let-bound variables are immutable. In this light, we think of a let expression as binding a value to each of its let-bound variables. Whereas the right-hand side of the := in a var statement gives initial values to the local variable, the right-hand side of the := in a let expression provides defining values for the let-bound variables. A syntactic difference between the var statement and let expression is that a let expression must always include the := and the accompanying defining expressions.

The relation between var statements and let expressions is analogous to the relation between a case in a match statement and a case in a match expression. Recall, for the match constructs, each case takes the shape

case Pattern => StmtList;

case Pattern => Expr

respectively. Analogously, var statements and let expressions take the respective shapes

var x := Expr; StmtList;

var x := Expr; Expr

2.0. Ghost Variables

Many constructs in Dafny come in two manifestations: compiled or ghost. A ghost construct is used in specifications to reason about the correctness of the program, but is erased by the compiler and has no run-time cost. Local variables and let expressions come in both manifestations. The ones we saw above were all compiled manifestations. To declare local variables or let-bound variables as ghost, simply precede the var keyword with ghost.

For example,

var m := 20;
ghost var n := m + 3;

introduces a compiled variable m and a ghost variable n.

Dafny makes sure that ghost variables (and other ghost constructs) really can be erased by the compiler without changing the meaning of the program. One of the checks it performs is that the values of compiled variables never depend on ghost variables. For example, the following declaration of n as a compiled variable is not legal, because the use the value of a ghost variable in an assignment to a compiled variable:

ghost var m := 20;
var n := m + 3;  // error: illegal to assign a ghost to a compiled variable

When a var statement or let expression omits the ghost keyword, all variables the statement introduces are declared as compiled. And when the ghost keyword is present, all variables the statement introduces are declared as ghost. There is one exception to this rule. The exception applies when a var statement that omits the ghost keyword gives a method call as the right-hand side. Then, any local variable that corresponds to a ghost out-parameter of the method called is implicitly declared as ghost, and if the method itself is ghost, then all of the local variables are implicitly declared as ghost.

For example, consider a method with two out-parameters, one compiled and one ghost:

method M() returns (x: int, ghost y: int)

The statement

var a, b := M();

declares a to be a compiled local variable and b to be a ghost local variable. If the statement explicitly uses the ghost keyword, as in

ghost var a, b := M();

then both a and b are declared to be ghost local variables.

2.1. Patterns

In a var statement and let expression, the left-hand side of := need not be a list of variables, but can be a list of patterns, just like the patterns in the cases of a match construct. This has the effect of deconstructing the right-hand side of := (that is, the initializing or defining expressions) and then assigning the new variables.

For example, using the datatype definition List from above, the var statement

var Cons(a, Cons(b, Nil)) := E;

introduces two variables, a and b, and sets these to the first two elements of the list E. The value denoted by the right-hand side must match the pattern in the left-hand side, and this is enforced by the verifier. In the example, E must denote a list whose length is exactly 2.

Just like in match patterns, any variable that is not needed can be replaced by an underscore. For example,

var Cons(_, Cons(b, _)) := E;

introduces b and initializes it to the second element of the list E, where E is allowed to have more than two elements.

To discard a result from the right-hand side, use an underscore as the corresponding actual out-parameter. For example, if P is a method with one out-parameter,

var _ := P();

calls P() and then discards its result. This is also allowed if P is a function, but then there's not much point in making the function call at all.

2.2. Assign-Such-That Statements and Let-Such-That Expressions

The var statements and let expressions we've seen so far are precise about what value to assign or bind to the new variables. There's is a cousin to these constructs that instead prescribes a set of values, any one of which may be picked at run-time as the value assigned. It is the assign-such-that statement (in statement form) and the let-such-that expression (in expression form).

For example, the assign-such-that statement

var x :| x in S;

introduces local variable x and says to set it to any value that satisfies the predicate x in S. In other words, this statement sets x to some value in S. (The verifier will complain for this example statement if it cannot prove S to be nonempty.)

Syntactically, the difference between an ordinary assignment and an assign-such-that construct is that the former uses := whereas the latter uses :|. Also, in the ordinary assignment, the variables introduced are not in scope in the right-hand side, where the variables introduced are in scope in the right-hand side of the assign-such-that constructs. Like the var statement and let expression, the such-that forms can introduce multiple variables, but the left-hand side is always a list of variables, never patterns.

Here is an example var statement with an assign-such-that statement that introduces three new variables, b, s, and t. It requires that either S is nonempty or N is strictly positive.

var b, s, t :| (b ==> s in S) && (!b ==> 0 <= t < N);

The assign-such-that constructs give rise to a proof obligation that there exists some value for each of the new variables that satisfies the right-hand side. For the example above, the proof obligation thus amounts to that S is nonempty. A Dafny compiler may impose additional restrictions on the right-hand side, because it may not always be clever enough to generate executable code that is guaranteed to, in finite time, find values for the variables.

The assign-such-that statement and let-such-that expression are the same, but there is an additional restriction on compiled let-such-that expressions. The restriction stems from the fact that expressions in Dafny are always deterministic, that is, if you evaluate an expression twice in the same state, you will get the same value. This is important in order to let the verifier reason about “obvious” properties like

E == E

Statements, on the other hand, can be nondeterministic, so assign-such-that statements may return different values each time they are executed:

var x :| x in S;
var y :| y in S;
assert x == y;  // this condition is NOT guaranteed, unless |S| == 1

Dafny's guarantee that let-such-that expressions be deterministic (within each run of the program) is unproblematic for ghost expressions, but the same guarantee for compiled let-such-that expressions would incur too high of a run-time cost (see [0]). Instead, Dafny restricts compiled such-that-expressions to those that uniquely specify the values of its variables. For example, if S is a nonempty set of integers, the compiled expression

var x :| x in S && forall y :: y in S ==> x <= y;
x

is legal and evaluates to the smallest element in S. Unfortunately, this restriction on compiled let-such-that expressions is sometimes more harsh than one would like.

2.3. Upates

Let-bound variables get their value at the time of declaration, but local variables can be subsequently updated with assignment statements or assign-such-that statements. These are like the assignments that provide the initial value of the local variables, except that patterns in the left-hand side are allowed only when the local variables are introduced, not in subsequent assignments.⁰

3. Statement Expressions

Generally, statements contain expressions, but not the other way around. With one exception: the statement expression. It has the form S; E, where S is one of several ghost statements and E is an expression. The expression S; E evaluates to E. That is, statement S does not affect the value of the expression. The role of S is to provide information that will help prove the well-formedness of E.

For example, suppose there is a function F and a lemma AboutF that tells us something about the value returned by F:

function method F(x: int): int
lemma AboutF(x: int)
  requires 0 <= x
  ensures 0 < F(x)

In the example, the lemma AboutF, which can be used if x is non-negative, says that F(x) returns a positive value. Now, consider the following statement:

if 0 <= y && 10 / F(y) == 2 {  // error: possible division by zero
  // ...
}

Without further information, Dafny is unable to prove the absence of division-by-zero in the guard of this if statement. To help the verifier along, we can use the lemma. To do that, we might try calling the lemma just before the if statement:

AboutF(y);  // error: precondition violation
if 0 <= y && 10 / F(y) == 2 {
  // ...
}

But that doesn't work, because in that context, it could be that y is negative. By using a statement expression, we can insert the lemma call just where we need it:

if 0 <= y && 10 / (AboutF(y); F(y)) == 2 {
  // ...
}

A statement expression accepts five kinds of statements: lemma calls and assert, assume, reveal, and calc statements.

3.0. Lemmas with out-parameters

There's a restriction on the lemma calls in statement expressions: the lemma must not have any out-parameters.¹ This may be frustrating when it occasionally pops up.

As an example, consider the following declarations:

// Some property that integers may have
predicate P(k: int)

// A lemma that, for every x, shows---by returning a witness---the
// existence of a k satisfying P(k) 
lemma MyLemma(x: int) returns (k: int)
  ensures P(k)

// Two functions that need the P property
function method G(x: int, ghost k: int): int
  requires P(k)
function method H(x: int): int
  requires exists k :: P(k)

It would be nice to be able to write an expression

var k := MyLemma(x); G(x, k)

or to write an expression

var _ := MyLemma(x); H(x)

when the out-parameter does not need to be named. But the restriction forbids these.

There are workarounds. They are rather clumsy, but they show how various expressions discussed in this note can be combined to produce the desired result.

For the latter case, a workaround is the expression

assert exists k :: P(k) by {
  var _ := MyLemma(x);
}
H(x)

The former is similar, but also uses a let-such-that expression to obtain a name for the value whose existence is proved by the first assert:

assert exists k :: P(k) by {
  var _ := MyLemma(x);
}
ghost var k :| P(k);
G(x, k)

Note that this let-such-that expression is ghost, so the restriction that k must be determined uniquely does not apply.

4. Putting it all together

Dafny makes a distinction between statements and expressions, but the syntactic similarities I've discussed in this note make it less of a burden to remember what to write where. As a final example, consider the following (rather contrived) method and function for computing the same thing:

method DoItByMethod(xs: List<int>) returns (r: nat)
{
  match xs
  case Nil =>
    return 0;
  case Cons(x, _) =>
    var square := x * x;
    AboutF(square);
    return F(square);
}

function method DoItByFunction(xs: List<int>): nat
{
  match xs
  case Nil =>
    0
  case Cons(x, _) =>
    var square := x * x;
    AboutF(square);
    F(square)
}

To be technical about it, the method uses a match statement, a var statement with a local variable, a lemma call, and the final function call that computes the result, whereas the function uses a match expression, a let expression with a let-bound variable, a statement expression featuring a lemma call, and the final function call.

But when you're in the midst of it, there's no need to dwell on those fine distinctions!

Acknowledgments

I thank Matthias Schlaipfer for useful feedback on this note.

References

[0]K. Rustan M. Leino. Compiling Hilbert's operator. In Ansgar Fehnker, Annabelle McIver, Geoff Sutcliffe, and Andrei Voronkov, editors, LPAR-20. 20th International Conferences on Logic for Programming, Artificial Intelligence and Reasoning — Short Presentations, volume 35 of EPiC Series in Computer Science, pages 106–118. EasyChair, December 2015. 🔎

^0.This restriction may be removed in some future version of the language. ↩

^1.This restriction may be removed in some future version of the language. ↩