Tracked regions

Tracked regions

K. Rustan M. Leino

10 October 2023, IFIP WG 2.3, Trento, Italy

A simple class

class Cell {
  var data: int

  method Increment()
    modifies this
  {
    data := data + 1;
  }
}

A composite class

class Machine {
  var motor: Motor
  var screen: Monitor

  method Operate()
    modifies this, motor, screen
  {
    motor.Turn();
    screen.Update();
  }
}

Abstraction

class Machine {
  ghost var Repr: set<object>
  ...
  var motor: Motor
  var screen: Monitor

  method Operate()
    ...
    modifies Repr
  {
    motor.Turn();
    screen.Update();
  }
}

A datatype

datatype Machine =
  | Machine(motor: Motor, screen: Monitor)
{
  method OperateImmutable() returns (this': Machine)
  {
    var motor' := motor.TurnImmutable();
    var screen' := screen.UpdateImmutable();
    this' := Machine(motor', screen');
  }
}

End goal: confined mutations

datatype Machine =
  | Machine(r: region, motor: Motor, screen: Monitor)
{
  method OperateImmutable() returns (this': Machine)
  {
    var Machine(r, motor, screen) := this;
    motor.Turn();
    screen.Update();
    this' := Machine(r, motor, screen);
  }
}

The immutable datatype uses a region that it modifies, and the client never needs to know!

Regions

  • A programming device for pooling allocations, allowing efficient simultaneous deallocation
  • A specification device for separating and abstracting over groups of objects

Global heap

class Cell {
  var data: int
}

method Inc(c: Cell)
  modifies c
{
  c.data := c.data + 1;
}

Heap as parameter

method Inc($\mathcal{H}$: Heap, c: Cell) returns ($\mathcal{H}$: Heap)
  modifies c
{
  c.data := c.data + 1;
}

Heap as in-out parameter

method Inc`$\mathcal{H}$(c: Cell)
  modifies c
{
  c.data := c.data + 1;
}

Multiple regions

method Inc`$\mathcal{H}$`$\mathcal{K}$`$\mathcal{L}$(c: Cell`$\mathcal{H}$, d: Cell`$\mathcal{K}$, e: Cell`$\mathcal{L}$)
  modifies c`$\mathcal{H}$
{
  c`$\mathcal{H}$.data := c`$\mathcal{H}$.data + d`$\mathcal{K}$.data + e`$\mathcal{L}$.data;
}

Inferring region “mode” of types

method Inc`$\mathcal{H}$`$\mathcal{K}$`$\mathcal{L}$(c: Cell`$\mathcal{H}$, d: Cell`$\mathcal{K}$, e: Cell`$\mathcal{L}$)
  modifies c
{
  c.data := c.data + d.data + e.data;
}

If regions are distinct and are parts of types, then

  • regions can be inferred in operations
  • regions can be modeled as separate variables

Local regions

method Inc2`$\mathcal{H}$`$\mathcal{L}$(c: Cell`$\mathcal{H}$, e: Cell`$\mathcal{L}$)
  modifies c
{
  var $\mathcal{K}$: region;
  var d := new Cell`$\mathcal{K}$;
  d.data := 1;
  Inc`$\mathcal{H}$`$\mathcal{K}$`$\mathcal{L}$(c, d, e);
}

The region names in this method body can probably be inferred.

Lexically scoped regions

{
  var outer: Cell;
  {
    var $\mathcal{K}$: region;
    var inner := new Cell`$\mathcal{K}$;
    // the following assignment is ill-typed,
    // because the type of `outer` cannot mention $\mathcal{K}$
    outer := inner;
  }
}

The lexical regions are similar to what the programming language Euclid provided

Dynamically scoped regions

  • To place regions into data structures, we need to go beyond lexical regions
  • The useful property that lexical scoping gives us is that
    • regions are distinct and never duplicated
  • To get that useful property beyond lexical scoping, we need to track provenance

Provenance

provenance

noun

the place of origin or earliest known history of something

  • the beginning of something's existence; something's origin
  • a record of ownership of a work of art or an antique, used as a guide to authenticity of quality

Linear, affine, unique, monadic, tracked

There are many names:

  • Uniqueness types track what has been
  • Affine types restrict what is to come
  • Linear types restrict what is to come and ensures something happens
  • Monadic types give a way to thread unique in-out parameters

I will use the term tracked types

A little language

Method ::= method M(x*) returns (x*) BlockStmt
BlockStmt ::= { Stmt* }
Stmt ::=
  | var x
  | x := Expr             // simple assignment
  | x* := M(Expr*)        // method call
  | var Ctor(x*) := Expr  // datatype destructor
  | if Expr BlockStmt else BlockStmt
  | BlockStmt
Expr ::=
  | x                     // variable
  | F(Expr*)              // constants, operators, functions
  | Ctor(Expr*)           // datatype constructor

A language like this is typical when defining linear types.

Tracking rules

The judgment

U;V $\vdash$ S ${}\rightsquigarrow{}$ U';V'

says that if

  • unrestrcted variables U and
  • tracked variables V

are available when statement S is started, then

  • unrestrcted variables U' and
  • tracked variables V'

are available after statement S.

U;V $\vdash$ E : m

says that, if U;V are available, then expression E evaluates to a value of mode m (un/tr).

Variable introduction

If x requires manual initialization:

U;V $\vdash$ var x ${}\rightsquigarrow{}$ U;V

If x is auto-initialized:

x : un
----------------------
U;V $\vdash$ var x ${}\rightsquigarrow{}$ U,x;V
x : tr
----------------------
U;V $\vdash$ var x ${}\rightsquigarrow{}$ U;V,x

If type of x requires manual initialization:

U;V $\vdash$ var x ${}\rightsquigarrow{}$ U;V

If x is auto-initialized:

U;V $\vdash$ var x ${}\rightsquigarrow{}$ U;V +x

Simple assignment

x : m'
U;V${}_0$ $\vdash$ E : m
m' $\leq$ m
-------------------------------
U;V${}_0$,V${}_1$ $\vdash$ x := E ${}\rightsquigarrow{}$ U;V${}_1$ +x

If

U;V${}_0$ $\vdash$ E : _
U;V${}_1$ $\vdash$ S ${}\rightsquigarrow{}$ U${}_S$;V${}_S$
U;V${}_1$ $\vdash$ T ${}\rightsquigarrow{}$ U${}_T$;V${}_T$
U' = U${}_S$ $\cap$ U${}_T$
V' = V${}_S$ $\cap$ V${}_T$
---------------------------------
U;V${}_0$,V${}_1$ $\vdash$ if E S else T ${}\rightsquigarrow{}$ U';V'

Block statement

$\forall_{i}$  U${}_i$;V${}_i$ $\vdash$ { S${}_i$ } ${}\rightsquigarrow{}$ U${}_{i+1}$;V${}_{i+1}$
------------------------------------
U${}_0$;V${}_0$ $\vdash$ { S${}_0$ S${}_1$ ... S${}_{n-1}$ } ${}\rightsquigarrow{}$ U${}_n$;V${}_n$

Method call

M : (m${}_i$) returns (m${}_j$)
U;V${}_i$ $\vdash$ E${}_i$ : m${}_i$
x${}_j$ : m'${}_j$
m'${}_j$ $\leq$ m${}_j$
------------------------------------
U;V,V${}_i$ $\vdash$ x${}_j$ := M(E${}_i$) ${}\rightsquigarrow{}$ U;V +x${}_i$

Datatype destruction

datatype m, Ctor : m${}_i$
U;V${}_0$ $\vdash$ E : m
x${}_i$ : m'${}_i$
m'${}_i$ $\leq$ m${}_i$
-------------------------------------------
U;V${}_0$,V${}_1$ $\vdash$ var Ctor(x${}_i$) := E ${}\rightsquigarrow{}$ U;V${}_1$ +x${}_i$

Note that dereliction is built into the rules, but that's okay, because of the ... $\vdash$ E : m requirement above and the x : m' in the rule for simple assignment

Rules for expressions

x : m
x $\in$ U $\cup$ V
------------
U;V $\vdash$ x : m
F : m${}_i$ $\to$ m
U;V${}_i$ $\vdash$ E${}_i$ : m${}_i$
----------------
U;V${}_i$ $\vdash$ F(E${}_i$) : m
datatype m, Ctor : m${}_i$
U;V${}_i$ $\vdash$ E${}_i$ : m${}_i$
---------------------
U;V${}_i$ $\vdash$ Ctor(E${}_i$) : m

Adding classes to the little language

Stmt ::= ...
  | x := new Type`$\mathcal{H}$      // object allocation
  | Expr`$\mathcal{H}$.f := Expr     // object field update
Expr ::= ...
  | Expr`$\mathcal{H}$.f             // object field selection

Allocation

$\mathcal{H}$ $\in$ V
---------------------------------
U;V $\vdash$ x := new Type`$\mathcal{H}$ ${}\rightsquigarrow{}$ U;V +x

Object field read/write

class m, f : m'
U;$\mathcal{H}$,V${}_0$ $\vdash$ E${}_0$ : m
U;$\mathcal{H}$,V${}_1$ $\vdash$ E${}_1$ : m'
--------------------------------------
U;$\mathcal{H}$,V${}_0$,V${}_1$,V $\vdash$ E${}_0$`$\mathcal{H}$.f := E${}_1$ ${}\rightsquigarrow{}$ U;$\mathcal{H}$,V
class m, f : m'
U;V $\vdash$ E : m
$\mathcal{H}$ $\in$ V
-----------------
U;V $\vdash$ E`$\mathcal{H}$.f : m'

Are we done?

(At least) three things are missing/wrong:

  • The strictness of tracked types is unnatural for objects
  • Reading from an object consumes the object
  • We need to be able to use tracked values in a function without consuming them

Tracked objects

tracked class QueryEngine {
  var $\mathcal{H}$: region
  var cache: Cache`$\mathcal{H}$
  ...

  method Operate()
  {
    var $\mathcal{H}$, cache := this; // "unpack this"
    cache`$\mathcal{H}$.Lookup();
    ...
    this := new QueryEngine{ $\mathcal{H}$, cache }; // "pack this"
  }
}

This seems unnatural for objects.

But maybe any “tracked object” should really be a tracked datatype value?

Reading from a region

tracked datatype Counter =
  | Counter($\mathcal{H}$: region, cell: Cell`$\mathcal{H}$)
{
  method Inc() returns (this': Counter)
  {
    var Counter($\mathcal{H}$, cell) := this;
    var x := cell`$\mathcal{H}$.data + 1; // RHS consumes cell
    cell`$\mathcal{H}$.data := x;         // LHS consumes cell
    this' := Counter($\mathcal{H}$, cell);
  }
}

Using tracked values in a function

tracked datatype Machine =
  | Machine($\mathcal{H}$: region, motor: Motor`$\mathcal{H}$, screen: Monitor`$\mathcal{H}$)
{
  predicate Valid() {
    var Machine($\mathcal{H}$, motor, screen) := this;
    screen`$\mathcal{H}$.vsize == screen`$\mathcal{H}$.hsize
  }
}

This is a complication in Chalice / Viper

Shared/borrowed values

  • Change U;V environment to U;B;V where B has the shared values
  • Can parameters be declared as “shared”?
  • Can locals be declared as “shared”?
  • What is the $\leq$ ordering on un,sh,tr?
  • Does sh need to be contained, like regions do? (Lifetimes) Can this mechanism build on the Type`$\mathcal{H}$ mechanism?

Other questions

  • Is a method always implicitly allowed to extend a given region?

      method UndeclaredAllocations`$\mathcal{H}$() {
        var cell := new Cell`$\mathcal{H}$;
        cell.data := 12;
      }
  • Can you do local storage with this region mechanism? (Then, is there a distinction between regions that can or cannot be extended?)

      method LocalStorage() {
        var $\mathcal{L}\normalfont{\textsc{ocal}}$: region;
        var cell := new Cell`$\mathcal{L}\normalfont{\textsc{ocal}}$;
        cell`$\mathcal{L}\normalfont{\textsc{ocal}}$.DoSomething();
        AnotherMethod`$\mathcal{L}\normalfont{\textsc{ocal}}$();
      }
  • Can these regions be used as a programming device? (Then, do all regions need to be present at run time?)

      method EfficientDeallocation() {
        var $\mathcal{H}$: region;
        ...
        SomeMethod`$\mathcal{H}$(...);
        ...
        // $\mathcal{H}$ is automatically freed here
      }

Comparisons

  • In separation logic, regions are not explicit in the program text
  • In Linear Dafny, regions are handled by the verifier, not the type system