Loops and Iterators


Until now we have only made use of the simple loop constructs, while! and until!. These are similar to the loop constructs found in other languages, aside from the terminal '!'. However, that terminal '!' hides something very special about these loop constructs in Sather; a programmer can define such looping constructs as easily as he or she could define a standard routine. Once defined, they may be used as conveniently as the while! and until! iterators.

To a first approximation, iterators are like streams that can "yield" different values on successive loop iterations. When an iterator has no more values to yield, it "quit"s. This, in turn, terminates the enclosing loop.

Iterators are defined as class features, just like routines, but iterator names must terminate with an '!'. When an iterator is called, it executes the statements in its body in order. If it executes a yield statement, control is returned to the caller. In this, the iterator is similar to a coroutine whose state remains persistent over multiple calls. Subsequent calls on the iterator resume execution with the statement following the yield statement. If an iterator executes quit or reaches the end of its body, control passes immediately to the end of the innermost enclosing loop statement in the caller and no value is returned.

3.1 Using iterators

3.1.1 loop statements

Iteration is done with loop statements, used in conjunction with iterator calls. In the absence of iterator calls, a loop statement simply executes an infinite loop. The difference between an iterator call and a routine call is that the iterator call "remembers" its state after it yields a value and, on subsequent calls, it simply resumes execution. The "lifetime" of an iterator usually includes several calls within a particular loop. Hence, an execution state is maintained for each iterator call textually enclosed within a loop - this execution state will be used to "remember" the state of the iterator between invocations. When a loop is entered, the execution state of all enclosed iterator calls is initialized. When an iterator is encountered, control is transferred to the iterator until the iterator "yields" control. Just as a routine may provide a value when it returns, so too an iterator may provide a value when it yields.
   
sum: INT := 0;
loop sum := sum + 1.upto!(10); end;
#OUT + sum + '\n';              -- Prints sum of integers from 1 to 10
Instead of yielding control back to the enclosing loop, an iterator may also terminate or quit, which terminates the enclosing loop.

Note that each loop may contain more than one iterator call, thus providing much more flexibility than conventional languages. When any of the iterators terminates, the whole loop terminates, and execution continues at the next statement after the loop.

3.1.2 Built-in iterators

The until!, while! and break! iterators are built-in. They have the standard definitions of until, while and break in other languages and may occur anywhere in the loop body. while! expressions are iterator calls which take a single boolean argument that is re-evaluated on each iteration. They yield when the argument is true and quit when it is false. until! expressions are iterator calls which take a single boolean argument that is re-evaluated on each iteration. They yield when the argument is false and quit when it is true. break! expressions are iterator calls which immediately quit when they are called.
   
sum:INT := 0; i: INT := 0;
loop while!(i < 5);   
   sum := sum + i;   
   i := i + 1;
end;
#OUT+ "Sum="+sum+'\n';         -- Prints out Sum=10
The break! iterator can be used to terminate a loop at any time. We illustrate this with the bubble sort routine show below, which terminates the first time a pass through the data occurs with no order change.
   
bubble_sort(a:ARRAY{INT}) is
   loop 
     done: BOOL := true;
     i:INT := 0; -- Loop until the "break!" is encountered
     loop until!(i = (a.size-2));
        if a[i] > a[i+1] then 
          done := false
          swap(inout a[i], inout a[i+1]); 
       end;
       i := i + 1;
     end;
     if done then break!; end;
   end;
end;
The 'swap' routine is as we have described earlier.
   
swap(inout x:INT, inout y:INT) is
   tmp:INT := x; x := y; y := tmp; 
end;    
Note that the above 'bubblesort' routine could easily be rewritten to only use until!.

In addition to the built-in iterators, there are many commonly used iterators in the INT class that provide for interation over a range of values. For instance, the iterator upto! yileds successive integer values.
   
sum:INT:= 0; loop sum := sum+ 10.upto!(20); end;

The upto! iterator returns successive integers from 10 upto 20, inclusive. Below, are examples of a few other common iterators
   
i:INT :== 10; sum:INT := 0; 
loop 11.times!; sum := sum + i; i := i + 1; end;
The 'times' iterator yields a certain number of times. For iterating over a range with a certain stride, use the 'step' iterator
   
sum: INT := 0;
loop sum := sum + 18.step!(11,2); end;
-- The first argument is the number of iterations, 11 in this case
-- the second argument is the stride
. The following example counts 11 even numbers starting at 18

The 'step_upto!' iterator is similar, but instead of specifying a number of iterations, it specifies the maxium value to be reached
   
sum: INT := 0;
loop sum := sum + 18.step_upto!(40,2); end;
. The following loop is equivalent to the preceeding one.

3.2 Defining Iterators

3.2.1 yield statements

Iterator definitions are similar to routine definitions, except that we need to indicate when control should be transferred back to the calling point. In a routine, this transfer of control is indicated by a return statement, which terminates the routine. An iterator, however, can return control in two different ways. It can either

The yield must return a value (of the appropriate type), if the iterator has a return value.
   
range!(min, max:INT):INT is  
    i:INT := min;
    loop until!(i > max);
        yield i;           -- 
        i := i + 1;
   end;
end;
 
This iterator can be used to add up all the numbers in a particular integer range
   
sum: INT := 0;  loop sum := sum+range!(1,10); end;

3.2.2 Explicitly leaving an iterator using quit

When an iterator has yielded as many times as needed, it can either reach the end of it's statement list or explicitly call a quit statement. quit statements are used to terminate loops and may only appear in iterator definitions. No value is returned from an iterator when it quits. No statements may follow a quit statement in a statement list. The following definition of 'range!' is equivalent to the preceeding definition:
   
range!(min, max:INT):INT is  
    x:INT := min;
    loop if x > max then quit end;
        yield x;        
        x := x + 1;
   end;
end;

3.2.3 Control flow within an iterator

The following figures illustrate the control flow between an interator and its calling loop.

When the iterator is first called, control goes into the iterator and then returns to the outer loop, when the iterator yields in step [7]

After the first yield, control continues in the outer loop until the iterator is encountered again in step [11] and control is again transferred to the iterator, right after the point of the yield, in step [12]

The above sequence will continue until the if statement is true and the quit statement is encountered in the iterator. Control is then transferred to the end of the enclosing loop. The iterator calling context keeps track of the internal state of the iterator from the last yield.

3.2.4 The once argument mode

One problem with the above definition of 'range!' is that the arguments to the function will be evaluated each time through the loop. Consider the following loop
   
sum:INT := 0;
max:INT := 5;
loop  sum := sum + range!(3,max);
 max := max+2;
end;
This, somewhat silly, example will go into an infinite loop, since the argument 'max' increases each time through the loop.

Iterator argument are hot by default. This means that the arguments will be re-evaluated and passed to the iterator each time through the loop. When the arguments to the iterator are constant, it is not important whether they are re evaluated or not. However, in many cases it is important to ensure that the argument is only evaluated the first time through the loop.

This happens to once-arguments. Arguments which are marked with the mode 'once are only evaluated the first time they are encountered during a loop execution. Thus, the correct definition of the 'range' iterator is:
   
range!(once min, once max:INT):INT is 
 i:INT := min;
 loop until!(i > max);
  yield i
  i := i + 1;
 end;
end;
Note that 'once' arguments are only marked at the point of definition, not at the point of call. Thus, invoking the loop will look the same as before
   
sum:INT := 0; loop sum := sum + range!(3,5); end; 

The 'self' parameter (i.e. the object on which the iterator is being called) is always a once parameter.
   
i: INT := 5;
loop #OUT+i.upto!(11)!+' '; i:=1; end;
-- The above loop prints out 5 6 7 8 9 10 11   
In the above example, though the value of 'i' changes the second time through the loop, the change is ignored - the first value of 'i' is used.

The following more complex example will sum up some of the elements of the first row although the variable row will contain different rows in consecutive loop iterations.
   
loop     -- Sum up some of the elements of the first row!
 row := matrix.row!;
 sum := sum + row.elt!;  
      -- row is only evaluated at the first iteration!
end;

3.2.5 out and inout argument modes

Yield causes assignment to out and inout rguments in the caller i.e. these arguments are assigned each time when the iterator yields..
   
range!(once min, once max:INT, out val:INT) is
 i: INT := min;
 loop until!(i > max);
  val := i;
  yield;
  i := i + 1;
 end;
end;
Which may be used by:
   
sum:INT := 0;  
loop res:INT;  
 range2!(3,5, out res);  
 sum :=  sum + res;
end; 
#OUT+sum+'\n'; -- Prints out 12
Note that no assignment to out and inout arguments takes place when an iterator quits.

3.2.6 pre and post conditions in iterators

The behavior of pre- and post- conditions in iterator definitions is a natural extension of their behavior in routine definitions. The pre clause must be true each time the iterator is called and the post clause must be true each time it yields. The post clause is not evaluated when an iterator quits.

3.2.7 Argument evaluation in iterators

At a more technical level, when an iterator is first called in a loop, the expressions for self and for each once argument are evaluated left to right. Then the expressions for arguments which are not once (in or inout before the call, out or inout after the call) are evaluated left to right. On subsequent calls, only the expressions for arguments which are not once are re-evaluated. self and any once arguments retain their earlier values.

3.2.8 Points to note

Iterator usage

When the iterator elt! terminates the surrounding loop, an opening bracket has already been printed. The expression producing the matching closing bracket will not be evaluated, hence the algorithm will always print a bogus closing bracket in the end. The standard solution looks as follows:
   
loop  #OUT + ( "(" + c.elt! + ")\n" ); end;

The extra paratheses force the whole line to be evaluated first. As this evaluation will be aborted by the quit of the iterator the printing evaluation will not happen for the last iterator call.

Iterator definitions

3.3 Iterator Examples

Some of the following examples make use of arrays, which will be introduced in the chapter on Parametrized Classes

Because they are so useful, the 'while!', 'until!' and 'break!' iterators are built into the language. Here's how 'while!' could be written if it were not a primitive
   
while!(pred:BOOL) is
 -- Yields as long as 'pred' is true
 loop
  if pred then yield
  else quit
  end
 end
end.
The built-in class 'INT' defines some useful iterators. Here's the definition of 'upto!'. Unlike the argument 'pred' used above, 'i' here is declared to be 'once'; when 'upto!' is called, the argument is only evaluated once, the first time the iterator is called in the loop.
   
upto!(once i:SAME):SAME is
 -- Yield successive integers from self to `i' inclusive.
 r::=self; 
 loop 
  until!(r>i);
  yield r; 
  r:=r+1
 end
end;
To add up the integers 1 through 10, one might say
   
sum: INT := 0; loop sum := sum + 1.upto!(10) end

Or, using the library iterator 'sum!' like this. 'x' needs to be declared (but not initialized) outside the loop, so its value is available after the loop terminates.
   
x: INT; loop x:=INT::sum!(1.upto!(10)) end

Some of the most common uses of iters are with container objects. Arrays, lists, sets, trees, strings, and vectors all have iterators to yield all their elements. Here we print all the elements of some container 'a'
   
a: ARRAY{INT} := |1,2,7|;
loop #OUT + a.elt!.str + '\n'  end
This doubles the elements of array 'a'
   
loop a.set!(a.elt! * 2) end

This computes the dot product of two vectors 'a' and 'b'. There is also a built-in method 'dot' to do this. 'x' needs to be declared (but not initialized) before the loop.
   
loop x:=sum!(a.elt! * b.elt!) end

Separating elements of a list

When printing out the elements of a container, or other kinds of lists, it is usually appropriate to insert a separator between all the elements (but, of course, not after the last element). There is a convenient iterator in the string class that does exactly this:.
   
a: ARRAY{INT} := |1,2,3|;
loop  #OUT +  ",".separate!(a.elt!.str); end;
-- Prints out 1,2,3
The separate! iterator is called on the string which you wish to use to separate components of the list. In this case, the list elements will be separated by a comma. The definition of this iterator is as shown below
   
class STR is
  ...
  separate!(s: STR): STR is
     -- On the first iteration just outputs `s', on later iterations
     -- it outputs self followed by `s'. 
     yield s.str; loop yield self + s.str end 
  end;
Note that the argument to the iterator is not a once argument, and will be evaluated each time the iterator is called.