Introduction

Iterators are a new type of code abstraction resembling procedures (which are named routines in Sather). The first limited implementation occurred in CLU. Iterators are a generalisation of constructs like while, for or repeat .. until. Furthermore they are a powerful alternative for iterator classes in other object oriented languages.

This introduction has been motivated by the huge amount of questions occurring in the Sather newsgroup. Some of the answers and suggestions of Alex Cozzi, Matthias Ernst, Claudio Fleiner, Ben Gomes, Matt Kennel and David Stoutamir have been compiled into this article.

The syntax of the iterators has slightly changed in Sather 1.1. If your compiler does not accept the programs in this tutorial, you might have to upgrade to the newest compiler version.


Iterators for beginners

Iterators are different from procedures which are difficult itself for beginners to understand. Where procedures are abstractions for dividing code into separatable pieces, iterators provide ways of controling code which should be executed over and over.

The most important thing to know about iterators are that they belong to loops. Only within loops iterators may be called.

A loop contains a list of statements which will be executed in order over and over without any end. No normal statement can bring the loop to an end, only iterators can. Whenever an iterator is being called it either returns - one says it yields - like a call to a normal routine or it quits the innermost surrounding loop. When returning the iterator may return a value - like routines do. A loop may contain more than one iterator; each of them can quit the loop.

  i := 1.upto!(10);
  loop
    #OUT + i + "\n";
  end;
This is illegal, because the iterator must be in the loop.
  loop
    i := 1.upto!(10);
    until!( i > 5 );
  end;
This is a legal loop. The upto! iterator will not finish counting up to 10, because the until! iterator will finish the loop when i has reached 6.

The most common loop constructs people know are while-loops repeat-loops and for-loops, here is how to translate them into iterators:

while and repeat-until -loops

While and Repeat loops are trivial:

Pascal or MODULA 2            SATHER

while (condition) do          loop
  ...                           while!( condition );
                                ...
end;                          end;

repeat                        loop
  ...                           ...
until (condition);              until!( condition );
                              end;
Iterators are - like routines - being called within objects. But in which object is while! being called? Good question. The iterators while!(cond:BOOL), until!(BOOL) and break! are predefined iterators which can be used in every class.

for-loops

Some languages have varying options for using for-loops. There might be loops counting upward or downward or even counting in steps different from 1. In Sather there are different iterators for these different styles of for-loops.

for i := a to b do            loop
  ...                           i := a.upto!(b);
end;                            ...
                              end;

for i := a to b by c do       loop
   ...                          i := a.step!(b,c);
end;                            ...
                              end;

for i := a to b by -1 do      loop
   ...                          i := a.downto!(b);
end;                            ...
                              end;
The iterators upto!(INT), downto!(INT) and step!(INT,INT) are iterators which belong to integers.

Note that the iterator is not called in i - the loop variable - but in a - the start value -, and the result of the iterator is assigned to i. By the way, when i gets a new value somewhere in the loop, this will not change the number of times the loop will be executed, as the value of i is not used in the iterator itself - it is just a result.

Exercise 1

Write a program printing all products of the numbers from 1 to 10. You may use all discussed iterators.

Some other nifty predefined iterators

INT::times!
This iterator can be used just to force a loop to be executed a certain amount of times. Usually this iterator should be the first statement in a loop.
        loop
          10.times!;
          ... -- all code here will be executed
          ... -- 10 times.
        end;
        
STR::separate!($STR):STR
How often do people write programs to print out some lists? Usually one wants to write commas between the list items but none in front of the list or at the very end of it. This iterator provides exactly this behavior.

separate! is being called in a loop to print elements in form of a list. It is called in the string object representing the separator between the list elements. When this iterator is called once, it just returns the string representation of the passed object. Upon the following calls it returns the string representation of the passed argument preceded by the separating string.

          loop
            i := 1.upto!(10);
            #OUT + ",".separate!( i*i );
          end;

        -- prints "1,4,9,16,25,36,49,64,81,100"
        
Note: the argument of separate! will be evaluated within every call (see 'once' arguments).

ARRAY{T}::elt!:T
This iterator returns consecutively all elements of the array, without the hassle dealing with indices and checking for the end of the array.
          loop #OUT + array.elt! end;
        

Writing iterators

All the iterators we have discussed so far are provided by the Sather libraries or even by the language itself.

Writing new iterators is almost the same as writing procedures with only one difference. As procedures return the to the calling procedure with the statement return, the iterators have two kinds of return:

yield
returns to the calling loop. Like return it can return an optional result. But there is one big difference to the return:

When the same iterator call is being executed again, the iterator will not be executed from the beginning but rather from the next statement behind the yield statement!

quit
aborts calling loop. It can return no result like yield (Where should this result be assigned to, anyway?). Reaching the end of an iterator is equivalent to quit.

Other ways of leaving iterators:

return
is not allowed in iterators!
raise
statements also terminate the iterator, because whenever an iterator raises an exception, the protect statement has to be outside of the calling loop and hence the control flow leaves the whole loop.
How an iterator works

The following example tries to illustrate the special behavior of yield and quit. Given the following program:

class MAIN is
  main is
    i,sum: INT;

    sum := 0;
    loop
      i := 1.upto!(10);
      sum := sum + i;
    end;
    #OUT + sum + '\n';
  end;
end; -- class MAIN
class INT is
  ...
  upto!(once m:INT):INT is
    x: INT := self;
    loop
      if x>m then quit end;
      yield x;
      x := x + 1;
    end;
  end;
end; -- class INT

To each textual iterator call there is an iterator context, which stores the information whether the once arguments already have been evaluated and from which yield the iterator returned after the last invocation.

In this example the iterator context knows the two entry points of the iterator, at the beginning and behind the yield. When the iterator is being invocated for the first time, the control is being transferred to the beginning of the iterator. The iterator computes the first result and stops at the yield. The context switches back to the surrounding loop and continues.

When reaching the end of the loop, the loop will just be started from the beginning. But this time the iterator context is different. At the next time the iterator is being called the control is passed to the context, the controlflow will continue right behind the yield statement. The end of the loop in the iterator will be reached and the loop will be restarted (this loop has nothing to do with the loop in the calling routine).

When the iterator hits the quit statement, the control is not transferred to the iterator call but to the end of the surrounding loop. The control will continue with the first statement after the surrounding loop.

A first iterator with a single quit

  my_break!
  is
    quit
  end;
Whenever you call this iterator, the loop in which the call occurs will be terminated immediately. The quit could be left out, because reaching the end of the iterator has the same effect.
loop
  #OUT + "Hello\n";
  my_break!;
  #OUT + "there\n";
end;
This loop therefore would print "Hello" once and never "there". This iterator does the same as the predefined break!.

A single yield

The following iterator does something different. Try to figure out what, before you continue reading.

class MAIN
is
  once!
  is
    yield
  end;

  main
  is
    loop
      #OUT + "Hello\n";
      once!;
      #OUT + "World\n";
    end
  end;
end; -- class MAIN
When the iterator is being executed first the first statement will be executed. This is a yield statement, which returns to the calling loop. So after the "Hello" "World" will be printed. Then the end of the loop is reached and started over with printing "Hello" again. When the iterator is called next time, the execution in the iterator is resumed after the yield from the first round. As the yield was the last statement of the iterator, the calling loop will be quit-ted. So the output is "Hello World Hello".

What a strange iterator, allowing the first half of the loop to be executed twice, you think? True, but imagine the iterator standing in the loop as first statement. Then the whole loop would be executed just once.

Exercise 2

Write an iterator called three_times! - without using any other iterators - so that the following loop will be executed exactly three times:

  loop
    three_times!;
    #OUT + "Hello World!\n";
  end;
Exercise 3

What does an implementation of a while!(BOOL) iterator look like? The frame of the iterator should look like this, and again: do not use any other iterators within your implementation.

  my_while!( cond: BOOL )
  is
     ... -- fill in your solution here
  end;
Control flow in an iterator

Unlike a procedure an iterator does not always start its execution at the first statement of its body. This only happens when the iterator call has benn invoked the first time. Later invocations continue after the yield which returnd the control to the calling loop. The figure in the section How an iterator works gives a visual idea how the control actually flows though an iterator.

More than one iterator per loop

Other languages allow only one kind of iteration in one loop. You cannot combine while-loops and for-loops in an easy way. The only restriction in Sather for iterators is, that they have to appear in loops. But one can use as many iterators in one loop as one wishes. Every iterator can terminate the loop or return to the loop. So the first iterator which quits the loop terminates all other iterators with it.

We have already done it in an earlier example. There the upto! iterator is used to generate consecutive numbers, and the separate iterator processes them into list form.

There the upto! terminates the loop - the separate iterator never terminates a loop because it is designed to receive an arbitrary number of objects to process.

Iterators more than once in a loop

When there is the possibility to call more than one iterator in a loop, one can also call the same iterator twice. However, the information about the state, the iterator is in, are connected to the call and passed to the iterator when it is called. So the calls are independant from one another as if the other call would not exist (Unless the other quits and terminates the loop, of course).

  loop
    #OUT + a.elt! + "," + a.elt! + "\n";
  end;
One might expect that this loop writes the elements of the array a two in a row. As both calls to elt! are independant from each other this loop prints all elements of the array twice in a row. If you want to print all elements only once you have to write something like this:
  even: BOOL := false;
  loop
    #OUT + a.elt!;
    if even then
      #OUT + "\n";
    else
      #OUT + ",";
    end;
    even := ~even
  end;
Exercise 4

The last program is somewhat ugly. If the array contains an odd number of elements there is a trailing comma instead of a newline. Write a routine print_array which does not have this flaw.

Hint: Writing a helping iterator might be useful.


Iterators for experts

What are once arguments?

Usually, an interator will be called more than once. But as the iterator remembers its state it can also remember the value of the arguments from the last call. Usually the iterator does not do this, but when an argument is marked as a once argument in the declaration of the iterator, the iterator remembers its value and does not re-evaluate it in subsequent calls.

Routines have no once arguments. If a routine is being called multiple times, all arguments will be re-evaluated.

  loop
    i := 1.upto!( 2 * j );
    ...
  end
INT::upto!(INT):INT receives an once argument. So, it is evaluated exactly once. If j changes its value, this will not affect the parameter in the iterator. (In for loops written in C the upper bound of a for-loop is re-evaluated in every run of the loop. So you save some computations in Sather.)

The arguments of while!(BOOL) and until(BOOL!) can't be once arguments. The condition in these iterators should be checked again when the loop repeats. It would not make much sense to evaluate the boolean expression only once and decide whether one wants to execute the loop forever or not at all.

Exercise 5

Write an own version of the separate iterator which receives two arguments. The first is the object to be printed into the list. It should be a subtype of $STR, so that you can pass everything printable when using the iterator. The second argument should be the string being used to separate these objects - and should only be evaluated once. Do not use any other iterators!

  loop
    i := 1.upto!( 10 );
    #OUT + my_separate!( i*i, "," );
  end;
  #OUT + "\n";

  -- prints "1,4,9,16,25,36,49,64,81,100"
'self' is always 'once'

It is important to know that the object in which the iterator is called is always an once argument, it will only be evaluated once - when the iterator is being executed the first time. Thats why an iterator may not be called in a result of another iterator.

  list: ARRAY{ARRAY{INT}};
  loop
    i := list.elt!.elt!;
  end;
list.elt! would return a different array in every loop, but the second elt! would always being called within the first result of the list.elt! call. The compiler can detect this case and will produce an error message.

One last remark: Only 'in' arguments can be 'once' arguments.

Variable declaration in iterators

As one can declare local variables everywhere in a Sather program, this is also possible withing loops. But these variables are only initialized once and not in every run of the loop. This can be considered as a flaw, because the only reason for this are speed considerations.

  loop
    i := a.ind1!;
    sum:FLT;
    loop
      sum := sum + a[ i, aind2! ];
    end;
    res[i] := sum;
  end;
Here sum is initialized to zreo only in the first time, and not in every cycle, as one might think.

Recursive iterators

Using recursive iterators is more tricky than expected. An iterator might do the next step of its recursion and call itself recursively. But one has to consider that this recursive call might yield more than one result and all these results have to be yielded forth.

Imagine a binary tree which has left and right children and contains some data in each node. How to write an iterator yielding all data from the tree?

class TREE{T}
is
  attr left,right: SAME;
  attr data: T;

  create(l,r:SAME, d: T): SAME
  is
    res: SAME := new;
    res.left := l; res.right := r; res.data := d;
    return res
  end;

  elt!: T
  is
    if ~ void(self) then
      loop
        yield left.elt!
      end;
      yield data;
      loop
        yield right.elt!
      end
    end --if
  end; -- elt!

end; -- class TREE{T}
This is a very small example for a tree class. The left child of the node can be a rather big tree containing many objects to be yielded. One has to write a loop to yield all of them until the iterator quits. After this you can yield the data from the node and then you have another loop for all of the objects in the right subtree.

In an n-ary tree the same situation looks like this.

class N_ARY_TREE{T}
is
  attr children: ARRAY{SAME};
  attr data: T;
  ...
  elt!: T
  is
    child: SAME;
    yield data;
    loop
      child := children.elt!;
      loop
        yield child.elt!
      end;
    end;
  end;
end; -- class N_ARY_TREE{T}
There one might be tempted to write loop yield children.elt!.elt! instead of the two nested loops. But this will not work, because the compiler detects an error, see
'self' is always 'once'.

Exercise 6

But what might happen, if you forget the innermost loop statements in the iterator?

Iterators in complex expressions

What does the following program print? You should try out first.

  a := | 2,2,2 |;
  loop
    #OUT + "value=" + a.elt! + "\n"
  end;
One might expect that the program prints 3 lines, each one containing value=2. But for a big surprise there appears a fourth incomplete line just with a value= starting there.

This is a result of the order of evaluation in the #OUT expression. The output expression is being evaluated whenever the loop starts. When evaluating this expression the iterator will be called after the string value= has been printed. So when this iterator is being called the fourth time, the string will have been printed four times. Termination of the loop only suppresses the printout of the result of the iterator and the printout of the newline.

In this concrete (and common) case the problem is easy to fix by using parathese to force the complete outputstring to be evaluated first and then to pass it to OUT.

    #OUT + ("value=" + a.elt! + "\n")
But seems somehow like a hack, as now the plus routine in STR will be called and no longer the plus in OUT. Hence the first argument in the list of summands has to be a string.

Exercise 7

Write an iterator receiving a string and yielding all permutations of the string. You may use all known iterators and you can treat all letters of the string as being unique within the string (perm!("aa") may yield "aa" twice). But do not write more than one iterator in your solution.

Hint:

  s_without_ith_letter := s.head(i-1) + s.tail( s.length-i-1 );

Solutions for some of the exercises

It is intended to have no links from the exercises to the solutions! The temptation to peek into the solution would be too big :-)

Solution for Exercise 1

class MAIN
is
  main
  is
    i,j: INT;

    #OUT + "         1   2   3   4   5   6   7   8   9  10\n";
    loop
      i := 1.upto!(10);
      #OUT + #FMT( "<####> :",i);
      -- Use the format class to justify `i' to the right in an area
      -- of size 4.
      loop
        j := 1.upto!(10);
        #OUT + #FMT( "<####>", i*j );
      end;
      #OUT + "\n";
    end
  end;
end; -- class MAIN

See the Documentation of the format class for more information how to print formatted objects.

Solution for Exercise 2

  three_times!
  is
    yield;
    yield;
    yield;
  end;
This program does the job. When called the first time, the first yield will return the control to the loop. The next call resumes after the first yield and executes the second one. The third time the third yield will be executed. When trying to go through the loop the forth time, the iterator reaches its end and therefore terminates the loop.

Exercise for intermediates

Try to implement an iterator which leaves the loop after n rounds. It should be used like this:

  loop
    times(n);
    ...
  end;
As in the first solution: Do not use any other iterator!

Hint: You will have to use a loop within the iterator, but you can leave the loop within an iterator with yield or quit.

Solution for Exercise 3

This solution is intentionally left out.

Solution for Exercise 4

The trick is to ensure that the last thing printed out is always a number and to print the separating characters before the next number. Watch out for the first number, which does not need a linefeed in front.
  print_column!( num: INT )
  is
    #OUT + num;          -- first
    yield;
    loop
      #OUT + "," + num;  -- even
      yield;
      #OUT + "\n" + num; -- odd
      yield;
    end;
  end; -- print_column!

  print_array( a: ARRAY{INT} )
  is
    loop
      print_column!( a.elt! );
    end;
    #OUT + "\n"          -- terminating linefeed
  end; -- print_array
Sure, this is only one of many possible solutions. If your solution differs, check out what happens when passing an odd sized array, an even sized array, and empty array and an array with only one element.

Remember: As the element to be printed changes in every loop the argument to the iterator must not be once argument.

Here a version almost without iterators, just for comparison.
  print_array( a: ARRAY{INT} )
  is
    i,el: INT;
  
    i := 1;
    loop
      el := a.elt!;
      if i = 1 then
        #OUT + el
      elsif i.even then
        #OUT + "," + el
      else -- i.odd and i>1
        #OUT + "\n" + el
      end;
      i := i + 1
    end;
    #OUT + "\n"
  end; -- print_array
Every iterator call maintains a state in which the current status of the iterator is being encoded. This state is made visible here by using i, but the program flow is much less obvious.

Exercise for intermediates

Write an iterator capable printing the contents of an array row-by-row in an arbitrary number of columns. The separator between the columns should be arbitrary as well. You may use iterators discussed in this tutorial. Which arguments to the iterator should be once and which not?

Does it make sense to write an iterator printing down column-by-column?

Solution for Exercise 5

  my_separate!( ob: $STR, once sep: STR ): STR
  is
    yield ob.str;
    loop
      yield sep + ob.str
    end
  end;
The first time, the iterator is called, the first yield returns just the string representation of the object ob. From then on, all calls will be handled by the second yield and all objects will be preceded by the separator.

Note that the first argument has an exclamation mark in the declaration and the second has none. So the second argument is evaluated only once and all separator will look the same - no matter what the user might do.

  a := | 1,2,3,4,5 |;
  star := " * ";
  loop
    #OUT + my_separate( a.elt!, star );
    star := ",";
  end;

  -- Will produce "1 * 2 * 3 * 4 * 5"
Solution for Exercise 6

If a tree looks like this:

	     3
	    /|\
	   / | \
	 14  7  10
	 /\  |  /|\
	2  5 8 6 1 9

  loop #OUT + tree.elt!.str + "\n" end;
The output of this loop should be 3 14 2 5 7 8 10 6 1 9. Forgetting the loop around child.elt! the result is 3 14 2. What is going on?
  1. First just the data of the root is being yielded.
    -->3
  2. The loop in the iterator computes the leftmost node from the root, which is the node containing 14. In this node the iterator is being called. This iterator returns the value of the node.
    -->14
  3. Now it gets interesting. The outermost loop reaches its end and starts over. It computes the next node (containing 7). But the object in which iterator is being called afterwards is the 14 node, because the argument in which the iterator is being called is a 'once' argument (see 'self' is always 'once')! So this iterator enters its loop and computes the node containing 2.
    -->2
  4. In the next loop the outermost iterator computes the node containing 10 and calls again the 14 iterator which computes the 5 node but calls the 2 iterator. This one quits, because the node 2 has no children. This causes the loop in the 14 iterator to terminate. So the loop in the 3 iterator quits, too, and this terminates the main loop.
If the 3 node happened to have less than 3 children the output might even been shorter.

Solution for Exercise 7

This method yields all permutation of a string. It works as follows: All letters of the string are being taken as the first letter of the permutated strings. This letter is removed from the string. Now all permutations of the shorter string are computed and the selected letter is being added as a head.

  perm!(once s:STR): STR
  is
    head, tail: STR;
    i: INT;

    if s.length <= 1 then
      yield s
    else
      loop
        i := 0.upto!(s.length-1);
        head := s[i].str;
        tail := s.head(i) + s.tail(s.length-i-1);
        loop
          yield head + perm!(tail)
        end
      end
    end
  end; -- perm!
Hopefully you did not forget the loop ... end around the yield head + perm!(tail).

To be continued ...


Last modified: 07/09/96
Dipl. math. Holger Klawitter (holger@icsi.berkeley.edu).