----- 
File: 1987/tr-87-001 
 
A Minimax Arc Theorem for Reducible Flow Graphs 
 
Vijaya Ramachandran 
tr-87-001 
November 1987 
 
We establish a conjecture of Frank and Gyarfas by  
proving that the cardinality of a minimum feedback  
arc set in a reducible flow graph is equal to the  
cardinality of a maximum collection of arc disjoint  
cycles.  
 
----- 
File: 1988/tr-88-001 
 
Future Directions in DBMS Research 
 
Erich Neuhold and Michael Stonebraker 
tr-88-001 
February 1988 
 
On February 4-5, 1988, the International Computer  
Science Institute sponsored a two day workshop at  
which 16 senior members of the database research  
community discussed future research topics in the  
DBMS area. This paper summarizes the discussion  
which took place.  
 
----- 
File: 1988/tr-88-002 
 
The Cell Tree: An Index for Geometric Databases 
 
Oliver G&uuml;nther 
tr-88-002 
June 1988 
 
This paper describes the design of the cell tree, an  
index structure for geometric databases. The data  
objects in the database are represented as unions of  
convex point sets (cells). The cell tree is a balanced  
tree structure whose leaves contain the cells and  
whose interior structure allows quick access to the  
cells (and thereby to the data objects), depending on  
their location in space. Furthermore, the cell tree  
is designed for paged memory: each node corresponds  
to a disk page. This minimizes the number of page  
faults occurring during a tree search. Point  
locations and range searches can therefore be  
carried out very efficiently using the cell tree.  
 
----- 
File: 1988/tr-88-003 
 
Measuring with Slow Clocks 
 
Heinz Beilner 
tr-88-003 
July 1988 
 
This report describes a measurement technique and  
corresponding statistical evaluation options that  
can be used for assessing the mean duration of  
performing a particular operation, even when this  
duration is small compared with the resolution of an  
available, readable clock. The technique has been  
developed with regard to measuring operation  
durations of distributed system kernels, and to  
measuring durations of sub-activities embedded in  
these operations. The technique employs repetitive  
executions of the measured operation, but does not,  
however, depend on the usually employed "tight loop"  
around the operation. It also allows for  
simultaneous assessments of several different time  
intervals within the repetitive pattern. Based on an  
initial guess about the mean length of the smallest  
time interval to be measured, the necessary number of  
loop cycles can be determined before an experiment,  
for a selectable width of the confidence interval of  
the mean to be estimated, and at a selectable  
confidence level.  
 
----- 
File: 1988/tr-88-004 
 
MOSIX: An Integrated UNIX for Multiprocessor Workstations 
 
Amnon Barak and Richard Wheeler 
tr-88-004 
October 1988 
 
MOSIX is a general-purpose Multicomputer Operating  
System that integrates a cluster of loosely  
connected, independent computers (nodes) into a  
single-machine UNIX environment. Developed  
originally at Hebrew University for a cluster of  
uniprocessor nodes, it has recently been enhanced to  
support nodes with multiple processors. In this  
paper we present the hardware architecture of this  
multiprocessor workstation and the software  
architecture of the MOSIX operating system kernel.  
We then describe the main enhancements made in the  
multiple processor version and give some  
performance measurements of the internal  
mechanisms of the system.  
 
----- 
File: 1988/tr-88-005 
 
Static Allocation of Periodic Tasks with Precedence Restraints in Distributed Systems 
 
Kang Shin and Dar-Tzen Peng 
tr-88-005 
October 1988 
 
Using two branch-and-bound (B&B) algorithms, we  
propose an optimal solution to the problem of  
allocating (or assigning with subsequent  
scheduling considered) periodic tasks to a set of  
heterogeneous processing nodes (PNs) of a  
distributed real-time system. The solution is  
optimal in the sense of minimizing the maximum  
normalized task response time, called the system  
hazard, subject to precedence constraints among the  
tasks to be allocated. First, the task system is  
described as a task graph (TG), which represents  
computation and communication modules as well as the  
precedence constraints among them. Second, the  
exact system hazard of a complete assignment is  
determined so that an optimal (rather than  
suboptimal) assignment can be derived. This exact  
cost is obtained by optimally scheduling the modules  
assigned to each PN with a B&B algorithm guided by the  
dominance relationship between simultaneously  
schedulable modules. Thirdly, to reduce the amount  
of computation needed for an optimal assignment, we  
derive a lower-bound system hazard that is  
obtainable with a polynomial time algorithm. This  
lower-bound cost, together with the exact cost of a  
complete assignment, is used to efficiently guide  
the search for an optimal assignment. Finally,  
examples are provided to demonstrate the concept,  
utility and power of our approach.  
 
----- 
File: 1988/tr-88-006 
 
Load Sharing in Distributed Real-Time Systems with Broadcast State Changes 
 
Kang Shin and Yi-Chieh Chang 
tr-88-006 
October 1988 
 
If task arrivals are not uniformly distributed over  
the nodes in a distributed real-time system, some  
nodes may become overloaded while others are  
lightly-loaded or even idle. Consequently, some  
tasks cannot be completed before their deadlines,  
even if the overall system has the capacity to meet all  
deadlines. Load sharing (LS) is one way to alleviate  
this difficulty. In this paper, we propose a  
decentralized, dynamic LS method for a distributed  
real-time system. Under this LS method, whenever the  
state of a node changes from lightly-loaded to  
overloaded and vice versa, the node broadcasts this  
change to a set of nodes, called a buddy set, in the  
system. An overloaded node can select, without  
probing other nodes, the first available node from  
its preferred list, an ordered set of nodes in its  
buddy set. Preferred lists are so constructed that  
the probability of more than one overloaded node  
"dumping" their loads on a single lightly-loaded  
node may be made very small. Performance of the  
proposed LS policy is evaluated with both analytic  
modeling and simulation. Analytic models are used to  
derive the distribution of queue length at each node,  
the probability of meeting task deadlines, and  
analyze the effects of buddy set size, the frequency  
of state change, and the average system sojourn time  
of each task. On the other hand, simulation is used to  
verify analytic results. The proposed LS method is  
shown to meet task deadlines with a very high  
probability.  
 
----- 
File: 1988/tr-88-007 
 
Monitoring and Management-Support of Distributed Systems 
 
Dieter Haban, Dieter Wybranietz, and Amnon Barak 
tr-88-007 
November 1988 
 
This paper describes a tool for on-line monitoring of  
distributed systems. The tool consists of a hardware  
component and software level, i.e., a hybrid  
monitor, which is capable of presenting the  
interactive user and the local operating system with  
a high-level information and performance  
evaluation of the activities in the host system with  
minimal interferences. A special hardware support,  
which consists of a test and measurement processor  
(TMP), was designed and has been implemented in the  
nodes of an experimental multicomputer system. The  
main function of the TMP is to execute low level  
operating system functions, to manage local  
resources and to trigger time driven events in order  
to reduce the overhead of the host operating system.  
The operations of the TMP are completely transparent  
to the users with a minimal, less that 0.1%, overhead  
to the hardware system. In the experimental system,  
all the TMPs were connected with a central monitoring  
station, using an independent communication  
network in order to provide a global view of the  
monitored system. The central monitoring station  
displays the resulting information in easy-to-read  
charts and graphs. Our experience with the TMP shows  
that it promotes an improved understanding of  
run-time behavior and performance measurements, to  
derive qualitative and quantitative assessments of  
distributed systems.  
 
----- 
File: 1988/tr-88-008 
 
Links Between Markov Models and Multilayer Perceptrons 
 
Herve Bourlard and C. J. Wellekens 
tr-88-008 
November 1988 
 
Hidden Markov models are widely used for automatic  
speech recognition. They inherently incorporate  
the sequential character of speech signal and are  
statistically trained. However, the a priori choice  
of a model topology limits the flexibility of the  
HMM's. Another drawback of these models is their weak  
discriminating power. <P>Multilayer perceptrons are  
now promising tools in the connectionist approach  
for classification problems and have already been  
successfully tested on speech recognition  
problems. However, the sequential nature of the  
speech signal remains difficult to handle in that  
kind of machine. <P>In this paper, a discriminant hidden  
Markov model is defined and it is shown how a  
particular multilayer perceptron with contextual  
and extra feedback input units can be considered as a  
general form of such Markov models. Relations with  
other recurrent networks commonly used in speech  
recognition are also pointed out.  
 
----- 
File: 1988/tr-88-009 
 
Designing Computers to Check Their Work 
 
Manuel Blum 
tr-88-009 
November 1988 
 
Students, engineers, programmers...are taught to  
check their work. Computer programs are not. There  
are several reasons for this: <BLOCKQUOTE><P>1. Computer hardware  
almost never makes errors -- but that fails to  
recognize that programmers unfortunately do! <P>2.  
Programs are hard enough to write without having to  
also write program checkers for them -- but that is the  
price of increased confidence! <P>3. There is no clear  
notion what constitutes a good checker. Indeed, the  
same students and engineers who are cautioned to  
check their work are rarely informed what it is that  
makes for a good procedure to do so -- but that is just  
the sort of problem that computer scientists should  
be able to solve! </BLOCKQUOTE><P> 
In the view of the author, the lack of  
correctness checks in programs is an oversight.  
Programs have bugs that could perfectly well be  
caught by such checks. This paper urges that programs  
be written to check their work, and outlines a  
promising and rigorous approach to the study of this  
fascinating new area.  
 
----- 
File: 1988/tr-88-010 
 
Knowledge-Intensive Recruitment Learning 
 
Joachim Diederich 
tr-88-010 
November 1988 
 
The model described in this paper is a  
knowledge-intensive connectionist learning  
system which uses a built-in knowledge  
representation module for inferencing, and this  
reasoning capability in turn is used for  
knowledge-intensive learning. On the  
connectionist network level, the central process is  
the recruitment of new units and the assembly of units  
to represent new conceptual information. Free,  
uncommitted subnetworks are connected to the  
built-in knowledge network during learning. The  
goal of knowledge-intensive connectionist  
learning is to improve the operationality of the  
knowledge representation: mediated inferences,  
i.e., complex inferences which require several  
inference steps, are transformed into immediate  
inferences; in other words, recognition is based on  
the immediate excitation from features directly  
associated with a concept.  
 
----- 
File: 1988/tr-88-011 
 
Time, Space and Form in Vision 
 
Jerome A. Feldman 
tr-88-011 
December 1988 
 
The prodigious spatial capabilities of the primate  
visual system are even more remarkable when temporal  
considerations are taken into account. Recent  
advances in neurophysiology, psychophysics and  
computer vision provide significant constraints on  
how the system could work. This paper presents a  
fairly detailed connectionist computational model  
of how the perception and recognition of objects is  
carried out by primate brains. The model is claimed to  
be functionally adequate and to satisfy all the  
constraints established by the various  
disciplines. One key notion introduced is a  
multi-input, multi-output network for inverting  
spatio-temporal cues. The central construct in  
intermediate level motion vision is taken to be the  
trajectory and these are used in recognition of  
dynamic situations called scenarios. The entire  
development is an extension of the author's 1985 Four  
Frames model, which required relatively little  
modification to accommodate temporal change  
(eventually).  
 
----- 
File: 1988/tr-88-012 
 
On a Theory of Computation and Complexity Over the Real Numbers; NP 
Completeness, Recursive Functions and Universal Machines 
 
Lenore Blum, Mike Shub, and Steve Smale 
tr-88-012 
December 1988 
 
We present a model for computation over the reals or an  
arbitrary (ordered) ring R. In this general setting,  
we obtain universal machines, partial recursive  
functions, as well as NP complete problems. While our  
theory reflects the classical theory over Z (e.g.,  
the computable functions are the recursive  
functions) it also reflects the special  
mathematical character of the underlying ring R  
(e.g., complements of Julia sets provide natural  
examples of R.E. undecidable sets over the reals) and  
provides a natural setting for studying  
foundational issues concerning algorithms in  
numerical analysis.  
 
----- 
File: 1988/tr-88-013 
 
Program Correctness Checking and the Design of Programs That Check Their 
Work 
 
Manuel Blum and Sampath Kannan 
tr-88-013 
December 1988 
 
A program correctness checker is an algorithm for  
checking the output of a computation. This paper  
defines the concept of a program checker. It designs  
program checkers for a few specific and carefully  
chosen problems in the class P of problems solvable in  
polynomial time. It also applies methods of modern  
cryptography, especially the idea of a  
probabilistic interactive proof, to the design of  
program checkers for group theoretic computations.  
Finally it characterizes the problems that can be  
checked.  
 
----- 
File: 1989/tr-89-001 
 
Guaranteeing Performance for Real-Time Communication in Wide-Area Networks 
 
Domenico Ferrari 
tr-89-001 
January 1989 
 
The increasing importance of distributed  
multimedia applications and the emergence of user  
interfaces based on digital audio and digital video  
will soon require that computer communication  
networks offer real-time services. This paper  
argues that the feasibility for providing  
performance guarantees in a wide-area network  
should be investigated, and describes a possible  
approach. We present a model of the network to be  
studied, and discuss its generality, as well as the  
presumable limits to its validity in the future. We  
also give a careful formulation of the problem,  
including a precise definition of the guarantees to  
be provided and a provably correct scheme for the  
establishment of real-time connections with  
deterministic, statistical, and best-effort delay  
bounds.  
 
----- 
File: 1989/tr-89-002 
 
Pseudo-Random Number Generator From ANY One-Way Function 
 
Russell Impagliazzo and Mike Luby 
tr-89-002 
February 1989 
 
We construct a pseudo-random number generator from  
ANY one-way function. Previous results show how to  
construct pseudo-random number generators from  
one-way functions that have special properties  
(Blum and Micali [BM], Yao [Y], Levin [L1],  
[Goldreich, Krawczyk and Luby [GKL]). We use  
techniques borrowed from the theory of  
slightly-random sources (Santha and Vazirani [SV],  
Vazirani and Vazirani [VV], Vazirani [V], Chor and  
Goldreich [CG]) and from the theory of universal hash  
functions (Carter and Wegman [CW]). <P>We also  
introduce a weaker kind of one-way function, that we  
call an informationally one-way function. For an  
informationally one-way function f, given y = f(x)  
for a randomly chosen x, it is hard to generate  
uniformly a random preimage of y. We show that the  
existence of an informationally one-way function  
yields a one-way function in the usual sense, and  
hence a pseudo-random number generator. These  
results can be combined to show that the following are  
equivalent: (1) private key encryption; (2) bit  
commitment; (3) pseudo-random number generators;  
(4) one-way functions; (5) informationally one-way  
functions.  
 
----- 
File: 1989/tr-89-003 
 
Parallel Search for Maximal Independence Given Minimal Dependence 
 
Paul Beame and Michael Luby 
tr-89-003 
February 1989 
 
We consider the problem of finding a maximal  
independent set fast in parallel when the  
independence system is presented as an explicit list  
of minimal dependent sets. Karp and Wigderson [KW]  
were the first to find an NC algorithm for the special  
case when the size of each minimal dependent set is at  
most two, and subsequent work by Luby [Lu1], by Alon,  
Babai and Itai[ABI] and Goldberg and Spencer [GS]  
have introduced substantially better algorithms  
for this case. On the other hand, no previous work on  
this problem extends even to the case when the size of  
each minimal dependent set is at most a constant, and  
we conjecture that this algorithm is a randomized NC  
algorithm for the general case.  
 
----- 
File: 1989/tr-89-004 
 
Towards a Theory of Average Case Complexity 
 
Shai Ben-David, Benny Chor, Oded Goldreich, and Michael Luby 
tr-89-004 
February 1989 
 
This paper takes the next step in developing the  
theory of average case complexity, a study initiated  
by Levin. Previous works have focused on the  
existence of complete problems [Le,Gu,VL]. We widen  
the scope to other basic questions in computational  
complexity. For the first time in the context of  
average case complexity, we show the equivalence of  
search and decision problems, analyze the structure  
of NP under P reductions, and relate the NP versus  
average-P to non-deterministic versus  
deterministic (worst case) exponential time. We  
also present definitions and basic theorems  
regarding other complexity classes, such as average  
log-space.  
 
----- 
File: 1989/tr-89-005 
 
A Study of Password Security 
 
Michael Luby, and Charles Rackoff 
tr-89-005 
February 1989 
 
We prove relationships between the security of a  
function generator when used in an encryption scheme  
and the security of a function generator when used in a  
UNIX-like password scheme.  
 
----- 
File: 1989/tr-89-006 
 
Fault-Tolerant Routing in Hypercube Multicomputers Using Depth-First Search 
 
Ming-Syan Chen, and Kang G. Shin 
tr-89-006 
February 1989 
 
A fault-tolerant routing scheme for hypercube  
multicomputers is developed using the depth-first  
search. The routing scheme requires a node to know  
only the condition (faulty or not) of its own links,  
and adds information on the components traversed to  
each message as it is routed toward the destination  
node. <P>Performance of the proposed routing scheme is  
rigorously analyzed. We derive an exact expression  
for the probability of routing messages via optimal  
paths (of length identical to the Hamming distance  
between the corresponding pair of nodes) from the  
source node to an obstructed node, the first node on a  
path determined by the above routing scheme from  
which no optimal path to the destination exists.  
Moreover, bounds for this probability are derived in  
closed form. The probability of routing messages via  
optimal paths between the source and destination can  
be obtained from this expression by replacing the  
obstructed node with the destination node. The  
lengths of paths obtained from this scheme are  
analyzed, and the scheme, despite its simplicity, is  
shown to be able to route messages via optimal paths  
with a very high probability. <P>Due to the absence of  
information at each node on components other than its  
own links, the actual paths chosen by the above scheme  
could sometimes be longer than the desired. To  
alleviate this deficiency, we also present a simple  
modification to the above routing scheme in which  
every node is made aware of not only the condition of  
its own links but also that of links one hop away from  
the node. The improvement of routing efficiency with  
this additional information at each node is  
analyzed.  
 
----- 
File: 1989/tr-89-007 
 
A Linear-Algorithm for Enumerating Perfect Matchings in Skew Bipartite 
Graphs 
 
Paul Dagum 
tr-89-007 
February 1989 
 
Let G = (U,V,E) be a bipartite graph with |E| = m, U union  
V = {v(subscript 1),..., v(subscript 2n)} and with  
the bipartition U consisting of all odd indexed  
vertices and V consisting of all even indexed  
vertices. An edge in G is always assumed to be oriented  
towards the endpoint with the larger index. We refer  
to the up (resp. down) edges of G as the edges which are  
oriented from an even (resp. odd) indexed vertex. If  
all the up edges are nested among themselves and among  
the down edges we say G is a skew graph. The main result  
of this paper is to give an O(m) algorithm to enumerate  
perfect matchings in skew graphs. Applications to  
outerplanar graphs and some problems in chemistry  
are given.  
 
----- 
File: 1989/tr-89-008 
 
Spreading Activation and Connectionist Models for Natural Language 
Processing 
 
Joachim Diederich 
tr-89-008 
February 1989 
 
High level cognitive tasks performed by an  
artificial neural network require both knowledge  
over a domain and inferencing abilities. To operate  
in a complex, natural environment neural networks  
must have robust, reliable and massively parallel  
inference mechanisms. This paper describes various  
spreading activation and connectionist mechanisms  
for inferencing as part of natural language  
processing systems, including possible techniques  
to enrich these systems by machine learning. In  
particular models which attack one or more important  
problems such as variable binding,  
knowledge-intensive learning, avoidance of  
cross-talk and false classifications are selected  
for this overview.  
 
----- 
File: 1989/tr-89-009 
 
Constructive Omega(t(superscript 1.26)) Lower Bound for the Ramsey Number R 
(3,t) 
 
Richard Cleve, and Paul Dagum 
tr-89-009 
February 1989 
 
We present a feasibly constructive proof that R(3,t)  
> 5((t-1)/2)(superscript (log4/log3)) Element  
Omega (t(superscript 1.26)). This is, as far as we  
know, the first constructive superlinear lower  
bound for R(3,t). Also, our result yields the first  
feasible method for constructing triangle-free  
k-chromatic graphs that are polynomial-size in k.  
 
----- 
File: 1989/tr-89-010 
 
Conceptual Hierarchies in Classical and Connectionist Architecture 
 
Alfred Kobsa 
tr-89-010 
February 1989 
 
Representation systems for conceptual hierarchies  
have been used in the field of Artificial  
Intelligence for nearly two decades. They are based  
on symbolic representation structures and  
sequential processes operating upon these  
structures. Recently, a number of network  
structures have been developed in the field of  
Connectionism which are also claimed to be able to  
represent conceptual hierarchies. Processes in  
these networks operate in a parallel way and largely  
without a global control mechanism. This paper  
investigates the expressive power,  
interpretation, and inferential capabilities of  
these networks as compared to traditional  
representations, of concept hierarchies in  
particular to KL-ONE, a standard representation  
language for conceptual hierarchies in the field of  
natural-language processing. Although the  
capabilities of current connectionist hierarchies  
fall short of traditional representations, three  
inference processes will be described which can be  
very easily and elegantly realized in a  
connectionist architecture whilst they are hard and  
cumbersome to implement in traditional knowledge  
representation systems.  
 
----- 
File: 1989/tr-89-011 
 
Preemptive Ensemble Motion Planning on a Tree 
 
Greg N. Frederickson, and D. J. Guan 
tr-89-011 
March 1989 
 
Consider the problem of transporting a set of objects  
between the vertices of a tree by a vehicle that  
travels along the edges of the tree. The vehicle can  
carry only one object at a time, and it starts and  
finishes at the same vertex of the tree. It is shown  
that if objects can be dropped at intermediate  
vertices along its route and picked up later, then the  
problem can be solved in polynomial time. Two  
efficient algorithms are presented for this  
problem. The first algorithm runs in O(k + qn) time,  
where n is the number of vertices in the tree, k is the  
number of objects to be moved, and q is less than or  
equal to min{k,n} is the number of nontrivial  
connected components in a related directed graph.  
The second algorithm runs in O(k + nlogn) time. <P>* Has  
since been revised by author. Contact him via  
"gnf at cs.purdue.edu" for a current copy.  
 
----- 
File: 1989/tr-89-012 
 
Nonpreemptive Ensemble Motion Planning on a Tree 
 
Greg N. Frederickson, and D. J. Guan 
tr-89-012 
March 1989 
 
Consider the problem of transporting a set of objects  
between the vertices of a tree by a vehicle that  
travels along the edges of the tree. The vehicle can  
carry only one object at a time, and it starts and  
finishes at the same vertex of the tree. It is shown  
that if each object must be carried directly from its  
initial vertex to its destination, then finding a  
minimum cost transportation is NP-hard. Several  
fast approximation algorithms are presented for  
this problem. The fastest runs in O(k + n) time and  
generates a transportation of cost at most 3/2 times  
the cost of an optimal transportation, where n is the  
number of vertices in the tree, k is the number of  
objects to be moved. Another runs in O(k +  
nlogbeta(n,q)) time, and generates a  
transportation of cost at most 5/4 times the cost of an  
optimal transportation, where q is less than or equal  
to min{k,n} is the number of nontrivial connected  
components in a related directed graph. <P>* Has since  
been revised by author. Contact him via  
"gnf at cs.purdue.edu" for a current copy.  
 
----- 
File: 1989/tr-89-013 
 
The Establishment of the International Computer Science Institute in 
Berkeley, California: Venturing with Norbert 
 
Ron Kay 
tr-89-013 
March 1989 
 
This is an account of the events and considerations  
which led to the establishment of the International  
Computer Science Institute in Berkeley,  
California. The initiative for this undertaking  
came from Norbert Szyperski, as Managing Director of  
the German National Center for Computer Science  
(GMD). He also took the lead in assuring support on the  
part of German industry and government. Copies of the  
most important source documents are included as an  
appendix to this account.  
 
----- 
File: 1989/tr-89-014 
 
Subtree Isomorphism is in Random NC 
 
Philip Gibbons, Richard M. Karp, and Gary L. Miller and Danny Soroker 
tr-89-014 
March 1989 
 
Given two trees, a guest tree G and a host tree H, the  
subtree isomorphism problem is to determine whether  
there is a subgraph of H that is isomorphic to G. We  
present a randomized parallel algorithm for finding  
such an isomorphism, if it exists. The algorithm runs  
in time O(log(superscript 3)n) on a CREW PRAM, where n  
is the number of nodes in H. The number of processors  
required by the algorithm is polynomial in n.  
Randomization is used (solely) to solve each of a  
series of bipartite matching problems during the  
course of the algorithm. We demonstrate the close  
connection between the two problems by presenting a  
log-space reduction from bipartite perfect  
matching to subtree isomorphism. Finally, we  
present some techniques to reduce the number of  
processors used by the algorithm.  
 
----- 
File: 1989/tr-89-015 
 
Planar Graph Decomposition and All Pairs Shortest Paths 
 
Greg N. Frederickson 
tr-89-015 
March 1989 
 
An algorithm is presented for generating a succinct  
encoding of all pairs shortest path information in a  
directed planar graph G with real-valued edge costs  
but not negative cycles. The algorithm runs in O(pn)  
time, where n is the number of vertices in G, and p is the  
minimum cardinality of a subset of the faces that  
cover all vertices, taken over all planar embeddings  
of G. The algorithm is based on a decomposition of the  
graph into O(pn) outerplanar subgraphs satisfying  
certain separator properties. Linear-time  
algorithms are presented for various subproblems  
including that of finding an appropriate embedding  
of G and a corresponding face-on-vertex covering of  
cardinality O(p), and of generating all pairs  
shortest path information in a directed outerplanar  
graph. <P>* Has since been revised by author. Contact him  
via "gnf at cs.purdue.edu" for a current copy.  
 
----- 
File: 1989/tr-89-016 
 
Explanation and Connectionist Systems 
 
Joachim Diederich 
tr-89-016 
April 1989 
 
Explanation is an important function in symbolic  
artificial intelligence (AI). For example,  
explanation is used in machine learning and for the  
interpretation of prediction failures in  
case-based reasoning. Furthermore, the  
explanation of results of a reasoning process to a  
user who is not a domain expert must be a component of  
any inference system. Experience with expert  
systems has shown that the ability to generate  
explanations is absolutely crucial for the  
user-acceptance of AI systems (Davis, Buchanan &  
Shortliffe 1977). In contrast to symbolic systems,  
neural networks have no explicit, declarative  
knowledge representation and therefore have  
considerable difficulties in generating  
explanation structures. In neural networks,  
knowledge is encoded in numeric parameters  
(weights) and distributed all over the system. <P>It is  
the intention of this paper to discuss the ability of  
connectionist systems to generate explanations. It  
will be shown that connectionist systems benefit  
from the explicit encoding of relations and the use of  
highly structured networks in order to realize  
explanation and explanation components.  
Furthermore, structured connectionist systems  
using spreading activation have the advantage that  
any intermediate state in processing is  
semantically meaningful and can be used for  
explanation. The paper describes several  
successful applications of explanation components  
in connectionist systems which use highly  
structured networks, and discusses possible future  
realizations of explanation in neural networks.  
 
----- 
File: 1989/tr-89-017 
 
Generalization and Parameter Estimation in Feedforward Nets: Some 
Experiments 
 
N. Morgan and H. Bourlard 
tr-89-017 
April 1989 
 
We have begun an empirical study of the relation of the  
number of parameters (weights) in a feedforward net  
to generalization performance. Two experiments are  
reported. In one, we use simulated data sets with  
well-controlled parameters, such as the  
signal-to-noise ratio of continuous-valued data.  
In the second, we train the network on  
vector-quantized mel cepstra from real speech  
samples. In each case, we use back-propagation to  
train the feedforward net to discriminate in a  
multiple class pattern classification problem. We  
report the results of these studies, and show the  
application of cross-validation techniques to  
prevent overfitting.  
 
----- 
File: 1989/tr-89-018 
 
A Parallel Algorithm for Maximum Matching in Planar Graphs 
 
Marek Karpinski, Elias Dahlhaus, and Andrzej Lingas 
tr-89-018 
April 1989 
 
We present a new parallel algorithm for finding a  
maximum (cardinality) matching in a planar  
bipartite graph G. Our algorithm is processor-time  
product efficient if the size l of a maximum matching  
of G is large. It runs in time O((n/2-l + (the square  
root of n))log (superscript 7)n) on a CRCW PRAM with  
O(n(superscript 1.5)log (superscript 3)n)  
processors.  
 
----- 
File: 1989/tr-89-019 
 
A More Practical PRAM Model 
 
Phillip B. Gibbons 
tr-89-019 
April 1989 
 
This paper introduces the Asynchronous PRAM model of  
computation, a variant of the PRAM in which the  
processors run asynchronously and there is an  
explicit charge for synchronization. A family of  
Asynchronous PRAM's are defined, varying in the  
types of synchronization steps permitted and the  
costs for accessing the shared memory. Algorithms,  
lower bounds, and simulation results are presented  
for an intersting member of the family.  
 
----- 
File: 1989/tr-89-020 
 
Multiple Network Embeddings into Hypercubes 
 
Ajay Gupta and Susanne E. Hambrusch 
tr-89-020 
April 1989 
 
In this paper we study the problem of how to  
efficiently embed r interconnection networks  
G(subscript 0),...G(subscript r-1), r is less than  
or equal to k, into a k-dimensional hypercube H so that  
every node of the hypercube is assigned at most r nodes  
all of which belong to different G(subscript i)s.  
When each G(subscript i) is a complete binary tree or a  
leap tree of 2(superscript k)-1 nodes, we describe an  
embedding achieving a dilation of 2 and a load of 5 and  
6, respectively. For the cases when each G(subscript  
i) is a linear array of a 2-dimensional mesh of  
2(superscript k) nodes, we describe embeddings that  
achieve a dilation of 1 and an optimal load of 2 and 4,  
respectively. Using these embeddings, we also show  
that r(subscript 1) complete binary trees,  
r(subscript 2) leap trees, r(subscript 3) linear  
arrays, and r(subscript 4) meshes can  
simultaneously be embedded into H with constant  
dilation and load, (4 over sum over i=1) (r(subscript  
i)) is less than or equal to k.  
 
----- 
File: 1989/tr-89-021 
 
Learning Read-Once Formulas Using Membership Queries 
 
Lisa Hellerstein and Marek Karpinski 
tr-89-021 
April 1989 
 
In this paper we examine the problem of exact learning  
(and inferring) of read-once formulas (also called  
mu-formulas or boolean trees) using membership  
queries. The power of membership queries in learning  
various classes of formulas was studied by Angluin  
[A]. Valiant proved that, using three powerful  
oracles, read-once formulas can be learned in  
polynomial time [V]. Pitt and Valiant proved that if  
RP is not equal to NP, read-once formulas cannot be  
learned by example in polynomial time [PV,KLPV]. We  
show that given explicitly a boolean formula f  
defining a read-once function, if RP is not equal to  
NP, then there does not exist a polynomial time  
algorithm for inferring an equivalent read-once  
formula. An easy argument on the cardinality of the  
set of all (read-once) 1-term DNF formulas implies an  
exponential lower bound on the number of membership  
queries necessary to learn read-once formulas.  
Angluin showed that it takes time 2(superscript  
Omega(n)) to learn monotone n-term DNf formulas  
using membership queries [A]. We prove that,  
surprisingly, it is possible to learn monotone  
read-once formulas in polynomial time using  
membership queries. We present an algorithm that  
runs in time O(n(superscript 3)) and makes  
O(n(superscript 3)) queries to the oracle. It is  
based on a combinatorial characterization of  
read-once formulas developed by Karchmer et. al.  
[KLNSW]. We also use the combinatorial  
characterization to prove two other results. We show  
that read-once formulas can be learned in polynomial  
time using only one of the three oracles used in  
Valiant's polynomial time algorithm. In addition,  
we show that given an arbitrary boolean formula f, the  
problem of deciding whether f defines a read-once  
function is complete in the class D(superscript P)  
under randomized NC(superscript 1)-reductions.  
The main results of this paper can also be interpreted  
in terms of efficient input oracle algorithms for  
boolean function interpolation (cf. [KUW],[GKS].  
 
----- 
File: 1989/tr-89-022 
 
Real-Time Communication in Packet-Switching Wide-Area Networks 
 
Domenico Ferrari 
tr-89-022 
May 1989 
 
The increasing importance of distributed  
multimedia applications and the emergence of user  
interfaces based on digital audio and digital video  
will soon require that computer communication  
networks offer real-time services. This paper  
argues that the feasibility of providing  
performance guarantees in a packet-switching  
wide-area network should be investigated, and  
describes a possible approach. We present a model of  
the network to be studied, and discuss its  
generality, as well as the presumable limits to its  
validity in the future. We also formulate the  
problem, give a definition of the guarantees to be  
provided, and describe a correct scheme for the  
establishment of real-time connections with  
deterministic, statistical, and best-effort delay  
bounds.  
 
----- 
File: 1989/tr-89-023 
 
Approximating the Permanent of Graphs with Large Factors 
 
Paul Dagum and Michael Luby 
tr-89-023 
April 1989 
 
Let G = (U,V,E) be a bipartite graph with |U|=|V|=n.  
The factor size of G,f, is the maximum number of edge  
disjoint perfect matchings in G. We characterize the  
complexity of counting the number of perfect  
matchings in classes of graphs parameterized by  
factor size. We describe the simple algorithm, which  
is an approximation algorithm for the permanent that  
is a natural simplification of the algorithm  
suggested in [Broder 86] and analyzed in [Jerrum,  
Sinclair 88 a,b]. A combinatorial lemma is used to  
prove that the simple algorithm runs in time  
n(superscript O(n/f)). Thus (1) for all constants  
alpha > 0, the simple algorithm runs in polynomial  
time for graphs with factor size at least alpha(n);  
(2) for some constant c, the simple algorithm is the  
fastest known approximation for graphs with factor  
size at least c log n. (Compare with the approximation  
algorithms described in [Karmarkar, et. al. 88). <P>We  
prove the following complementary hardness  
results. For functions f such that 3 is less than or  
equal to f(n) is less than or equal to n-3, the exact  
counting problem for f(n)-regular bipartite graphs  
is #P-complete. For any epsilon > 0, for any function f  
such that 3 is less than or equal to f(n) is less than or  
equal to n (superscript 1-epsilon), approximate  
counting for f(n)-regular bipartite graphs is as  
hard as approximate counting for all bipartite  
graphs.  
 
----- 
File: 1989/tr-89-024 
 
An Efficient Parallel Algorithm for the Minimal Elimination Ordering (MEO) 
of an Arbitrary Graph 
 
Elias Dahlhaus and Marek Karpinski 
tr-89-024 
May 1989 
 
We design the first efficient parallel algorithm for  
computing Minimal Elimination Ordering (MEO) of an  
arbitrary graph. <P>The algorithm works in  
O(log(superscript 3)n) parallel time and O(nm)  
processors on a CRCW PRAM, for an n-vertex, m-edge  
graph, and is optimal up to polylogarithmic factor  
with respect to the best sequential algorithm of  
Rose, Tarjan and Lueker. <P>The MEO Problem for  
arbitrary graphs arises in a number of combinatorial  
optimization problems, as well as in database  
applications, scheduling problems, and the sparse  
Gaussian elimination of symmetric matrices. It was  
believed before to be inherently sequential and  
strongly resisting sublinear parallel time  
(sublinear sequential storage) algorithms. <P>As an  
application, this paper gives the first efficient  
parallel solutions to the problem of Minimal Fill-In  
for arbitrary graphs (and connected combinatorial  
problems, cf. [RTL 76],[Ta 85]), and to the problem of  
the Gaussian elimination of sparse symmetric  
matrices [Ro 70], [Ro 73]. (The problem of computing  
Minimum Fill-In is known to be NP-complete [Ya 81]).  
It gives also an alternative to [GM 87] efficient  
parallel algorithm for computing Breadth-First  
Search (BFS) trees in arbitrary graphs using O(nm)  
processors on a CRCW PRAM. <P>The method of solution  
involves a development of new techniques for solving  
connected minimal set system problem, and combining  
it with some new divide-and-conquer methods.  
 
----- 
File: 1989/tr-89-025 
 
On Parallel Evaluation of Game Trees 
 
Richard M. Karp and Yanjun Zhang 
tr-89-025 
May 1989 
 
We present parallel algorithms for evaluating game  
trees. These algorithms parallelize the  
"left-to-right" sequential algorithm for  
evaluating AND/OR trees and the alpha-beta pruningn  
procedure for evaluating MIN/MAX trees. We show  
that, on every instance of a uniform tree, these  
parallel algorithms achieve a linear speed-up over  
their corresponding sequential algorithms, if  
number of processors used is close to the height of the  
input tree. These are the first non-trivial  
deterministic speed-up bounds known for the  
"left-to-right" algorithm and the alpha-beta  
pruning procedure.  
 
----- 
File: 1989/tr-89-026 
 
Separating Abstraction from Implementation in Communication Network Design 
 
Ramon Caceres 
tr-89-026 
May 1989 
 
Datagrams and visual circuits are not disjoint  
conceptual models for data communication, but  
rather inhabitants of a wide design space containing  
many other viable networking solutions. Many design  
choices often closely associated with these two  
communication styles can be decoupled from the  
datagram and virtual circuit abstractions, and  
combined to form new and effective network  
implementations. This paper examines several key  
elements of network architecture. For each element,  
it shows how certain characteristics often thought  
to differentiate datagrams and virtual circuits are  
independent of these two concepts and form a  
multi-valued spectrum of design choices. This  
discussion is motivated by the current drive to  
design a new generation of high-speed wide-area  
networks, and the observation that this effort would  
benefit from a more systematic evaluation of  
existing and future network design alternatives.  
 
----- 
File: 1989/tr-89-027 
 
Boolean Circuit Complexity of Algebraic Interpolation Problems 
 
Marek Karpinski 
tr-89-027 
May 1989 
 
We present here some recent results on fast parallel  
interpolation of multivariate polynomials over  
finite fields. Some applications towards the  
general conversion algorithms for boolean  
functions are also formulated.  
 
----- 
File: 1989/tr-89-028 
 
Application of Real-Time Monitoring to Scheduling Tasks with Random 
Execution Times 
 
Dieter Haban and Kang Shin 
tr-89-028 
May 1989 
 
A real-time monitor is employed to aid in scheduling  
tasks with random execution times in a real-time  
computing system. Scheduling algorithms are  
usually based on the worst-case execution time (WET)  
of each task. Due to data-dependent loops and  
conditional branches in each program and resource  
sharing delay during execution, this WET is usually  
difficult to obtain and could be several orders of  
magnitude larger than the true exception time. Thus,  
scheduling tasks based on WET could result in a severe  
underutilization of CPU cycles and  
under-estimation of systems schedulability. <P>To  
alleviate the above problem, we propose to use a  
real-time monitor as a scheduling aid. The real-time  
monitor is composed of dedicated hardware, called  
Test and Measurement Processor (TMP), and used to  
measure accurately, with minimal interference, the  
true execution time which consists of the pure  
execution time and resource sharing delay. The  
monitor is a permanent and transparent part of a  
real-time system, degrades system performance by  
less than 0.1 percent, and does not interfere with the  
host system's execution. <P>Using the measured pure  
execution time and resource sharing delay for each  
task, we have developed a mechanism which reduces the  
discrepancy between the WET and the estimated  
execution time. This result is then used to decide at  
an earliest possible time whether or not a task can  
meet its deadline.  
 
----- 
File: 1989/tr-89-029 
 
Behavior and Performance Analysis of Distributed Systems Using a Hybrid 
Monitor 
 
Dieter Haban and Dieter Wybranietz 
tr-89-029 
May 1989 
 
This paper describes a hybrid monitor for measuring  
the performance and observing the behavior of  
distributed systems during execution. We emphasize  
data collection, analysis and presentation of  
execution data. A special hardware support, which  
consists of a test and measurement processor (TMP),  
was designed and has been implemented in the nodes of  
an experimental multicomputer system consisting of  
eleven nodes. The operations of the TMP are  
completely transparent with a minimal, less than  
0.1%, overhead to the measured system. In the  
experimental system, all the TMPs were connected  
with a central monitoring station, using an  
independent communication network, in order to  
provide a global view of the monitored system. The  
central monitoring station displays the resulting  
information in easy-to-read charts and graphs. Our  
experience with the TMP shows that it promotes an  
improved understanding of run-time behavior and  
performance measurements, to derive qualitative  
and quantitative assessments of distributed  
systems.  
 
----- 
File: 1989/tr-89-030 
 
Monitoring and Measuring Parallel Systems Using a Non-Intrusive Rule-Based 
System 
 
Dieter Haban and Dieter Wybranietz 
tr-89-030 
March 1989 
 
This paper describes a tool for on-line monitoring of  
distributed systems and the evaluation of the  
collected data. The hybrid monitor is capable of  
presenting the interactive user and the local  
operating system with high-level information of the  
behavior and the activities in the host system with  
minimal interferences. A special hardware support,  
which consists of a test and measurement processor  
(TMP), was designed and has been implemented in the  
nodes of an experimental multicomputer system. The  
operations of the TMP are completely transparent to  
users with a minimal, less than 0.1 percent, overhead  
to the hardware system. To provide a global view of the  
monitored system, a central monitoring station  
evaluates the locally collected data and displays  
the resulting information in charts and graphs. A  
rule-based evaluation system assists in improving  
the understanding of run-time behavior and in easily  
assessing performance measurements. Flexibility  
is achieved by rules given in tables which control the  
evaluation and the display of monitored and  
processed data. These rules represent expert-level  
knowledge about the evaluation of distributed  
systems.  
 
----- 
File: 1989/tr-89-031 
 
One-Way Functions are Essential for Complexity Based Cryptography (Extended 
) 
 
Russell Impagliazzo and Michael Luby 
tr-89-031 
May 1989 
 
In much of modern cryptography, the security of a  
protocol is based on the intractability of a problem  
such as factorization of randomly chosen large  
numbers. The problems assumed intractable all have  
the same form; they are based on a one-way function,  
i.e. one that is easy to compute but hard to invert.  
This is not a coincidence. We show that for many  
cryptographic tasks any secure protocol for the task  
can be converted into a one-way function, and thus any  
proposed protocol for these tasks is implicitly  
based on a one-way function. Tasks examined here are  
chosen to cover a spectrum of cryptographic  
applications: private-key encryption,  
identification/authentication, bit commitment  
and coin-flipping by telephone. Thus, unless  
one-way functions exist, secure protocols for these  
tasks are impossible.  
 
----- 
File: 1989/tr-89-032 
 
A Connectionist Model of Unification 
 
Andreas Stolcke 
tr-89-032 
May 1989 
 
A general approach to encode and unify recursively  
nested feature structures in connectionist  
networks is described. The unification algorithm  
implemented by the net is based on iterative  
coarsening of equivalence classes of graph nodes.  
This method allows the reformulation of unification  
as a constraint satisfaction problem and enables the  
connectionist implementation to take full  
advantage of the potential parallelism inherent in  
unification, resulting in sublinear time  
complexity. Moreover, the method is able to process  
any number of feature structures in parallel,  
searching for possible unifications and making  
decisions among mutually exclusive unifications  
where necessary.  
<P> 
Keywords: Unification, constraint satisfaction,  
connectionism, feature structures.  
 
----- 
File: 1989/tr-89-033 
 
Merging Multilayer Perceptrons and Hidden Markov Models: Some Experiments in 
Continuous Speech Recognition 
 
Herve Bourlard and Nelson Morgan 
tr-89-033 
May 1989 
 
The statistical and sequential nature of the human  
speech production system makes automatic speech  
recognition difficult. Hidden Markov Models (HMM)  
have provided a good representation of these  
characteristics of speech, and were a breakthrough  
in speech recognition research. However, the a  
priori choice of a model topology and weak  
discriminative power limit HMM capabilities.  
Recently, connectionist models have been  
recognized as an alternative tool. Their main useful  
properties are their discriminative power and their  
ability to capture input-output relationships.  
They have also proved useful in dealing with  
statistical data. However, the sequential  
character of speech is difficult to handle with  
connectionist models. We have used a classic form of a  
connectionist system, the Multilayer Perceptron  
(MLP), for the recognition of continuous speech as  
part of an HMM system. We show theoretically and  
experimentally that the outputs of the MLP  
approximate the probability distribution over  
output classes conditioned on the input (i.e., the  
Maximum a Posteriori (MAP) probabilities). We also  
report the results of a series of speech recognition  
experiments. By using contextual information at the  
input of the MLP, frame classification performance  
can be achieved which is significantly improved over  
the corresponding performance for simple Maximum  
Likelihood probabilities, or even MAP  
probabilities without the benefit of context.  
<P>However, it was not so easy to improve the recognition  
of words in continuous speech by the use of an MLP,  
although it was clear that the classification at the  
frame and phoneme levels was better than we achieved  
with our HMM system. We present several  
modifications of the original methods that were  
required to achieve acceptable performance at the  
word level. Preliminary results are reported for a  
1000 word vocabulary, phoneme based,  
speaker-dependent continuous speech recognition  
system embedding MLP into HMM. These results show  
equivalent recognition performance using either  
the Maximum Likelihood or the outputs of an MLP to  
estimate emission probabilities of an HMM.  
 
----- 
File: 1989/tr-89-034 
 
A Survey of Optical Fibers in Communication 
 
Ramesh Govindan and Srinivasan Keshav and Dinesh C. Verma 
tr-89-034 
May 1989 
 
In recent years there has been a major effort to  
integrate fiber optic media into existing  
communication systems. In this survey, we outline  
the physics behind fiber optic media and optical  
interfaces. Different types of optical interfaces  
and optical media are considered and the advantages  
and disadvantages of each are listed. We then discuss  
topologies and protocols suitable for optical  
fibers in communication. We also take a detailed look  
into the new Fiber Distributed Data Interface (FDDI)  
Standard for fiber-optic token rings. Finally, we  
list off-the-shelf fiber networks available as of  
September 1988.  
 
----- 
File: 1989/tr-89-035 
 
Conjectures on Representations in Backpropagation Networks 
 
Paul W. Munro 
tr-89-035 
May 1989 
 
The pros and cons of the backpropagation learning  
procedure have been the subject of numerous debates  
recently. Some point out its promise as a powerful  
instrument for finding the weights in a  
connectionist network appropriate to a given  
problem, and the generalizability of the solution to  
novel patterns. Others claim that it is an algorithm  
for fitting data to a function by error correction  
through gradient descent. The arguments in this  
paper focus on the latter (curve-fitting) point of  
view, but take the point of view that the power of back  
propagation comes from carefully choosing the form  
of the function to be fit. This amounts to choosing the  
architecture and the activation functions of the  
units (nodes) in the net. A discussion of the role of  
these two network features motivates two  
conjectures identifying the form of the squashing  
function as an important factor in the process. Some  
preliminary simulations in support of these  
conjectures are presented.  
 
----- 
File: 1989/tr-89-036 
 
A Scheme for Real-Time Channel Establishment in Wide-Area Networks 
 
Domenico Ferrari and Dinesh C. Verma 
tr-89-036 
May 1989 
 
Multimedia communication involving digital audio  
and/or digital video has rather strict delay  
requirements. A real-time channel is defined in this  
paper as a simplex connection between a source and a  
destination characterized by parameters  
representing the performance requirements of the  
client. A real-time service is capable of creating  
real-time channels on demand and guaranteeing their  
performance. These guarantees often take the form of  
delay bounds that the service enforces in exchange  
for offered load bounds specified and enforced by the  
client. <P>In this paper, we study the feasibility of  
providing real-time services on a packet-switched  
store-and-forward wide-area network with general  
topology. We describe a scheme for the establishment  
of channels with deterministic or statistical delay  
bounds, and present the results of the simulation  
experiments we ran to evaluate it. The results are  
encouraging: our approach is correct (i.e.,  
satisfies the guarantees even in worst-case  
situations), uses the network's resources to a fair  
extent, and efficiently handles channels with a  
variety of offered load and burstiness  
characteristics. The packet transmission overhead  
is quite low, whereas the channel establishment  
overhead may occasionally become too large; an  
approximation method is therefore needed to reduce  
the latter overhead to an acceptable level even in  
those cases.  
 
----- 
File: 1989/tr-89-037 
 
A Tagging Method for Distributed Constraint Satisfaction 
 
Hans Werner Guesgen 
tr-89-037 
June 1989 
 
Local propagation algorithms such as Waltz'  
filtering and Mackworth's AC-x algorithms have been  
successfully applied in AI for solving constraint  
satisfaction problems (CSPs). In general, these  
algorithms can only be used as preprocessing methods  
as they do not compute a global consistent solution  
for a CSP; they result in local consistency also known  
as arc consistency. <P>In this paper, we introduce an  
extension of local constraint propagation to  
overcome this drawback, i.e., to compute global  
consistent solutions for a CSP. The advantage over  
backtracking approaches is that the method  
introduced here is easy to implement on parallel  
machines with an arbitrary number of processors. The  
underlying idea is to associate recursive tags with  
the values during the propagation process so that  
global relationships among the values are  
maintained.  
 
----- 
File: 1989/tr-89-038 
 
Metric Constraint Satisfaction with Intervals 
 
Peter B. Ladkin 
tr-89-038 
June 1989 
 
We show how algorithms in Dechter, Meiri and Pearl's  
recent paper on constraint satisfaction techniques  
for metric information on time points [DeMePe89] may  
be adapted to work directly with metric constraints  
on intervals. Inter alia we show termination of  
path-consistency algorithms if range intervals in  
the problem contain only rational number endpoints.  
 
----- 
File: 1989/tr-89-039 
 
Fast Parallel Algorithms for the Clique Separator Decomposition 
 
Elias Dahlhaus, Marek Karpinski and Mark B. Novick 
tr-89-039 
July 1989 
 
We give an efficient NC algorithm for finding a clique  
separator decomposition of an arbitary graph, that  
is, a series of cliques whose removal disconnects the  
graph. This algorithm allows one to extend a large  
body of results which were originally formulated for  
chordal graphs to other classes of graphs. Our  
algorithm is optimal to within a polyalgorithmic  
factor of Tarjan's O(nm) time sequential algorithm.  
The decomposition can also be used to find NC  
algorithms for some optimization problems on  
special families of graphs, assuming these problems  
can be solved in NC for the prime graphs of the  
decomposition. These optimization problems  
include: finding a maximum weight clique, a minimum  
coloring, a maximum-weight independent set, and a  
minimum fill-in elimination order. We also give the  
first parallel algorithms for solving these  
problems by using the clique separator  
decomposition. Our maximum independent set  
algorithm applied to chordal graphs yields the most  
efficient known parallel algorithm for finding a  
maximum-weight independent set of a chordal graph.  
 
----- 
File: 1989/tr-89-040 
 
The Possibility of an Executable Specification Language 
 
Peter B. Ladkin 
tr-89-040 
July 1989 
 
We consider what it takes to build an executable  
specification language for concurrent systems. The  
key ingredients are executability and  
very-high-level specification. Many researchers  
have concluded that one can't have both in any  
reasonable way. We consider a number of criteria for  
an executable specification language. We conclude  
that it is possible to build such a language, and thus  
that executability should be a criterion for  
evaluating any specification language for  
concurrent systems.  
 
----- 
File: 1989/tr-89-041 
 
Geometric Learning Algorithms; 
 
Stephen M. Omohundro 
tr-89-041 
June 1989 
 
Emergent computation in the form of geometric  
learning is central to the development of motor and  
perceptual systems in biological organisms and  
promises to have a similar impact on emerging  
technologies including robotics, vision, speech,  
and graphics. This paper examines some of the  
trade-offs involved in different implementation  
strategies, focussing on the tasks of learning  
discrete classifications and smooth nonlinear  
mappings. The trade-offs between local and global  
representations are discussed, a spectrum of  
distributed network implementations are examined,  
and an important source of computational  
inefficiency is identified. Efficient algorithms  
based on k-d trees and the Delaunay triangulation are  
presented and the relevance to biological networks  
is discussed. Finally, extensions of both the tasks  
and the implementations are given.  
<P> 
Keywords: learning algorithms, neural networks,  
computational geometry, emergent computation,  
robotics.  
 
----- 
File: 1989/tr-89-042 
 
Optimal Parallel Algorithm for the Hamiltonian Cycle Problem on Dense Graphs 
 
Elias Dahlhaus, Peter Hajnal and Marek Karpinski 
tr-89-042 
June 1989 
 
Dirac's classical theorem asserts that, if every  
vertex of a graph G on n vertices has degree at least  
n/2, then G has a Hamiltonian cycle. We give a fast  
parallel algorithm on a CREW-PRAM to find a  
Hamiltonian cycle in such graphs. Our algorithm uses  
a linear number of processors and is optimal up to a  
polylogarithmic factor. The algorithm works in  
O((log (superscript 4)) n) parallel time and uses  
linear number of processors on a CREW-PRAM. Our  
method bears some resemblance to Anderson's RNC  
algorithm [An] for maximal paths: we, too, start from  
a system of disjoint paths and try to glue them  
together. We are, however, able to perform the base  
step (perfect matching) deterministically. We also  
prove that a perfect matching in dense graphs can be  
found in NC(superscript 2). The cost of improved time  
is a quadratic number of processors. <P>On the negative  
side, we prove that finding an NC algorithm for  
perfect matching in slightly less dense graphs (1/2 -  
epsilon) |V| is as hard as the same problem for all  
graphs, and interestingly the problem of finding a  
Hamiltonian cycle becomes NP-complete.  
 
----- 
File: 1989/tr-89-043 
 
Parallel Asynchronous Connected Components in a Mesh 
 
Susan Hambrusch and Michael Luby 
tr-89-043 
July 1989 
 
Levialdi [6] introduced a parallel synchronous  
algorithm for counting the number of connected  
components in a binary image embedded in an n x n mesh of  
processors that runs in time O(n). We describe a  
parallel asynchronous algorithm for the same  
problem achieving the same time  
 
----- 
File: 1989/tr-89-044 
 
Removing Randomness in Parallel Computation Without a Processor Penalty 
 
Michael Luby 
tr-89-044 
July 1989 
 
We develop some general techniques for converting  
randomized parallel algorithms into deterministic  
parallel algorithms without a blowup in the number of  
processors. One of the requirements for the  
application of these techniques is that the analysis  
of the randomized algorithm uses only pairwise  
independence. Our main new result is a parallel  
algorithm for coloring the vertices of an undirected  
graph using at most delta + 1 distinct colors in such a  
way that no two adjacent vertices receive the same  
color, where delta is the maximum degree of any vertex  
in the graph. The running time of the algorithm is  
O((log (superscript 3)) n log log n) using a linear  
number of processors on a concurrent read, exclusive  
write (CREW) parallel random access machine (PRAM).  
Our techniques also apply to several other problems,  
including the maximal independent set problem and  
the maximal matching problem. The application of the  
general technique to these last two problems is  
mostly of academic interest because parallel  
algorithms that use a linear number of processors  
which have better running times have been previously  
found [Israeli, Siloach86], [Goldberbg, Spencer  
87].  
 
----- 
File: 1989/tr-89-045 
 
Parallel Path-Consistency Algorithms for Constraint Satisfaction 
 
Peter B. Ladkin and Roger D. Maddux 
tr-89-045 
August 1989 
 
This paper concerns heuristic algorithms used for  
solution of Boolean Constraint Satisfaction  
Problems, or CSPs [Mon74, Mac77, Fre78, Mac87]. CSPs  
occur particularly in areas of artificial  
intelligence such as vision, temporal reasoning,  
and truth-maintenance systems. The most common form  
involves binary constraints and we consider  
properties of binary CSPs only (we shall omit the  
adjective from now on). CSPs may be represented by  
labeled digraphs called binary constraint  
networks, or BCNs. Many constraint satisfaction  
techniques operate upon BCNs. An important property  
of BCNs is that of path-consistency, which is used  
extensively as a heuristic for solving CSPs (many  
classes of CSPs are NP-hard, e.g. [VilKau86]).  
nEvery BCN has a path-consistent reduction, and it is  
known that algorithms for computing it are serial O(n  
superscript 3) in the number of variables [Mac77,  
Fre78, All83, MacFre85, MohHen86]. <P>We have  
formulated CSPs and path-consistency computations  
in the framework of Tarski's relation algebra, and  
give a brief overview below [Tar41, LadMad88.2]. We  
give a parallel O((n superscript 2) log n) algorithm  
for achieving path-consistency. We also give a class  
of hard examples on which all algorithms proposed so  
far, and possible parallelisations of them, take  
time 0(n superscript 2). This effectively  
constrains parallel path- consistency algorithms  
of the most common form (which we glorify with the name  
of reduction-type) within a fairly narrow  
asymptotic range. <P>In the next section, we introduce  
the relation-algebraic formulation of CSPs. We  
formulate some algorithms in the following section,  
ending with the O((n superscript 2) log n) parallel  
path-consistency algorithm. In the final section,  
we describe the class of problems on which the  
reduction-type algorithms take 0(n superscript 2)  
time.  
 
----- 
File: 1989/tr-89-046 
 
On Zero-Testing and Interpolation of k-Sparse Multivariate Polynomials over 
Finite Fields 
 
Michael Clausen, Andreas Dress, Johannes Grabmeier, and Marek Karpinski 
tr-89-046 
July 1989 
 
Given a black box which will produce the value of a  
k-sparse multivariate polynomial for any given  
specific argument, one may ask for optimal  
strategies (1) to distinguish such a polynomial from  
the zero-polynomial, (2) to distinguish any two such  
polynomials from one other and (3) to (uniformly)  
reconstruct the polynomial from such an information  
source. While such strategies are known already for  
polynomials over fields of characteristic zero, the  
equally important, but considerably more  
complicated case of a finite field K of small  
characteristic is studied in the present paper. The  
result is that the time complexity of such strategies  
depends critically on the degree m of the extension  
field of K from which the arguments are to be chosen;  
e.g., if m equals the number n of variables, then (1)  
can be solved by k+1 and (2) as well as (3) by 2k+1  
queries, while in case m=1 essentially 2  
(superscript log n log k) queries are needed.  
 
----- 
File: 1989/tr-89-047 
 
The Transitive Closure of a Random Digraph; 
 
Richard M. Karp 
tr-89-047 
August 1989 
 
In a random $n$-vertex digraph, each arc is present  
with probability $p$, independently of the presence  
or absence of other arcs. We investigate the  
structure of the strong components of a random  
digraph and present an algorithm for the  
construction of the transitive closure of a random  
digraph. We show that, when $n$ is large and $np$ is  
equal to a constant $c$ greater than 1, it is very  
likely that all but one of the strong components are  
very small, and that the unique large strong  
component contains about $ size 9 {\(*H} sup 2 n$  
vertices, where $ size 9 {\(*H}$ is the unique root in  
$[0,1]$ of the equation $1~-~x~-~e sup -cx ~=~0$.  
Nearly all the vertices outside the large strong  
component lie in strong components of size 1.  
Provided that the expected degree of a vertex is  
bounded away from 1, our transitive closure  
algorithm runs in expected time $O(n)$. For all  
choices of $n$ and $p$, the expected execution time of  
the algorithm is $O(w(n)~(n^ log ^n) sup 4/3 )$, where  
$w(n)$ is an arbitrary nondecreasing unbounded  
function. To circumvent the fact that the size of the  
transitive closure may be $\(*W (n sup 2 )$ the  
algorithm presents the transitive closure in the  
compact form $(A~ times ~B)~\(cu~C$, where $A$ and  
$B$ are sets of vertices, and $C$ is a set of arcs.  
 
----- 
File: 1989/tr-89-048 
 
Parallel Heuristics for the Steiner Tree Problem in Images without Sorting 
or Routing 
 
Susanne Hambrusch and Lynn TeWinkel 
tr-89-048 
August 1989 
 
In this paper we consider the problem of determining a  
minimum-cost rectilinear Steiner tree when the  
input is an n X n binary array I which is stored in an n X n  
mesh of processors. We present several heuristic  
mesh algorithms for this NP-hard problem. A major  
design criteria of our algorithms is to avoid sorting  
and routing which are expensive operations in  
practice. All of our algorithms have a O(n log k)  
running time, where k is the number of connected  
components formed by the entries of value `1'. The  
main contribution of the paper are two conceptually  
different methods for connecting components in an  
image.  
 
----- 
File: 1989/tr-89-049 
 
Spatial Reasoning Based on Allen's Temporal Logic 
 
Hans Werner Guesgen 
tr-89-049 
July 1989 
 
<BLOCKQUOTE>"If one were to categorize the behavior of the  
intelligent machine of the future, one might do so on  
the basis of the machine's capabilities to carry out  
temporal reasoning over interrelated entities that  
change with time; to carry out spatial reasoning for  
solving problems dealing with entities occupying  
space; and, on a more complex level, to reason over  
interrelated entities occupying space and changing  
in time with respect to their attributes and spatial  
interrelationships." --Avi Kak [12] </BLOCKQUOTE> 
<P>There are a lot of  
approaches to spatial reasoning which are more or  
less efficient. Nevertheless, they are not always  
adequate from the cognitive point of view. What we  
want to suggest in this paper is reasoning based on  
qualitative descriptions of spatial  
relationships. We introduce a set of basic relations  
similar to the one Allen suggested for temporal  
reasoning and we show how inferences can be performed  
on this set. <P>We start with one dimensional  
descriptions which we extend to more-dimensional  
ones in various ways. A theoretical base is provided  
and the soundness of our approach is proven. Although  
we do not claim our approach to be suitable in general,  
it is an efficient and straightforward way in many  
situations to handle spatial knowledge.  
 
----- 
File: 1989/tr-89-050 
 
Learning Read-Once Formulas with Queries 
 
Dana Angluin, Lisa Hellerstein and Marek Karpinski 
tr-89-050 
July 1989 
 
A read-once formula is a boolean formula in which each  
variable occurs at most once. Such formulas are also  
called m-formulas or boolean trees. This paper  
treats the problem of exactly identifying an unknown  
read-once formula using specific kinds of queries.  
The main results are a polynomial time algorithm for  
exact identification of monotone read-once  
formulas using only membership queries, and a  
polynomial time algorithm for exact identification  
of general read-once formulas using equivalence and  
membership queries (a protocol based on the notion of  
a minimally adequate teacher[1]). Our results  
improve on Valiant's previous results for read-once  
formulas [18]. We also show that no polynomial time  
algorithm using only membership queries or only  
equivalence queries can exactly identify all  
read-once formulas.  
 
----- 
File: 1989/tr-89-051 
 
A Note on Computational Indistinguishability 
 
Oded Goldreich 
tr-89-051 
July 1989 
 
We show that the following two conditions are  
equivalent: <BLOCKQUOTE> <P>1) The existence of pseudorandom  
generators. <P>2) The existence of a pair of efficiently  
constructible distributions which are  
computationally indistinguishable but  
statistically very different. </BLOCKQUOTE> 
 
----- 
File: 1989/tr-89-052 
 
An Efficient Parallel Algorithm for the 3MIS Problem 
 
Elias Dahlhaus and Marek Karpinski 
tr-89-052 
September 1989 
 
The paper considers the problem of computing a  
maximal independent set in hypergraphs (see [Karp,  
Ramachandran 88] and [Beame, Luby 89]). We present an  
efficient deterministic parallel algorithm for the  
case when the maximal cardinality of any hyperedge is  
3. The algorithm works in O((log superscript 4) n)  
parallel time with O(n + m) processors on a CREW PRAM  
and is optimal up to a polylogarithmic factor.  
 
----- 
File: 1989/tr-89-053 
 
Supporting Formal Program Developments: the DEVA Environment 
 
Stefan Jahnichen, Robert Gabriel, Matthias Weber and Matthias Anlauff 
tr-89-053 
September 1989 
 
The project ToolUse aims at providing means for  
active assistance in the design, implementation and  
evolution of software. This is achieved and  
supported by a formal development language called  
Deva. As Deva uses two-dimensional notations to get  
better structured and surveyable representations  
of developments, and as different Deva  
implementations have been used within the project,  
both internal and external integration play crucial  
roles in the project ToolUse. The paper shortly  
introduces the language DEVA, sketches one of its  
implementations, and discusses both kinds of  
integration.  
 
----- 
File: 1989/tr-89-054 
 
Fast Evaluation of Boolean Formulas by CREW-PRAMs 
 
Rudiger Reischuk 
tr-89-054 
September 1989 
 
We extend the result of Cook, Dwork and Reischuk  
[CDR86] that a CREW-PRAM with a linear number of  
processors can computer the or of n bits in less than  
log(subscript 2)n time to arbitrary Boolean  
formulas of logarithmic depth. Furthermore a  
matching lower bound for the or shown by Kutylowski  
[K89] is generalized to probabilistic and  
nondeterministic computations.  
 
----- 
File: 1989/tr-89-055 
 
On the Theory of Average Case Complexity (Revised Edition) 
 
Shai Ben-David, Benny Chor, Oded Goldreich, and Michael Luby 
tr-89-055 
September 1989 
 
This paper takes the next step in developing the  
theory of average case complexity initiated by  
Leonid A. Levin. Previous works [Levin 84, Gurevich  
87, Venkatesan and Levin 88] have focused on the  
existence of complete problems. We widen the scope to  
other basic questions in computational complexity.  
Our results include: <UL><LI> the equivalence of search and  
decision problems in the context of average case  
complexity; <LI> an initial analysis of the structure of  
distributional-NP under reductions which preserve  
average polynomial-time; <LI> a proof that if all  
distributional-NP is in average polynomial-time  
then non-deterministic exponential-time equals  
deterministic exponential time (i.e., a collapse in  
the worst case hierarchy); <LI> definitions and basic  
theorems regarding other complexity classes such as  
average log-space. </UL> 
 
----- 
File: 1989/tr-89-056 
 
Fast Establishment of Real-Time Channels 
 
Spiridon Damaskos and Dinesh C. Verma 
tr-89-056 
October 1989 
 
A real-time channel [Fer89a] is a simplex connection  
betwen two nodes characterized by parameters  
representing the performance requirements of the  
client. In this paper, we consider fast  
establishment of real-time channels, i.e., data can  
be sent on a real-time channel without waiting for a  
connection establishment to be confirmed by the  
destination.  
 
----- 
File: 1989/tr-89-057 
 
Multiplexing Real-Time Channels 
 
Spiridon Damaskos and Dinesh C. Verma 
tr-89-057 
October 1989 
 
A real-time channel is a simplex connection betwen  
two nodes characterized by parameters representing  
the performance requirements of the client. Such a  
connection may be established through the scheme  
described in [Fer89a]. In this paper, we study the  
feasibility of multiplexing real-time channels on a  
lower-layer real-time channel. Sufficient  
conditions for multiplexing channels are obtained  
as an extension of the establishment algorithm. <P>The  
extension is based on two observations: (1) a  
real-time channel can be looked upon as a network with  
bounded delays connecting the multiplexing point (a  
virtual source) to the demultiplexing point (a  
virtual destination); and the parameters of the  
physical channel can be used to define the service  
time at the virtual source and sink. Multiplexing is  
nothing but channel establishment over this  
network. By a judicious definition of the parameter  
specifying service times, it is possible to make  
multiplexing decisions at the multiplexing point  
(source) without consulting the destination, which  
is merely informed about the new multiplexed  
channel.  
 
----- 
File: 1989/tr-89-058 
 
Controlled Gradual Disclosure Schemes for Random Bits and Their Applications 
 
Richard Cleve 
tr-89-058 
October 1989 
 
We construct a protocol that enables a secret bit to be  
revealed gradually in a very controlled manner. In  
particular, if Alice possesses a bit S that was  
generated randomly according to the uniform  
distribution and 1/2 < p(subscript 1) < ... <  
p(subscript m) = 1 then, using our protocol with Bob,  
Alice can achieve the following. The protocol  
consists of m stages and after the i-th stage, Bob's  
best prediction of S, based on all his interactions  
with Alice, is correct with probability exactly  
p(subscript i) (and a reasonable condition is  
satisfied in the case where S is not initially  
uniform). Furthermore, under an intractabilility  
assumption, our protocol can be made "oblivious" to  
Alice and "secure" against an Alice or Bob that might  
try to cheat in various ways. Previous proposed  
gradual disclosure schemes for single bits release  
information in a less controlled manner: the  
probabilities that represent Bob's confidence of  
his knowledge of S follow a random walk that  
eventually drifts towards 1, rather that a  
predetermined sequence of values. <P>Using controlled  
gradual disclosure schemes, we show how to construct  
an improved version of the protocol proposed by Luby,  
Micali and Rackoff for two-party secret bit  
exchanging ("How to Simultaneously Exchange a  
Secret Bit by Flipping a Symmetrically-Biased  
Coin," Proc. 22nd Ann. IEEE Symp. on Foundations of  
Computer Science, 1983, pp. 11-21) that is secure  
against additional kinds of attacks that the  
previous protocol is not secure against. Also, our  
protocol is more efficient in the number of rounds  
that it requires to attain a given level of security,  
and is proven to be asymptotically optimal in this  
respect. <P>We also show how to use controlled gradual  
disclosure schemes to improve existing protocols  
for other cryptographic problems, such as  
multi-party function evaluation.  
 
----- 
File: 1989/tr-89-059 
 
Accessing and Customizing Services in Distributed Systems 
 
Ralf Guido Herrtwich and Uwe Wolfgang Brandenburg 
tr-89-059 
October 1989 
 
In a distributed system, entities access services  
provided to them by other entities at remote sites.  
While it may be unimportant to the service users which  
entities act as service providers, they often have  
other requirements on the services they use. On the  
other hand, service providers only have certain  
possibilities. Both the requirements and  
possibilities can be described by means of  
quality-of-service parameters (QOSPs), which have  
to be determined for each service session. In this  
paper we design a session establishment service  
(SES) which takes QOSP values into account. The SES  
can be used for any kind of QOSPs since it uses badness  
specifications as a uniform means to identify the  
usefulness of a certain QOSP value to a service user,  
to determine the relative importance of single  
QOSPs, and to calculate the overall quality of a  
service. Three kinds of QOSPs are distinguished:  
Static parameters do not change as long as the service  
is available, dynamic parameters depend on the  
current state of a service provider, and  
retrospective parameters result from evaluations  
of the service which are obtained from previous  
service users. While some QOSP values are available  
others can only be accomplished if the service  
provider schedules its resources appropriately.  
The reservation of resources can be integrated  
within the SES. This is especially important for  
real-time services.  
 
----- 
File: 1989/tr-89-060 
 
VC Dimension and Learnability of Sparse Polynomials and Rational Functions 
 
Marek Karpinski and Thorsten Werther 
tr-89-060 
November 1989 
 
We prove upper and lower bounds on the VC dimension of  
sparse univariate polynomials over reals, and apply  
these results to prove uniform learnability of  
sparse polynomials and rational functions. As  
another application we solve an open problem of  
Vapnik [Vapnik 82] on uniform approximation of the  
general regression functions, a central problem of  
computational statistics (cf. [Vapnik 82], p. 256).  
 
----- 
File: 1989/tr-89-061 
 
On Space-bounded Learning and the Vapnik-Chervonenkis Dimension (Thesis)  
 
Sally Floyd 
tr-89-061 
December 1989 
 
This thesis explores algorithms that learn a concept  
from a concept class of Vapnik-Chervonenkis (VC)  
dimension d by saving at most d examples at a time. The  
framework is the model of probably approximately  
correct (pac) learning introduced by Valiant [V84].  
A maximum concept class of VC dimension d is defined.  
For a maximum class C of VC dimension d, we give an  
algorithm for representing a finite set of positive  
and negative examples of a concept by a subset of d  
labeled examples of that set. This data compression  
scheme of size d is used to construct a space-bounded  
algorithm called the iterative compression  
algorithm that learns a concept from the class C by  
saving at most d examples at a time. These d examples  
represent the current hypothesis of the learning  
alorithm. A space-bounded algorithm is called  
acyclic if a hypothesis that has been rejected as  
incorrect is never reinstated. We give a sufficient  
condition for the iterative compression algorithm  
to be acyclic on a maximum class C. Classes for which  
the iterative compression algorithm is acyclic  
include positve half-spaces in Euclidean space  
E(superscript n), balls in E(superscript n), and  
arbitrary rectangles and triangles in the plane. The  
iterative compression algorithm can be thought of as  
learning a boundary between the positive and the  
negative examples.  
 
----- 
File: 1989/tr-89-062 
 
The Asynchronous PRAM: A Semi-Synchronous Model for Shared Memory MIMD Machines (Thesis) 
 
Phillip Baldwin Gibbons 
tr-89-062 
December 1989 
 
This thesis introduces the Asynchronous PRAM model  
of computation, of the design and analysis of  
algorithms that are suitable for large parallel  
machines in which processors communicate via a  
distributed, shared memory. The Asynchronous PRAM  
is a variant of the well-studied PRAM model which  
differs from the PRAM in two important respects: (i)  
the processors run asynchronously and there is an  
explicit charge for synchronization, and (ii) there  
is a non-unit time cost to access the shared memory.  
<P>Many new algorithms are presented for the  
Asynchronous PRAM model. We modify a number of PRAM  
algorithms for improved asymptotic time and  
processor complexity in the Asynchronous PRAM. We  
show general classes of problems for which the time  
complexity can be improved by restructuring the  
computation. We prove lower bounds that reflect  
limitation on information flow and load balancing in  
this model. Simulation results between the  
Asynchronous PRAM and various known synchronous  
models are presented as well. <P>We introduce a post  
office gossip game for studying the inherent  
synchronization complexity of coordinating  
processors using pairwise synchronization  
primitives. Results are presented that compare the  
relative power of various such primitives. These  
results and techniques are used to reduce the amount  
of synchronization in Asynchronous PRAM  
algorithms. <P>Furthermore, we discuss a programming  
model based on the Asynchronous PRAM. We introduce  
the notion of a semi-synchronous programming model,  
a model for repeatable asynchronous programs.  
Repeatable programs, in which the output and all  
intermediate results are the same every time the  
program is run on a particular input, greatly  
simplify the tasks of writing, debugging,  
analyzing, and testing programs. <P>Finally, we  
discuss hardware support for the Asynchronous PRAM  
model. In particular, we present a cache protocol  
suitable for the Asynchronous PRAM and a new  
technique for barrier synchronous PRAM and a new  
technique for barrier synchronization.  
 
----- 
File: 1989/tr-89-063 
 
Five Balltree Construction Algorithms; 
 
Stephen M. Omohundro 
tr-89-063 
December 1989 
 
Balltrees are simple geometric data structures with  
a wide range of practical applications to geometric  
learning tasks. In this report we compare 5 different  
algorithms for constructing balltrees from data. We  
study the tradeoff between construction time and the  
quality of the constructed tree. Two of the  
algorithms are on-line, two construct the  
structures from the data set in a top down fashion, and  
one uses a bottom up approach.  
 
----- 
File: 1989/tr-89-064 
 
Program Checkers for Algebraic Problems (Thesis) 
 
Sampath Kanan 
tr-89-064 
February 1989 
 
In this thesis we explore a model of ensuring the  
correctness of results produced by programs. This  
model called program checking is distinct from the  
two methods in the literature -- testing and  
verification. Testing does not provide  
mathematical guarantees on the correctness of  
computation. Verification requires going into the  
inner workings of a program to determine its  
correctness, and is infeasible to implement for all  
but very simple programs. <P>Program checking treats  
the program as a black box. In the checking scenario  
the program is run on the desired input and the output  
is checked by a program checker. The checker is  
allowed to make other calls to the program to ensure  
the correctness of the original computation with  
very high probability. The theory of program  
checking draws heavily from the theory of  
interactive proof systems and probabilistic  
algorithms, but the model is intended to be very  
practical as well. <P>Our focus in this thesis is on  
program checkers for algebraic problems. The  
unifying theme amongst such problems is the concept  
of random self-reducibility. A function f is  
randomly self-reducible if the computation of f(x)  
for any x can be reduced to the computation of several  
"randomly chosen" inputs. For most of the algebraic  
problems considered in this thesis the checkers use  
the fact that the problem is at least partially  
self-reducible. This allows us to construct sets of  
instances whose answers are related. Verifying  
consistency of the program's answers on these  
instances allows us to design checkers for problems  
in linear algebra such as rank and determinant and for  
problems such as graph isomorphism and group  
intersection. <P>We also study the connection between  
interactive proofs and program checking. Using the  
two step approach of designing an interactive proof  
and converting it into a checker, we design a checker  
for group intersection. We construct bounded round  
interactive proofs for a few other problems  
including the problem of permutation group  
non-isomorphism. This interactive proof uses  
interesting consequences of the classification of  
finite simple groups. <P>Finally we consider the notion  
of random self-reducibility in its own right and  
obtain negative results about the random  
self-reducibility of certain functions.  
 
----- 
File: 1989/tr-89-065 
 
Lectures on a Theory of Computation and Complexity over the Reals (or an 
Arbitrary Ring) 
 
Lenore Blum 
tr-89-065 
December 1989 
 
These lectures discuss a new theory of computation  
and complexity which attempts to integrate key ideas  
from the classical theory in a setting more amenable  
to problems defined over continuous domains. The  
goal is to develop theoretical foundations for a  
theory of computational complexity for numerical  
analysis and scientific computation that might  
embody some of the naturalness and strengths of the  
classical theory. <P>We highlight key aspects of the new  
theory as well as to give exposition, in this setting,  
of classical ideas and results. Indeed, one of our  
themes will be the comparison of results over the  
integers with results over the reals and complex  
numbers. Contrasting one theory with the other will  
help illuminate each, and give deeper understanding  
to such basic concepts as decidability,  
definability, computability and complexity.  
 
----- 
File: 1990/tr-90-001 
 
The Delaunay Triangulation and Function Learning; 
 
Stephen M. Omohundro 
tr-90-001 
January 1990 
 
In this report we consider the use of the Delaunay  
triangulation for learning smooth nonlinear  
functions with bounded second derivatives from sets  
of random input output pairs. We show that if  
interpolation is implemented by piecewise-linear  
approximation over a triangulation of the input  
samples, then the Delaunay triangulation has a  
smaller worst case error at each point than any other  
triangulation. The argument is based on a nice  
connection between the Delaunay criterion and  
quadratic error functions. The argument also allows  
us to give bounds on the average number of samples  
needed for a given level of approximation.  
 
----- 
File: 1990/tr-90-002 
 
Speech Segmentation and Labeling on the NeXT Machine 
 
Chuck Wooters and Nelson Morgan 
tr-90-002 
January 1990 
 
We are attempting to incorporate connectionist  
models into speech recognition algorithms. Since  
these models require a large amount of training data,  
it was necessary to build an automated speech  
labeling/segmentation application. There were two  
significant system requirements for this program: <UL><LI>  
Digital-to-analog capabilities. <LI> Support for  
speedy development of applications requiring a  
user-interface. </UL>The NeXT machine fulfills both of  
these requirements. It has built in AD/DA  
capabilities. Its object-oriented programming  
environment and application-building modules  
permit quick program development. <P>We report here on a  
program we have developed to integrate automatic  
labeling and segmentation of continuous speech with  
a manual system for observing and correcting these  
signal annotations. The overall system has  
functioned well enough to permit easy user marking of  
600 sentences in a reasonable amount of time.  
 
----- 
File: 1990/tr-90-003 
 
Considerations for the Electronic Implementation of Artificial Neural 
Networks 
 
Nelson Morgan 
tr-90-003 
January 1990 
 
Computer scientists and designers have long been  
interested in comparisons between artificial  
automata and the human brain [Von Neumann, 1957].  
Mental activity is often characterized as the result  
of the parallel operation of large numbers of neurons  
(~10 superscript 11 for the human brain). Neurons  
interact electrochemically on a time scale of  
milliseconds, and are jointly capable of  
significant feats of pattern recognition (such as  
recognizing a friend wearing an unusual costume).  
These commonplace human achievements are currently  
unattainable by large electronic computers built  
from components with characteristic delays in the  
nanosecond range. Artificial Neural Network (ANN)  
researchers hope that simplified functional models  
of nervous tissue can help us to design algorithms and  
machines that are better than conventional  
computers for difficult problems in machine  
perception and intelligence. <P>However, engineering  
constraints for silicon implementations of these  
systems may suggest design choices which differ from  
mimicry of biology in significant ways. In  
particular, large silicon ANN systems may require  
multiplexing of communication AND CO and  
computation as a consequence of limited  
connectivity. This report discusses  
considerations such as these, and concludes with a  
short description of an ongoing effort to design  
silicon ANN building blocks using powerful CAD  
tools.  
 
----- 
File: 1990/tr-90-004 
 
On the Complexity of Genuinely Polynomial Computation 
 
Marek Karpinski and Friedhelm Meyer auf der Heide 
tr-90-004 
January 1990 
 
We present the separation results on genuinely (also  
called strong) sequential, parallel, and  
non-deterministic complexity classes for the set of  
arithmetic RAM operations {+, -, *} and {+, -, DIV  
subscript c}. In particular, we separate  
non-uniform polynomial time from non-uniform  
parallel polynomial time for the set of operations  
{+, -, *}, answering a question posed in [Meyer auf der  
Heide 88].  
 
----- 
File: 1990/tr-90-005 
 
Interpolation of Sparse Rational Functions Without Knowing Bounds on 
Exponents 
 
Dima Y. Grigoriev, Marek Karpinski, and Michael F. Singer 
tr-90-005 
January 1990 
 
We present the first algorithm for the (black box)  
interpolation of t-sparse rational functions  
without knowing bounds on exponents of their sparse  
representations.  
 
----- 
File: 1990/tr-90-006 
 
A Resource Reservation Protocol for Guaranteed-Performance Communication in 
the Internet 
 
David P. Anderson, Ralf Guido Herrtwich, and Carl Schaefer 
tr-90-006 
February 1990 
 
This report describes the Session Reservation  
protocol (SRP). SRP is defined in the DARPA Internet  
family of protocols. It allows communicating peer  
entities to reserve the resources (CPU and network  
bandwidth) necessary to achieve given performance  
objectives (delay and throughput). The immediate  
goal of SRP is to support continuous media (digital  
audio and video) in IP-based distributed systems.  
However, it is applicable to any application that  
requires guaranteed-performance network  
communication. <P>The design goals of SRP include:  
independence from transport protocols (SRP can be  
used with standard protocols such as TCP or with new  
real-time protocols); compatibility with IP  
(packets are not modified); and that a host  
implementing SRP can benefit from its use even when  
communicating with hosts not supporting SRP. <P>SRP is  
based on a workload and scheduling model called the  
DASH resource model. This model defines a  
parameterization of client workload, an abstract  
interface for hardware resources, and an end-to-end  
algorithm for negotiated resource reservation  
based on cost minimization. SRP implements this  
end-to-end algorithm, handling those resources  
related to network communication.  
 
----- 
File: 1990/tr-90-007 
 
Client Requirements for Real-Time Communication Services 
 
Domenico Ferrari 
tr-90-007 
March 1990 
 
A real-time communication service provides its  
clients with the ability to specify their  
performance requirements and to obtain guarantees  
about the satisfaction of those requirements. In  
this paper, we propose a set of performance  
specifications that seem appropriate for such  
services; they include various types of delay  
bounds, throughput bounds, and reliability bounds.  
We also describe other requirements and desirable  
properties from a client's viewpoint, and the ways in  
which each requirement is to be translated to make it  
suitable for lower levels in the protocol hierarchy.  
Finally, we present examples of requirements  
specification, and discuss some of the possible  
objections to our approach.  
 
----- 
File: 1990/tr-90-008 
 
An Algebraic Approach to General Boolean Constraint Problems 
 
Hans W. Guesgen and Peter B. Ladkin 
tr-90-008 
March 1990 
 
We consider an algebraic approach to the statement  
and solution of general Boolean constraint  
satisfaction problems (CSPs). Our approach is to  
consider partial valuations of a constraint network  
(including the relational constraints themselves)  
as sets of partial functions, with the operators of  
join and projection. We formulate all the usual  
concepts of CSPs in this framework, including  
k-consistency, derived constraints, and  
backtrack-freeness, and formulate an algorithm  
scheme for k-consistency which has the  
path-consistency scheme in [LadMad88.2] as a  
special case. This algebra may be embedded in the  
cylindric algebra of Tarski [HeMoTa71, 85], via the  
embedding of [ImiLip84], and a connection with  
relational database operations. CSPs are shown to  
correspond to conjunctive queries in relational  
database theory, and we formulate a notion of  
equivalence of CSPs with hidden variables,  
following [ChaMer76, Ull80], and show that testing  
equivalence is NP-hard.  
 
----- 
File: 1990/tr-90-009 
 
Miniature Language Acquisition: A touchstone for cognitive science; 
 
Jerome A. Feldman, George Lakoff, Andreas Stolcke, and Susan Hollbach Weber 
tr-90-009 
March 1990 (revised April 1990) 
 
Cognitive Science, whose genesis was interdisciplinary, shows signs of 
reverting to a disjoint collection of fields. This paper presents a compact, 
theory-free task that inherently requires an integrated solution. The basic 
problem is learning a subset of an arbitrary natural language from 
picture-sentence pairs. We describe a very specific instance of this task 
and show how it presents fundamental (but not impossible) challenges to 
several areas of cognitive science including vision, language, inference and 
learning. 
 
 
 
----- 
File: 1990/tr-90-010 
 
L0: A Testbed for Miniature Language Acquisition; 
 
Susan Hollbach Weber and Andreas Stolcke 
tr-90-010 
May 1990 
 
L0 constitutes a recent effort in Cognitive Science  
to build a natural language acquisition system for a  
limited visual domain. As a preparatory step towards  
addressing the issue of learning in this domain, we  
have built a set of tools for rapid prototyping and  
experimentation in the areas of language  
processing, image processing, and knowledge  
representation. The special focus of our work was the  
integration of these different components into a  
flexible system which would allow us to better  
understand the domain given by L0 and experiment with  
alternative approaches to the problems it poses.  
 
----- 
File: 1990/tr-90-011 
 
A Network for Extracting the Locations of Point Clusters Using Selective 
Attention; 
 
Subutai Ahmad and Stephen Omohundro 
tr-90-011 
May 1990 
 
This report explores the problem of dynamically  
computing visual relations in connectionist  
systems. It concentrates on the task of learning  
whether three clumps of points in a 256x256 image form  
an equilateral triangle. We argue that feed-forward  
networks for solving this task would not scale well to  
images of this size. One reason for this is that local  
information does not contribute to the solution: it  
is necessary to compute relational information such  
as the distances between points. Our solution  
implements a mechanism for dynamically extracting  
the locations of the point clusters. It consists of an  
efficient focus of attention mechanism and a cluster  
detection scheme. The focus of attention mechanism  
allows the system to select any circular portion of  
the image in constant time. The cluster detector  
directs the focus of attention to clusters in the  
image. These two mechanisms are used to sequentially  
extract the relevant coordinates. With this new  
representation (locations of the points) very few  
training examples are required to learn the correct  
function. The resulting network is also very  
compact: the number of required weights is  
proportional to the number of input pixels.  
 
----- 
File: 1990/tr-90-012 
 
A Connectionist Unification Algorithm 
 
Steffen Hoelldobler 
tr-90-012 
March 1990 
 
Unification plays an important role in many areas of  
computer science, mathematical logic, and  
artificial intelligence. It is also at the heart of  
connectionist models concerned with knowledge  
representation and inference. However, most of  
these models are severly restricted by their  
propositional fixation as they are defined over a  
finite set of constants and predicates. This  
restriction is caused by the inability to unify terms  
built from function symbols, constants and  
variables. In this paper a connectionist  
unification algorithm is presented. It utilizes the  
fact that the most general unifier of two terms  
corresponds to a finest valid equivalence relation  
defined on a occurrence-label representation of the  
unification problem. The algorithm exploits the  
maximal parallelism inherent in the computation of  
such a finest valid equivalence relation while using  
only computational features of connectionism. It  
can easily be restricted to solve special forms of the  
unification problem such as the word problem, the  
matching problem, or the unification problem over  
infinite trees.  
 
----- 
File: 1990/tr-90-013 
 
Towards Optimal Simulations of Formulas by Bounded-Width Programs 
 
Richard Cleve 
tr-90-013 
March 1990 
 
We show that, over an arbitrary ring, for any fixed  
epsilon > 0, all balanced algebraic formulas of size s  
are computed by algebraic straight-line programs  
that employ a constant number of registers and have  
length O (s superscript(1+epsilon)). In  
particular, in the special case where the ring is  
GF(2), we obtain a technique for simulating balanced  
Boolean formulas of size s by bounded-width  
branching programs of length O(s  
superscript(1+epsilon)), for any fixed epsilon > 0.  
This is an asymptotic improvement in efficiency over  
previous simulations in both the Boolean and  
algebraic setting.  
 
----- 
File: 1990/tr-90-014 
 
Dynamic Constraints 
 
Hans Werner Guesgen and Joachim Hertzberg 
tr-90-014 
April 1990 
 
Usually, a constraint describes a relation on  
variables, and networks of constraints are obtained  
by sharing variables among constraints.  
Manipulating a constraint or a constraint network  
means manipulating the variables until a consistent  
assignment is found. There are, however, deviations  
from this classical view, e.g., manipulating the  
constraints themselves to make the computation of  
consistent assignments more efficient, or relaxing  
constraints to make an overspecified constraint  
problem solvable. <P>In this paper, we present a  
formalism that subsumes classical constraint  
satisfaction, constraint manipulation, and  
constraint relaxation. The idea is that the  
constraints in a network are not static but that their  
relations can and must be manipulated and that  
manipulating relations subsumes manipulating  
variable values. We clarify the relation between  
classical constraint networks and the newly  
developed dynamical ones; we prove termination  
properties of dynamic constraint networks in the  
special case of filtering; and we show by examples how  
to express constraint manipulation and constraint  
relaxation in the new formalism.  
 
----- 
File: 1990/tr-90-015 
 
Learning Feature-based Semantics with Simple Recurrent Networks; 
 
Andreas Stolcke 
tr-90-015 
April 1990 
 
The paper investigates the possibilities for using  
simple recurrent networks as transducers which map  
sequential natural language input into  
non-sequential feature-based semantics. The  
networks perform well on sentences containing a  
single main predicate (encoded by transitive verbs  
or prepositions) applied to multiple-feature  
objects (encoded as noun-phrases with adjectival  
modifiers), and shows robustness against  
ungrammatical inputs. A second set of experiments  
deals with sentences containing embedded  
structures. Here the network is able to process  
multiple levels of sentence-final embeddings but  
only one level of center-embedding. This turns out to  
be a consequence of the network's inability to retain  
information that is not reflected in the outputs over  
intermediate phases of processing. Two extensions  
to Elman's \shortcite{Elman:88} original  
recurrent network architecture are introduced.  
 
----- 
File: 1990/tr-90-016 
 
Temporal Reasoning Based on Semi-Intervals (Revised Version) 
 
Christian Freksa 
tr-90-016 
April 1990 
 
A generalization of Allen's interval-based  
approach to temporal reasoning is presented. The  
scope of reasoning capabilities can be considerably  
extended by using relations between semi-intervals  
rather than intervals as the basic units of  
knowledge. Semi-intervals correspond to  
beginnings or endings of temporal events. We develop  
a representational framework in which relations  
between semi-intervals appear as coarse knowledge  
in comparison with relations between intervals. We  
demonstrate the advantages of reasoning on the basis  
of semi-intervals: 1) coarse knowledge can be  
processed directly; computational effort is saved;  
2) incomplete knowledge about temporal intervals  
can be fully exploited; 3) incomplete inferences  
made on the basis of complete knowledge can be used  
directly for further inference steps; 4) there is no  
trade-off in computational strength for the added  
flexibility and efficiency; 5) semi-intervals  
correspond to natural entities both from a cognitive  
and from a computational point of view. The presented  
scheme supports reasoning on the basis of  
fine-grained or complete knowledge, on the basis of  
coarse or incomplete knowledge, and on combinations  
of both kinds of knowledge. The notion of `conceptual  
neighborhood' is central to the presented approach.  
Besides enhancing the reasoning capabilities in  
several directions, this notion allows for a drastic  
compaction of the knowledge base underlying Allen's  
inference scheme. A connection to fuzzy reasoning on  
the basis of `conceptual neighborhood' is drawn. It  
is suggested that reasoning based on the simplified  
knowledge base may be particularly suited for the  
implementation of parallel inference engines.  
<p> 
[Revised version was published as: 
<br> 
Freksa C, Temporal reasoning based on semi-intervals, Artificial 
Intelligence 54 (1992) 199-227.] 
 
----- 
File: 1990/tr-90-017 
 
Time Dated Streams in Continuous-Media Systems 
 
Ralf Guido Herrtwich 
tr-90-017 
May 1990 
 
Data in continuous-media systems, such as digital  
audio and video, has time parameters associated with  
it that determine its processing and display. We  
present the "time capsule" abstraction to describe  
how timed data shall be stored, exchanged, and  
accessed in a real-time system. When data is written  
into a time capsule, a time stamp and a duration are  
associated with the data item. When it is read, a time  
stamp is used to select the data item. The time capsule  
abstraction includes the notion of "clocks" that  
ensure periodic data access that is typical for  
continuous-media applications. By modifying the  
parameters of a clock, effects such as time lapses or  
slow motion can be achieved.  
 
----- 
File: 1990/tr-90-018 
 
A Connectionist Approach to Symbolic Constraint Satisfaction 
 
Hans Werner Guesgen 
tr-90-018 
April 1990 
 
Algorithms for solving constraint satisfaction  
problems, i.e., for finding one, several, or all  
solutions for a set of constraints on a set of  
variables, have been introduced in a variety of  
papers in the area of Artificial Intelligence. Here,  
we illustrate how a connectionist network for  
constraint satisfaction can be implemented. <P>The  
idea is to use a connectionist node for each value of  
each variable and for each tuple of each constraint of  
the constraint satisfaction problem, and to connect  
them according to the way in which the constraints are  
related to the variables. Goedel numbers are used as  
potentials of the nodes that correspond to  
variables, representing possible paths of  
solutions.  
 
----- 
File: 1990/tr-90-019 
 
Applications of Topology to Lower Bound Estimates in Computer Science 
 
Michael D. Hirsch 
tr-90-019 
May 1990 
 
This research explores the relationship between  
topology and computer science by analyzing simple  
problems in which the role played by topology is  
crucial, yet which can be approached using  
techniques that are not too esoteric. The goal is to  
develop a set of topological tools which can then be  
applied to other, more central, problems in  
complexity theory. <P>We define the concepts of "a  
problem" and "problem reduction" in computer  
science in such a way as to make the techniques of point  
set and algebraic topology applicable. Following  
Smale, we define "topological complexity" as the  
minimal number of branch nodes in an algebraic  
computation tree and relate it to the Schwartz genus  
of a map. <P>We introduce a new problem, the new point  
problem (NPP), and calculate its topological  
complexity for a variety of spaces. NPP has many  
variations. The most realistic and applicable  
version is the following. Given a list of n distinct  
points in a metric space X with a known lower bound  
delta for the distance between any two points, what is  
the topological complexity of finding a new point y  
such that delta is still a lower bound for the distance  
between any two points. <P>We prove: <DL><DT>Theorem <DD>The  
topological complexity of the above problem on the  
interval [0,1], with n sufficiently small, is n. </DL>In  
the final chapter, we show how to use the definition of  
"a problem" to get lower bounds on the non-linear  
complexity of many problems in Computer Science that  
are slightly better than previous lower bounds.  
 
----- 
File: 1990/tr-90-020 
 
Prototyping and Analysis of Non-Sequential Systems Using Predicate-Event 
Nets 
 
Heinz W. Schmidt 
tr-90-020 
May 1990 
 
The specific language SEGRAS is centered on  
Predicate-Event nets (PrE-nets), a class of Petri  
nets whose data and behavioral invariants are  
defined using algebraic specification. This paper  
focuses on the analysis methods we have developed for  
these nets in the ESPRIT-project GRASPIN. <P>PrE-nets  
inherit from the algebraic theory of abstract  
datatypes and from net-theory. From the side of  
algebraic specification notions like the modular  
decomposition, initial models or consistency and  
completeness carry over to PrE-nets and preserve  
their standard semantics. These notions are related  
to the static semantics and the invariants of the  
dynamic behavior of a non-sequential system. From  
the net-theoretic side theorems and methods for  
analysis of behavioral properties are applicable to  
PrE-nets in a straightforward way. Here we consider  
in particular net transformations and  
decomposition methods.  
 
----- 
File: 1990/tr-90-021 
 
Structure and Schedulingin Real-Time Protocol Implementations 
 
David P. Anderson, Luca Delgrossi and Ralf G. Herrtwich 
tr-90-021 
June 1990 
 
Real-time network communication involves 1) the  
underlying network and its contention mechanism, 2)  
the design of transport protocols, 3) the scheduling  
of CPU and network interface devices, and 4) the  
process/interrupt structure of protocol  
implementations. This paper is concerned with 3) and  
4), in the context of network communication of  
digital audio and video data. <P>We describe the issues  
and design alternatives for CPU and network  
interface scheduling in the sending host, and CPU  
scheduling for protocol processing in the receiving  
host. We discuss how the proposed policies can be  
incorporated in existing operating systems such as  
UNIX. Our discussion is based on the "DASH resource  
model", a workload and scheduling model designed for  
real-time communication.  
 
----- 
File: 1990/tr-90-022 
 
Buffer Space Allocation for Real-Time Channels in a Packet-Switching Network 
 
Domenico Ferrari and Dinesh C. Verma 
tr-90-022 
June 1990 
 
Broadband integrated networks will have to offer  
real-time communication services; that is, they  
will have to transport information with performance  
guarantees. A paper previously published by the  
authors presented a scheme for establishing  
real-time channels in a pure packet-switching  
network; that scheme did not include any method for  
allocating buffer space in the network's nodes to the  
channels being established. This paper completes  
the description and evaluation of that scheme, since  
it presents one such method, and some of the results of  
the extensive simulations performed to test it. The  
method is found to be correct and to have a low  
overhead. While the utilization of the buffer space  
allocated to the statistical channels is often quite  
low, thereby indicating that our worst-case  
approach tends to overallocate space to those  
channels, the space our method gives to  
deterministic channels seems to be reasonably well  
utilized.  
 
----- 
File: 1990/tr-90-023 
 
On the Power of Randomization in Online Algorithms; 
 
S. Ben-David, A. Borodin, R. Karp, G. Tardos, and A. Wigderson 
tr-90-023 
June 1990 
 
Against an adaptive adversary, we show that the power  
of randomization in online algorithms is severely  
limited! We prove the existence of an efficient  
``simulation'' of randomized online algorithms by  
deterministic ones, which is best possible in  
general. <P>The proof of the upper bound is existential.  
We deal with the issue of computing the efficient  
deterministic algorithm, and show that this is  
possible in very general cases.  
 
----- 
File: 1990/tr-90-024 
 
An Introduction to Randomized Algorithms; 
 
Richard M. Karp 
tr-90-024 
June 1990 
 
Research conducted over the past fifteen years has  
amply demonstrated the advantages of algorithms  
that make random choices in the course of their  
execution. This paper presents a wide variety of  
examples intended to illustrate the range of  
applications of randomized algorithms, and the  
general principles and approaches that are of  
greatest use in their construction. The examples are  
drawn from many areas, including number theory,  
algebra, graph theory, pattern matching,  
selection, sorting, searching, computational  
geometry, combinatorial enumeration, and parallel  
and distributed computation.  
 
----- 
File: 1990/tr-90-025 
 
Approximating the Number of Solutions of a GF[2] Polynomial 
 
Marek Karpinski and Michael Luby 
tr-90-025 
July 1990 
 
We develop a polynomial time Monte-Carlo algorithm  
for estimating the number of solutions to a  
multivariate polynomial over GF[2]. This gives the  
first efficient method for estimating the number of  
points on algebraic varieties ove GF[2], which has  
been recently proven to be #P-complete even for cubic  
polynomials. There are a variety of applications of  
our result, which will be discussed in the full  
version of the paper.  
 
----- 
File: 1990/tr-90-026 
 
Audio and Video in Distributed Computer Systems: Why and How? 
 
Ralf Guido Herrtwich 
tr-90-026 
July 1990 
 
Technological advances allow computer systems to  
handle "continuous media" such as audio and video in  
addition to "discrete media" such as text and  
graphics. As with the introduction of computer  
graphics ten years ago, the integration of  
continuous media will extend the range of computer  
applications and change existing paradigms for  
computer usage and programming. Distributed  
computer systems that are capable of handling  
continuous media can (1) unify the methods of  
information distribution, (2) personalize  
information services through interactive access  
and individual information selection, and (3) make  
information presentation more effective. The major  
obstacles to using continuous media in today's  
computer systems are performance limitations. In  
addition to high-capacity and high-speed hardware,  
system software is needed that meets the real-time  
demands of audio and video, and that provides  
application interfaces which take the special  
requirements of these new data types into account.  
 
----- 
File: 1990/tr-90-027 
 
Complexity Theoretic Issues Concerning Block Ciphers Related to D.E.S. 
 
Richard Cleve 
tr-90-027 
July 1990 
 
The D.E.S. cipher is naturally viewed as a  
composition of sixteen invertible transformations  
on 64-bit strings (where the transformations depend  
of the value of a 56-bit key). Each of the  
transformations has a special form and satisfies the  
particular property that each of its output bits is  
determined by a "small" number of its input bits. We  
investigate the computational power of block  
ciphers on n-bit strings that can be expressed as  
polynomial-length (with respect to n) compositions  
of invertible transformations that have a form  
similar to those of D.E.S. In particular, we require  
that the basic transformations have the property  
that each of their output bits depends on the value of a  
small number of their input bits (where "small" is  
somewhere in the range between O(1) and O(log n)). We  
present some sufficient conditions for ciphers of  
this type to be "pseudorandom function generators"  
and, thus, to yield private key cryptosystems that  
are secure against adaptive chosen plaintext  
attacks.  
 
----- 
File: 1990/tr-90-028 
 
Temporal Resoning with Intervals in Branching Time 
 
Peter B. Ladkin, Frank D. Anger, and Rita V. Rodriguez 
tr-90-028 
July 1990 
 
Allen [ALLE83] adapted path-consistency  
techniques [MACK77] to heuristic reasoning  
concerning intervals over linear time, by  
calculating the composition table of binary  
relations on intervals, and using it in the  
path-consistency algorithm. We consider here a  
model of branching time which is dense, unbounded,  
future branching, without rejoining branches. The  
algorithm in [ALLE83] works directly with  
branching-time intervals, provided only that the  
composition table of the binary branching-time  
interval relations is used instead of Allen's table  
[LADK88]. Here we calculate the composition table  
which has to be used, which is considerably more  
complex than the table for linear-time intervals.  
This provides a heuristic, cubic-time algorithm for  
reasoning with branch-time intervals.  
 
----- 
File: 1990/tr-90-029 
 
On Location: Points About Regions 
 
Peter B. Ladkin and Judith S. Crow 
tr-90-029 
July 1990 
 
In this paper we formalize Whitehead's construction  
for inducing point structures from region  
structures using a primitive relation of connection  
on regions [Whi79]. Our concern is to formulate a  
spatiotemporal analogue to the construction of  
temporal periods/points from events, and is  
reminiscent of the temporal constructions of Kamp  
[Kam79] and van Benthem [vBen83]. We compare our  
interpretation of Whitehead with the Kamp/van  
Benthem/Russell constructions and find some  
unresolved issues of interdefinability. Our goal is  
an appposite formulation of spatiotemporal  
locations as suggested for Situation Theory by  
Barwise and Perry [BP83].  
 
----- 
File: 1990/tr-90-030 
 
On the Magnification of Exchange Graphs with Applications to Enumeration Problems (Thesis)  
 
Paul Dagum 
tr-90-030 
July 1990 
 
This thesis concerns the design of fully polynomial  
approximation algorithms for some #P-complete  
enumeration problems. The types of enumeration  
problems we consider can be regarded as instances of  
computing |F| for set systems (V,F) having a  
description in terms of a "complete set of  
implicants" I with |I| = O(|V| superscript 2). By  
studying the geometric quantities of adjacency and  
magnification of the "exchange graph" of set  
systems, we establish criteria for the design of  
fully polynomial algorithms.  
 
----- 
File: 1990/tr-90-031 
 
Fault Tolerance in Feed-foward Artificial Neural Networks 
 
Carlo H. Sequin and Reed D. Clay 
tr-90-031 
July 1990 
 
The errors resulting from defective units and faulty  
weights in layered feed-forward ANN's are analyzed,  
and techniques to make these networks more robust  
against such failures are discussed. First, using  
some simple examples of pattern classification  
tasks and of analog function approximation, it is  
demonstrated that standard architectures  
subjected to normal backpropagation training  
techniques do not lead to any noteworthy fault  
tolerance. Additional, redundant hardware coupled  
with suitable new training techniques are necessary  
to achieve that goal. A simple and general procedure  
is then introduced that develops fault tolerance in  
neural networks: The type of failures that one might  
expect to occur during operation are introduced at  
random during the training of the network, and the  
resulting output errors are used in a standard way for  
backpropagation and weight adjustment. The result  
of this training method is a modified internal  
representation that is not only more robust to the  
type of failures encountered in training, but which  
is also more tolerant of faults for which the network  
has not been explicitly trained.  
 
----- 
File: 1990/tr-90-032 
 
A Note on Self-Testing/Correcting Methods for Trigonometric Functions 
 
Richard Cleve and Michael Luby 
tr-90-032 
July 1990 
 
Blum, Luby and Rubinfeld (1990) introduced the  
notion of self-testing/correcting for various  
problems. We show how to apply some of their  
techniques to construct a self-testing/correcting  
pair for the problem of computing the sin and cos  
functions.  
 
----- 
File: 1990/tr-90-033 
 
The Computational Complexity of (XOR, AND)-Counting Problems 
 
Andrzej Ehrenfeucht and Marek Karpinski 
tr-90-033 
July 1990 
 
We characterize the computational complexity of  
counting the exact number of satisfying assignments  
of (XOR, AND)-formulas in their RSE-representation  
(i.e., equivalently, polynomials in GF[2] [x  
subscript 1, ..., x subscript n]. This problem  
refrained for some time efforts to find a polynomial  
time solution and efforts to prove the problem to be  
#P-complete. Both main results can be generalized to  
the arbitrary finite fields GF[q]. Because counting  
the number of solutions of polynomials over finite  
fields is generic for many other algebraic counting  
problems, the results of this paper settle a border  
line for the algebraic problems with a polynomial  
time counting algorithms and for problems which are  
#P-complete. In [Karpinski, Luby 89] the counting  
problem for arbitrary multivariate polynomials  
over GF[2] has been proved to have randomized  
polynomial time approximation algorithms.  
 
----- 
File: 1990/tr-90-034 
 
Finite Representations of Deformable Functions 
 
Peitro Perona 
tr-90-034 
July 1990 
 
Starting from a `template' function F(x) and  
composing it with a family of transformations T  
subscript 0 (e.g., rotations, scalings) of its  
domain one obtains a family of `deformations' of F,  
F0T(x) spanning an n-dimensional space; n is in  
general infinite. A technique is presented that  
allows (1) to compute the best approximation of a  
given family using linear combinations of a finite  
number of `basis' functions; (2) to characterize  
those functions F generating finite-dimensional  
families. The technique applies to all cases where T  
subscript 0 belongs to a compact group of  
transformations. The results presented here have  
applications in early vision and signal processing  
for the computation of filters in a continuum of  
orientations and scales.  
 
----- 
File: 1990/tr-90-035 
 
An Introduction to Real-Time Scheduling 
 
Ralf Guido Herrtwich 
tr-90-035 
July 1990 
 
Until now, real-time processing techniques were  
only used in more exotic computer applications such  
as process automation. With the advent of computer  
systems capable of handling time-critical data such  
as digital audio and video, they become important for  
general-purpose computing as well. Real-time  
scheduling, i.e., assigning resources to processes  
in a way that takes the timing requirements of these  
processes into account, is the single most important  
technique in the construction of real-time systems.  
This tutorial introduces the most widely used system  
models for real-time scheduling, describing  
resource characteristics, process parameters, and  
scheduling objectives. It summarizes,  
illustrates, and verifies essential findings about  
basic real-time scheduling algorithms such as  
earliest-deadline-first, least-laxity-first,  
and rate-monotonic scheduling for both sporadic and  
periodic processes.  
 
----- 
File: 1990/tr-90-036 
 
The Goedel Incompleteness Theorem and Decidability over a Ring 
 
Lenore Blum 
tr-90-036 
August 1990 
 
Goedel showed in 1931 that given any reasonable  
(consistent and effective) theory of arithmetic,  
there are true assertions about the natural numbers  
that are not theorems in that theory. This  
"incompleteness theorem" ended Hilbert's program  
of formalizing mathematics and is rightfully  
regarded as the most important result in the  
foundations of mathematics in this century. Now the  
concept of undecidability of a set plays an important  
role in understanding Goedel's work. On the other  
hand, the question of the undecidability of the  
Mandelbrot set has been raised by Roger Penrose.  
Penrose acknowledges the difficulty of formulating  
his question because "decidability" has  
customarily only dealt with countable sets, not sets  
of real or complex numbers. <P>Here we give an exposition  
of Goedel's result in an algebraic setting and also a  
formulation (and essentially an answer) to  
Penrose's problem. The notions of computability and  
decidability over a ring R underly our point of view.  
Goedel's Theorem follow from the Main Theorem: There  
is a definable undecidable set over Z. By way of  
contrast, Tarski's Theorem asserts that every  
definable set over the reals or any real closed field R  
is decidable over R. We show a converse to this result,  
namely: any sufficiently infinite ordered field  
with this property is necessarily real closed.  
 
----- 
File: 1990/tr-90-037 
 
Two Results on the List Update Problem; 
 
Sandy Irani 
tr-90-037 
August 1990 
 
In this paper we give a randomized on-line algorithm  
for the list update problem. Sleator and Tarjan show a  
deterministic algorithm, Move-to-Front, that  
achieves competitive ratio of (2L-1)/L for lists of  
length L. Karp an Raghavan show that no deterministic  
algorithm can beat 2L/(L+1). We show that  
Move-to-Front in fact achieves an optimal  
competitive ratio of 2L/(L+1). We show a randomized  
algorithm that achieves a competitive ratio of (31 L +  
1 )/16(L+1) against an oblivious adversary. This is  
the first randomized strategy whose competitive  
factor beats a constant less than 2.  
<P> 
Keywords: Analysis of Algorithms, On-line Algorithms,  
Competitive Analysis, Amortized Analysis, Linear  
Lists.  
 
----- 
File: 1990/tr-90-038 
Information-Based Complexity: New Questions for Mathematicians 
 
J. F. Traub and H. Woznaikowski 
tr-90-038 
August 1990 
 
[No Abstract]  
 
----- 
File: 1990/tr-90-039 
 
The Monte Carlo Algorithm with a Pseudo-Random Generator 
 
J. F. Traub and H. Woznaikowski 
tr-90-039 
August 1990 
 
We analyze the Monte Carlo algorithm for the  
approximation of multivariate integrals when a  
pseudo-random generator is used. We establish lower  
and upper bounds on the error of such algorithms. We  
prove that as long as a pseudo-random generator is  
capable of producing only finitely many points, the  
Monte Carlo algorithm with such a pseudo-random  
generator fails for L subscript 2 or continuous  
functions. It also fails for Lipschitz functions if  
the number of points does not depend on the number of  
variables. This is the case if a linear congruential  
generator is used with one initial seed. On the other  
hand, if a linear congruential generator of period m  
is used for each component with independent  
uniformly distributed initial seeds, then the Monte  
Carlo algorithm with such a pseudo-random generator  
using n function values behaves as for the uniform  
distribution and its expected error is roughly n  
superscript (-1/2) as long as the number n of function  
values is less than m superscript 2.  
 
----- 
File: 1990/tr-90-040 
 
Designing Checkers for Programs that Run in Parallel 
 
Ronitt Rubinfeld 
tr-90-040 
August 1990 
 
We extend the theory of program result checking to  
parallel programs, and find general techniques for  
designing such result checkers. We find result  
checkers for many basic problems in parallel  
computation. We show that there are P-complete  
problems (evaluating straight-line programs,  
linear programming) that have very fast (even  
constant depth) parallel result checkers. Sorting,  
multiplication, parity, majority and the all pairs  
shortest path problem all have constant depth result  
checkers. In addition, the sequential versions of  
the parallel result checkers given for integer  
sorting and the all pairs shortest path problems are  
the first deterministic sequential result checkers  
for those problems.  
 
----- 
File: 1990/tr-90-041 
 
Self-Testing/Correcting with Applications to Numerical Problems 
 
Manuel Blum, Michael Luby and Ronitt Rubinfeld 
tr-90-041 
August 1990 
 
Suppose someone gives us an extremely fast program P  
that we can call as a black box to compute a function f.  
Should we trust that P works correctly? A  
self-testing/correcting pair for f allows us to: (1)  
estimate the probability that P(x) is not equal to  
f(x) when x is randomly chosen; (2) on any input x,  
compute f(x) correctly as long as P is not too faulty on  
average. Furthermore, both (1) and (2) take time only  
slightly more than the original running time of P. <P>We  
present general techniques for constructing simple  
to program self-testing/correcting pairs for a  
variety of numerical functions, including integer  
multiplication, modular multiplication, matrix  
multiplication, inverting matrices, computing the  
determinant of a matrix, computing the rank of a  
matrix, integer division, modular exponentiation  
and polynomial multiplication.  
 
----- 
File: 1990/tr-90-042 
 
CHCL - A Connectionist Inference System for Horn Logic based on the 
Connection Method and using Limited Resources 
 
Steffen Hoelldobler 
tr-90-042 
August 1990 
 
A connectionist inference system for a class of Horn  
clauses is presented. The system is based on a  
connectionist unification algorithm for  
first-order terms and utilizes Bibel's connection  
method. The resources of the system are limited in  
that at most one instance of each clause may be used in a  
proof.  
 
----- 
File: 1990/tr-90-043 
 
ODA-Based Data Modeling in Multimedia Systems 
 
Ralf Guido Herrtwich and Luca Delgrossi 
tr-90-043 
August 1990 
 
A multimedia system can handle both discrete media  
(text, graphics) and continuous media (audio,  
video). The design of a multimedia system comprises  
processing and data modeling aspects. In this paper,  
we are concerned with data modeling only. We present a  
proposal to extend the ISO Office Document  
Architecture (ODA) to accommodate continuous  
media. To provide media flexibility, the needs for  
new ODA content architectures are identified. To  
take into account the timing requirements of  
continuous-media data, attributes for temporal  
synchronization are introduced for the logical and  
layout structure of an ODA document. To consider that  
multimedia information does not only appeal to the  
sense of vision, the layout structure is extended  
from two-dimensional visual space to arbitrary  
"presentation space". In addition, the inclusion of  
live information and hypertext features into ODA  
documents is proposed.  
 
----- 
File: 1990/tr-90-044 
 
Continuous Speech Recognition on the Resource Management Database Using 
Connectionist Probability Estimation 
 
N. Morgan, C. Wooters, H. Bourlard and M. Cohen 
tr-90-044 
September 1990 
 
Previous work has shown the ability of Multilayer  
Perceptrons (MLPs) to estimate emission  
probabilities for a Hidden Markov Model (HMM). The  
advantage to this approach is the ability to  
incorporate multiple sources of evidence  
(features, temporal context) without restrictive  
assumptions of distribution or statistical  
independence. <P>In our earlier publications on this  
topic, a hybrid MLP/HMM continuous speech  
recognition algorithm was tested on the SPICOS  
German-language data base. In our recent work, we  
have shifted to the speaker-dependent portion of  
DARPA's English language Resource Management (RM)  
data base. Both consist of continuous utterances  
(sentences) and incorporate a lexicon of roughly  
1000 words. Preliminary results appear to support  
the previously reported utility of MLP probability  
estimation for continuous speech recognition (at  
least for the case of this simple form of HMM).  
 
----- 
File: 1990/tr-90-045 
 
SPOONS '90: The SPeech recOgnition frOnt eNd workShop 
 
N. Morgan H. Hermansky and C. Wooters 
tr-90-045 
September 1990 
 
An appropriate input representation is crucial for  
pattern classification. In spite of this, we find  
that feature extraction, transformation, and  
selection tend to be under-represented aspects of  
the speech recognition literature. Therefore, the  
authors decided to gather together a group of  
interested parties for a dialog on the subject. We  
ultimately invited a group of about 30 researchers,  
and on July 6, 1990, held a 1-day workshop which we  
called SPOONS. This document is a brief summary of  
that day, including the abstract for each talk.  
 
----- 
File: 1990/tr-90-046 
 
Space-Frequency Distributions in the Analysis and Modeling of Early Vision 
 
Gabriel Cristobal 
tr-90-046 
September 1990 
 
The use of the joint space-spatial frequency  
representations has recently received  
considerable attention; especially in those areas  
of science and engineering where nonstationary  
signals appear. In that case, local energy  
distribution representations based in the local  
spectra computation would be more appropriate. The  
Wigner Distribution (WD) which gives a joint  
representation in the space and spatial frequency  
domain entails a rigorous mathematical framework in  
the study of these local representations. In this  
paper, texture recognition is performed through the  
extraction of features from the WD and a comparative  
study with other methods is presented. A review of the  
state-of-the-art of the joint representations in  
different areas of research namely signal, speech  
and vision processing is presented. Afterwards, the  
importance of these distributions in the modeling of  
early vision processes is considered, and a brief  
review about the physiological findings is  
presented in order to have a quantitative measure of  
the degree of biological plausibility.  
 
----- 
File: 1990/tr-90-047 
 
The Ring Array Processor (RAP): Algorithms and Architecture 
 
Nelson Morgan 
tr-90-047 
September 1990 
 
We have designed and implemented a Ring Array  
Processor (RAP) for fast implementation of our  
continuous speech recognition training algorithms  
which are currently dominated by layered neural  
network calculations. The RAP is a multi-DSP system  
with a low-latency ring interconnection scheme  
using programmable gate array technology and a  
significant amount of local memory per node (4-16  
MBytes of dynamic memory and 256 KByte of fast static  
RAM). Theoretical peak performance is 128  
MFlops/board, and test runs with the first working  
board show a sustained throughput of roughly 30-90  
percent of this for algorithms of current interest.  
<P>This report describes the motivation for the RAP  
design, and shows how the architecture matches the  
target algorithm. Technical reports from other  
members of the RAP team focus on the hardware and  
software specifics for the system.  
 
----- 
File: 1990/tr-90-048 
 
The Ring Array Processor (RAP): Hardware; 
 
J. Beck 
tr-90-048 
September 1990 
 
The ICSI Ring Array Processor, or RAP, is a system of  
hardware and software specifically designed for our  
research in speech processing using neural  
networks. This technical report describes the RAP  
hardware, paying particular attention to the  
features that may be unusual in a system of this type.  
Other features and design decisions that  
facilitated realization of the RAP are also  
described. Technical reports from other members of  
the RAP team focus on the architecutre and algorithms  
of the RAP, and the software specifics for the system.  
 
----- 
File: 1990/tr-90-049 
 
Ring Array Processor (RAP): Software User's Manual Version 1.0; 
 
P. Kohn and J. Bilmes 
tr-90-049 
September 1990 
 
The RAP machine is a high performance parallel  
processor developed at ICSI as described in previous  
technical reports. This report documents the RAP  
software environment. It is intended for the  
moderately experienced C programmer who wishes to  
program the RAP. The RAP software environment is very  
similar to the UNIX C programming environment.  
However, there are some differences arising from the  
hardware that the programmer must keep in mind. Also  
described is the RAP library which contains  
hand-optimized matrix, vector and inter-processor  
communications routines. Single Program Multiple  
Datastream (SPMD) programs can be developed under  
UNIX with a simulated RAP library and then recompiled  
to run on the RAP.  
 
----- 
File: 1990/tr-90-050 
 
Ring Array Processor (RAP): Software Architecture; 
 
Jeff Bilmes and Phil Kohn 
tr-90-050 
September 1990 
 
The design and implementation of software for the  
Ring Array Processor (RAP), a high performance  
parallel computer, involved development for three  
hardware platforms: Sun SPARC workstations,  
Heurikon MC68020 boards running the VxWorks  
real-time operating system, and Texas Instruments  
TMS320C30 DSPs. The RAP now runs in Sun workstations  
under UNIX and in a VME based system using VxWorks. A  
flexible set of tools has been provided both to the RAP  
user and programmer. Primary emphasis has been  
placed on improving the efficiency of layered  
artificial neural network algorithms. This was done  
by providing a library of assembly language  
routines, some of which use node-custom  
compilation. An object-oriented RAP interface in  
C++ is provided that allows programmers to  
incorporate the RAP as a computational server into  
their own UNIX applications. For those not wishing to  
program in C++, a command interpreter has been built  
that provides interactive and shell-script style  
RAP manipulation.  
 
----- 
File: 1990/tr-90-051 
 
Characterizing the Variability of Arrival Processes with Indices of 
Dispersion 
 
Riccardo Gusella 
tr-90-051 
September 1990 
 
We propose to characterize the burstiness of packet  
arrival processes with indices of dispersion for  
intervals and for counts. These indices, which are  
functions of the variance of intervals and counts,  
are relatively straightforward to estimate and  
convey much more information than simpler indices,  
such as the coefficient of variation, that are often  
used to describe burstiness quantitatively. <P>We  
define and evaluate the indices of dispersion for  
some of the simple analytical models that are  
frequently used to represent highly variable  
processes. We then estimate the indices for a number  
of measured point processes which were generated by  
workstations communicating to file servers over a  
local-area network. <P>We show that nonstationary  
components in the measured packet arrival data  
distort the shape of the indices and propose ways to  
handle nonstationary data. Finally, to show how to  
incorporate measures of variability into  
analytical models and to offer an example of how to  
model our measured packet arrival processes, we  
describe a fitting procedure based on the index of  
dispersion for counts for the Markov-modulated  
Poisson process.  
 
----- 
File: 1990/tr-90-052 
 
On Semi-Algebraic Decision Complexity 
 
Thomas Lickteig 
tr-90-052 
September 1990 
 
The topic of this paper is the lower bound question for  
composition trees that solve certain semialgebraic  
decision problems.  
 
----- 
File: 1990/tr-90-053 
 
A Pipelining Model Which Pipelines Blocks of Code 
 
Joachim Beer 
tr-90-053 
October 1990 
 
This paper presents a new technique of software  
pipelining and an architecture to support this  
technique. Rather than attempting to pipeline a  
sequence of individual instructions, the presented  
technique tries to pipeline entire blocks of code,  
i.e. the units to be pipelined are chunks of code,  
instructions within each code block might or might  
not be pipelined themselves. In this model blocks of  
code are identified which can be executed in a  
pipelined fashion. Neighboring blocks of code do not  
need to be data independent; pipeline stages can feed  
results and/or synchronization markers on to the  
next pipeline stage. The architecture can be seen as  
an attempt to use classical pipelining techniques in  
a multiprocessor system. The architecture consists  
of a circular pipeline of ordinary microprocessors.  
Advantages of the architecture are: unlike  
supercomputers and VLIW architectures the system  
can be based on commercial micro-processors, it  
avoids the high overhead of process startup, and it is  
not restricted to vectorizing only inner-loops.  
Simulation studies show the viability of the  
architecture and the associated execution model.  
 
----- 
File: 1990/tr-90-054 
 
A Mathematical Theory of Self-Checking, Self-Testing and Self-Correcting Programs (Thesis)  
 
Ronitt Rubinfeld 
tr-90-054 
October 1990 
 
Suppose someone gives us an extremely fast program P  
that we can call as a black box to compute a function f.  
Rather than trust that p works correctly, a  
self-testing/correcting pair for f allows us to: (1)  
estimate the probability that P(x) is not equal to  
f(x) when x is randomly chosen; (2) on any input x,  
compute f(x) correctly as long as P is not too faulty on  
average. Furthermore, both (1) and (2) require only a  
small multiplicative overhead (usually constant)  
over the running time of P. A program result checker  
for f (as introduced by Manuel Blum) allows us to check  
that on particular input x, P(x) = f(x). <P>We present  
general techniques for constructing simple to  
program self-testing/correcting pairs for a  
variety of numerical functions. The  
self-testing/correcting pairs introduced for many  
of the problems are based on the property that the  
solution to a particular instance of the problem can  
be expressed as the solution to a few random instances  
of the same size. An important idea is to design  
self-testing/correcting pairs for an entire  
library of functions rather than for each function  
individually. <P>We extend these notions and some of the  
general techniques to check programs for some  
specific functions which are only intended to give  
good approximations to f(x). We extend the above  
models and techniques of program result checking and  
self-testing/correcting to the case where the  
behavior of the program is modelled as being  
adaptive, i.e., the program may not always give the  
same answer on a particular input. These stronger  
checkers provide multi-prover interactive proofs  
for these problems. <P>The theory of checking is also  
extended to parallel programs [Rubinfeld]. We  
construct parallel checkers for many basic problems  
in parallel computation. <P>We show that for some  
problems, result checkers that are much more  
efficient can be constructed if the answers are  
checked in batches, i.e., many answers are checked at  
the same time. For these problems, the  
multiplicative overhead of checking the result can  
be made arbitrarily small.  
 
----- 
File: 1990/tr-90-055 
 
ICSIM: Initial Design of An Object-Oriented Net Simulator 
 
Heinz W. Schmidt 
tr-90-055 
October 1990 
 
ICSIM is a connectionist net simulator being  
developed at ICSI. It is object-oriented to meet the  
requirements for flexibility and reuse of models and  
to allow the user to encapsulate efficient  
customized implementations perhaps running on  
dedicated hardware. Nets are composed by combining  
off-the-shelf library classes and, if necessary, by  
specializing some of their routines. <P>The report  
gives an overview of the simulator. The class  
structure and some important design decisions are  
sketched and a number of example nets are used to  
illustrate how net structure, connectivity and  
behavior are defined.  
 
----- 
File: 1990/tr-90-056 
 
How Fast Can A Threshold Gate Learn? 
 
Wolfgang Maass and Gyoergy Turan 
tr-90-056 
October 1990 
 
It is shown that a threshold gate with d Boolean input  
variables can learn any halfspace in polynomially in  
d many steps in the common on-line learning model  
(worst case analysis). This is achieved by a  
computationally feasible learning algorithm that  
exploits geometrical properties in the version  
space. This positive result can be extended to the  
case of input variables that range over {0,...,n-1},  
and to threshold gates with more than two different  
output values (these gates can learn arbitrary  
discrete approximations to sigmoid threshold  
functions). <P>On the other hand we show that all known  
distributed learning algorithms for threshold  
gates (delta-rule, WINNOW 1, WINNOW 2) are  
inherently slow.  
 
----- 
File: 1990/tr-90-057 
 
Learning Spatial Terms without Explicit Negative Instances 
 
Terry Regier 
tr-90-057 
October 1990 
 
A method is presented for learning to associate  
scenes with spatial terms, in the absence of explicit  
negative instances, using error back-propagation.  
A straightforward approach, in the learning of a  
given term, is to take all positive instances for any  
other term to be implicit negative instances for the  
term in question. While this approach is inadequate,  
a variation on it is shown to work well: error signals  
from implicit negative instances are attenuated, so  
that an implicit negative instance will have less  
effect on the network's weights than will a positive  
instance of the same error magnitude. It is also shown  
that "a priori" knowledge of which pairs of spatial  
terms are antonyms facilitates the learning  
process.  
 
----- 
File: 1990/tr-90-058 
 
A Theory of Computation and Complexity over the Real Numbers 
 
Lenore Blum 
tr-90-058 
October 1990 
 
The classical theory of computation and complexity  
presupposes all underlying spaces are countable and  
hence ipso facto cannot handle arbitrary sets of real  
or complex numbers. Thus e.g., Penrose (1990)  
acknowledges the difficulty of formulating  
classically his question: Is the Mandelbrot set  
recursive? On the other hand, this as well as a number  
of other inherent questions of decidability and  
computability over the reals or complex number can be  
naturally posed and settled within the framework  
presented in this paper.  
 
----- 
File: 1990/tr-90-059 
 
Constraint Reasoning With Intervals: A Tutorial, Survey and Bibliography 
 
Peter B. Ladkin 
tr-90-059 
November 1990 
 
A version of this work was presented at the 1990  
Berkeley Workshop on Temporal and Real-Time  
Specification, held at ICSI, Berkeley. In Part I, we  
present a short tutorial on constraint reasoning  
with time intervals, of the sort initially  
introduced by James Allen, and continued by many  
others. The tutorial concentrates on the general  
mathematical expression of common algorithms, in  
particular path-consistency algorithms, for  
constraint satisfaction using the thirteen  
interval relations. We use the relation algebra of  
Tarski to express the important concepts. In Part II,  
we survey important research in this field to date,  
focusing on mathematical results and algorithms for  
reasoning directly with intervals, although we do  
attempt to include as much literature as the author is  
aware of. Part III is a select bibliography. Three  
appendices include the mathematical background,  
and the operation tables for the Point Algebra and  
Interval Algebra, which form the focus of Part I.  
 
----- 
File: 1990/tr-90-060 
 
Proceedings of the Berkeley Workshop on Temporal and Real-Time 
Specification, August 9-10, 1990 
 
P. B. Ladkin and F. H. Vogt  
tr-90-060 
November 1990  
 
This report contains papers presented  
by participants at the workshop, with an  
introduction, a participant list, a synopsis of the  
workshop, and a short summary of the problem session  
discussion. The workshop brought together  
practitioners with different interests in temporal  
and real-time specification, from simulation,  
testing and verification to theoretical issues such  
as relative strengths of theories. The papers  
concern interval logic, theories of intervals,  
real-time temporal logic and automata, a real-time  
systems simulation language, and a causality  
problem in robot motion planning.  
 
----- 
File: 1990/tr-90-061 
 
Stochastic Model-Based Image Segmentation Using Markov Random Fields and 
Multi-layer Perceptrons 
 
Jun Zhang and Nelson Morgan 
tr-90-061 
November 1990 
 
Recently, there has been much interest in Markov  
random field (MRF) model-based techniques for image  
(texture) segmentation. MRF models are used to  
enforce reasonable physical constraints on  
segmented regions, such as the continuity of the  
regions, and have been shown to improve segmentation  
results. However, in these techniques, parametric  
probability models which do not have sufficient  
physical justifications are often used to model  
observed image data because they are  
computationally tractable. In this paper, we  
outline an MRF approach to image segmentation in  
which the probability distribution of observed  
image data is modeled by using a multi-layer  
perceptron (MLP) which can "learn" the distribution  
from training data. Furthermore, we propose a  
technique to achieve unsupervised image  
segmentation using this approach. We hope that this  
will improve the current MRF image segmentation  
techniques by providing a better model for observed  
image data.  
 
----- 
File: 1990/tr-90-062 
 
Proceedings of the First International Workshop on Network Operating System 
Support for Digital Audio and Video 
 
[Proceedings Editor] 
tr-90-062 
November 1990 
 
Held at the International Computer Science  
Institute November 8-9, 1990.  
 
----- 
File: 1990/tr-90-063 
 
A Monte-Carlo Algorithm for Estimating the Permanent; 
 
N. Karmarkar, R. Karp, R. Lipton, L. Lovasz, and M. Luby 
tr-90-063 
November 1990 
 
Let $A$ be an $n \times n$ matrix with 0-1 valued  
entries, and let $\PER(A)$ be the permanent of $A$. We  
describe a Monte-Carlo algorithm which produces a  
``good in the relative sense'' estimate of $\PER(A)$  
and has running time $\POLY(n) 2^{n/2}$, where  
$\POLY(n)$ denotes a function that grows  
polynomially with $n$.  
<P> 
Key Words: permanent,  
matching, Monte-Carlo algorithm, algorithm,  
bipartite graph, determinant.  
 
----- 
File: 1990/tr-90-064 
 
Quality of Service in ATM Networks 
 
Domenico Ferrari and Dinesh Verma 
tr-90-064 
December 1990 
 
B-ISDN networks of the future will have to handle  
traffic with a wide range of traffic characteristics  
and performance requirements. In view of the high  
bandwidth of these networks and the relatively large  
propagation delays involved in wide-area B-ISDN  
networks, the performance requirements can only be  
provided by reserving resources to communicating  
clients at the connection establishment time.  
However, reservation mechanisms for heterogenous  
bursty traffic usually result in a rather poor  
utilization of network resources. <P>In this paper, we  
propose a simple admission control criterion that  
can be used to reserve resources for bursty as well as  
smooth traffic with delay and loss sensitivities.  
Our scheme leads to a reasonable value of the maximum  
utilization of network bandwidth (about 40 percent)  
for delay sensitive traffic with moderate  
burstiness (peak-to-average bandwidth ratios of  
about 4), even under the worst possible conditions.  
Actual utilizations can be higher if there is smooth  
traffic or traffic which is not delay-sensitive. Our  
admission control algorithm uses a well-defined  
traffic specification scheme which is easy to  
enforce and verify, and able to accommodate  
arbitrary degrees of burstiness. Extensive  
simulation experiments failed to show that our  
admission control criterion are incorrect, in the  
sense that the quality of service requirements of the  
traffic was always met,even in the worst case.  
Moreover, the scheme is simple and feasible at the  
high speeds required of B-ISDN networks.  
 
----- 
File: 1990/tr-90-065 
 
Developments in Digital VLSI Design for Artificial Neural Networks 
 
Nelson Morgan, Krste Asanovic, Brian Kingsbury, and John Wawrzynek 
tr-90-065 
December 1990 
 
Artificial Neural Networks (ANNs) have been  
heralded as a form of massive parallelism that may  
significantly advance the state of the art in machine  
intelligence and perception. While these  
expectations may or may not be realistic, this class  
of algorithms has already been useful for difficult  
problems in signal processing and pattern  
recognition over the last 25 years. However, for  
extension to a wider class of problems, a key  
requirement is the parallel hardware  
implementation of such systems, since ANN  
implementation on conventional Von Neumann  
machines is often prohibitively slow. While the ANN  
mainstream has focused on analog VLSI ANNs, some  
projects have shown the potential of a fully digital  
approach. We report here on progress in developing a  
methodology for digital ANN design, including a new  
object-oriented CAD interface, and a set of  
ANN-specific library cells. A new measure for  
efficiency of silicon ANNs is also described.  
 
----- 
File: 1990/tr-90-066 
 
Automatic Worst Case Complexity Analysis of Parallel Programs 
 
Wolf Zimmermann 
tr-90-066 
December 1990 
 
This paper introduces a first approach for the  
automatic worst case complexity analysis. It is an  
extension of previous work on the automatic  
complexity analysis of functional programs. The  
language is a first order parallel functional  
language which allows the definition of indexed data  
types and parallel execution of indexed terms. The  
machine model is a parallel reduction system based on  
eager evaluation. It is shown how parallel programs  
based on the basic design principles balanced binary  
tree technique, divide-and-conquer technique and  
pointer jumping technique can be analyzed  
automatically. The analysis techniques are  
demonstrated by various examples. Finally it is  
shown that an average case analysis of parallel  
programs is difficult.  
 
----- 
File: 1991/tr-91-001 
 
The Mean Field Theory in EM Procedures for Markov Random Fields 
 
Jun Zhang 
tr-91-001 
January 1991 
 
The EM (expectation maximization) algorithm is a  
maximum-likelihood parameter estimation  
procedure for incomplete data problems in which part  
of the data is hidden, or unobservable. In many signal  
processing and pattern recognition applications,  
the hidden data are modeled as Markov processes and  
the main difficulty of using the EM algorithm for  
these applications is the calculation of the  
condition expectations of the hidden Markov  
processes. In this paper, we show how the mean field  
theory from statistical mechanics can be used to  
efficiently calculate the conditional  
expectations for these problems. The efficacy of the  
mean field theory approach is demonstrated on the  
parameter estimation for one-dimensional mixture  
data and two-dimensional unsupervised stochastic  
model-based image segmentation. Experimental  
results indicate that in the 1-D case, the mean field  
theory approach provides comparable results to  
those obtained by Baum's algorithm, which is known to  
be optimal. In the 2-D case, where Baum algorithm can  
no longer be used, the mean field theory provides good  
parameter estimates and image segmentation for both  
synthetic and real-world images.  
 
----- 
File: 1991/tr-91-002 
 
Protocols for Providing Performance Guarantees in a Packet Switching 
Internet 
 
Carlyn M. Lowery 
tr-91-002 
January 1991 
 
As advances in technology enable us to implement very  
high speed computer networks, we expect to use our  
networks for more diverse applications. While the  
Internet was designed with textual data processing  
in mind, future networks will carry information such  
as voice, music, images, and video, along with  
textual data. Many new applications will have  
real-time performance requirements, where the  
timing of data arrival is crucial to its usefulness.  
<P>This paper describes a methodology developed at the  
University of California at Berkeley to support such  
applications, reviews related research work, and  
proposes a real-time delivery system, composed of a  
new protocol for administration of real-time  
connections, combined with modifications to the  
Internet Protocol (IP) to support such connections.  
Transport protocol requirements are also  
discussed. This work is intended to facilitate  
experiments with real-time communication over the  
Experimental University Network (XUNET).  
 
----- 
File: 1991/tr-91-003 
 
On-Line Learning with an Oblivious Environment and the Power of 
Randomization 
 
Wolfgang Maass 
tr-91-003 
January 1991 
 
A new model for on-line learning is introduced. In  
this model the environment is assumed to be  
"oblivious" to the learner: it supplies an arbitrary  
(not necessarily random) sequence of examples for  
the target concept which does not depend on the  
sequence of hypotheses of the learner. This model  
provides a framework for the design and analysis of  
on-line learning algorithms which acquire  
information not just from counter examples, but also  
from examples which "support" their current  
hypothesis. It is shown that for various concept  
classes C an arbitrary target concept from C can be  
learned in this model by a randomized learning  
algorithm (which uses only hypotheses from C) with  
substantially fewer prediction errors than in  
Angluin's classical model for on-line learning with  
an adaptive worst-case environment. In particular  
any target-setting of weights and threshholds in a  
feed forward neural net can be learned by a randomized  
learning algorithm in this model with an expected  
number of prediction errors that is polynomial in the  
number of units of the neural net. <P>For comparison we  
also examine the power of randomization for  
Angluin's model for learning with an adaptive  
environment.  
 
----- 
File: 1991/tr-91-004 
 
Real-Time Transmission and Software Decompression of Digital Video in a 
Workstation 
 
K. Umemura and A. Okazake 
tr-91-004 
January 1991 
 
This paper describes an experiment in which  
compressed video data is transformed via Ethernet to  
a workstation, and uncompressed and displayed on the  
workstation. The workstation has no special  
hardware. The video data is 192x114 pixel gray scale,  
30 frames per second. The data consists of a human  
speaker with a static background. It is displayed on  
monochrome display, with dithering, in a 768x576  
rectangle. This decompression and display uses  
about 10 MIPS. The quality of output is suitable for  
applications such as conferencing, telephony, and  
presentations.  
 
----- 
File: 1991/tr-91-005 
 
Some Computational Problems in Linear Algebra as Hard as Matrix 
Multiplication 
 
Peter Buergisser, Marek Karpinski, and Thomas Lickteig 
tr-91-005 
January 1991 
 
We define the complexity of a computational problem  
given by a relation using the model of a computation  
tree with Ostrowski complexity measure. To a  
sequence of problems we assign an exponent similar as  
for matrix multiplication. For the complexity of the  
following computational problems in linear  
algebra: <UL><LI> KER: Compute a basis of the kernel for a  
given n x n - matrix. <LI> OGB: Find an invertible matrix  
that transforms a given symmetric <br>n x n - matrix  
(quadratic form) to diagonal form. <LI> SPR: Find a  
sparse representation of a given n x n - matrix. </UL>We  
prove relative lower bounds of the form ((a)(M  
subscript n)) - b and absolute lower bounds where (M  
subscript n) denotes the complexity of matrix  
multiplication and a, b, d are suitably chosen  
constants. We show that the exponents of the problem  
sequences KER, OGB, SPR are the same as the exponent  
omega of matrix multiplication.  
 
----- 
File: 1991/tr-91-006 
 
Parallel Combinatorial Computing 
 
Richard M. Karp 
tr-91-006 
January 1991 
 
In this article we suggest that the application of  
highly parallel computers to applications with a  
combinatorial or logical flavor will grow in  
importance. We briefly survey the work of  
theoretical computer scientists on the  
construction of efficient parallel algorithms for  
basic combinatorial problems. We then discuss a  
two-stage algorithm design methodology, in which an  
algorithm is first designed to run on a PRAM and then  
implemented for a distributed-memory machine.  
Finally, we propose the class of node expansion  
algorithms as a fruitful domain for the application  
of highly parallel computers.  
 
----- 
File: 1991/tr-91-007 
 
Delay Jitter Control for Real-Time Communication in a Packet Switching 
Network 
 
Dinesh C. Verma, Hui Zhang and Domenico Ferrari 
tr-91-007 
January 1991 
 
A real-time channel is a simplex connection between  
two nodes characterized by parameters representing  
the performance requirements of the client. These  
parameters may include a bound on the minimum  
connection bandwidth, a bound on the maximum packet  
delay, and a bound on the maximum packet loss rate.  
Such a connection may be established in a  
packet-switching environment by means of the  
schemes described by some of the authors in previous  
papers. <P>In this paper, we study the feasibility of  
bounding the delay jitter for real-time channels in a  
packet-switched store-and-forward wide-area  
network with general topology, extending the scheme  
proposed in the previous papers. We prove the  
correctness of our solution, and study its  
effectiveness by means of simulations. The results  
show that the scheme is capable of providing a  
significant reduction in delay jitter, that there is  
no accumulation of jitter along the path of a channel,  
and that jitter control reduces the buffer space  
required in the network significantly.  
 
----- 
File: 1991/tr-91-008 
 
A Study of I/O Architecture for High Performance Next Generation Computers 
 
Anurag Sah, Vojin G. Oklobdjiza and Dinesh C. Verma 
tr-91-008 
January 1991 
 
We describe an I/O architecture for a high  
performance next generation computer. The  
architecture proposed in this paper makes special  
provisions for communication networks. In order to  
allow for the expected multi-media and  
time-critical components of future computer usage,  
we propose the concept of "Illogical buses" which  
gives the illusion that there are a number of  
dedicated buses between the components of a system. A  
logical bus has a number of performance parameters  
associated with it, and the system architecture  
ensures that the performance parameters for each  
logical bus are satisfied during the operation of the  
system.  
 
----- 
File: 1991/tr-91-009 
 
Bumptrees for Efficient Function, Constraint, and Classification Learning 
 
Stephen M. Omohundro 
tr-91-009 
January 1991 
 
A new class of data structures called "bumptrees" is  
described. These structures are useful for  
efficiently implementing a number of neural network  
related operations. An empirical comparison with  
radial basis functions is presented on a robot arm  
mapping learning task. Applications to density  
estimation, classification, and constraint  
representation and learning are also outlined.  
 
----- 
File: 1991/tr-91-010 
 
How Receptive Field Parameters Affect Neural Learning 
 
Stephen M. Omohundro and Bartlett W. Mel 
tr-91-010 
January 1991 
 
We identify the three principle factors affecting  
the performance of learning by networks with  
localized units: unit noise, sample density, and the  
structure of the target function. We then analyze the  
effect of unit receptive field parameters on these  
factors and use this analysis to propose a new  
learning algorithm which dynamically alters  
receptive field properties during learning.  
 
----- 
File: 1991/tr-91-011 
 
Algorithms for Sparse Rational Interpolation 
 
Dima Grigoriev and Marek Karpinski 
tr-91-011 
January 1991 
 
We present two algorithms on sparse rational  
interpolation. The first is the interpolation  
algorithm in a sense of the sparse partial fraction  
representation of rational functions. The second is  
the algorithm for computing the entier and the  
remainder of a rational function. The first  
algorithm works without apriori known bound on the  
degree of a rational function, the second one is in the  
class NC provided the degree is known. The presented  
algorithms complement the sparse interpolation  
results of [Grigoriev, Karpinski, and Singer  
(1990)].  
<P> 
Keywords: Algorithms, NC-Class, Sparse  
Rational Interpolation, Fraction Representation.  
 
----- 
File: 1991/tr-91-012 
 
On Distributed Representation in Word Semantics 
 
Burghard B. Rieger 
tr-91-012 
January 1991 
 
The dualism of the rationalistic tradition of  
thought is sketched in view of the "semiotic problem"  
of meaning constitution. Being a process of  
cognition which is based upon communicative  
interaction by signs, their usages (in linear order  
and selective combination), constitute language  
structures. Other than "symbolic"  
representational formats employed so far in natural  
language processing by machine, it is argued here  
that "distributional" representations correspond  
directly to the way word meanings are constituted and  
understood (as fuzzy structures of world knowledge)  
by (natural and artificial) information processing  
systems. Based upon such systems' theoretical  
performance in general and the pragmatics of  
communicative interaction by real language users in  
particular, the notions of "situation" and  
"language game" as introduced by Barwise/Perry and  
Wittgenstein respectively are combined to allow for  
a numerical reconstruction of processes that  
simulate the constitution of meaning and the  
interpretation of signs. This is achieved by  
modelling the linear or "syntagmatic" and selective  
or "paradigmatic" constraints which natural  
language structure imposes on the formation of  
(strings of) linguistic entities. A formalism, a  
related algorithm, and test results of its  
implementation are given in order to substantiate  
the claim for an artificial "cognitive information  
processing system" (CIPS) that operates in a  
linguistic environment as some meaning acquisition  
and understanding device.  
 
----- 
File: 1991/tr-91-013 
 
Short Proofs for Nondivisibility of Sparse Polynomials under the Extended 
Riemann Hypothesis 
 
Dima Grigoriev, Marek Karpinski, and Andrew M. Odlyzko 
tr-91-013 
February 1991 
 
Symbolic manipulation of sparse polynomials, given  
as lists of exponents and nonzero coefficients,  
appears to be much more complicated than dealing with  
polynomials in dense encoding (see e.g. [GKS 90, KT  
88, P 77a, P 77b]). The first results in this direction  
are due to Plaisted [P 77a, P 77b], who proved, in  
particular, the NP-completeness of divisibility of  
a polynomial x**n-1 by a product of sparse  
polynomials. On the other hand, essentially nothing  
nontrivial is known about the complexity of the  
divisibility problem of two sparse integer  
polynomials. (One can easily prove that it is in  
PSPACE with the help of [M 86].) Here we prove that  
nondivisibility of two sparse multivariable  
polynomials is in NP, provided that the Extended  
Riemann Hypothesis (ERH) holds (see e.g. [LO 77]).  
<P>The divisibility problem is closely related to the  
rational interpolation problem (whose  
decidability and complexity bound are determined in  
[GKS 90]). In this setting we assume that a rational  
function is given by a black box for evaluating it. We  
prove also that the problem of deciding whether a  
rational function given by a black box equals a  
polynomial belongs to the parallel class NC,  
provided the ERH holds and moreover, that we know the  
degree of some sparse rational representation of it.  
<P> 
Keywords: Algorithms, NC-Class, Symbolic  
Manipulation, Nondivisibility, Short Proofs,  
Extended Riemann Hypothesis.  
 
----- 
File: 1991/tr-91-014 
 
Computational Complexity of Learning Read-Once Formulas over Different Bases 
 
Lisa Hellerstein, and Marek Karpinski 
tr-91-014 
February 1991 
 
We study computational complexity of learning  
read-once formulas over different boolean bases. In  
particular we design a polynomial time algorithm for  
learning read-once formulas over a threshold basis.  
The algorithm works in time O(n**3) using O(n**3)  
membership queries. By the result of [Angluin,  
Hellerstein, Karpinski, 1989] on the corresponding  
unate class of boolean functions, this gives a  
polynomial time learning algorithm for arbitrary  
read-once formulas over a threshold basis with  
negation using membership and equivalence queries.  
Furthermore we study the structural notion of  
nondegeneracy in the threshold formulas  
generalizing the result of [Heiman, Newman,  
Wigderson, 1990] on the uniqueness of read-once  
formulas over different boolean bases and derive a  
negative result on learnability of nondegenerate  
read-once formulas over the basis (AND, XOR).  
<P> 
Keywords: Computational Complexity, Learning  
Algorithms, Read-Once Formulas, Queries.  
 
----- 
File: 1991/tr-91-015 
 
A Control-Theoretic Approach to Flow Control 
 
Srinivasan Keshav 
tr-91-015 
March 1991 
 
This paper presents a control-theoretic approach to  
reactive flow control in networks that do not reserve  
bandwidth. We assume a round-robin-like queue  
service discipline in the output queues of the  
network's switches, and propose deterministic and  
stochastic models for a single conversation in a  
network of such switches. We then construct a  
standard time-invariant linear model for the  
simplified dynamics of the system. This is used to  
design an optimal (Kalman) state estimator, a  
heuristic second-order state estimator as well as a  
provably stable rate-based flow control scheme.  
Finally, schemes for correcting parameter drift and  
for coordination with window flow control are  
described.  
 
----- 
File: 1991/tr-91-016 
 
Parallel Priority Queues 
 
Maria Cristina Pinotti and Geppino Pucci 
tr-91-016 
March 1991 
 
This paper introduces the Parallel Priority Queue  
(PPQ) abstract data type. A PPQ stores a set of  
integer-valued items and provides operations such  
as insertion of n new items or deletion of the n  
smallest ones. Algorithms for realizing PPQ  
operations on an n-processor CREW-PRAM are based on  
two new data structures, the n-Bandwidth-Heap (n-H)  
and the n-Bandwidth-Leftist-Heap (n-L), that are  
obtained as extensions of the well known sequential  
binary-heap and leftist-heap, respectively. Using  
these structures, it is shown that insertion of n new  
items in a PPQ of m elements can be performed in  
parallel time O(h+logn), where h=log(m/n), while  
deletion of the n smallest items can be performed in  
time O(h+loglogn).  
 
----- 
File: 1991/tr-91-017 
 
Optimal Adaptive K-means Algorithm with Dynamic Adjustment of Learning Rate 
 
Chedsada Chinrungrueng and Carlo Sequin 
tr-91-017 
March 1991 
 
Adaptive k-means clustering algorithms have been  
used in several artificial neural network  
architectures, such as radial basis function  
networks or feature-map classifiers, for a  
competitive partitioning of the input domain. This  
paper presents a modification of the traditional  
k-means algorithm. In approximates an optimal  
clustering solution with an efficient adaptive  
learning rate, which renders it usable even in  
situations where the statistics of the problem task  
slowly varies with time. This modification is based  
on the optimality criterion for the k-means  
partition stating that all of the region in the  
optimal k-means partition have the same "within-  
cluster variation" when the number of regions in the  
partition is large and the underlying distribution  
for generating input patterns is smooth. The  
within-cluster variation of any cluster is defined  
as the expectation of the squared Euclidean distance  
between pattern vectors in that cluster and the  
center of that cluster. Simulations comparing this  
improved adaptive k-means algorithm with other  
k-means variants are presented.  
 
----- 
File: 1991/tr-91-018 
 
Computational Complexity of Sparse Rational Interpolation 
 
Dima Grigoriev, Marek Karpinski, and Michael F. Singer 
tr-91-018 
March 1991 
 
We analyze the computational complexity of sparse  
rational interpolation, and give the first genuine  
time (arithmetic complexity does not depend on the  
size of the coefficients) algorithm for this  
problem.  
<P> 
Keywords: Computational Complexity,  
Algorithms, Arithmetic Complexity, Sparse  
Rational Interpolation.  
 
----- 
File: 1991/tr-91-019 
 
Probabilistic Recurrence Relations 
 
Richard M. Karp 
tr-91-019 
March 1991 
 
This paper is concerned with recurrence relations  
that arise frequently in the analysis of  
divide-and-conquer algorithms. In order to solve a  
problem instance of size $x$, such an algorithm  
invests an amount of work $a(x)$ to break the problem  
into subproblems of sizes  
$h_1(x),h_2(x),\ldots,h_k(x)$, and then proceeds  
to solve the subproblems. Our particular interest is  
in the case where the sizes $h_i(x)$ are random  
variables; this may occur either because of  
randomization within the algorithm or because the  
instances to be solved are assumed to be drawn from a  
probability distribution. When the $h_i$ are random  
variables the running time of the algorithm on  
instances of size $x$ is also a random variable  
$T(x)$. We give several easy-to-apply methods for  
obtaining fairly tight bounds on the upper tails of  
the probability distribution of $T(x)$, and present  
a number of typical applications of these bounds to  
the analysis of algorithms. The proofs of the bounds  
are based on an interesting analysis of optimal  
strategies in certain gambling games.  
 
----- 
File: 1991/tr-91-020 
 
The Design of a File System that Supports Multimedia 
 
Vassilios G. Polimenis 
tr-91-020 
March 1991 
 
A multimedia file system is one that can support  
real-time sessions as well as normal disk traffic.  
When a request for a real-time session is accepted,  
the file system guarantees that, as long as the system  
does not crash and the user process reads or writes  
data at most as fast as the initially specified rate,  
starvation will never occur. <P>It is shown that the only  
hard requirement for the acceptance of a set of  
real-time sessions are that there is enough disk  
bandwidth and buffer space. A rigorous discussion of  
these requirements as well as the various parameters  
that affect the system's behaviors are presented.  
<P>Finally and most importantly, a scheduler that uses  
this theory to schedule the various disk transfers is  
designed. The scheduler guarantees the  
non-starvation for multimedia sessions and also  
that interactive (non-real-time) jobs will  
experience acceptable response delays.  
 
----- 
File: 1991/tr-91-021 
 
Generalized Compact Multigrid (REVISED) 
 
Victor Pan and John Reif 
tr-91-021 
December 1992 
 
Extending our recent work, based on the ideas of the  
multigrid iteration, we decrease the storage space  
for a smooth solution of a nonlinear PDE and,  
furthermore, for any smooth function on a  
multidimensional grid and on discretization sets  
other than grids.  
 
----- 
File: 1991/tr-91-022 
 
An (epsilon, delta)--Approximation Algorithm of the Number of Zeros for a 
Multilinear Polynomial over GF[q] 
 
Marek Karpinski and Barbara Lhotzky 
tr-91-022 
March 1991 
 
We construct a polynomial time (epsion,  
delta)-approximation for estimating the number of  
zeros of an arbitrary multi-linear polynomial f((x  
subscript 1), ..., (x subscript n)) over GF[q]. This  
extends the recent result of Karpinski/Luby [KL90]  
on approximating the number of zeros of polynomials  
over the field GF[2].  
 
----- 
File: 1991/tr-91-023 
 
On the Average Case Complexity of Parallel Sublist Selection 
 
Geppino Pucci and Wolf Zimmerman 
tr-91-023 
March 1991 
 
The "Sublist Selection Problem" (SSP) is the  
following: Given an input list of nodes labelled True  
or False, extract the sublist of nodes labelled True.  
This paper analyzes the average case complexity of a  
parallel algorithm that solves SSP on the PRAM model  
of computation. The algorithm is based on the  
well-known "recursive doubling" technique. Doubly  
logarithmic upper and lower bounds are derived for  
the average number of iterations needed to produce  
the output list, under the assumption that all the  
nodes of the input list are marked False with  
prabability p, independently of the other nodes.  
Finally, the exact number of iterations (up to lower  
order terms) is established in the case that the input  
list is drawn from the uniform distribution over all  
possible labelings.  
 
----- 
File: 1991/tr-91-024 
 
Large Comparison of Rate-Based Service Disciplines 
 
Hui Zhang and Srinivasan Kesahv 
tr-91-024 
April 1991 
 
This paper compares six new queue service  
disciplines that are implemented at the output  
queues of switches in a connection-oriented packet  
switched data network. These are Virtual Clock, Fair  
Queueing, Delay-Earliest-Due-Date,  
Jitter-Earliest-Due-Date, Stop-and-Go and  
Hierarchical Round Robin. We describe their  
mechanisms, their similarities and differences,  
and some implementation strategies. In particular,  
we show why each discipline can or cannot provide  
bandwidth, delay and delay jitter guarantees. This  
leads to some interesting conclusions about the  
relative strengths and weaknesses of each approach.  
 
----- 
File: 1991/tr-91-025 
 
Limiting Fault-Induced Output Errors In ANNs 
 
Reed D. Clay and Carlo H. Sequin 
tr-91-025 
April 1991 
 
The worst case output errors produced by the failure  
of a hidden neuron in layered feed-forward ANNs are  
investigated. These errors can be much worse than  
simply the loss of the contribution of a neuron whose  
output goes to zero. A much larger erroneous signal  
can be produced when the failure sets the value of the  
hidden neuron to one of the power supply voltages. <P>A  
new method is investigated that limits the  
fractional error in the output signal of a  
feed-forward net due to such saturated hidden unit  
faults in analog function approximation tasks. The  
number of hidden units is significantly increased,  
and the maximal contribution of each unit is limited  
to a small fraction of the net output signal. To  
achieve a large localized output signal, several  
Gaussian hidden units are moved into the same  
location in the input domain and the gain of the linear  
summing output unit is suitably adjusted. Since the  
contribution of each unit is equal in magnitude,  
there is only a modest error under any possible  
failure mode.  
 
----- 
File: 1991/tr-91-026 
 
[REVISED:] New Resultant Inequalities and Complex Polynomial Factorization 
(formerly known as "Randomized Incomplete Numerical Factorization of a 
Polynomial Over the Complex Field") 
 
Victor Pan 
tr-91-026 
December 1992 
 
We deduce some new probabilistic estimates on the  
distances between the zeroes of a polynomial p(x) by  
using some properties of the discriminant of p(x) and  
apply these estimates to improve the fastest  
deterministic algorithm for approximating  
polynomial factorization over the complex field.  
 
----- 
File: 1991/tr-91-027 
 
An Approximation Algorithm for the Number of Zeros of Arbitrary Polynomials 
over GF[q] 
 
Dima Grigoriev, and Marek Karpinski 
tr-91-027 
April 1991 
 
We design the first polynomial time (for an arbitrary  
and fixed field GF[q])  
(epsilon,delta)-approximation algorithm for the  
number of zeros of arbitrary polynomial f(x_1, ...  
,x_n) over GF[q].It gives the first efficient method  
for estimating the number of zeros and nonzeros of  
multivariate polynomials over small finite fields  
other than GF[2] (like GF[3]), the case important for  
various circuit approximation techniques. The  
algorithm is based on the estimation of the number of  
zeros of an arbitrary polynomial f(x_1, ... ,x_n)  
over GF[q] in the function on the number m of its terms.  
The bounding ratio number is proved to be m**((q-1)  
log q) which is the main technical contribution of  
this paper and could be of independent algebraic  
interest.  
<P> 
Keywords: Approximation Algorithms,  
Counting Problems, Multivariate Polynomials,  
Finite Fields.  
 
----- 
File: 1991/tr-91-028 
 
The Packet Pair Flow Control Protocol 
 
Srinivasan Keshav 
tr-91-028 
May 1991 
 
This paper presents a reactive flow control  
mechanism for networks that do not reserve  
bandwidth. We assume a round-robin-like Fair  
Queueing service discipline in the output queues of  
switches and routers, which enables us to model a  
conversation as a sequence of D/D/1 queues. This  
model is used to derive a rate-based flow control  
protocol called Packet-pair, or 2P. 2P uses short  
packet bursts to estimate the service rate of a  
conversation at its bottleneck, and to adapt its  
sending rate to the network state. We describe the  
design and impementation of 2P in detail.  
Simulations compare the scheme with some well known  
flow control schemes in deterministic as well as  
stochastic scenarios. Anaysis and simulations  
indicate that 2P is able to use available bandwidth  
efficiently and to achieve low queueing delays,  
particulary in networks where the bandwidth-delay  
product is large. Further, 2P responds quickly and  
correctly to dynamic changes in the network.  
 
----- 
File: 1991/tr-91-029 
 
On the Decidability Problem for a Topological Syllogistic Involving the 
Notion of Topological Product 
 
Domenico Cantone and Vincenzo Cutello 
tr-91-029 
May 1991 
 
A two-level, multi-sorted language of sets with  
cartesian product is introduced. The solvability of  
the satisfiability problem for the corresponding  
class of unquantified formulae is shown to be useful  
in order to automatically verify the validity of  
certain topological statements involving the  
notion of product of spaces. <P>The underlying  
motivation for this study is to enrich the class of  
theoretical results that can be used for a  
set-theoretic proof verifier.  
 
----- 
File: 1991/tr-91-030 
 
Probability estimation by feed-forward networks in continuous speech 
recognition 
 
Steve Renals, Nelson Morgan and Herve Bourlard 
tr-91-030 
August 1991 
 
We review the use of feed-forward networks as  
estimators of probability densities in hidden  
Markov modelling. In this paper we are mostly  
concerned with radial basis functions (RBF)  
networks. We note the isomorphism of RBF networks to  
tied mixture density estimators; additionally we  
note that RBF networks are trained to estimate  
posteriors rather than the likelihoods estimated by  
tied mixture density estimators. We show how the  
neural network training should be modified to  
resolve this mismatch. We also discuss problems with  
discriminative training, particularly the problem  
of dealing with unlabelled training data and the  
mismatch between model and data priors.  
 
----- 
File: 1991/tr-91-031 
 
pSather monitors: Design, Tutorial, Rationale and Implementation 
 
Jerome A. Feldman, Chu-Cheow Lim and Franco Mazzanti 
tr-91-031 
September 1989 
 
Sather is a new object-oriented programming  
language under development at the International  
Computer Science Institute. The initial beta test  
release of the language was in June, 1991. From the  
outset, one goal of the Sather project has been the  
incorporation of constructs to support parallel  
programming. pSather is a parallel extension of  
Sather aimed at shared memory parallel  
architectures. A prototype of the language is  
currently being implemented on a Sequent Symmetry  
and on SUN Sparc-Stations. pSather monitors are one  
of the basic new features introduced in the language  
to deal with parallelism. The current design is  
presented and discussed in detail.  
 
----- 
File: 1991/tr-91-032 
 
GAL: Networks that grow when they learn and shrink when they forget 
 
Ethem Alpaydin 
tr-91-032 
May 1991 
 
Learning when limited to modification of some  
parameters has a limited scope; the capability to  
modify the system structure is also needed to get a  
wider range of the learnable. In the case of  
artificial neural networks, learning by iterative  
adjustment of synaptic weights can only succeed if  
the network designer predefines an appropriate  
network structure, i.e., number of hidden layers,  
units, and the size and shape of their receptive and  
projective fields. This paper advocates the view  
that the network structure should not, as usually  
done, be determined by trial-and-error but should be  
computed by the learning algorithm. Incremental  
learning algorithms can modify the network  
structure by addition and/or removal of units and/or  
links. A survey of current connectionist literature  
is given on this line of thought. ``Grow and Learn''  
(GAL) is a new algorithm that learns an association at  
one-shot due to being incremental and using a local  
representation. During the so-called ``sleep''  
phase, units that were previously stored but which  
are no longer necessary due to recent modifications  
are removed to minimize network complexity. The  
incrementally constructed network can later be  
finetuned off-line to improve performance. Another  
method proposed that greatly increases recognition  
accuracy is to train a number of networks and vote over  
their responses. The algorithm and its variants are  
tested on recognition of handwritten numerals and  
seem promising especially in terms of learning  
speed. This makes the algorithm attractive for  
on-line learning tasks, e.g., in robotics. The  
biological plausibility of incremental learning is  
also discussed briefly.  
<P> 
Keywords: Incremental  
learning, supervised learning, classification,  
pruning, destructive methods, growth,  
constructive methods, nearest neighbor.  
 
----- 
File: 1991/tr-91-033 
 
Polymorphic Processor Arrays 
 
Massimo Maresca 
tr-91-033 
May 1991 
 
A Polymorphic Processor Array (PPA) is a  
two-dimensional mesh- connected array of  
processors, in which each processor is equipped with  
a switch able to interconnect its four NEWS ports. PPA  
is an abstract architecture based upon the  
experience acquired in the design and in the  
implementation of a VLSI chip, namely the  
Polymorphic Torus (PT) chip, and, as a consequence,  
it only includes capabilities that have been proved  
to be supported by cost-effective hardware  
structures. The main claims of PPA are that 1) it  
models a realistic class of parallel computers, 2) it  
supports the definition of high level programming  
models, 3) it supports virtual parallelism and 4) it  
supports low complexity algorithms in a number of  
application fields. In this paper we present both the  
PPA computation model and the PPA programming model;  
we show that the PPA computation model is realistic by  
relating it to the design of the PT chip and show that  
the PPA programming model is scalable by  
demonstrating that any algorithm having 0(p)  
complexity on a virtual PPA of size (square root m) X  
(square root m), has 0(kp) complexity on a PPA of size  
(square root n) X (square root n), with m=kn and k  
integer. We finally show some application  
algorithms in the area of numerical analysis and  
graph processing.  
 
----- 
File: 1991/tr-91-034 
 
Sather Language Design and Performance Evaluation 
 
Chu-Cheow Lim and Andreas Stolcke 
tr-91-034 
May 1991 
 
Sather is an object-oriented language recently  
designed and implemented at the International  
Computer Science Institute in Berkeley. It compiles  
into C and is intended to allow development of  
object-oriented, reusable software while  
retaining C's efficiency and portability. We  
investigate to what extent these goals were met  
through a comparative performance study and  
analysis of Sather and C programs on a RISC machine.  
Several language design decisions in Sather are  
motivated by the goal of efficient compilation to  
standard architectures. We evaluate the reasoning  
behind these decisions, using instruction set usage  
statistics, cache simulations, and other data  
collected by instrumented Sather-generated code.  
<P>We conclude that while Sather users still pay a  
moderate overhead for programming convenience (in  
both run time and memory usage) the overall CPU and  
memory usage profiles of Sather programs are  
virtually identical to those of comparable C  
programs. Our analysis also shows that each of the  
choices made in Sather design and implementation is  
well justified by a distinctive performance  
advantage. It seems, then, that Sather proves the  
feasibility of its own design goal of making  
object-oriented programming efficient on standard  
architectures using a combination of judicious  
language design and efficient implementation.  
 
----- 
File: 1991/tr-91-035 
 
HiPNeT-1: A Highly Pipelined Architecture for Neural Network Training 
 
Krste Asanovic, Brian E. D. Kingsbury, Nelson Morgan, and John Wawrzynek 
tr-91-035 
June 1991 
 
Current artificial neural network (ANN) algorithms  
require extensive computational resources.  
However, they exhibit massive fine-grained  
parallelism and require only moderate arithmetic  
precision. These properties make possible custom  
VLSI implementations for high performance, low cost  
systems. This paper describes one such system, a  
special purpose digital VLSI architecture to  
implement neural network training in a speech  
recognition application. <P>The network algorithm has  
a number of atypical features. These include: shared  
weights, sparse activation, binary inputs, and a  
serial training input stream. The architecture  
illustrates a number of design techniques to exploit  
these algorithm-specific features. The result is a  
highly pipelined system which sustains a learning  
rate of one pattern per clock cycle. At a clock rate of  
20MHz each "neuron" site performs 200 million  
connection updates per second. Multiple such  
neurons can be integrated onto a modestly sized VLSI  
die.  
 
----- 
File: 1991/tr-91-036 
 
Experimental Determination of Precision Requirements for Back-Propagation 
Training of Artificial Neural Networks 
 
Krste Asanovic and Nelson Morgan 
tr-91-036 
June 1991 
 
The impact of reduced weight and output precision on  
the back-propagation training algorithm is  
experimentally determined for a feed-forward  
multi-layer perceptron. In contrast with previous  
such studies, the network is large with over 20,000  
weights, and is trained with a large, real-world data  
set of over 130,000 patterns to perform a difficult  
task, that of phoneme classification for a  
continuous speech recognition system. <P>The results  
indicate that 16b weight values are sufficient to  
achieve training and classification results  
comparable to 32b floating point, provided that  
weight and bias values are scaled separately, and  
that rounding rather than truncation is employed to  
reduce the precision of intermediary values. Output  
precision can be reduced to 8 bits without  
significant effects on performance.  
 
----- 
File: 1991/tr-91-037 
 
A Brief History of the Association for Women in Mathematics: The Presidents' 
Perspectives 
 
Lenore Blum 
tr-91-037 
June 1991 
 
A talk with transparencies presented at the 20th  
anniversary celebration of the Association for  
Women in Mathematics, January, 1991.  
 
----- 
File: 1991/tr-91-038 
 
Test Complexity of Generic Poynomials 
 
Peter Buergisser, Thomas Lickteig and Michael Shub 
tr-91-038 
July 1991 
 
We investigate the complexity of algebraic decision  
membership in a hypersurface X propersubset (C  
superscript m). We prove an optimal lower bound on the  
number of additions, subtractions and comparisons  
and an asymptotically optimal lower bound on the  
number of multiplications, divisions and  
comparisons that are needed to decide membership in a  
generic subsurface X propersubset (C superscript  
m). <P>In the situation over the reals where in addition  
to equality branching also \leq-branching allowed,  
we prove an analogous statement for irreducible  
"generic" hypersurfaces X propersubset (R  
superscript m). In the case m=1 we give also a lower  
bound for finite subsets of X propersubset R.  
 
----- 
File: 1991/tr-91-039 
 
Verification Complexity of Linear Prime Ideals 
 
Peter Buergisser and Thomas Lickteig 
tr-91-039 
July 1991 
 
The topic of this paper is the complexity of algebraic  
decision trees deciding membership in an algebraic  
subset X propersubset (R superscript m) where R is a  
real or algebraically closed field). We define a  
notion of verification complexity of a (real) prime  
ideal (in a prime cone) which gives a lower bound on the  
decision complexity. We exactly determine the  
verification complexity of some prime ideals of  
lineary type generalizing a result by Winograd  
[Win-70]. As an application we show uniform  
optimality with respect to the number of  
multiplications and divisions needed for two  
algorithms: <UL><LI> For deciding whether a number is a zero  
of several polynomials -- if this number and the  
coefficients of these polynomials are given as input  
data -- evaluation of each polynomial with Horner's  
rule and then testing the values for zero is optimal. <P><LI> 
For verifying that a vector satisfies a system of  
linear equations -- given the vector and the  
coefficients of the system as input data -- the  
natural algorithm is optimal. </UL> 
 
----- 
File: 1991/tr-91-040 
 
Efficient Visual Search: A Connectionist Solution 
 
Subutai Ahmad and Stephen Omohundro 
tr-91-040 
July 1991 
 
Searching for objects in scenes is a natural task for  
people and has been extensively studied by  
psychologists. In this paper we examine this task  
from a connectionist perspective. Computational  
complexity arguments suggest that parallel  
feed-forward networks cannot perform this task  
efficiently. One difficulty is that, in order to  
distinguish the target from distractors, a  
combination of features must be associated with a  
single object. Often called the binding problem,  
this requirement presents a serious hurdle for  
connectionist models of visual processing when  
multiple objects are present. Psychophysical  
experiments suggest that people use covert visual  
attention to get around this problem. In this paper we  
describe a psychologically plausible system which  
uses a focus of attention mechanism to locate target  
objects. A strategy that combines top-down and  
bottom-up information is used to minimize search  
time. The behavior of the resulting system matches  
the reaction time behavior of people in several  
interesting tasks.  
 
----- 
File: 1991/tr-91-041 
 
Virtual Parallelism Support in Reconfigurable Processor Arrays 
 
Massimo Maresa and Hungwen Li 
tr-91-041 
July 1991 
 
Reconfigurable Processor Arrays (RPAs) are a  
special class of mesh connected computers in which  
each node is equipped with a switching system able to  
internally interconnect its NEWS ports and to  
establish paths between non-neighborhood nodes.  
The best known proposals in the area of RPAs are the  
Mesh with Reconfigurable Bus [Miller, et al., 1988],  
the Processor Arrays with Reconfigurable Bus  
Systems [Wang and Chen, 1990], the Gated Connection  
Network [Shu and Nash] and Polymorphic Processor  
Array [Li and Maresca, 1989]. In this paper we show  
that only one of these architectures, namely the  
Polymorphic Processor Array, supports virtual  
parallelism. The support of virtual parallelism is  
important because it allows the complexity  
measurements of the parallel algorithms to be scaled  
to real implementations, where the size of the  
processor array can be smaller than the problem size.  
We demonstrate that: 1) the RPAs that allow the  
establishment of an arbitrary shape  
two-dimensional bus do not support virtual  
parallelism and 2) the Polymorphic Processor Array,  
with its connection power to one-dimensional buses,  
supports virutal paralellism.  
 
----- 
File: 1991/tr-91-042 
 
Hierarchical Node Clustering in Polymorphic Processor Arrays 
 
Massimo Maresa and Hungwen Li 
tr-91-042 
July 1991 
 
Massively parallel computers are implemented by  
means of modules at different packaging levels. This  
paper discusses a hierarchical node clustering  
scheme (HNC) for packaging a class of reconfigurable  
processor arrays called Polymorphic Processor  
Arrays (PPA) which use circuit-switching-based  
routers at each node to deliver a different topology  
at every instruction. The PPA family suffers from an  
unknown signal delay between two arbitrary nodes  
connected by the circuit- switched paths. This  
either forces the hardware clock to compromise to the  
worst signal or makes the software dependent on the  
system size. The use of the HNC scheme allows one to  
obtain communication speed-up and automatic  
control, at the compiler lever, over signal  
propagation delay.  
 
----- 
File: 1991/tr-91-043 
 
Efficiency of Asynchronous Transfer Mode Networks in Transporting Wide-Area 
Data Traffic 
 
Ramon Caceres 
tr-91-043 
July 1991 
 
For performance and economic reasons, ATM networks  
must efficiently support the Internet family of  
protocols. We calculate the transmission  
efficiency achieved by a range of ATM-related  
protocols when transporting TCP and UDP wide-area  
traffic. We also compare the efficiency efforts of  
several non-standard compression techniques. To  
assure an accurate workload characterization, we  
drive these calculations with millions of wide-area  
packet lenghts measured on the current Internet. <P>We  
find that networks using standard ATM procedures are  
dismally inefficient in carrying traditional data  
traffic -- depending on the protocols used,  
efficiency as seen by an application program ranges  
between 40 and 53 percent. Moreover, due to  
interaction between TCP- IP datagram lengths and ATM  
cell padding, efficiency responds abruptly to  
changes in certain protocol parameters -- for  
example, a 4-byte increase in ATM cell payload size  
can yield a 10 percent increase in efficiency. Using  
one compression technique in isolation can improve  
efficiency by 12 percent, and simultaneously using  
three techniques can improve it by 34 percent. These  
issues should be considered when designing future  
ATM networks.  
 
----- 
File: 1991/tr-91-044 
 
VC Dimension and Sampling Complexity of Learning Sparse Polynomials and 
Rational Functions 
 
Marek Karpinski and Thorsten Werther 
tr-91-044 
August 1991 
 
This paper presents the recent results on the VC  
dimension and the sampling complexity of learning  
sparse polynomials and rational functions. Some of  
the direct applications of these results have also  
been presented.  
 
----- 
File: 1991/tr-91-045 
 
The Automatic Worst Case Analysis of Parallel Programs: Single Parallel 
Sorting and Algorithms on Graphs 
 
Wolf Zimmerman 
tr-91-045 
August 1991 
 
No Abstract. 
 
----- 
File: 1991/tr-91-046 
 
A Characterization of Space Complexity Cases and Subexponential Time Classes 
as Limiting Polynomially Decidable Sets 
 
Giorgio Ausiello, Marco Protasi and Michele Angelaccio 
tr-91-046 
August 1991 
 
The concept of limiting approximation, originally  
introduced by Gold for recursive functions, has been  
previously adapted by the authors to the polynomial  
level of complexity in order to study complexity  
classes of sets polynomially computable in the  
limit. In this paper new results concerning the  
characterization of space complexity classes (from  
PSPACE to Grzegorczyk classes) as classes of sets  
polynomially decidable in the limit are presented.  
Besides tight trade-offs between the rate of  
convergence of the approximating sequences and the  
constants of their polynomially running time are  
shown. Finally the limiting polynomial  
approximation for classes of sets between P and  
PSPACE is investigated under the hypothesis that P is  
different from PSPACE.  
 
----- 
File: 1991/tr-91-047 
 
CLOS, Eiffel, and Sather: A Comparison 
 
Heinz W. Schmidt and Stephen M. Omohundro 
tr-91-047 
September 1991 
 
The Common Lisp Object System defines a powerful and  
flexible type system which builds on more than 15  
years of experience with object-oriented  
programming. Most current implementations include  
a comfortable suite of Lisp support tools including  
an Emacs lisp editor, an interpreter, an incremental  
compiler, a debugger, and an inspector which  
together promote rapid prototyping and design. What  
else might one want from a system? We argue that static  
typing yields earlier error detection, greater  
robustness, and higher efficiency and that greater  
simplicity and more orthogonality in the language  
constructs leads to a shorter learning curve and more  
intuitive programming. These elements can be found  
in Eiffel and a new object-oriented language,  
Sather, that we are developing at ICSI. Language  
simplicity and static typing are not for free,  
though. Programmers have to pay with loss of  
polymorphism and flexibility in prototyping. We  
give a short comparison of CLOS, Eiffel and Sather,  
addressing both language and environment issues.  
<P>The different approaches taken by the languages  
described in this paper have evolved to fulfill  
different needs. While we have only touched on the  
essential differences, we hope that this discussion  
will be helpful in understanding the advantages and  
disadvantages of each language.  
 
----- 
File: 1991/tr-91-048 
 
ICSIM: An Object-Oriented Connectionist Simulator 
 
Heinz W. Schmidt, and Benedict Gomes 
tr-91-048 
November 1991 
 
ICSIM is a connectionist net simulator under  
development at ICSI and written in Sather. It is  
object-oriented to meet the requirements for  
flexibility and reuse of homogeneous and structured  
connectionist nets and to allow the user to  
encapsulate efficient customized implementations  
perhaps running on dedicated hardware. Nets are  
composed by combining off-the-shelf library  
classes and, if necessary, by specializing some of  
their behaviour. General user interface classes  
allow a uniform or customized graphic presentation  
of the nets being modeled. <P>The report gives an  
overview of the simulator. Its main concepts, the  
class structure of its library and some of the design  
decisions are sketched and a number of example nets  
are used to illustrate how net structure,  
interconnection and behavior are defined.  
 
----- 
File: 1991/tr-91-049 
 
VISIT: An Efficient Computational Model Of Human Visual Attention 
 
Subutai Ahmad 
tr-91-049 
September 1991 
 
Thesis One of the challenges for models of cognitive  
phenomena is the development of efficient and  
flexible interfaces between low level sensory  
information and high level processes. For visual  
processing, researchers have long argued that an  
attentional mechanism is required to perform many of  
the tasks required by high level vision. This thesis  
presents VISIT, a connectionist model of covert  
visual attention that has been used as a vehicle for  
studying this interface. The model is efficient,  
flexible, and is biologically plausible. The  
complexity of the network is linear in the number of  
pixels. Effective parallel strategies are used to  
minimize the number of iterations required. The  
resulting system is able to efficiently solve two  
tasks that are particularly difficult for standard  
bottom-up models of vision: computing spatial  
relations and visual search. Simulations show that  
the network's behavior matches much of the known  
psychophysical data on human visual attention. The  
general architecture of the model also closely  
matches the known physiological data on the human  
attention system. Various extensions to VISIT are  
discussed, including methods for learning the  
component modules.  
 
----- 
File: 1991/tr-91-050 
 
Learning Spatial Concepts Using a Partially-Structured Connectionist 
Architecture 
 
Terry Regier 
tr-91-050 
October 1991 
 
This paper reports on the learning of spatial  
concepts in the L0 project. The challenge of  
designing an architecture capable of learning  
spatial concepts from any of the world's languages is  
first highlighted by reviewing the spatial systems  
of a number of languages which differ strikingly from  
English in this regard. A partially structured  
connectionist architecture is presented which has  
successfully learned concepts from the languages  
outlined. In this architecture, highly structured  
subnetworks, specialized for the spatial concept  
learning task, feed into an unstructured,  
fully-connected upper subnetwork. The system's  
success at the learning task is attributed on the one  
hand to the constrained search space which results  
from structuring, and on the other hand to the  
flexibility afforded by the unstructured upper  
subnetwork.  
 
----- 
File: 1991/tr-91-051 
 
Evaluation of Overflow Probabilities in Resource Management 
 
Dinesh Chandra Verma and Domenico Ferrari 
tr-91-051 
October 1991 
 
In a number of network and database management  
applications, we need to evaluate an overflow  
probability, which is an upper bound on the  
probability that the capacity of a server will be  
exceeded. The problem can be essentially reduced to  
evaluating the probability that the sum of N  
independent random variables exceed a given  
threshold. Evaluation of this probability by  
brute-force enumeration requires exponential  
time, so attempts have been made to approximate the  
overflow probability by using Chernoff bounds. This  
paper presents a simple scheme that can be used to  
evaluate the overflow probability with a higher  
degree of accuracy and lower computational efforts  
than the Chernoff bound approach.  
 
----- 
File: 1991/tr-91-052 
 
CHCL--A Connectionist Inference System 
 
Steffen Hoelldobler and Franz Kurfess 
tr-91-052 
October 1991 
 
CHCL is a "c"onnectionist inference system for  
"H"orn logic which is based on the "c"onnection  
method and uses "l"imited resources. This paper  
gives an overview of the system and its  
implementation.  
 
----- 
File: 1991/tr-91-053 
 
Unification with ICSIM 
 
Franz Kurfess 
tr-91-053 
August 1991 
 
This document describes the implementation of a  
distributed unification algorithm using the  
connectionist simulator ICSIM. The algorithm is  
based on S. Hoelldobler's work, as described in  
[Hoelldobler, 1990b]. Unification problems are  
specified according to a simple language,  
describing the terms, functions, variables and  
constants occurring in such a problem; the terms to be  
unified are represented as less than term_1 = term_2  
is greater than (e.g., less than f(x, x, x) = f(g(a), y,  
g(z)) is greater than). <P>A parser extracts relevant  
information and creates intermediate data  
structures needed for the construction of the  
connectionist network. Essential data structures  
describe the symbols occurring in the terms, the  
hierarchical structure of the terms (functions and  
their arguments), and the occurrences of the symbols  
in the terms. The connectionist unification network  
is constructed based on these intermediate  
structures. It is hierarchically organized, its top  
level NET consisting of POSITIONS, which correspond  
to the nodes in the term structure. A POSITION  
consists of a SYMBOL, either of type VARIABLE or  
CONSTANT. Symbols comprise a TERM UNIT and a number of  
UNIFICATION UNITS, depending on the number of  
positions in the terms to be unified. Initially, TERM  
UNITS are set according to the occurrences of their  
symbols in the term structure; based on the links  
within the network and the activation of UNIFICATION  
UNITS, more TERM UNITS are activated as required by  
the unification algorithm. The final set of active  
TERM UNITS is used to construct the most general  
unifier for the terms to be unified. The network can be  
easily extended to detect inconsistencies in the  
term structure or to perform an occur check.  
 
----- 
File: 1991/tr-91-054 
 
Knowledge Selection with ANNs 
 
Dimitris Karagiannis, Franz Kurfess and Heinz-Wilhelm Schmidt 
tr-91-054 
October 1991 
 
(32 Pages) The access to information contained in  
possibly large knowledge bases is a crucial factor in  
the usability of such a knowledge base. In this paper,  
we present a method to select information relevant  
for a query in knowledge bases where the information  
is represented in a rule-based way. An approach based  
on artificial neural networks is used to pre-select  
the set of relevant rules, thus facilitating the task  
of the inference mechanism by restricting the search  
space to be traversed considerably. In addition to  
the information contained in the query itself, data  
derived from the environment in which the query is  
situated is used to further trim down the search  
space. Sources for this derivation process are data  
about the task under investigation as well as the  
history of user interactions. <P>We refer to the first  
way of diminishing the search space via the query as  
"identification"; the second one is referred to as  
"adaptation", since the selection process is  
adapted to the current task. The third one, taking  
into account the history of interactions between  
user and knowledge base, is called "prediction",  
aiming at a possible prediction of the next query, or a  
subset of rules relevant for the next query. <P>An  
implementation of the artificial neural networks  
used for these tasks is based on ICSIM, a  
connectionist simulator developed at ICSI.  
 
----- 
File: 1991/tr-91-055 
 
Potentiality of Parallelism in Logic 
 
Franz Kurfess 
tr-91-055 
October 1991 
 
The processing of knowledge is becoming a major area  
of applications for computer systems. In contrast to  
data processing, the current stronghold of computer  
use, where well-structured data are manipulated  
through well-defined algorithms, the treatment of  
knowledge requires more intricate representation  
schemes as well as refined methods to manipulate the  
represented information. Among the many candidates  
proposed for representing and processing  
knowledge, logic has a number of important  
advantages, although it also suffers from some  
drawbacks. One of the advantages is the availability  
of a strong formal background with a large assortment  
of techniques for dealing with the representation  
and processing of knowledge. A considerable  
disadvantage so far is the amount and complexity of  
computation required to perform even simple tasks in  
the area of logic. One promising approach to overcome  
this problem is the use of parallel processing  
techniques, enabling an ensemble of processing  
elements to cooperate in the solution of a problem.  
The goal of this paper is to investigate the  
combination of parallelism and logic.  
 
----- 
File: 1991/tr-91-056 
 
Distributed Delay Jitter Control in Packet-Switching Internetworks 
 
Domenico Ferrari 
tr-91-056 
October 1991 
 
Delay jitter is the variation of the delays with which  
packets travelling on a network connection reach  
their destination. For good quality of reception,  
continuous-media (video, audio, image) streams  
require that jitter be kept below a sufficiently  
small upper bound. This paper proposes a distributed  
mechanism for controlling delay jitter in a  
packet-switching network. The mechanism can be  
applied to an internetwork that satisfies the  
conditions detailed in the paper, and can coexist  
with other schemes (including the absence of any  
scheme) for jitter control within the same network,  
the same node, and even the same real-time channel.  
The mechanism can guarantee small jitter bounds even  
when the clocks of the host systems and the gateways  
along a channel's route are only loosely  
synchronized; furthermore, it makes the  
distribution of buffer space requirements more  
uniform over the channel's route, and reduces by a  
non-neglible amount the total buffer space needed by  
a channel. The paper argues that, if these advantages  
are sufficient to justify the higher costs of the  
distributed jitter control mechanism with respect  
to a non-distributed one, it would be useful to offer  
to the network's users a jitter control service based  
on the mechanism proposed here.  
 
----- 
File: 1991/tr-91-057 
 
A Method for Obtaining Randomized Algorithms with Small Tail Probabilities 
 
H. Alt, L. Guibas, K. Mehlhorn, R. Karp and A. Wigderson 
tr-91-057 
September 1991 
 
We study strategies for converting randomized  
algorithms of the Las Vegas type into randomized  
algorithms with small tail probabilities.  
 
----- 
File: 1991/tr-91-058 
 
Detecting Skewed Symmetries 
 
Stefan Posch 
tr-91-058 
October 1991 
 
Many surfaces of objects in our world are bounded by  
planar bilaterally symmetric figures. When these  
figures are imaged under orthographic projection a  
skewed symmetric contour results. In this paper a new  
fast, local method to recover skewed symmetries from  
curve segments is proposed. It can be applied to  
complete as well as to occluded contours.  
Furthermore, the skewed symmetry property is  
employed to overcome fragmentation of a contour  
during segmentation.  
 
----- 
File: 1991/tr-91-059 
 
Line Labeling Using Markov Random Fields 
 
Terry Regier 
tr-91-059 
October 1991 
 
The task of obtaining a line labeling from a greyscale  
image of trihedral objects presents difficulties  
not found in the classical line labeling problem. As  
originally formulated, the line labeling problem  
assumed that each junction was correctly  
pre-classified as being of a particular junction  
type (e.g. T, Y, arrow); the success of the algorithms  
proposed have depended critically upon getting this  
initial junction classification correct. In real  
images, however, junctions of different types may  
actually look quite similar, and this  
pre-classification is often difficult to achieve.  
This issue is addressed by recasting the line  
labeling problem in terms of a coupled probabilistic  
system which labels both lines and junctions. This  
results in a robust system, in which prior knowledge  
of acceptable configurations can serve to overcome  
the problem of misleading or ambiguous evidence.  
 
----- 
File: 1991/tr-91-060 
 
Oracle Computations in Parallel Numerical Linear Algebra 
 
B. Codenotti, M. Leoncini and G. Resta 
tr-91-060 
October 1991 
 
We analyze the relative complexity of several  
numerical linear algebra problems, when errors in  
the computation occur. We show that the simple  
parallel complexity classes of the exact case do not  
seem to preserve under approximation.  
 
----- 
File: 1991/tr-91-061 
 
Combinatory Differential Fields: An Algebraic Approach to Approximate 
Computation and Constructive Analysis 
 
Karl Aberer 
tr-91-061 
October 1991 
 
The algebraic structure of combinatory  
differential fields is constructed to provide a  
semantics for computations in analysis. In this  
setting programs, approximations, limits and  
operations of analysis are represented as algebraic  
terms. Analytic algorithms can be derived by  
algebraic methods. The main tool in this  
construction are combinatory models which are inner  
algebras of Engeler graph models. As an universal  
domain of denotational semantics the lattice  
structure of the graph models allows to give a  
striking simple semantics for computations with  
approximations. As models of combinatory algebra  
they provide all essential computational  
constructs, including recursion. Combinatory  
models are constructed as extensions of first order  
theories. The classical first order theory to  
describe analysis is the theory of differential  
fields. It turns out that two types of computational  
constructs, namely composition and piecewise  
definition of functions, are preferably introduced  
as extensions of the differential fields theory.  
Combinatory differential fields are then the  
combinatory models of these enriched differential  
fields. We show for basic algorithms of  
computational analysis how their combinatory  
counterparts are derived in the algebraic setting.  
We illustrate how these algorithms are suitable to be  
implemented in a computer algebra environment like  
mathematica.  
 
----- 
File: 1991/tr-91-062 
 
Self-Testing/Correcting with Applications to Numerical Problems (Revised 
Version) 
 
Manuel Blum, Michael Luby, Ronitt Rubinfeld 
tr-91-062 
November 1991 
 
Suppose someone gives us an extremely fast program  
$P$ that we can call as a black box to compute a function  
$f$. Should we trust that $P$ works correctly? A {\em  
self-testing/correcting pair} for $f$ allows us to:  
(1) estimate the probability that $P(x) \not= f(x)$  
when $x$ is randomly chosen; (2) on {\em any} input  
$x$, compute $f(x)$ correctly as long as $P$ is not too  
faulty on average. Furthermore, both (1) and (2) take  
time only slightly more than the original running  
time of $P$. <P>We present general techniques for  
constructing simple to program  
self-testing/\-correcting pairs for a variety of  
numerical functions, including integer  
multiplication, modular multiplication, matrix  
multiplication, inverting matrices, computing the  
determinant of a matrix, computing the rank of a  
matrix, integer division, modular exponentiation  
and polynomial multiplication.  
 
----- 
File: 1991/tr-91-063 
 
How to Solve Interval Constraint Networks: The Definitive Answer - Probably 
 
Peter Ladkin and Alexander Reinefeld 
tr-91-063 
November 1991 
 
We implemented and tested an algorithm for solving  
interval constraint problems which returned  
solutions in less than or equal to 0.5 seconds on the  
average, with the hardest problem taking less than or  
equal to 0.5 minutes on a RISC workstation. This is a  
surprising result considering the problem is known  
to be NP-complete. We conclude that our algorithm  
suffices for solving random interval constraint  
problems in practice. <P>Other conclusions are that  
path-consistency is an excellent pruning technique  
for solution search, which becomes almost a linear  
selection of atomic labels; also that  
path-consistency by itself is an excellent  
consistency heuristic for networks with fewer than  
six or greater than 15 nodes. We tested the algorithm  
on over two million randomly generated interval  
networks of various sizes, hence our title.  
 
----- 
File: 1991/tr-91-064 
 
Distortion Accumulation in Image Transform Coding/Decoding Cascades 
 
Michael Gilge 
tr-91-064 
December 1991 
 
With an increasing number of applications that  
employ transform coding algorithms for data  
reduction, the effect of distortion accumulation  
caused by multiple coding needs to be investigated.  
Multiple coding occurs when more than one coding  
system is connected in a cascade. From the second  
stage on, the coding algorithm operates on data that  
has been previously coded/decoded. First a generic  
image communication system is being modelled and  
situations that can lead to distortion accumulation  
are analyzed. These results show two main reasons for  
distortion accumulation, which are separately and  
jointly investigated using a JPEG-type compression  
algorithm. The first situation involves geometric  
operations between the decoding and next coding  
step. Measurements show however that these spatial  
manipulations are the main contributors to  
distortion accumulation. The second reason for  
distortion accumulation is a misalignment of the  
block segmentation reference point in subsequent  
transform operations. A block raster detection  
algorithm is derived that can find the position of the  
block raster that was introduced in a previous coding  
step. If this information is used in the block  
segmentation of the following coding step,  
distortion accumulation can be avoided. Simulation  
results are given for an extended algorithm that  
registers regions of homogeneous block raster in  
images consisting of several subimages.  
 
----- 
File: 1991/tr-91-065 
 
Motion Video Coding for Packet-Switching Networks -- An Integrated Approach 
 
Michael Gilge and Riccardo Gusella 
tr-91-065 
December 1991 
 
NOTE: This postscript file will preview just fine,  
but on most postscript printers it will refuse to  
print past page 4. Hence the .BAD tag. This file is  
offerred AS-IS, and will likely not ever be fixed. The  
advantages of packet video, constant image quality,  
service integration and statistical multiplexing,  
are overshadowed by packet loss, delay and jitter. By  
integrating network-control into the image data  
compression algorithm, the strong interactions  
between the coder and the network can be exploited and  
the available network bandwidth can be used best. In  
order to enable video transmission over today's  
networks without reservation or priorities and in  
the presence of high packet loss rates, congestion  
avoidance techniques need to be employed. This is  
achieved through rate and flow control, where  
feedback from the network is used to adapt coding  
parameters and vary the output rate. From the coding  
point of view the network is seen as data buffer.  
Analogously to constant bit rate applications,  
where a controller measures buffer fullness, we  
attempt to avoid network congestion (eq. buffer  
overflow) by monitoring the network and adapting the  
coding parameters in real-time.  
 
----- 
File: 1991/tr-91-066 
 
A Graph-Theoretic Game and its Application to the k-Server Problem 
 
Noga Alon, Richard M. Karp, David Peleg, and Douglas West 
tr-91-066 
December 1991 
 
This paper investigates a zero-sum game played on a  
weighted connected graph <I>G</I> between two players,  
the <I>tree player</I> and the <I>edge player</I>. At  
each play, the tree player chooses a spanning tree <I>T</I> 
and the edge player chooses an edge <I>e</I>. The payoff to  
the edge player is <I>cost(T,e)</I>, defined as follows:  
If <I>e</I> lies in the tree <I>T</I> then <I>cost(T,e)=0</I>;  
        if <I>e</I> does not lie in the tree then <I>cost(T,e) =  
cycle(T,e)/w(e)</I>, where <I>w(e)</I> is the weight of edge  
<I>e</I> and <I>cycle(T,e)</I> is the weight of the unique cycle  
formed when edge <I>e</I> is added to the tree <I>T</I>. Our main  
result is that the value of the game on any <I>n</I>-vertex  
graph is bounded above by <I>\exp(O(\sqrt{\log n  
\log\log n}))</I>. <P>The game arises in connection with  
the <I>k</I>-server problem on a <I>road network</I>; i.e.,  
a metric space that can be represented as a multigraph  
<I>G</I> in which each edge <I>e</I> represents a road of length  
<I>w(e)</I>. We show that, if the value of the game on <I>G</I> is  
<I>Val(G,w)</I>, then there is a randomized strategy  
that achieves a competitive ratio of <I>k(1 +  
Val(G,w))</I> against any oblivious adversary. Thus,  
on any <I>n</I>-vertex road network, there is a randomized  
algorithm for the <I>k</I>-server problem that is  
<I>k\cdot\exp(O(\sqrt{\log n \log\log  
n}))</I>-competitive against oblivious adversaries.  
<P>At the heart of our analysis of the game is an algorithm  
that, for any <I>n</I>-vertex weighted, connected  
multigraph, constructs a spanning tree <I>T</I> such that  
the average, over all edges <I>e</I>, of <I>cost(T,e)</I> is  
less than or equal to <I>\exp(O(\sqrt{\log n \log\log  
n}))</I>. This result has potential application to the  
design of communication networks.  
<P> 
[The on-line copy of this technical report was created from a later 
version (1992).  A revised and expanded version of the paper appeared in 
the SIAM J. on Computing, Volume 24, (1995), pages 78-100.] 
 
 
----- 
File: 1991/tr-91-067 
 
Probabilistic Recurrence Relations for Parallel Divide-and-Conquer 
Algorithms 
 
Marek Karpinski and Wolf Zimmermann 
tr-91-067 
December, 1991 
 
We study two probabilistic recurrence relations  
that arise frequently in the analysis of parallel and  
sequential divide-and-conquer algorithms (cf.  
[Karp 91]). Suppose a problem of size x has to be  
solved. In order to solve it we divide it into  
subproblems of size h_1(x), ... ,h_k(x) and these  
subproblems are solved recursively. We assume that  
size(h_i(z)) are random variables. This occurs if  
either the break up step is randomized or the  
instances to be solved are drawn from a probability  
distribution. The running time T(z) of a parallel  
algorithm is therefore determined by the maximum of  
the running times T(h_i(z)) of the subproblems while  
the sequential algorithm is determined by the sum of  
the running times of the subproblems. We give a method  
for estimating tight upper bounds on the probability  
distribution of T(x) for these two kinds of  
recurrence relations, answering the open questions  
in [Karp 91].  
<P> 
Keywords: Probabilistic Recurrence  
Relations, Devide-and-Conquer Algorithms,  
Parallel Algorithms, Upper Bounds on Probability  
Distribution.  
 
----- 
File: 1991/tr-91-068 
 
Construction of a pseudo-random generator from any one-way function 
 
Johan Hastad, Russell Impagliazzo, Leonid A. Levin, Michael Luby 
tr-91-068 
December 1991 
 
We show how to construct a pseudo-random generator  
from any one-way function. In contrast, previous  
works have constructed pseudo-random generators  
only from one-way functions with special structural  
properties. Our overall approach is different in  
spirit from previous work; we concentrate on  
extracting and smoothing entropy from a single  
iteration of the one-way function using universal  
hash functions.  
 
----- 
File: 1991/tr-91-069 
 
RASTA-PLP Speech Analysis 
 
Hynek Hermansky, Nelson Morgan, Aruna Bayya, and Phil Kohn 
tr-91-069 
December 1991 
 
Most speech parameter estimation techniques are  
easily influenced by the frequency response of the  
communication channel. We have developed a  
technique that is more robust to such steady-state  
spectral factors in speech. The approach is  
conceptually simple and computationally  
efficient. The new method is described, and  
experimental results are reported, showing a  
significant advantage for the proposed method.  
 
----- 
File: 1991/tr-91-070 
 
Connectionist Speech Recognition: Status and Prospects 
 
Steve Renals, Nelson Morgan, Herve Bourlard, Michael Cohen, Horacio Franco, 
Chuck Wooters and Phil Kohn 
tr-91-070 
December 1991 
 
We report on recent advances in the ICSI  
connectionist speech recognition project.  
Highlights include: <BLOCKQUOTE><P>1. Experimental results  
showing that connectionist methods can improve the  
performance of a context independent maximum  
likelihood trained HMM system, resulting in a  
performance close to that achieved using state of the  
art context dependent HMM systems of much higher  
complexity; <P>2. Mixing (context independent)  
connectionist probability estimates with maximum  
likelihood trained context dependent models to  
improve the performance of a state of the art system;  
<P>3. The development of a network decomposition method  
that allows connectionist modelling of context  
dependent phones efficiently and parsimoniously,  
with no statistical independence assumptions.</BLOCKQUOTE> 
 
----- 
File: 1991/tr-91-071 
 
GDNN: A Gender-Dependent Neural Network for Continuous Speech Recognition 
 
Yochai Konig, Nelson Morgan, and Claudia Chandra 
tr-91-071 
December 1991 
 
Conventional speaker-independent speech  
recognition systems do not consider  
speaker-dependent parameters in the probability  
estimation of phonemes. These recognition systems  
are instead tuned to the ensemble statistics over  
many speakers. Most parametric representations of  
speech, however, are highly speaker dependent, and  
probability distributions suitable for a certain  
speaker may not perform as well for other speakers. It  
would be desirable to incorporate constraints on  
analysis that rely on the same speaker producing all  
the frames in an utterance. Our experiments take a  
first step towards this speaker consistency  
modeling by using a classification network to help  
generate gender-dependent phonetic probabilities  
for a statistical recognition system. Our results  
show a good classification rate for the gender  
classification net. Simple use of such a model to  
augment an existing larger network that estimates  
phonetic probabilities does not help speech  
recognition performance. However, when the new net  
is properly integrated in an HMM recognizer, it  
provides significant improvement in word accuracy.  
 
----- 
File: 1991/tr-91-072 
 
SPERT: A VLIW/SIMD Microprocessor for Artificial Neural Network Computations 
 
Krste Asanovic, James Beck, Brian E. D. Kingsbury, Phil Kohn, Nelson Morgan, 
John Wawrzynek 
tr-91-072 
December 1991 
 
SPERT (Synthetic PERceptron Testbed) is a fully  
programmable single chip microprocessor designed  
for efficient execution of artificial neural  
network algorithms. The first implementation will  
be in a 1.2 micron CMOS technology with a 50MHz clock  
rate, and a prototype system is being designed to  
occupy a double SBus slot within a Sun Sparcstation.  
<P>SPERT will sustain over 300 million connections per  
second during pattern classification, and around  
100 million connection updates per second while  
running the popular error backpropagation training  
algorithm. This represents a speedup of around two  
orders of magnitude over a Sparcstation-2 for  
algorithms of interest. An earlier system produced  
by our group, the Ring Array Processor (RAP), used  
commercial DSP chips. Compared with a RAP  
multiprocessor of similar performance, SPERT  
represents over an order of magnitude reduction in  
cost for problems where fixed-point arithmetic is  
satisfactory. <P>This report describes the current  
architecture, and gives the results of detailed  
simulations. The report also makes a short  
comparison to other high-performance digital  
neurocomputing chips.  
 
----- 
File: 1991/tr-91-073 
 
Connectionist Layered Object-Oriented Network Simulator (CLONES): User's 
Manual 
 
Phil Kohn 
tr-91-073 
December 1991 
 
CLONES is an object-oriented library for constructing, training and 
utilizing layered connectionist networks. The CLONES library contains 
all the object classes needed to write a simulator with a small 
amount of added source code (examples are included). The size of 
experimental ANN programs is greatly reduced by using an object- 
oriented library; at the same time these programs are easier to read, 
write and evolve. The library includes database, network behavior 
and training procedures that can be customized by the user. It is 
designed to run efficiently on data parallel computers (such as the 
RAP [6] and SPERT [1]) as well as uniprocessor workstations. While 
efficiency and portability to parallel computers are the primary 
goals, there are several secondary design goals: 
<BLOCKQUOTE> 
<P>1. minimize the learning curve for using CLONES, 
<P>2. minimize the additional code required for new experiments, 
<P>3. allow heterogeneous algorithms and training procedures to be 
   interconnected and trained together. 
</BLOCKQUOTE> 
<P> 
Within these constraints we attempt to maximize the variety of 
artificial neural network algorithms that can be supported. 
 
----- 
File: 1991/tr-91-074 
 
Recent Work in VLSI Elements for Digital Implementations of Artificial 
Neural Networks 
 
Brian E. D. Kingsbury, Bertrand Irissou, Krste Asanovic, John Wawrzynek, 
Nelson Morgan 
tr-91-074 
December 1991 
 
A family of high-performance, area-efficient VLSI  
elements is being developed to simplify the design of  
artificial neural network processors. The  
libraries are designed around the MOSIS Scalable  
CMOS design rules, giving users the option of  
fabricating designs in 2.0um or 1.2um n-well  
processes, and greatly simplifying migration of the  
libraries to new MOSIS technologies. To date,  
libraries and generators have been created for  
saturating and nonsaturating adders, a  
two's-complement multiplier, and a triple-ported  
register file. The SPERT processor currently being  
designed at ICSI will be based upon these libraries,  
and is expected to run at 50 MHz when realized in a 1.2um  
CMOS technology.  
 
----- 
File: 1991/tr-91-075 
 
Incomplete Factorizations for Certain Toeplitz matrices 
 
C. Bernini, B. Codenotti, M. Leoncini and G. Resta 
tr-91-075 
December 1991. 
 
We propose some incomplete factorizations for banded Toeplitz matrices and 
we show their application to the direct and iterative solution of several 
special Toeplitz linear systems. 
 
 
 
----- 
File: 1992/tr-92-001 
 
Real-Time Communication in an Internetwork; 
 
Domenico Ferrari 
tr-92-001 
January 1992 
 
Can end-to-end communication performance be  
guaranteed by a packet-switching internetwork?  
This paper addresses the question by examining the  
feasibility of extending to an internetwork the  
Tenet approach to real-time communication service  
design. The conditions to be satisfied by an  
internetwork so that the approach can be extended to  
it are investigated. These include conditions for  
the scheduling discipline to be used in the nodes of  
the internetwork. <P>The original Tenet approach to  
real-time communication applies to a network  
consisting of hosts, homogeneous nodes (or  
switches), and physical links connecting nodes and  
hosts in an arbitrary topology. The nodes are  
store-and-forward, and are scheduled by a  
multi-class version of the Earliest Due Date  
deadline-based policy. <P>The discussion presented in  
this paper results in extendibility conditions that  
are quite broad; hence, the Tenet approach may be used  
to establish and run real-time channels in a vast  
class of internetworks. A case study is also  
discussed, involving a simple network, whose nodes  
are scheduled by FCFS-based disciplines, and the  
connection of such a network to an internetwork with  
deadline-based and hierarchical round robin  
scheduling.  
 
----- 
File: 1992/tr-92-002 
 
Constraint Relaxation and Nonmonotonic Reasoning 
 
Gerhard Brewka, Hans Werner Guesgen, Joachim Hertzberg 
tr-92-002 
January 1992 
 
The purpose of this paper is to bring together the two  
AI areas of constraint-based and nonmonotonic  
reasoning. In particular, we analyze the relation  
between different forms of constraint relaxation  
and a particular approach to nonmonotonic  
reasoning, namely, preferred subtheories. In  
effect, we provide formal semantics for the  
respective forms of constraint relaxation.  
 
----- 
File: 1992/tr-92-003 
 
Rate-Controlled Static Priority Queueing 
 
Hui Zhang and Domenico Ferrrari 
tr-92-003 
January, 1992 
 
We propose a new service discipline, called the  
Rate-Controlled Static-Priority (RCSP) queueing  
discipline, that can provide throughput, delay,  
delay jitter, and loss free guarantees in a  
connection-oriented packet-switching network.  
The proposed RCSP queueing discipline avoids  
problems in previous proposed solutions. It  
achieves flexibility in the allocation of delay and  
bandwidth, as well as simplicity of implementation.  
The key idea is to separate rate-control and  
delay-control functions in the design of the server.  
Applying this separation of functions will result in  
a class of service disciplines, of which RCSP is an  
instance.  
 
----- 
File: 1992/tr-92-004 
 
Best-First Model Merging for Dynamic Learning and Recognition 
 
Stephen M. Omohundro 
tr-92-004 
January 1992 
 
"Best-first model merging" is a general technique  
for dynamically choosing the structure of a neural or  
related architecture while avoiding overfitting.  
It is applicable to both learning and recognition  
tasks and often generalizes significantly better  
than fixed structures. We demonstrate the approach  
applied to the tasks of choosing radial basis  
functions for function learning, choosing local  
affine models for curve and constraint surface  
modelling, and choosing the structure of a balltree  
or bumptree to maximize efficiency of access.  
 
----- 
File: 1992/tr-92-005 
 
New algorithmic results for lines-in-3-space problems 
 
Leonidas J. Guibas and Marco Pellegrini 
tr-92-005 
January 1992 
 
In the first part of the report we consider some  
incidence and ordering problems for lines in  
3-space. We solve the problem of detecting  
efficiently if a query simplex is collision-free  
among polyhedral obstacles. In order to solve this  
problem we develop new on-line data structures to  
detect intersections of query halfplanes with sets  
of lines and segments. <P>Then, we consider the  
nearest-neighbor problems for lines. Given a set  
of$n$ lines in 3-space, the shortest vertical  
segment between any pair of lines is found in  
randomized expected time $O(n^{8/5+\epsilon})$  
for every $\eps>0$. The longest connecting vertical  
segment is found in time $O(n^{4/3+\eps})$. The  
shortest connecting segment is found in time  
$O(n^{5/3 + \epsilon})$. <P>Problems involving lines,  
points and spheres in 3-space have important  
applications in graphics, CAD and optimization. In  
the second part of the report we consider several  
problems of this kind. We give subquaratic  
algorithms to count the number of incidences between  
a set of lines and a set of spheres, and to find the  
minimum distance between a set of lines and a set of  
points. We show that the sphere of minimum radius  
intersecting every line in a set of $n$ lines can be  
found in optimal expected time $O(n)$. Given $m$  
possibly intersecting spheres we solve  
ray-shooting queries in $O(\log^2 m)$ time using a  
data structure of size $O(m^{5+\eps})$. <P>This  
technical report collects part of the second  
author's work at I.C.S.I. form September 1991 to  
January 1992.  
 
----- 
File: 1992/tr-92-006 
 
The LOGIDATA+ Object Algebra 
 
Umberto Nanni, Silvio Salza, Mario Terranova 
tr-92-006 
February 1992 
 
In this paper we present the LOGIDATA+ Object Algebra  
(LOA), an algebra for complex objects which has been  
developed within the LOGIDATA project funded by the  
Italian National Research Council (CNR). LOGIDATA+  
is intended to provide a rule based language on a data  
model with structured data types, object identity  
and sharing. LOA is a set-oriented manipulation  
language which was conceived as an internal language  
for a prototype system supporting such a rich  
environment. The algebra refers to a data model that  
includes structured data types and object identity,  
thus allowing both classes of objects and  
value-based relations. <P>LOA must deal with a rule  
based language with possible recursive programs  
with limited forms of negation. LOA programs  
explicitly include a "fixpoint" operator over a set  
of algebraic equations. Figures are omitted in the  
ftp-able version of the paper. A complete version is  
available from ICSI.  
 
----- 
File: 1992/tr-92-007 
 
The LOGIDATA+ Prototype System 
 
Umberto Nanni, Silvio Salza, Mario Terranova 
tr-92-007 
February 1992 
 
In this paper we present a prototype system developed  
within LOGIDATA+, a national project funded by the  
Italian National Research Council (CNR). The  
prototype supports a rule based language on a data  
model with structured data types, object identity  
and sharing. The system has an interactive user  
interface, with a unit of interaction consisting of a  
LOGIDATA+ program , to extract information from the  
knowledge base and/or modify the schema. A program  
consists of a set of rules, and of additional  
directives to handle the data output and/or the  
updates to the schema. The prototype handles a  
temporary (user) environment where updates are  
performed and a permanent one, updated on request.  
The system uses LOA (LOGIDATA+ Object Algebra) as an  
intermediate internal language (see ICSI  
#tr-92-006.ps.gz). User programs are translated  
into LOA programs, i.e. sequences of fixpoint  
systems of algebraic equations. The prototype is  
built on the top of a relational DBMS, that handles SQL  
transactions and provides the basic support for the  
permanent storage of data as well as for concurrency  
control and recovery. A main memory database has been  
included in the architecture, to improve the  
performance in the evaluation of the fixpoint  
systems, by keeping in main memory the intermediate  
results. Figures are omitted in the ftp-able version  
of the paper. A complete version is available from  
ICSI.  
 
----- 
File: 1992/tr-92-008 
 
Linear Time Algorithms for Liveness and Boundedness in Conflict-free Petri 
Nets 
 
Paola Alimonti, Esteban Feuerstain, Umberto Nanni 
tr-92-008 
February 1992 
 
In this paper we consider the problems of deciding the  
set of potentially firable transitions, the  
liveness and boundedness for the class of  
Conflict-Free Petri Nets. For these problems we  
propose algorithms which are linear in the size of the  
description of the net, dramatically improving the  
best previous known results for these problems.  
Moreover the algorithm for the first problem is  
incremental: it is possible to perform an arbitrary  
sequence of updates, introducing new transitions  
and increasing the initial marking of the net, and  
queries, asking whether any transition is firable or  
any place reachable. Queries are answered in  
constant time, and the total cost for all the  
modifications is still linear in the size of the final  
net. Our approach is based on a representation of  
conflict-free Petri nets by means of directed  
hypergraphs. Figures are omitted in the ftp-able  
version of the paper. A complete version is available  
from ICSI.  
 
----- 
File: 1992/tr-92-009 
 
Fish in Schools or Fish in Cans Evolutionary Thinking and Formalization 
 
Dirk Siefkes 
tr-92-009 
February 1992 
 
Gregory Bateson maintains that individual  
development and natural evolution follow the same  
principles --he parallels learning and evolution. I  
try to establish the precise mechanism of human  
learning by attributing the role of genes to  
concepts. We develop our thoughts conceptually  
through selection, in the same way that living beings  
develop genetically. Thus, thoughts evolve in our  
mind like fish in a cove, thoughts yielding concepts  
as the genetic material from which new thoughts  
arise.  
 
----- 
File: 1992/tr-92-010 
 
A New Algorithm for Counting Circular Arc Intersections 
 
Marco Pellegrini 
tr-92-010 
February 1992 
 
We discuss the following problem: given a collection  
$\Gamma$ of $n$ circular arcs in the plane, count all  
intersections between arcs of $\Gamma$. We present  
an algorithm whose expected running time is  
$O(n^{3/2+\eps})$, for every $\eps >0$. If the arcs  
have all the same radius the expected time bound is  
$O(n^{4/3+\eps})$, for every $\eps>0$. Both  
results improve on the time bounds of previously  
known asymptotically fastest algorithms. The  
technique we use is quite general and it is applicable  
to other counting problems.  
 
----- 
File: 1992/tr-92-011 
 
The Weighted List Update Problem and the Lazy Adversary 
 
Fabrizio d&#039Amore, Alberto Marchetti-Spaccamela, Umberto Nanni 
tr-92-011 
February 1992 
 
The "List Update Problem" consists in maintaining a dictionary as an 
unsorted linear list. Any request specifies an item to be found by 
sequential scanning through the list. After an item has been found, 
the list may be rearranged in order to reduce the cost of processing 
a "sequence" of requests. 
<P> 
Several kinds of adversaries can be considered to analyze the 
behavior of heuristics for this problem. The "Move-to-Front" (MTF) 
heuristic is 2-competitive against a "strong" adversary, matching 
the deterministic lower bound for this problem [21]. 
<P> 
But, for this problem, moving elements does not help the adversary. 
A "lazy" adversary has the limitation that he can use only a static 
arrangement of the list to process (off-line) the sequence of 
requests: still, no algorithm can be better than 2-competitive 
against the lazy adversary [3]. 
<P> 
In this paper we consider the "Weighted List Update Problem" 
(WLUP) where the cost of accessing an item depends on the item 
itself. It is shown that MTF is not competitive by any consent 
factor for this problem against a lazy adversary. Two heuristics, 
based on the MTF strategy, are presented for WLUP: "Random Move-to- 
Front" is randomized and uses biased coins; "Counting Move-to- 
Front" is deterministic, and replaces coins by counters. Both are 
shown to be 2-competitive against a lazy adversary. This is 
optimal for the deterministic case. 
<P> 
We apply this approach for searching items in a tree, proving that 
any "c"-competitive heuristic for the weighted list update 
problem provides a "c"-competitive heuristic for the "Tree Update 
Problem". 
 
 
----- 
File: 1992/tr-92-012 
 
Towards a Complexity Theory for Approximation 
 
Karl Aberer, and Bruno Codenotti 
tr-92-012 
February 1992 
 
This paper presents a novel approach to the analysis  
of numerical problems, which is closely related to  
the actual nature of numerical algorithms. In fact,  
models of computation are introduced which take into  
account such issues as adaptivity and error.  
Moreover, complexity vs error bounds and examples  
regarding the role of adaptivity are provided.  
Finally, it is shown that the overall approach fits  
naturally into an algebraic framework.  
 
----- 
File: 1992/tr-92-013 
 
Competitive On-line Algorithms for Paging and Graph Coloring 
 
Sandy Irani 
tr-92-013 
January 1992 
 
We analyze the competitiveness of on-line  
algorithms for two problems: paging and on-line  
graph coloring. In the first problem, we develop a  
refinement of competitive analysis for paging  
algorithms which addresses some of the areas where  
traditional competitive analysis fails to  
represent what is observed in practice. For example,  
traditional competitive analysis is unable to  
discern between LRU and FIFO, although in practice  
LRU performs much better than FIFO. In addition, the  
theoretical competitiveness of LRU is much more  
pessimistic than what is observed in practice. We  
also address the following important question:  
given some knowledge of a program's reference  
pattern, can we use it to improve paging performance  
on that program? <P>We address these concerns by  
introducing an important practical element that  
underlies the philosophy behind paging: locality of  
reference. We devise a graph-theoretical model, the  
access graph, for studying locality of reference.  
<P>The second problem that we consider is on-line graph  
coloring. In the spirit of competitiveness, we  
evaluate on-line graph coloring algorithms by their  
performance ratio which measures the number of  
colors the algorithm uses in comparison to the  
chromatic number of the graph. We consider the class  
of d-inductive graphs. A graph G is d-inductive if the  
vertices of G can be numbered so that each vertex has at  
most d edges to higher numbered vertices. We analyze  
the greedy algorithm and show that if G is d-inductive  
then FF uses O( d log n) colors on G. We show that this  
bound is tight. Since planar graphs are 5-inductive,  
and chordal graphs are c(G)-inductive, (where c(G)  
is the chromatic number of the graph G), our results  
yield bounds on the performance ratio of greedy on  
these important classes of graphs. We also examine  
on-line graph coloring with lookahead. An algorithm  
is on-line with lookahead l, if it must color vertex i  
after examining only the first l+i vertices. We show  
that for l < (n / log n) no on-line algorithm with  
lookahead l can perform better than First Fit on  
d-inductive graphs.  
<P> 
Keywords: on-line algorithms,  
competitive analysis, paging, locality of  
reference, on-line graph coloring, lookahead.  
 
----- 
File: 1992/tr-92-014 
 
Backwards Analysis of Randomized Geometric Algorithms 
 
Raimund Seidel 
tr-92-014 
February 1992 
 
The theme of this paper is a rather simple method that  
has proved very potent in the analysis of the expected  
performance of various randomized algorithms and  
data structures in computational geometry. The  
method can be described as ``analyze a randomized  
algorithm as if it were running backwards in time,  
from output to input.'' We apply this type of analysis  
to a variety of algorithms, old and new, and obtain  
solutions with optimal or near optimal expected  
performance for a plethora of problems in  
computational geometry, such as computing Delaunay  
triangulations of convex polygons, computing  
convex hulls of point sets in the plane or in higher  
dimensions, sorting, intersecting line segments,  
linear programming with a fixed number of  
variables,and others.  
 
----- 
File: 1992/tr-92-015 
 
Queueing Delays in Rate Controlled Networks 
 
Anindo Banerjea and Srinivasan Keshav 
tr-92-015 
March 1992 
 
This paper addresses the problem of finding the worst  
case end-to-end delay and buffer occupancy bounds in  
networks of rate-controlled, non-work conserving  
servers. <P>The calculations are based on a simple fluid  
model, but care is taken so that the computed delay and  
buffer occupancy values are upper bounds on actual  
values. A simple algorithm is presented to perform  
these calculations in linear time. <P>Simulation  
results compare the computed worst case delays with  
the actual delays obtained on some simple network  
topologies. The algorithm is found to predict node  
delays well for bursty input traffic, but poorly for  
smooth input traffic. Buffer requirements are  
predicted well in both cases.  
 
----- 
File: 1992/tr-92-016 
 
A Framework for the Study of Pricing in Integrated Networks 
 
Colin J. Parris, Srinivasan Keshav, and Domenico Ferrari 
tr-92-016 
March 1992 
 
Integrated networks of the near future are expected  
to provide a wide variety of services, which could  
consume widely differing resources. We present a  
framework for pricing services in integrated  
networks, and study the effect of pricing on user  
behavior and network performance. We first describe  
a network model that is simple, yet models details  
such as the wealth distribution in society,  
different classes of service, peak and off-peak  
traffic and call blocking due to budgetary  
constraints. <P>We then perform experiments to study  
the effect of setup, per packet, and peak load prices  
on the blocking probability of two classes of calls  
passing through a single node enforcing admission  
control. Some selected results are that a)  
increasing prices first increases the net revenue to  
a provider, then causes a decrease b) peak-load  
pricing spreads network utilization more evenly,  
raising revenue while simultaneously reducing call  
blocking probability. <P>Finally, we introduce a novel  
metric for comparing pricing schemes, and prove that  
for the most part, a pricing scheme involving setup  
prices is better than a pricing scheme with no setup  
cost.  
 
----- 
File: 1992/tr-92-017 
 
The Sather Language and Libraries 
 
Stephen Omohundro and Chu-Cheow Lim 
tr-92-017 
March 1992 
 
Sather is an object-oriented language derived from  
Eiffel which is particularly well suited for the  
needs of scientific research groups. It is designed  
to be very efficient and simple while supporting  
strong typing, garbage collection,  
object-oriented dispatch, multiple inheritance,  
parameterized types, and a clean syntax. It compiles  
into portable C code and easily links with existing C  
code. The compiler, debugger and several hundred  
library classes are freely available by anonymous  
FTP. This paper describes aspects of the language  
design, implementation and libraries.  
 
----- 
File: 1992/tr-92-018 
 
A Resource Based Pricing Policy for Real-Time Channels in a Packet-Switching 
Network 
 
Colin J. Parris and Domenico Ferrari 
tr-92-018 
March 1992 
 
In the packet switching networks of the future the  
need for guaranteed performance on a wide variety of  
traffic characteristics will be of paramount  
importance. The generation of revenue, to recover  
costs and provide profit, and the multiple type of  
services offered will require that new pricing  
policies be implemented. <P>This paper presents a  
resource based pricing policy for real-time  
channels ( ie., channels with guaranteed  
performance ) in a packet switching network. The  
policy is based on a set of specific criteria, and the  
charges for any channel are based on the resources  
reserved for use by the channel. This reservation  
charge is based on the type of service requested, the  
time of day during which the channel exists, and the  
lifetime of the channel. We argue that the  
traditional resources are not sufficient to  
determine a fair reservation charge for a channel  
offering guaranteed delay bounds, and we introduce  
the notion of a delay resource in our charging  
formula. The type of service requested is thus  
characterized by the amount of the bandwidth, buffer  
space, CPU, and delay resources reserved. The  
analysis of this pricing policy is reduced to the  
analysis of a single node of the network, assuming a  
homogeneous network. This single-node  
characteristic increases the scalability and  
flexibility of the policy. An example of an  
implementation of this policy is provided.  
 
----- 
File: 1992/tr-92-019 
 
Design of a Continuous Media Data Transport Service and Protocol 
 
Mark Moran and Bernd Wolfinger 
tr-92-019 
April 1992 
 
Applications with real-time data transport  
requirements fall into two categories: those which  
require transmission of data units at regular  
intervals, which we call continuous media (CM)  
clients, e.g. video conferencing, voice  
communication, high-quality digital sound; and  
those which generate data for transmission at  
relatively arbitrary times, which we call real-time  
message-oriented clients. Because CM clients are  
better able to characterize their future behavior  
than message-oriented clients, a data transport  
service dedicated for CM clients can use this a priori  
knowledge to more accurately predict their future  
resource demands. Therefore, a separate transport  
service can potentially provide a more  
cost-effective service along with additional  
functionality to support CM clients. The design of  
such a data transport service for CM clients and its  
underlying protocol (within the BLANCA gigabit  
testbed project) will be presented in this document.  
This service provides unreliable, in-sequence  
transfer (simplex, periodic) of so-called stream  
data units (STDUs) between a sending and a receiving  
client, with performance guarantees on loss, delay,  
and throughput.  
 
----- 
File: 1992/tr-92-020 
 
Read-Once Threshold Formulas, Justifying Assignments, and Generic 
Tranformations 
 
Nader H. Bshouty, Thomas R. Hancock, Lisa Hellerstein, Marek Karpinski 
tr-92-020 
March, 1992 
 
We present a membership query (i.e. interpolation)  
algorithm for exactly identifying the class of  
read-once formulas over the basis of boolean  
threshold functions. Using a generic  
transformation from [Angluin, Hellerstein,  
Karpinski 89], this gives an algorithm using  
membership and equivalence queries for exactly  
identifying the class of read-once formulas over the  
basis of boolean threshold functions and negation.  
We also present a series of generic transfor- mations  
that can be used to convert an algorithm in one  
learning model into an algorithm in a different  
model.  
<P> 
Keywords: Learning Algorithms, Queries,  
Read-Once Formulas, Threshold Functions.  
 
----- 
File: 1992/tr-92-021 
 
Local Properties of Some NP-Complete Problems 
 
Bruno Codenotti and Luciano Margara 
tr-92-021 
April 1992 
 
It has been shown that certain NP-complete problems,  
i.e. TSP, min cut, and graph partitioning, with  
specific notions of neighborhood, satisfy a simple  
difference equation. In this paper, we extend these  
results by proving that TSP with 2-change,  
2+3-new-change, and 3-new-change notions of  
neighborhood satisfy such a difference equation,  
and we derive some properties of local search when  
performed with the above definitions of  
neighborhood.  
 
----- 
File: 1992/tr-92-022 
 
Petri Net Based Software Validation: Prospects and Limitations 
 
Monika Heiner 
tr-92-022 
March 1992 
 
Petri net based software validation to check the  
synchronization structure against some data or  
control flow anomalies (like unboundednesss or  
non-liveness) has been a well-known and widely used  
approach for about ten years. To decrease the  
complexity problem and because the simpler the  
model, the more efficient the analysis, the  
validation is usually tried with the help of place  
transition Petri nets. However, the modelling with  
this Petri net class involves two important  
abstractions of actual software properties -- the  
time consumption of any action and the data  
dependencies among conflict decisions. Basically,  
this paper discusses some problems resulting from  
these abstractions in the models analyzed which are  
very often neglected and have therefore not been well  
understood up to now. Furthermore, discussing the  
pros and cons of the Petri net approach is done by  
offering a rough overview of the given background of  
dependable distributed software engineering.  
Suggestions for a related workstation supporting  
different net-based methods are outlined.  
 
----- 
File: 1992/tr-92-023 
 
Quality-of-Service Negotiation in a Real-Time Communication Network 
 
Jean Ramaekers and Giorgio Ventre 
tr-92-023 
April 1992 
 
In the recent years new protocols and algorithms have  
been proposed to guarantee performance and  
reliability in exchanging data in real-time  
communication networks, and new services have been  
presented to allow cooperative office work,  
distributed conferencing, etc. Less attention has  
been paid to how applications and, more generally,  
clients of real-time communication services can  
interact with the network in order to specify and  
negotiate the quality-of-service of a connection.  
We believe that this problem is going to become a key  
issue for the success of future distributed systems,  
since it affects both client and network  
performances. In this paper we present a new  
mechanism for the establishment of real-time  
connections in a quality-of-service network  
developed for the Tenet real-time protocol suite. By  
improving the information exchanged between the  
network and the clients, the model allows to reduce  
the complexity and the time required to establish a  
real-time connection, and increases the network  
utilization. Additionally, we introduced a new  
class of real-time communication service to support  
adaptive quality-of-service, in order to enhance  
the possibilities of the network to face congestion  
situations.  
 
----- 
File: 1992/tr-92-024 
 
Communicating with Low-Diffraction Lasers and Mirrors 
 
Richard Beigel 
tr-92-024 
April 1992 
 
Optical interconnection networks, in which each  
processor contains a set of lasers for communication  
with other processors, have long been studied. In the  
``regular optics'' model of Murdocca a bounded  
number of planar mirrors are used to redirect light  
beams, and each processor has a bounded number of  
lasers directed at a fixed set of angles, independent  
of the processor. <P>It is theoretically interesting to  
ignore diffraction, and assume that lasers beams  
travel in a straight line. In the regular optical  
model, we present elegant layouts for processor  
networks including the shuffle, grids, and  
Margulis' expander graph. We also disprove the  
existence of a certain kind of 3-dimensional layout  
for shuffles. <P>Using slightly more complicated  
optical devices, such as beam splitters, we design a  
``light guide,'' which allows simultaneous  
broadcasts, subject only to the limitations of light  
sensors. In particular, the light guide can perform  
single broadcasts. Given accurate enough clocks, it  
can perform arbitrary permutations.  
 
----- 
File: 1992/tr-92-025 
 
Tree Matching with Recursive Distributed Representations 
 
Andreas Stolcke and Dekai Wu 
tr-92-025 
April 1992 
 
We present an approach to the structure unification  
problem using distributed representations of  
hierarchical objects. Binary trees are encoded  
using the recursive auto-association method  
(RAAM), and a unification network is trained to  
perform the tree matching operation on the RAAM  
representations. It turns out that this restricted  
form of unification can be learned without hidden  
layers and producing good generalization if we allow  
the error signal from the unification task to modify  
both the unification network and the RAAM  
representations themselves.  
 
----- 
File: 1992/tr-92-026 
 
On the Power of Discontinous Approximate Computations 
 
Karl Aberer, Bruno Codenotti 
tr-92-026 
April 1992 
 
The set of operations S_1={+,-,*,/,>} is used in  
algebraic computations to avoid degeneracies  
(e.g., division by zero), but is also used in  
numerical computations to avoid huge roundoff  
errors (e.g., division by a small quantity). On the  
other hand, the classes of algorithms using  
operations from the set S_2={+,-,*,/} or from the set  
S_3={+,-,*} are the most studied in complexity  
theory, and are used, e.g., to obtain fast parallel  
algorithms for numerical problems. In this paper, we  
study, by using a simulation argument, the relative  
power of the sets S_1, S_2, and S_3 for computing with  
approximations. We prove that S_2 does very  
efficiently simulate S_1, while S_3 does not; this  
fact shows and measures the crucial role of division  
in computations introducing roundoff errors. We  
also show how to construct algorithms using  
operations {+,-,*,/} which achieve for most inputs  
the same error bounds as algorithms using operations  
{+,-,*,/,>}. To develop our simulation strategy we  
combine notions imported from approximation theory  
and topology with complexity and error bounds. More  
precisely, to find conditions under which this  
simulation can take place, we quantitatively  
describe the interplay between algebraic,  
approximation, topological, and complexity  
notions and we provide lower and upper bounds on the  
cost of simulation.  
 
----- 
File: 1992/tr-92-027 
 
The Quality of Separation Between NP and Exponential Time; Reducing the 
Cases 
 
Gerhard Lischke 
tr-92-027 
April 1992 
 
We consider three aspects of quality of separation  
between complexity classes: inclusion, immunity  
and sparseness in the differences. These aspects are  
discussed in general and investigated especially  
for the relationship between NP and deterministic  
exponential linear time, where we can reduce the  
number of possible cases from 24 to 8. Seven of the 8  
cases are realizable in appropriate relativized  
worlds; one case remains open. Also, we found an error  
in former papers on this subject.  
 
----- 
File: 1992/tr-92-028 
 
Proposal of an External Processor Scheduling in Micro-Kernel Based Operating 
Systems 
 
Winfried Kalfa 
tr-92-028 
May, 1992 
 
Until now, the management of resources was a task of  
the operating systems kernel. The applications  
running on the operating system were in general,  
similiar to each other. Thus the limited policy of the  
resource manager could satisfy the demands of  
applications. With the advent of computer systems  
capable handling new applications such as  
multi-media and of new operating systems based on  
micro-kernels and supporting object paradigm in a  
distributed environment, an external resource  
manager became important for both traditional  
operating systems like UNIX with new applications  
and new object oriented and micro- kernel based  
operating systems. In this paper an approach to an  
external scheduling on the basis of the operating  
system BirliX is given. The proposal is based on a  
scheduler implemented in the user space. Problems of  
the implementation are described by means of the  
operating system BirliX as an example. Because the  
operating system is a distributed object oriented  
opera- ting system, our proposal deals with local and  
distributed managers. Coming from a system model of  
the BirliX, a resource mode, and a process model, the  
scheduling model is developed.  
<P> 
Keywords: Distributed Operating Systems, External Processor  
Scheduler, Micro-Kernel, BirliX  
 
----- 
File: 1992/tr-92-029 
 
Efficient Computation of Spatial Joins 
 
Oliver G&uuml;nther 
tr-92-029 
May 1992 
 
Spatial joins are join operations that involve  
spatial data types and operators. Due to some basic  
properties of spatial data, many conventional join  
processing strategies suffer serious performance  
penalties or are not applicable at all in this case. In  
this paper we explore which of the join strategies  
known from conventional databases can be applied to  
spatial joins as well, and how some of these  
techniques can be modified to be more efficient in the  
context of spatial data. Furthermore, we describe a  
class of tree structures, called generalization  
trees, that can be applied efficiently to compute  
spatial joins in a hierarchical manner. Finally, we  
model the performance of the most promising  
strategies analytically and conduct a comparative  
study.  
 
----- 
File: 1992/tr-92-030 
 
Checking Approximate Computations over the Reals 
 
Sigal Ar, Manuel Blum, Bruno Codenotti, and Pete Gemmell 
tr-92-030 
May 1992 
 
This paper provides the first systematic  
investigation of checking approximate numerical  
computations, over subsets of the reals. In most  
cases, approximate checking is more challenging  
than exact checking. Problem conditioning, i.e.,  
the measure of sensitivity of the output to slight  
changes in the input, and the presence of  
approximation parameters foil the direct  
transformation of many exact checkers to the  
approximate setting. We can extend exact checkers  
only if they have a very smooth dependence on the  
sensitivity of the problem. Furthermore,  
approximate checking over the reals is complicated  
by the lack of nice finite field properties such as the  
existence of a samplable distribution which is  
invariant under addition or multiplication by a  
scalar. We overcome the above problems by using such  
techniques as testing and checking over similar but  
distinct distributions, using functions' random  
and downward self-reducibility properties, and  
taking advantage of the small variance of the sum of  
independent identically distributed random  
variables.  
 
----- 
File: 1992/tr-92-031 
 
Decision Procedures for Flat Set-Theorectical Syllogistics.I. General Union, 
Powerset and Singleton Operators 
 
Domenico Cantone and Vincenzo Cutello 
tr-92-031 
May 1992 
 
(Pages 30) In this paper we show that a class of  
unquantified multi-sorted set-theoretic formulae  
involving the notions of powerset, general union,  
and singleton has a solvable satisfiability  
problem. We exhibit a normalization procedure that  
given a model for a formula in our theory, it produces a  
simpler and "a priori" bounded model whose  
cardinality depends solely on the size of the given  
formula.  
 
----- 
File: 1992/tr-92-032 
 
A Model for Amalgamation in Group Decision Making 
 
Vincenzo Cutello and Javier Montero 
tr-92-032 
May 1992 
 
(Pages 14) In this paper we present a generalization  
of the model proposed by Montero in [Mon87a, Mon87b,  
Mon92], by allowing non complete fuzzy binary  
relations for individuals. A degree of  
unsatisfaction can be defined in this case,  
suggesting that any democratic aggregation rule  
should take into account not only ethical conditions  
or some degree of rationality in the amalgamating  
procedure, but also a minimum support for the set of  
alternatives subject to the group analysis.  
 
----- 
File: 1992/tr-92-033 
 
A Characterization of Rational Amalgamation Operations 
 
Vincenzo Cutello and Javier Montero 
tr-92-033 
May 1992 
 
(Pages 24) This paper deals with amalgamation of  
fuzzy opinions when a fixed number of individuals is  
faced with an unknown number of alternatives. The  
aggregation rule is defined by means of intensity  
aggregation operations that verify certain ethical  
conditions, and assuming fuzzy rationality as  
defined in [6, 7]. A necessary and sufficient  
condition for non-irrationality is presented,  
along with comments on the importance of the number of  
alternatives.  
 
----- 
File: 1992/tr-92-034 
 
Ambiguities in Object Specifications in View of Data Testing 
 
Dieter Richter 
tr-92-034 
June 1992 
 
Checking data only relying on their specification is  
of importance when using neutral or standardized  
object models. Ambiguities arise during the tests  
because of specifications leaving a certain degree  
of freedom to the implementation. Based on an  
experimental background the observations and  
reflections about the reasons are systematically  
presented. It turns out that the transition (or  
mapping) from a specification of an object to a  
physical instance (or data set) has to take into  
consideration when defining neutral models. This  
transition which often has been seen as a technical  
question of the implementation or as the internal  
(hided) feature of a system appears as a particular  
point of the concept besides the specification of the  
semantics. <P>One crucial point is the instance  
handling with respect to assign and comparison  
operations. The mapping from a specification into a  
database can be realized in various manners which  
leads to interpretation defects when testing  
independently. Another point is the weak scope  
definition in specifications. Several ambiguities  
are caused by it. A very frequent reason of  
misunderstandings is the imprecise or wrong  
understanding of the different relations between  
objects, logical and physical instances. There are  
approaches for more clear specifications. The last  
point is the representation of failures or more  
generally of the state of instances. A concept based  
on multiple inheritance seems to increase the  
abstraction level of state specifications on the  
same level as the used specification language is of.  
 
----- 
File: 1992/tr-92-035 
 
Experiments with Noise Reduction Neural Networks for Robust Speech 
Recognition 
 
Michael Trompf 
tr-92-035 
May, 1992 
 
Speech recognition systems with small and medium  
vocabularies are used as natural human interface in a  
variety of real world applications. Though they work  
well in a laboratory environment, a significant loss  
in recognition performance can be observed in the  
presence of background noise. In order to make such a  
system more robust, the development of a neural  
network based noise reduction module is described in  
this paper. Based on function approximation  
techniques using multilayer feedforward networks  
(Hornik et al. 1990), this approach offers inherent  
nonlinear capabilities as well as easy training from  
pairs of corresponding noisy and noise-free signal  
segments. For the development of a robust  
nonadaptive system, information about the  
characteristics of the noise and speech components  
of the input signal and its past and future context is  
taken into account. Evaluation of each step is done by  
a word recognition task and includes experiments  
with changing signal parameters and sources to test  
the robustness of this neural network based  
approach.  
 
----- 
File: 1992/tr-92-036 
 
Efficient Clustering Techniques for the Geometric Traveling Salesman Problem 
 
Bruno Codenotti and Luciano Margara 
tr-92-036 
June 1992 
 
This paper presents some direct and iterative  
heuristic methods for the geometric Traveling  
Salesman Problem (TSP). All these methods are based  
on a particular notion of mass density, which can be  
used to construct a tour for the geometric TSP in an  
incremental fashion. In the iterative method, this  
technique is combined with the Lin-Kernighan method  
(LK), and this allows us to obtain better tours than  
those found by using LK itself. More precisely, the  
tour length we get is only 1.1% off the optimum. The  
direct method finds a solution passing through a  
sequence of subsolutions over progressively larger  
sets of points. These points are the relative maxima  
of the mass density obtained by using different  
parameter settings. The method has O(n^3) worst case  
running time and finds tours whose length is 9.2% off  
the optimal one.  
 
----- 
File: 1992/tr-92-037 
 
Measuring the Latency Time of Real-Time Unix-like Operating Systems 
 
Newton Faller 
tr-92-037 
June 1992 
 
With the advent of continuous-media applications,  
real-time operating systems, once confined to  
process control and other specialized  
applications, are coming to the desktop. The  
popularity of UNIX made this operating system the  
first choice for use with such real-time desktop  
applications. However, since UNIX kernel does not  
provide real-time responsiveness, some software  
developers have been trying to adapt it to respond to  
this new requirements, while others have been  
proposing its total redesign. Though the evaluation  
of the performance of a real-time operating system  
depends on many factors, a predictable small latency  
time in responding to external events is always  
essential. In this paper, after a discussion about  
the probable sources of latency, it is presented a  
method for collecting information about  
context-switching and interrupt-acknowledge  
times in UNIX-like operating systems without  
requiring external measuring tools. It is also  
proposed, a form of presentation of these data aimed  
at facilitating the comparison with previously  
collected data obtained from the same or from other  
systems. The paper is illustrated with actual  
results obtained by the application of the method to  
TROPIX, a real-time UNIX-like operating system,  
running on a Motorola 68010-based computer. The  
impact of kernel preemption and some practical  
measurement interference considerations due to  
dynamic memory refresh, DMA operation and disk  
multiblock access are also discussed.  
 
----- 
File: 1992/tr-92-038 
 
Fuzzy Evolutionary Algorithms 
 
Hans-Michael Voigt 
tr-92-038 
June 1992 
 
Evolutionary algorithms (EA) combine different approaches for solving 
complex problems based on principles, models, and mechanisms of 
natural evolution. Typical representatives of such algorithms are 
Genetic Algorithms (GA) and Evolution Strategies (ES), which are 
closely related in principle but show different emphasis on the 
representational and operational level. The basic ideas and concepts 
for GAs and ESs dates back to the early sixties. Central concepts of 
these approaches include the replication, recombination, mutation, 
selection, isolation-migration, and diffusion of individuals within 
or between populations or subpopulations, respectively. These 
algorithms do not take into account the development of an individual 
or organism from the gene level to the mature phenotype level. This 
development is a multistage decision process influenced by the 
environment and by interspecific as well as intraspecific competition 
and cooperation such that usually no inferences can be drawn from 
phenotype to genotype. The goal of this paper is to introduce a fuzzy 
representation and fuzzy operations to model the developmental process 
based on fuzzy decisions. Some first conclusions with respect to 
optimization will be stated. 
<P> 
The appendices include an up-to-date software survey for Evolutionary 
Algorithms and the description of "The Evolution Machine". 
 
 
----- 
File: 1992/tr-92-039 
 
Boot Algebras 
 
D. Schuett, U. Eckhardt and P. Suda 
tr-92-039 
June 1992 
 
The paper surveys our recent work in the field of  
Boolean algebra. It begins with an introduction into  
the theory of Boolean algebras and discusses  
problems related to the separation of an algebra into  
a family of factors so that the Cartesian product of  
the family is isomorphic to the given algebra. Such a  
product is called a "Boo"lean "t"uple algebra or for  
short a Boot algebra if each factor is completely  
contained in the original algebra. Some examples are  
taken from the field of digital circuit design and  
image processing. They demonstrate how Boot  
algebras can be applied.  
 
----- 
File: 1992/tr-92-040 
 
Robot Shaping: Developing Situated Agents through Learning 
 
Marco Colombetti, Marco Dorigo 
tr-92-040 
April 1992 
 
August 1992 [Second edition, revised: December  
1993] Learning plays a vital role in the development  
of situated agents. In this paper, we explore the use  
of reinforcement learning to "shape" a robot to  
perform a predefined target behavior. We connect  
both simulated and real robots to Alecsys, a parallel  
implementation of a learning classifier system with  
an extended genetic algorithm. After classifying  
different kinds of Animat-like behaviors, we  
explore the effects on learning of different types of  
agent's architecture (monolithic, flat and  
hierarchical) and of training strategies. In  
particular, hierarchical architecture requires  
the agent to learn how to coordinate basic learned  
responses. We show that the best results are achieved  
when both the agent's architecture and the training  
strategy match the structure of the behavior pattern  
to be learned. We report the results of a number of  
experiments carried out both in simulated and in real  
environments, and show that the results of  
simulations carry smoothly to real robots. While  
most of our experiments deal with simple reactive  
behavior, in one of them we demonstrate the use of a  
simple and general memory mechanism. As a whole, our  
experimental activity demonstrates that  
classifier systems with genetic algorithms can be  
practically employed to develop autonomous agents.  
 
<P> 
Keywords: machine learning, adaptive systems,  
genetic algorithms, learning classifier systems,  
behavior-based robotics.  
 
----- 
File: 1992/tr-92-041 
 
The NC Equivalence of Integer Linear Programming and Euclidean GCD 
 
Victor Pan 
tr-92-041 
December 1992 
 
We show NC-reduction of integer linear programming  
with two variables to the evaluation of the remainder  
sequence arising in the application of the Euclidean  
algorithm to two positive integers. Due to the  
previous result of Deng, this implies  
NC-equivalence of both of these problems, whose  
membership in NC, as well as P-completeness, remain  
unresolved open problems.  
 
----- 
File: 1992/tr-92-042 
 
A Framework for Cumulative Default Logics 
 
Gehard Brewka 
tr-92-042 
July 1992 
 
We present a framework for default reasoning which  
has its roots in Reiter's Default Logic. Contrary to  
Reiter, however, we do not consider defaults as  
inference rules used to generate extensions of a  
classical set of facts. In our approach defaults are  
elements of the logical language, and we will define  
inference rules on defaults. This has several  
advantages. First of all, we can reason about  
defaults, not just with defaults. This makes it easy  
to include different intuitions about the right  
behaviour of a default logic in an explicit form.  
Secondly, we can show how some of the problems of  
Reiter's logic and of some recent proposals to solve  
them can be handled adequately by exploiting the  
dependency information contained in derived  
defaults.  
 
----- 
File: 1992/tr-92-043 
 
A Symbolic Complexity Analysis of Connectionist Algorithms for 
Distributed-Memory Machines 
 
Jonathan Bachrach 
tr-92-043 
July 1992 
 
This paper attempts to rigorously determine the  
computation and communication requirements of  
connectionist algorithms running on a  
distributed-memory machine. The strategy involves  
(1) specifying key connectionist algorithms in a  
high-level object-oriented language, (2)  
extracting their running times as polynomials, and  
(3) analyzing these polynomials to determine the  
algorithms' space and time complexity. Results are  
presented for various implementations of the  
back-propagation  
algorithm~\cite{Rumelhart-Hinton-Williams}.  
 
----- 
File: 1992/tr-92-044 
 
On-Line Algorithms Versus Off-Line Algorithms: How Much is it Worth to Know 
the Future? 
 
Richard M. Karp 
tr-92-044 
July 1992 
 
An "on-line algorithm" is one that receives a  
sequence of requests and performs an immediate  
action in response to each request. On-line  
algorithms arise in any situation where decisions  
must be made and resources allocated without  
knowledge of the future. The effectiveness of an  
on-line algorithm may be measured by its  
"competitive ratio", defined as the worst-case  
ratio between its cost and that of a hypothetical  
off-line algorithm which knows the entire sequence  
of requests in advance and chooses its actions  
optimally. In a variety of settings, we discuss  
techniques for proving upper and lower bounds on the  
competitive ratios achievable by on-line  
algorithms. In particular, we discuss the  
advantages of randomized on-line algorithms over  
deterministic ones.  
 
----- 
File: 1992/tr-92-045 
 
Persistence in the Object-Oriented Database Programming Language VML 
 
Wolfgang Klas, Volker Turau 
tr-92-045 
July 1992 
 
In this paper the principles of handling persistent  
objects in the object-oriented database  
programming language VML is presented. The main  
design criteria of VML with respect to persistence  
were: persistence independent programming, data  
type completeness and operations manipulating the  
extension of a class. After defining the above  
mentioned concepts an example is used to compare the  
modelling and computational power of VML with the  
database programming languages Adaplex, PS-algol,  
and Galileo. The distinction of types and classes is  
the basis for defining persistence in VML. Instances  
of classes are always persistent and those of data  
types are always transient. All instances are  
referenced by object identifiers, values of  
datatypes are referenced independently of the fact  
whether they are attached to persistent objects (and  
are therefore persistent itself) or whether they are  
"stand alone".  
 
----- 
File: 1992/tr-92-046 
 
An Object-Oriented Approach to the Design of Graphical User Interface 
Systems 
 
Fabio Paterno 
tr-92-046 
August 1992 
 
In this paper the problems concerning the design of  
graphical user interface systems composed of a set of  
interaction objects allowing users to interact with  
structured graphics are presented. Here we want to  
point out the problems and the requirements that are  
raised in performing such a design in an  
object-oriented environment. For this purpose the  
importance of task-oriented design of interaction  
objects in order to make the traslation from the user  
task to the system functions easier is addressed. The  
design of a hierarchy of interaction objects  
following this approach is proposed. This contrast  
with the current window systems toolkits design  
because it is mainly driven by the semantics of the  
interaction object rather than their appearance.  
Finally an example of common graphical interface  
performed by the proposed approach is presented.  
 
----- 
File: 1992/tr-92-047 
 
An Adaptive Classification Scheme to Approximate Decision Boundaries Using 
Local Bayes Criteria - The "Melting Octree" Network 
 
L. Miguel Encarnacao, Markus H. Gross 
tr-92-047 
July 1992 
 
The following paper describes a new method to  
approximate the minimum error decision boundary for  
any supervised classification problem by means of a  
linear neural network consisting of simple neurons  
that use a local Bayes criterium and a next neighbor  
decision rule. The neurons can be interpreted as  
centroids in feature space or as a set of particles  
moving towards the classification boundary during  
training. In contrary to existing LVQ methods and RCE  
networks each neuron has a receptive field of an  
adjustable width e and the goal of the supervised  
training method is completely different.  
Furthermore, the network is able to grow in the sense  
of generating new entities in order to decrease the  
classification error after learning. <P>For this  
purpose we initialize the network via a  
multidimensional octree representation of the  
training data set. The neurons generated during  
initialization only depend on the maximum number of  
data in a single octree cell. The learning method  
introduced ensures that all neurons move towards the  
class boundaries by checking the local Bayes  
criterium in their receptive field. For this process  
can also be interpreted as a melting away of the  
initial octree, we called the network "The Melting  
Octree" network. <P>This report first describes the  
algorithms used for initialization, training as  
well as for growing of the net. The classification  
performance of the algorithm is then illustrated by  
some examples and compared with those of a Kohonen  
feature Map (LVQ) and of a backpropagated  
multilayered perceptron. <P>Note: The charts are page  
39 of the techreport. I stored them under  
#tr-92-047.charts.ps.Z. They're not absolutely  
necessary for the report; just to complete it.  
 
----- 
File: 1992/tr-92-048 
 
A Study of Perceptually Grounded Polysemy in a Spatial Microdomain 
 
Jordan Zlatev 
tr-92-048 
August 1992 
 
This paper attempts to exemplify the advantages of perceptually 
grounded semantics with respect to traditional formalist approaches in 
elucidating the nature of the controversial notion of linguistic 
polysemy, or multiplicity of meaning. It is also suggested how some 
aspects of language typically associated with compositionality could 
be modeled, without there being a strictly ``compositional 
semantics''. 
<P> 
This is done through a series of experiments, using modifications of 
Terry Regier's connectionist system for learning spatial relations 
which constitutes a part of the L subscript 0 project concerned with 
associating descriptions in an arbitrary language with an analog 
environment, (sequences of) pictures of simple two-dimensional scenes. 
<P> 
The emphasis is above all on the English preposition `over', famous 
for its polysemy, and analyzed in detail by [Brugman, 1981] and 
[Lakoff, 1987], but some modeling has also been done of the meaning 
of `under', as well as some rudimentary semantics for simple verbs 
such as `be', `go' and `fly' that combine with the two prepositions. 
<P> 
Three kinds of connectionist architectures have been used in trying to 
capture what might be called a `polysemous over'. It is suggested 
that the first seems to treat polysemy like what has traditionally 
been regarded as generality, where distinctions are neutralized and 
senses are not distinct, while the second reduces polysemy to 
homonomy where they are distinct but not related. It is the third 
type of (structured) connectionist architecture that managed best in 
both learning different senses and reflecting the polysemous 
structure of the lexical item in analyses of the relevant hidden 
layers. In this architecture polysemy emerges as an effect of the 
combinatorics of words and their pairing with the environment. 
<P> 
The main theoretical claim is that polysemy is best regarded as a 
contextual rather than a purely lexical phenomenon. This on its 
part suggests support for the claim made in [Geeraerts, 1992] that 
the distinction between polysemy and generality is unstable, and for 
a semantics that is radically anti-reificational. The results from 
this study suggest that such a semantics can account for the 
generativity and systematicity of language, despite claims to the 
contrary made by formalists. 
<P> 
Keywords: computational linguistics,  
polysemy, perceptually grounded semantics, neural  
networks, partially structured connectionism.  
 
 
----- 
File: 1992/tr-92-049 
 
An Abductive Framework for Generalized Logic Programs: Preliminary Report 
 
Gerhard Brewka 
tr-92-049 
July, 1992 
 
We present an abductive semantics for generalized  
propositional logic programs which defines the  
meaning of a logic program in terms of its extensions.  
This approach extends the stable model semantics for  
normal logic programs in a natural way. The new  
semantics is equivalent to stable semantics for a  
logic program $P$ whenever $P$ is normal and has a  
stable model. The existence of extensions is  
guaranteed for all normal programs. The semantics  
can be applied without further modification to  
generalized logic programs where disjunctions and  
negation signs may appear in the head of rules. Our  
approach is based on an idea recently proposed by  
Konolige for causal reasoning. Instead of  
maximizing in abduction the set of used hypotheses  
alone we maximize the union of the used and refuted  
hypotheses.  
 
----- 
File: 1992/tr-92-050 
 
The Degrees of Discontinuity of some Translators between Representations of 
the Real Numbers 
 
Klaus Weihrauch 
tr-92-050 
July 1992 
 
Representations like decimal representation are  
used for defining computability on the set of real  
numbers. Translatability between different  
representations has been studied in the past by  
several authors. Most of the not computably solvable  
translation problems are not even continuously  
solvable. In this paper the degrees of discontinuity  
of translations between a number of common  
representations are compared and characterized.  
Mainly three degrees are considered: the first one  
with translations between the standard  
representation and the weak cut representations,  
the second one contains among others the  
translations between ``m''-adic and ``n''-adic  
representations, and the third one contains  
translations concerning proper cut  
representations and the iterated fraction  
representation.  
 
----- 
File: 1992/tr-92-051 
 
Improved Parallel Polynomial Division and Its Extensions 
 
Dario Bini and Victor Pan 
tr-92-051 
August 1992 
 
We compute the first N coefficients of the reciprocal  
r(x) of a given polynomial p(x), (r(x)p(x)=1 mod x^N,  
p(0) not equal to 0), by using, under the PRAM  
arithmetic models, O(h log N) time-steps and  
O((N/h)(1+2^{-h}log^{(h)}N)), processors, for  
any h, h=1, 2,... log^*N, provided that O(log m) steps  
and m processors suffice to perform DFT on m points and  
that log^{(0), N=N, log^{(h)}N = log_2 log  
^{(h-1)}N, h=1,...,log^*N, log^*N = max{h:  
\log^{(h)} N > 0. The same complexity estimates apply  
to some other computations, such as the division with  
a remainder of two polynomials of degrees O(N) and the  
inversion of an N times N triangular Toeplitz matrix.  
This improves the known estimates of Reif-Tate and  
Georgiev. We also show how to extend our techniques to  
parallel implementation of other recursive  
processes, such as the evaluation modulo x^N of the  
m^th root, p(x)^{1/m, of p(x) (for any fixed natural  
m), for which we need 0(log N log log N) time-steps and  
O(N/log log N) processors. The paper demonstrates  
some new techniques of supereffective slowdown of  
parallel algebraic computations, which we combine  
with a technique of stream contraction.  
 
----- 
File: 1992/tr-92-052 
 
Improved Parallel Computations with Toeplitz-like and Hankel-like Matrices 
 
Dario Bini and Victor Pan 
tr-92-052 
August 1992 
 
The known parallel algorithms for computations with  
general Toeplitz, Hankel, Toeplitz-like, and  
Hankel-like matrices are inherently sequential. We  
develop some new techniques in order to devise fast  
parallel algorithms for such computations,  
including the evaluation of Krylov sequences for  
such matrices, traces of their power sums,  
characteristic polynomials and generalized  
inverses. This has further extensions to computing  
the solution or a least-squares solution to a linear  
system of equations with such a matrix and to several  
polynomial evaluations (such as computing gcd, lcm,  
Pade approximation and extended Euclidean scheme  
for two polynomials), as well as to computing the  
minimum span of a linear recurrence sequence. The  
algorithms can be applied over any field of  
constants, with the resulting dvantages of using  
modular arithmetic. The algorithms consist of imple  
computational blocks (mostly reduced to fast  
Fourier transforms, FFT's) and have potential  
practical value. We also develop the techniques for  
extending all our results to the case of matrices  
representable as the sums of Toeplitz-like and  
Hankel-like matrices and in addition show some more  
minor innovations, such as an improvement of the  
transition to the solution to a Toeplitz linear  
system Tx=b from two computed columns of T^-1.  
 
----- 
File: 1992/tr-92-053 
 
A Mechanism for Dynamic Re-routing of Real-time Channels 
 
Colin Parris, Hui Zhang and Domenico Ferrari 
tr-92-053 
August 1992 
 
Various solutions have been proposed to provide  
real-time services (i.e., services with guaranteed  
performance requirements) in packet-switched  
networks. These solutions usually require fixed  
routing and resource reservation for each  
conversation. The routing and reservation  
decisions, combined with load fluctuations,  
introduce the problems of network unavailability  
and loss of network management flexibility. We  
believe that these problems can be alleviated by  
properly balancing the network load. In this paper,  
we present a mechanism that dynamically reroutes a  
real-time channel without disruption of service to  
the clients. This mechanism is one component in a  
framework to investigate load balancing in a  
real-time internetwork. We show that the mechanism  
can be incorporated into the Tenet real-time  
protocol suite with minimal changes and overhead.  
 
----- 
File: 1992/tr-92-054 
 
Process Grammar Processor: An Architecture for a Parallel Parser 
 
Massimo Marino 
tr-92-054 
August, 1992 
 
A parallel architecture of a parser for Natural  
Language is described. A serial architecture has  
been already realized and is currently used in a  
system for the design and testing of Natural Language  
grammars and the generation of the corresponding  
parsers. This system works using a Process Grammar  
Processor running a model of grammar suited for the  
generation of Natural Language applications. The  
grammar model, named Process Grammar (PG), is an  
extension of an augmented context-free  
phrase-structure grammar, and the parser is  
designed to use such a grammar model. A PG is a set of  
rules that are treated by the processor as  
descriptors of processes that are scheduled and  
applied if the conditions for their execution hold:  
from this the name Process Grammar. In this report the  
PG model is extended in order to allow a more  
structured and modular construction of grammars,  
even of big dimensions, keeping separated parsing  
control, and syntactic and semantic  
specifications, partitioning a PG in clusters of  
rules, completely independent one from each other,  
carrying on their own dedicated recognition of  
specific parts of speech. The parallel architecture  
is composed by parallel processes cooperating and  
communicating by means of a message passing  
protocol. This allows the realization of some  
parsing strategies and the implementation of  
parsing mechanisms extending the recognition  
capacity of the parser that could not be possible in a  
standard and serial context-free parsing  
environment. Both serial and parallel versions of  
the parser are introduced and described, looking in  
greater detail the mechanisms of process scheduling  
and how they can be used and extended for implementing  
various cases of parsing strategies.  
 
----- 
File: 1992/tr-92-055 
 
A New Approach to Fast Polynomial Interpolation and Multipoint Evaluation 
 
Victor Pan 
tr-92-055 
August 1992 
 
The fastest known algorithms for the problems of  
polynomial evaluation and multipoint  
interpolation are devastatingly unstable  
numerically because of their recursive use of  
polynomial divisions. We apply a completely  
distinct approach to compute approximate  
sollutions to both problems equally fast but with  
improved numerical stability. Our approach relies  
on new techniques, so far not used in this area: we  
reduce the problems to Vandermonde matrix  
computations and then exploit some recent methods  
for improving computations with structured  
matrices.  
 
----- 
File: 1992/tr-92-056 
 
On-line Graph Algorithms for Incremental Compilation 
 
Alberto Marchetti-Spaccamela, Umberto Nanni, Hans Rohnert 
tr-92-056 
August 1992 
 
Compilers usually construct various data  
structures which often vary only slightly from  
compilation run to compilation run. This paper gives  
various solutions to the problems of quickly  
updating these data structures instead of building  
them from scratch each time. All problems we found can  
be reduced to graph problems. Specifically, we give  
algorithms for updating data structures for the  
problems of topological order, loop detection, and  
reachability from the start routine.  
 
----- 
File: 1992/tr-92-057 
 
Describing and Recognizing Shape through Size Functions 
 
Claudio Uras and Alessandro Verri 
tr-92-057 
September 1992 
 
According to a recent mathematical theory the  
intuitive concept of shape can be formalized through  
functions, named "size functions", which convey  
information on both the topological and metric  
properties of the viewed shape. In this paper the main  
concepts and results of the theory are first reviewed  
in a somewhat intuitive fashion. Then, an algorithm  
for the computation of discrete size functions is  
presented. Finally, by introducing a suitable  
distance function, it is shown that size functions  
can be successfully used for both shape description  
and recognition from real images.  
 
----- 
File: 1992/tr-92-058 
 
Planar Passive Navigation: One Dimension is Better than Two 
 
Enrico De Micheli and Alessandro Verri 
tr-92-058 
November 1992 
 
This paper is based on the observation that if a  
viewing camera is appropriately mounted on a vehicle  
which moves on a planar surface, i.e. the image plane  
of the camera is orthogonal to the planar surface and  
the optical axis parallel to the instantaneous  
direction of translation, then the angular velocity  
is the only motion parameter to be computed.  
Consequently, the problem of motion and structure  
recovery from optical flow becomes linear and, in  
principle, can be solved locally. Elementary error  
analysis shows that the angular velocity can be  
robustly estimated by averaging the horizontal  
component of the optical flow along the vertical line  
through the center of the image. Once the angular  
velocity has been recovered, depth can be computed  
from one component only of the optical flow. It is  
shown that the accuracy in the estimation of depth  
from the vertical component is more accurate,  
increases with the distance from the horizontal  
liine through the center of the image, and is almost  
independent of the angular velocity. From the  
reported experiments on synthetic data and real  
images it can be concluded that in applications like  
autonomous robot navigation the computation of the  
two-dimensional (2D) optical flow over the entire 2D  
image plane can be probably avoided.  
 
----- 
File: 1992/tr-92-059 
 
Learning Topology-Preserving Maps Using Self-Supervised Backpropagation on a 
Parallel Machine 
 
Arnfried Ossen 
tr-92-059 
September 1992 
 
Self-supervised backpropagation is an  
unsupervised learning procedure for feedforward  
networks, where the desired output vector is  
identical with the input vector. For  
backpropagation, we are able to use powerful  
simulators running on parallel machines.  
Topology-preserving maps, on the other hand, can be  
developed by a variant of the competitive learning  
procedure. However, in a degenerate case,  
self-supervised backpropagation is a version of  
competitive learning. A simple extension of the cost  
function of backpropagation leads to a competitive  
version of self-supervised backpropagation, which  
can be used to produce topographic maps. We  
demonstrate the approach applied to the Traveling  
Salesman Problem (TSP). The algorithm was  
implemented using the backpropagation simulator  
(CLONES) on a parallel machine (RAP).  
 
----- 
File: 1992/tr-92-060 
 
Ring Array Processor: Programmer's Guide to the RAP Libraries 
 
Michael C. Greenspon 
tr-92-060 
September 1992 
 
The RAP machine is a high performance DSP-based  
distributed memory parallel processor developed at  
ICSI as described in previous technical reports.  
This report documents the application program  
interfaces to the high-level computational  
routines provided by the RAP class libraries  
corresponding to software release 1.0. It is  
intended as both an introductory guide and standard  
library reference for C++ and C programmers  
undertaking software development for the RAP  
machine. The RAP library classes and methods  
documented in this report transparently implement  
data-parallel operations on distributed memory  
objects. Thus client programs written to these  
interfaces automatically achieve scalability  
across different sized RAP machines. Additionally,  
the high-level interfaces provide a degree of  
general hardware independence, increasing the  
likelihood that client code will port easily to  
future parallel platforms under development at  
ICSI. This report also provides an introduction to  
the internals of the distributed object  
implementation with tips and examples for  
programmers wishing to extend the libraries in a  
structured fashion.  
 
----- 
File: 1992/tr-92-061 
 
Can we Utilize the Cancellation of the Most Significant Digits? 
 
Victor Pan 
tr-92-061 
December 1992 
 
If the sum of several positive and negative numbers  
has a small magnitude, relative to the magnitudes of  
the summands, then we show how to decrease the  
precision of the computation of this sum (without  
affecting the output precision). Furthermore, if  
the magnitude of the inner product of two vectors is  
small and if one of them is filled with "short" binary  
numbers, each represented with only a few bits, then  
we decrease the precision of the computation of such  
an inner product (without affecting the output  
precision), and we extend this result to the  
iterative improvement algorithm for a linear system  
of equations, whose coefficients are represented by  
"short" binary numbers. We achieve this by  
truncating both the least and the most significant  
digits of the operands, according to our new scheme of  
"backward binary segmentation".  
 
----- 
File: 1992/tr-92-062 
 
The Acquisition of Lexical Semantics for Spatial Terms: A Connectionist 
Model of Perceptual Categorization 
 
Terry Regier 
tr-92-062 
September, 1992 
 
This thesis describes a connectionist model which  
learns to perceive spatial events and relations in  
simple movies of 2-dimensional objects, so as to name  
the events and relations as a speaker of a particular  
natural language would. Thus, the model learns  
perceptually grounded semantics for natural  
language spatial terms. The design and construction  
of this system have resulted in several technical  
contributions. The first is a very simple but  
effective means of learning without explicit  
negative evidence. This thesis also presents the  
notion of partially-structured connectionism, a  
marriage of structured and unstructured network  
design techniques capturing the best of each  
paradigm. Finally, the idea of learning within  
highly specialized structural devices is  
introduced. Scientifically, the primary result of  
the work described here is a computational model of  
the acquisition of visually grounded semantics.  
This model successfully learns terms for spatial  
events and relations from a range of languages with  
widely differing spatial systems, including  
English, Mixtec (a Mexican Indian language),  
German, Bengali, and Russian. And perhaps most  
importantly, the model does more than just  
recapitulate the data; it also generates a number of  
falsifiable linguistic predictions regarding the  
sorts of semantic features, and combinations of  
features, one might expect to find in lexemes for  
spatial events and relations in the world's natural  
languages.  
 
----- 
File: 1992/tr-92-063 
 
Block Korkin-Zolotarev Bases and Successive Minima 
 
C. P. Schnorr 
tr-92-063 
September 1992 
 
Using block Korkin--Zolotarev bases we improve  
Babai's construction of a nearby lattice point.  
Given a block Korkin--Zolotarev basis with block  
size beta of the lattice L and given a point x in the span  
of L, a lattice point v can be found in time  
beta^{O(beta)} satisfying |x-v|^2 less then or  
equal to m gamma^{2m/{beta-1}_beta min_u epsilon L}  
|x-u|. These results also bear improvements for the  
method of solving integer programming problems via  
basis reduction.  
 
----- 
File: 1992/tr-92-064 
 
Competitive Analysis of Financial Games 
 
R. El-Yaniv and A. Fiat and R. Karp and G. Turpin 
tr-92-064 
September 1992 
 
In the unidirectional conversion problem an on-line  
player is given the task of converting dollars to yen  
over some period of time. Each day, a new exchange rate  
is announced, and the player must decide how many  
dollars to convert. His goal is to minimize the  
competitive ratio, defined as sup_E  
P_OPT(E)\P_{X}(E), where E ranges over exchange  
rate sequences, P_OPT(E) is the number of yen  
obtained by an optimal off-line algorithm, and  
P_{X}(E) is the number of yen obtained by the on-line  
algorithm X. We also consider a continuous version of  
the problem, in which the exchange rate varies over a  
continuous time interval. The on-line player's a  
priori information about the fluctuation of  
exchange rates distinguishes different variants of  
the problem. For three variants we show that a simple  
threat-based strategy is optimal for the on-line  
player and determine its competitive ratio. We also  
derive and analyze an optimal policy for the on-line  
player when he knows the probability distribution of  
the maximum value that the exchange rate will reach.  
Finally, we consider a bidirectional conversion  
problem, in which the player may trade dollars for yen  
or yen for dollars.  
 
----- 
File: 1992/tr-92-065 
 
The Impact of Multimedia Data on Database Management Systems 
 
Karl Aberer and Wolfgang Klas 
tr-92-065 
September 1992 
 
NOTE: Many have reported problems printing this  
file. Thus we have renamed it with a .BAD tag. We offer  
this techreport "as-is" and cannot offer help  
printing it. This paper analyzes the impact of  
multimedia data on database management systems and  
proposes some solutions which allow for a high degree  
of integrated handling of multimedia data by a  
multimedia database system. We first give a  
characterization of multimedia data with respect to  
issues like time dependency and amount of data. Then  
we derive major requirements which need to be  
satisfied in order to provide the integration. These  
requirements include e.g., dynamic data  
management, non-transparent parallelism,  
scheduling, several kinds of abstractions,  
resource distribution transparency, and advanced  
interaction models satisfying real time  
constraints. We show how some of the requirements can  
be met by exploiting concepts from the  
object-oriented paradigm and database systems.  
Then we discuss extensions needed with respect to  
data integration, scheduling, parallelism, and  
real time streams.  
 
----- 
File: 1992/tr-92-066 
 
Physical Mapping of Chromosomes: A Combinatorial Problem in Molecular 
Biology 
 
Farid Alizadeh, Richard M. Karp, Lee A. Newberg, Deborah K. Weisser 
tr-92-066 
September 1993 
 
A fundamental tool for exploring the structure of a  
long DNA sequence is to construct a ``library''  
consisting of many cloned fragments of the sequence.  
Each fragment can be replicated indefinitely and  
then ``fingerprinted'' to obtain partial  
information about its structure. A common type of  
fingerprinting is restriction fingerprinting, in  
which an enzyme called a restriction nuclease  
cleaves the fragment wherever a particular short  
sequence of nucleotides (letters `A', `G', `C', and  
`T') occurs, and the lengths of the resulting pieces  
are measured. An important combinatorial problem is  
to determine, from such fingerprint information,  
the most probable arrangement of the cloned  
fragments along the overall sequence. However, for a  
given arrangement, even the likelihood function  
involves a complicated multifold integral and  
therefore difficult to compute. We propose an  
approximation to the likelihood function and  
develop local search algorithms based on this  
approximate objective function. Our local search  
techniques are extensions of similar strategies for  
the travelling salesman problem. We provide some  
computational results which support our choice of  
objective function. We also briefly study  
alternative approaches based on pairwise  
probabilities that two fragments overlap.  
 
----- 
File: 1992/tr-92-067 
 
Integrating a Relational Database System into VODAK using its Metaclass 
Concept 
 
W. Klas, G. Fischer and K. Aberer 
tr-92-067 
August 1992 
 
This paper presents a specific approach of  
integrating a relational database system into a  
federated database system. The underlying database  
integration process consists of three steps: first,  
the external database systems have to be connected to  
the integrated database system environment and the  
external data models have to be mapped into a  
canonical data model. This step is often called  
syntactic transformation including structural  
enrichment and leads to component schemas for each  
external DBMS. Second, the resulting schemas from  
the first step are used to construct export schemas  
which are then integrated into global, individual  
schemas or views. In this paper we focus on the first  
step for relational databases, i.e., the connection  
of a relational database system and the mapping of the  
relational model into a canonical data model. We take  
POSTGRES as the relational database system and the  
object-oriented federated database system VODAK as  
the integration platform which provides the open,  
object-oriented data model as the canonical data  
model for the integration. We show different  
variations of mapping the relational model. By  
exploiting the metaclass concept provided by VML we  
show how to tailor VML such that the canonical data  
model meets the requirements of integrating  
POSTGRES into the global database system VODAK in an  
efficient way.  
 
----- 
File: 1992/tr-92-068 
 
Public Randomness in Cryptography 
 
Amir Herzberg and Michael Luby 
tr-92-068 
October 1992 
 
The main contribution of this paper is the  
introduction of a formal notion of public randomness  
in the context of cryptography. We show how this  
notino affects the definition of the security of a  
cryptoigraphic primitive and the definition of how  
much security is preserved when one cryptographic  
primitive is reduced to another. Previous works  
considered the public random bits as a part of the  
input, and security was parameterized in terms of the  
total length of the input. We parameterize security  
solely in terms of the length of the private input, and  
treat the public random bits as a separate resource.  
This separation allows us to independently address  
the important issues of how much security is  
preserved by a reduction and how many public random  
bits are used in the reduction. <P>To exemplify these new  
definitions, we present reductions from weak  
one-way permutations to one-way permutations with  
strong security preserving properties that are  
simpler than previously known reductions.  
 
----- 
File: 1992/tr-92-069 
 
Inductive learning of compact rule sets by using effcient hypotheses 
reduction 
 
Thomas Koch 
tr-92-069 
September 1992 
 
A method is described which reduces the hypotheses  
space with an efficient and easily interpretable  
reduction criteria called a - reduction. A learning  
algorithm is described based on a - reduction and  
analyzed by using probability approximate correct  
learning results. The results are obtained by  
reducing a rule set to an equivalent set of kDNF  
formulas. The goal of the learning algorithm is to  
induce a compact rule set describing the basic  
dependencies within a set of data. The reduction is  
based on criterion which is very flexible and gives a  
semantic interpretation of the rules which fulfill  
the criteria. Comparison with syntactical  
hypotheses reduction show that the a - reduction  
improves search and has a smaller probability of  
missclassification.  
 
----- 
File: 1992/tr-92-070 
 
On Randomized Algebraic Test Complexity 
 
Peter Buergisser, Marek Karpinski, and Thomas Lickteig 
tr-92-070 
October 1992 
 
We investigate the impact of randomization on the  
complexity of deciding membership in a  
(semi-)algebraic subset $X \subset \rr^m$.  
Examples are exhibited where allowing for a certain  
error probability $\epsilon$ in the answer of the  
algorithms the complexity of decision problems  
decreases. A randomized  
$(\Omega^k,\{=,\leq\})$-decision tree ($k  
\subseteq\rr$ a subfield) over $m$ will be defined as  
a pair $(T,\mu)$ where $\mu$ a probability measure on  
some $\rr^n$ and $T$ is a $(\Omega^k,\{=,\leq\})$-  
decision tree over $m+n$. We prove a general lower  
bound on the average decision complexity for testing  
membership in an irreducible algebraic subset $X  
\subset \rr^m$ and apply it to $k$-generic complete  
intersection of polynomials of the same degree,  
extending results in [4, 6]. We also give  
applications to nongeneric cases, such as graphs of  
elementary symmetric functions,  
$\mbox{SL}(m,\rr)$, and determinant varieties,  
extending results in \cite{Li:90}.  
 
----- 
File: 1992/tr-92-071 
 
An Efficient Parallel Algorithm for Computing a Maximal Independent Set in a 
Hypergraph of Dimension 3 
 
Elias Dahlhaus, Marek Karpinski, and Peter Kelsen 
tr-92-071 
October 1992 
 
The paper considers the problem of computing a  
maximal independent set in a hypergraph (see  
\cite{BL} and \cite{KR}). We present an efficient  
deterministic NC algorithm for finding a maximal  
independent set in a hypergraph of dimension $3$: the  
algorithm runs in time $O(\log^4 n)$ time on $n+m$  
processors of an EREW PRAM and is optimal up to a  
polylogarithmic factor. Our algorithm adapts the  
technique of Goldberg and Spencer (\cite{GS}) for  
finding a maximal independent set in a graph (or  
hypergraph of dimension $2$). It is the first  
efficient NC algorithm for finding a maximal  
independent set in a hypergraph of dimension greater  
than 2.  
 
----- 
File: 1992/tr-92-072 
 
Network Support For Multimedia: A Discussion of the Tenet Approach 
 
Domenico Ferrari, Anindo Banerjea and Hui Zhang 
tr-92-072 
October 1992 
 
Multimedia communication can be supported in an  
integrated-services network in the general  
framework of realtime communication. The Tenet  
Group has devised an approach that provides some  
initial solutions to the realtime communication  
problem. This paper attempts to identify the  
principles behind these solutions. We also describe  
a suite of protocols, and their implementations in  
several environments, that embody these  
principles, and work in progress that will lead  
towards more complete solutions.  
 
----- 
File: 1992/tr-92-073 
 
Optimal Traversal of Directed Hypergraphs 
 
Giorgio Ausiello, Giuseppe F. Italiano and Umberto Nanni 
tr-92-073 
September 1992 
 
A ``directed hypergraph'' is defined by a set of nodes  
and a set of ``hyperarcs'', each of which connects a  
set of ``source'' nodes to a single ``target'' node.  
Directed hypergraphs are used in several contexts to  
model different combinatorial structures, such as  
functional dependencies [20], Horn clauses in  
propositional calculus [6], AND-OR graphs [17],  
Petri nets [18]. A ``hyperpath'', similarly to the  
analogous notion of path in directed graphs,  
consists of a connection among nodes using  
hyperarcs. Unlike paths in graphs, hyperpaths are  
suitable of different definitions of measure,  
corresponding to different concepts arising in  
various applications. <P>In this paper we consider the  
problem of finding optimal hyperpaths according to  
several optimization criteria. We show that some of  
these problems are NP-hard but, if the measure  
function on hyperpaths matches certain conditions  
(namely if it is ``value-based''), the problem turns  
out to be tractable. We describe efficient  
algorithms and data structures to find optimal  
hyperpaths which can be used with any value-based  
measure function, since it appears in parametric  
form. The achieved time bound is O(|H| + n log n) for a  
hypergraph with n nodes and an overall description of  
size |H|. Dynamic maintenance of optimal hyperpaths  
is also considered, and the proposed solution  
supports insertions of hyperarcs.  
 
----- 
File: 1992/tr-92-074 
 
When is the Assignment Bound Tight for the Asymmetric Traveling-Salesman 
Problem? 
 
Alan Frieze, Richard Karp and Bruce Reed 
tr-92-074 
November 1992 
 
We consider the probabilistic relationship between  
the value of a random asymmetric traveling salesman  
problem ATSP(M) and the value of its assignment  
relaxation AP(M). We assume here that the costs are  
given by an n\times n matrix M whose entries are  
independently and identically distributed. We  
focus on the relationship between  
Pr(ATSP(M)=AP(M)) and the probability p_n that any  
particular entry is zero. If np_n\rightarrow \infty  
with n then we prove that ATSP(M)=AP(M) with  
probability 1-o(1). This is shown to be best possible  
in the sense that if np(n)\rightarrow c, c>0 and  
constant, then Pr(ATSP(M)=AP(M))<1-\phi(c) for  
some positive function phi. Finally, if  
np_n\rightarrow 0 then  
Pr(ATSP(M)=AP(M))\rightarrow 0.  
 
----- 
File: 1992/tr-92-075 
 
Genetic and Non Genetic Operators in Alecsys - Revised Version 
 
Marco Dorigo 
tr-92-075 
December 1992 
 
It is well known that standard learning classifier  
systems, when applied to many different domains,  
exhibit a number of problems: payoff oscillation,  
difficult to regulate interplay between the reward  
system and the background genetic algorithm (GA),  
rule chains instability, default hierarchies  
instability, are only a few. Alecsys is a parallel  
version of a standard learning classifier system  
(CS), and as such suffers of these same problems. In  
this paper we propose some innovative solutions to  
some of these problems. We introduce the following  
original features. Mutespec, a new genetic operator  
used to specialize potentially useful classifiers.  
Energy, a quantity introduced to measure global  
convergence in order to apply the genetic algorithm  
only when the system is close to a steady state.  
Dynamical adjustment of the classifiers set  
cardinality, in order to speed up the performance  
phase of the algorithm. We present simulation  
results of experiments run in a simulated  
two-dimensional world in which a simple agent learns  
to follow a light source.  
<P> 
Keywords: learning classifier systems, genetic algorithms, robotics.  
 
----- 
File: 1992/tr-92-076 
 
Approximate Evaluation of a Polynomial on a Set of Real Points 
 
Victor Pan 
tr-92-076 
November 1992 
 
The previous best algorithm for approximate  
evaluation of a polynomial on a real set was due to  
Rokhlin and required the order of mu + (nu superscript  
3) infinite precision arithmetic operations to  
approximate [on a fixed bounded set X(m) of m + 1 real  
points] a degree n polynomial p(x) = (sum  
(superscript n) (subscript i=0)) (p subscript i) (x  
superscript i) within the error bound (2 superscript  
-u) (sum (superscript n) (subscript i=0)) |p  
subscript i|. We develop an approximation  
algorithm, which decreases Rokhlin's record  
estimate to O(m (log superscript 2) u + n min (u, log  
n)). For log u = o(log n), this result may also be  
favorably compared with the record bound O((m+n)  
(log superscript 2) n) on the complexity of the exact  
multipoint polynomial evaluation. The new  
algorithm can be performed in the fields (or rings)  
generated by the input values, which enables us to  
decrease the precision of the computations [by using  
modular (residue) arithmetic] and to simplify our  
computations further in the case where u = O(log n).  
Our algorithm allows NC and simultaneously  
processor efficient parallel implementation.  
Because of the fundamental nature of the multipoint  
polynomial evaluation, our results have further  
applications to numerical and algebraic  
computational problems. By passing, we also show a  
substantial improvement in the Chinese remainder  
algorithm for integers based on incorporating  
Kaminski's fast residue computation.  
 
----- 
File: 1992/tr-92-077 
 
Polynomial Uniform Convergence and Polynomial-Sample Learnability 
 
Alberto Beroni, Paola Campadelli, Anna Morpurgo, and Sandra Panizza 
tr-92-077 
November 1992 
 
In the PAC model, polynomial-sample learnability in  
the distribution dependent framework has been  
characterized in terms of minimun cardinality of  
$\epsilon$-covers. In this paper we propose another  
approach to the problem by investigating the  
relationship between polynomial-sample  
learnability and uniform convergence, in analogy to  
what was done for the distribution free setting.  
First of all, we introduce the notion of polynomial  
uniform convergence, giving a characterization for  
it in terms of an entropic measure, then we study its  
relationship with polynomial- sample  
learnability. We show that, contrarily to what  
happens in the distribution independent setting,  
polynomial uniform convergence is a sufficient but  
not necessary condition for polynomial-sample  
learnability.  
 
----- 
File: 1992/tr-92-078 
 
On Randomized Versus Deterministic Computation 
 
Marek Karpinski and Rutger Verbeek 
tr-92-078 
November 1992 
 
In contrast to deterministic or nondeterministic  
computation, it is a fundamental open problem in  
randomized computation how to separate different  
randomized time classes (at this point we do not even  
know how to separate linear randomized time from O(n  
superscript (log n)) randomized time) or how to  
compare them relative to corresponding  
deterministic time classes. In another words we are  
far from understanding the power of ``random coin  
tosses'' in the computation, and the possible ways of  
simulating them deterministically. <P>In this paper we  
study the relative power of linear and polynomial  
randomized time compared with exponential  
deterministic time. Surprisingly, we are able to  
construct an oracle A such that exponential time  
(with or without the oracle A) is simulated by linear  
time Las Vegas algorithms using the oracle A. We are  
also able to prove, for the first time, that in some  
situations the randomized reductions are  
exponentially more powerful than deterministic  
ones (cf. [Adleman, Manders, 1977]). <P>Furthermore, a  
set B is constructed such that Monte Carlo polynomial  
time (BPP) under the oracle B is exponentially more  
powerful than deterministic time with  
nondeterministic oracles. This strengthens  
considerably a result of Stockmeyer [St85] about the  
polynomial time hierarchy that for some decidable  
oracle B, (BPP superscript B) (not subseteq)(Delta  
subscript 2)(P superscript B). Under our oracle BPP  
superscript B is exponentially more powerful than  
(Delta subscript 2)(P superscript B), and B does not  
add any power to (Delta subscript 2)(EXPTIME).  
 
----- 
File: 1992/tr-92-079 
 
Computation of the Additive Complexity of Algebraic Circuits with Root 
Extracting 
 
Marek Karpinski and Rutger Verbeek 
tr-92-079 
November 1992 
 
We design an algorithm for computing the generalized  
(algebraic circuits with root extraction)  
``additive complexity'' of any rational function.  
It is the first computability result of this sort on  
the additive complexity of algebraic circuits (cf.  
[SW80]).  
 
----- 
File: 1992/tr-92-080 
 
Simulating Threshold Circuits by Majority Circuits 
 
Mikael Goldmann and Marek Karpinski 
tr-92-080 
December 1992 
 
We prove that a single threshold gate can be simulated  
by an explicit polynomial size depth 2 majority  
circuit. In general we show that a depth d threshold  
circuit can be simulated uniformly by a majority  
circuit of depth d+1. Goldmann, Hastad and Razborov  
demonstrated that a non-uniform simulation exists.  
Our construction answers two open questions posed in  
their work: We give an explicit construction whereas  
Goldmann, Hastad and Razborov use a randomized  
existence argument, and we show that such a  
simulation is possible even if the depth d grows with  
the number of variables n (the simulation in their  
work gives polynomial size circuits only when d is  
constant).  
 
----- 
File: 1992/tr-92-081 
 
Connectionist Probability Estimation in HMM Speech Recognition 
 
Steve Renals and Nelson Morgan 
tr-92-081 
December 1992 
 
This report is concerned with integrating  
connectionist networks into a hidden Markov model  
(HMM) speech recognition system. This is achieved  
through a statistical understanding of  
connectionist networks as probability estimators,  
first elucidated by Herve- Bourlard. We review the  
basis of HMM speech recognition, and point out the  
possible benefits of incorporating connectionist  
networks. We discuss some issues necessary to the  
construction of a connectionist HMM recognition  
system, and describe the performance of such a  
system, including evaluations on the DARPA  
database, in collaboration with Mike Cohen and  
Horacio Franco of SRI International. In conclusion,  
we show that a connectionist component improves a  
state of the art HMM system.  
 
----- 
File: 1992/tr-92-082 
 
Perfect Zero-Knowledge Arguments for NP Can Be Based on General Complexity 
Assumptions 
 
Moni Naor and Rafail Ostrovsky 
tr-92-082 
December 1992 
 
"Zero-knowledge arguments" is a fundamental cryptographic primitive 
which allows one polynomial-time player to convince another 
polynomial-time player of the validity of an NP statement, without 
revealing any additional information in the information-theoretic 
sense. Despite their practical and theoretical importance, it was 
only known how to implement zero-knowledge arguments based on specific 
algebraic assumptions; basing them on a general complexity assumption 
was open since their introduction in 1986 [BCC, BC, CH]. In this 
paper, we finally show a general construction, which can be based on 
<b>any</b> one-way permutation.  
<P> 
We stress that our scheme is "efficient": both players can execute 
only polynomial-time programs during the protocol. Moreover, the 
security achieved is "on-line": in order to cheat and validate a 
false theorem, the prover must break a cryptographic assumption 
on-line "during the conversation", while the verifier can not find 
(ever!) any information unconditionally (in the information 
theoretic sense). 
 
 
----- 
File: 1992/tr-92-083 
 
Invariant Signatures and Non-Interactive Zero-Knowledge Proofs are 
Equivalent 
 
Shafi Goldwasser and Rafail Ostrovsky 
tr-92-083 
December 1992 
 
The standard definition of digital signatures  
allows a document to have many valid signatures. In  
this paper, we consider a subclass of digital  
signatures, called invariant signatures, in which  
all legal signatures of a document must be identical  
according to some polynomial-time computable  
function (of a signature) which is hard to predict  
given an unsigned document. We formalize this notion  
and show its equivalence to non-interactive  
zero-knowledge proofs.  
 
----- 
File: 1993/tr-93-001 
 
Implicit Parallelism in Genetic Algorithms 
 
Alberto Bertoni, Marco Dorigo 
tr-93-001 
January 1993 [November 1993 (Second Edition)] 
 
This paper is related to Holland's result on implicit  
parallelism. Roughly speaking, Holland showed a lower  
bound of the order of (n^3)/(c1*sqrt(l)) to the number  
of schemata usefully processed by the genetic algorithm  
in a population of n=c1*2^l binary strings, with c1 a small  
integer. We analyze the case of population of n = 2*beta*l  
binary strings where beta is a positive parameter (Holland's  
result is related to the case beta=1). In the main result,  
for all beta>0 we state a lower bound on the expected number  
of processed schemata; moreover, we prove that this bound is  
tight up to a constant for all beta>=1 and, in this case, we  
strengthen in probability the previous result.  
<P> 
Keywords: genetic algorithms, implicit parallelism.  
 
 
 
----- 
File: 1993/tr-93-002 
 
Optimization Problems: Expressibility, Approximation Properties and Expected 
Asymptotic Growth of Optimal Solutions 
 
T. Behrendt and K. Compton and E. Graedel 
tr-93-002 
January 1993 
 
We extend the recent approach of Papadimitrou and  
Yannakakis that relates the approximation  
properties of optimization problems to their  
logical representation. <P>Our work builds on results  
by Kolaitis and Thakur who sytematically studied the  
expressibility classes MS_n and MP_n of  
maximization problems and showed that they form a  
short hierarchy of four levels. The two lowest  
levels, MS_0 and MS_1 coincide with the classes Max  
Snp and Max Np of Papadimitriou and Yannakakis; they  
contain on ly problems that are approximable in  
polynomial time up to a constant factor and thus  
provide a logical criterion for approximability.  
However, there are computationally very easy  
maximization problems, such as Maximum Connected  
Component (MCC) that fail to satisfy this criterion.  
<P>We modify these classes by allowing the formulae to  
contain predicates that are definable in least  
fixpoint logic. In addition, we maximize not only  
over relations but also over constants. We call the  
extended classes MSF_i and MPF_i. The proof of  
Papadimitriou and Yannakakis can be extended to  
MSF_1 to show that all problems in this class are  
approximable. Some problems, such as MCC, descend  
from the highest level in the original hierarchy to  
the lowest level MSF_0 in the new hierarchy. Thus our  
extended class MSF_1 provides a more powerful  
sufficient criterion for approximability than the  
original class MS_1. <P>We separate the extended  
classes and prove that a number of important problems  
do not belong to MSF_1. These include Max Clique, Max  
Independent Set, V-C Dimension and Max Common  
Induced Subgraph. <P>To do this we introduce a new method  
that characterizes rates of growth of aveage optimal  
solution sizes. For instance, it is known that the  
expected size of a maximal clique in a random graph  
grows logarithmically with respect to the  
cardinality of the graph. We show that no problem in  
MSF_1 can have this property, thus proving that Max  
Clique is not in MSF_1. This technique is related to  
limit laws for various logics and to the  
probabilistic method from combinatorics. We  
believe that this method may be of independent  
interest. <P>In contrast to the recent results on the  
non-approximability of many maximization  
problems, among them Max Clique, our results do not  
depend on any unproved hypothesis from complexity  
theory, such as P does not equal NP.  
 
----- 
File: 1993/tr-93-003 
 
Simple Multivariate Polynomial Multiplication 
 
Victor Pan 
tr-93-003 
August 1993 
 
We observe that polynomial evaluation and  
interpolation can be performed fast over a  
multidimensional grid (lattice), and we apply this  
observation in order to obtain the bounds M(c,m) is  
greater than or equal to (c superscript m)(1 + m + 1.5m +  
2 (log subscript 2) c) over the fields of constants  
supporting FFT on c points, c being a power of 2, and M  
(c, m) = 0[N log N log log c], over any field, where N = (c  
superscript m), and M (c, m) denotes the number of  
arithmetic operations required in order to multiply  
(over any field F) a pair of m-variate polynomials  
whose product has degree at most c - 1 in each variable,  
so that M (c, m) = 0(N log N) if c=0(1), m right arrow  
infinity (over any field F), versus the known bound of  
O (N log N log log N).  
 
----- 
File: 1993/tr-93-004 
 
Mixture Models and the EM Algorithm for Object Recognition within 
Compositional Hierarchies. Part 1: Recognition 
 
Joachim Utans 
tr-93-004 
January 1993 
 
We apply the Expectation Maximization (EM)  
algorithm to an assignment problem where in addition  
to binary assignment variables analog parameters  
must be estimated. As an example, we use the problem of  
part labelling in the context of model based object  
recognition where models are stored in from of a  
compositional hierarchy. This problem has been  
formulated previously as a graph matching problem  
and stated in terms of minimizing an objective  
function that a recurrent neural network solves.  
Mjolsness has introduced a "stochastic visual  
grammar" as a model for this problem; there the  
matching problem arises from an index renumbering  
operation via a permutation matrix. The  
optimization problem w.r.t the match variables is  
difficult and Mean Field Annealing techniques are  
used to solve it. Here we propose to model the part  
labelling problem in terms of a mixture of  
distributions, each describing the parameters of a  
part. Under this model, the match variables  
correspond to the a posteriori estimates of the  
mixture coefficients. The parts in the input image  
are unlabelled, this problem can be stated as missing  
data problem and the EM algorithm can be used to  
recover the labels and estimate parameters. The  
resulting update equations are identical to the  
Elastic Net equations; however, the update dynamics  
differ.  
<P> 
Keywords: EM algorithm, object  
recognition, compositional hierarchy, elastic  
matching, mean field annealing.  
 
----- 
File: 1993/tr-93-005 
 
A Dynamic Connection Management Scheme for Guaranteed Performance Services 
in Packet-Switching Integrated Services Networks 
 
Colin Parris 
tr-93-005 
January 1993 
 
With the demand for multimedia and computational  
science applications, guaranteed performance  
communication services have become a necessary  
feature of future high-speed networks. These  
communications services should possess a high level  
of sophistication so that they can easily adapt the  
network to the wide variety of applications soon to be  
seen, thereby allowing the network to increase its  
availability and flexibility. Availability is the  
ability of the network to accommodate as many  
real-time clients as possible without violating any  
client's performance guarantees, while  
flexibility is the ability to adapt to changing  
network state and client demands in order to maintain  
the performance guarantees and quality of service  
promised to the client. Flexibility also refers to  
the ability of the network to easily increase the  
variety of real-time services that it offers. It is  
our contention that availability and flexibility  
can be enhanced in a network by providing the network  
with the ability to modify the performance  
parameters and/or the route of any guaranteed  
performance connection in the network without  
violating the previously made performance  
contracts. <P>In this paper, we present a scheme for  
dynamically managing guaranteed performance  
service connections and experimental results to  
verify the correctness and usefulness of the scheme.  
The motivation for this scheme, Dynamic Connection  
Management (DCM), is discussed, and detailed  
descriptions of the DCM modification contracts and  
algorithms are provided. A survey of guaranteed  
performance services protocols, architectures,  
and routing algorithms are presented together with  
their relevance to this work. A simulator has been  
built, and preliminary experiments and analyses  
were done on the scheme. The paper concludes with a  
summary and some topics for future work.  
 
----- 
File: 1993/tr-93-006 
 
A Characterization of Multi-Party Interactive Multimedia Applications 
 
Clemens Szyperski and Giorgio Ventre 
tr-93-006 
January 1993 
 
This document tries to define and characterize a  
class of applications called Multi-Party  
Interactive Multimedia (MIM), for which many  
examples are given. This class includes  
applications such as CSCW, teleconferencing, and  
remote education; its consideration in this report  
is based on the observation that MIM applications are  
both important and representative for the area of  
high-performance real-time communication. Purely  
functional criteria are used to capture the MIM  
class, {i.e.} ones that are not related to any  
particular way of implementation. Thus, future  
directions are sketched that give some indications  
on what a network architecture will need to provide,  
in order to effectively support such applications.  
 
----- 
File: 1993/tr-93-007 
 
On Removing Randomness from a Parallel Algorithm for Minimum Cuts 
 
Michael Luby, Joseph Naor, Moni Naor 
tr-93-007 
February 1993 
 
The weighted minimum cut problem in a graph is a  
fundamental problem in combinatorial  
optimization. Recently, Karger suggested a  
randomized parallel algorithm for this problem. We  
show that a similar algorithm can be implemented  
using only $O(\log^2 n)$ random bits. We also show  
that our result holds for computing minimum weight  
k-cuts, where k is fixed.  
 
----- 
File: 1993/tr-93-008 
 
Galileo: a Tool for Simulation and Analysis of Real-Time Networks 
 
Edward W. Knightly and Giorgio Ventre 
tr-93-008 
March 1993 
 
Galileo is a flexible tool for simulation of  
heterogeneous real-time communication networks  
and for development and verification of network  
protocols. Galileo provides several unique  
features that make it particularly suitable for the  
simulation and analysis of networks that provide  
quality-of-service guarantees. First, its  
object-oriented programming environment provides  
the means for a modular, hierarchical,  
heterogeneous description of networks. Second, its  
multimedia device interface provides the tools for a  
qualitative analysis of network protocols.  
Finally, Galileo's network interface provides  
interaction with actual networks to access real data  
and simulate realistic multimedia scenarios.  
 
----- 
File: 1993/tr-93-009 
 
On Deterministic Approximation of DNF 
 
Michael Luby and Boban Velickovic 
tr-93-009 
March 1993 
 
We develop efficient deterministic algorithms for  
approximating the fraction of truth assignments  
that satisfy a disjunctive normal form formula.  
Although the algorithms themselves are  
deterministic, their analysis is probabilistic and  
uses the notion of limited independence between  
random variables.  
 
----- 
File: 1993/tr-93-010 
 
Optimal Speedup of Las Vegas Algorithms 
 
Michael Luby and Alistair Sinclair and David Zuckerman 
tr-93-010 
March 1993 
 
Let A be a Las Vegas algorithm, i.e., A is a randomized  
algorithm that always produces the correct answer  
when it stops but whose running time is a random  
variable. We consider the problem of minimizing the  
expected time required to obtain an answer from~A  
using strategies which simulate A as follows: run A  
for a fixed amount of time t_1, then run A  
independently for a fixed amount of time t_2, etc. The  
simulation stops if A completes its execution during  
any of the runs. Let S=(t_1,t_2,...) be a strategy,  
and let \ell_A=inf_{S}T(A,S), where T(A,S) is the  
expected value of the running time of the simulation  
of A under strategy S. <P>We describe a simple universal  
strategy Sopt, with the property that, for any  
algorithm A, T(A,S^univ)=O(\ell_A log(\ell_A)).  
Furthermore, we show that this is the best  
performance that can be achieved, up to a constant  
factor, by any universal strategy.  
 
----- 
File: 1993/tr-93-011 
 
Graceful Adaptation of Guaranteed Performance Service Connections 
 
Colin Parris, Giorgio Ventre, Hui Zhang 
tr-93-011 
March 1993 
 
Most of the solutions proposed to support real-time  
communication services in a packet-switching  
network adopt a connection-oriented and  
reservation-oriented approach. In this approach,  
the resource allocation and route selection  
decisions are made before the start of the  
application on the basis of resource availability  
and real-time network load at that time, and are  
usually kept for the duration of the application.  
However, such an approach shows two major  
limitations: first, the communication service  
provided is usually fixed, with no or limited  
capability of adaptation to dynamic changes in the  
clients' requirements; second, a low utilization of  
the network may be observed. In this paper, we present  
a flexible management scheme that allows graceful  
adaptation of guaranteed performance service  
connections. Mechanisms have been devised to allow  
changing of the traffic and performance parameters  
of a real-time communication during its lifetime.  
These mechanisms, together with an adaption policy,  
can make more efficient use of the network resources  
by performing cooperative, consenting, high-level  
multiplexing. We distinguish between two types of  
adaptation: client initiated adaptation and  
network initiated adaptation. We give examples for  
both types and we also present results from  
simulation experiments to verify the correctness  
our proposal.  
 
----- 
File: 1993/tr-93-012 
 
Estimation of noise spectrum and its application to SNR-estimation and 
speech enhancement 
 
Hans-G&uuml;nter Hirsch 
tr-93-012 
March 1993 
 
One possible solution to improve recognition of  
noisy speech is the application of noise reduction  
techniques. Spectral subtraction is one well known  
technique to reduce stationary background noise in  
case of recording with a single microphone. An  
estimation of the noise spectrum is necessary to  
apply this method. The determination of segments  
containing just noise is usually a difficult task.  
This report describes a method to estimate the noise  
spectrum without the need of distinguishing between  
segments of noisy speech and segments of pure noise.  
The estimation of noise power inside one subband is  
based on an analysis of the histogram of a certain  
number of past short-term energy values inside this  
subband. This technique for estimating the noise  
spectrum can be used to estimate the actual  
signal-to-noise ratio (SNR). Another application  
is the integration inside a spectral subtraction  
technique for speech enhancement.  
 
----- 
File: 1993/tr-93-013 
 
Optimal Stochastic Quadrature Formulas For Convex Functions 
 
Erich Novak and Knut Petras 
tr-93-013 
March 1993 
 
We study optimal stochastic (or Monte Carlo)  
quadrature formulas for convex functions. While  
nonadaptive Monte Carlo methods are not better than  
deterministic methods we prove that adaptive Monte  
Carlo methods are much better.  
 
----- 
File: 1993/tr-93-014 
 
Optimal Recovery and n-Widths For Convex Classes of Functions 
 
Erich Novak 
tr-93-014 
March 1993 
 
We study the problem of optimal recovery in the case of  
a nonsymmetric convex class of functions. In  
particular we show that adaptive methods may be much  
better than nonadaptive methods. We define certain  
Gelfand-type widths that are useful for  
nonsymmetric classes and prove relations to optimal  
error bounds for adaptive and nonadaptive methods,  
respectively.  
 
----- 
File: 1993/tr-93-015 
 
Channel Groups: A Unifying Abstraction for Specifying Inter-stream Relationships 
 
Amit Gupta and Mark Moran 
tr-93-015 
March 1993 
 
A single distributed application typically  
requires setting up a number of real-time  
connections, or channels. Current schemes usually  
assume that different channels are independent,  
when in reality, important relationships often  
exist between them. We introduce a new abstraction  
called channel groups that allows network clients to  
describe these relationships explicitly to the  
network service provider. For example, by  
describing sharing relationships between  
channels, the network client enables the network to  
share resource allocations among related  
channels-lowering the cost and improving the  
scalability of communication. In addition,  
specification of other relationships, such as  
inter-stream synchronization, disjoint-path  
routing, relative dropping priorities, and  
simultaneous establishment provide a richer, more  
efficient service. Channel groups provide a  
unifying abstraction and an easily-extensible  
interface for specifying these and other  
relationships. This report presents a general  
description of the channel group abstraction and  
demonstrates its usefulness in describing several  
types of inter-stream relationships.  
 
----- 
File: 1993/tr-93-016 
 
Accelerated Solution of the Tridiagonal Symmetric Eigenvalue Problem 
 
Victor Pan 
tr-93-016 
March 1993 
 
We present new algorithms that accelerate the  
bisection method for the symmetric eigenvalue  
problem. The algorithms rely on some new techniques,  
which include acceleration of Newton's iteration  
and can also be further applied to acceleration of  
some other iterative processes, in particular, of  
iterative algorithms for approximating polynomial  
zeros.  
 
----- 
File: 1993/tr-93-017 
 
Efficient Multicasting for Interactive Multimedia Applications 
 
Clemens Szyperski and Giorgio Ventre 
tr-93-017 
March 1993 
 
A specific class of multimedia applications is  
expected to be of importance for future  
communication networks: Multi-Party Interactive  
Multimedia (MIM). Based on the isolation and  
characterization of MIM applications, concrete  
network support requirements are derived in this  
paper. The varying degree of connectivity, the  
vastly different sizes in terms of participants and  
the reliance on a guaranteed Quality of Service make  
MIM support a difficult problem. Starting with the  
definition of multimedia communication  
abstractions, principles of solutions are  
sketched. For an important subclass of applications  
a particularly efficient and practicable  
alternative implementation based on half-duplex  
channels is introduced. Finally, interfaces at both  
the transport and network layers are considered.  
 
----- 
File: 1993/tr-93-018 
 
Navigation Without Perception of Coordinates and Distances 
 
Armin Hemmerling 
tr-93-018 
March 1993 
 
We consider the target--reaching problem in plane  
scenes for a point robot which has a tactile sensor and  
can locate the target ray. It might have a compass,  
too, but it is not able to perceive the coordinates of  
its position nor to measure distances. The  
complexity of an algorithm is measured by the number  
of straight moves until reaching the target, as a  
function of the number of vertices of the (polygonal)  
scene. <P>It is shown how the target point can be reached  
by exhaustive search without using a compass, with  
the complexity exp(O(n^{2})). Using a compass,  
there is a target--reaching algorithm, based on  
rotation counting, with the complexity O(n^{2}).  
<P>The decision problem, to recognize if the target  
cannot be reached because it belongs to an obstacle,  
cannot be solved by our type of robot. If the behaviour  
of a robot without compass is periodic in a  
homogeneous environment, it cannot solve the  
target--reaching problem.  
<P> 
Keywords: motion planning, on-line algorithms, labyrinth problems,  
exhaustive search, rotation counting, trap  
constructions, power of compass.  
 
----- 
File: 1993/tr-93-019 
 
Matchings in Lattice Graphs (Preliminary Version) 
 
Claire Kenyon, Dana Randall, Alistair Sinclair 
tr-93-019 
March 1993 
 
We study the problem of counting the number of  
matchings of given cardinality in a d-dimensional  
rectangular lattice. This problem arises in several  
models in statistical physics, including  
monomer-dimer systems and cell-cluster theory. A  
classical algorithms due to Fisher, Kasteleyn and  
Temperley counts perfect matchings exactly in two  
dimensions, but is not applicable in higher  
dimensions and does not allow one to count matchings  
of arbitrary cardinality. In this paper, we present  
the first efficient approximation algorithms for  
counting matchings of arbitrary cardinality in  
(i)~d-dimensional ``periodic'' lattices (i.e.,  
with wrap-around edges) in any fixed dimension~d;  
and (ii)~two-dimensional lattices with ``fixed  
boundary conditions'' (i.e., no wrap-around  
edges). Our technique generalizes to approximately  
counting matchings in any bipartite graph that is the  
Cayley graph of some finite group.  
 
----- 
File: 1993/tr-93-020 
 
Design Principles of Parallel Operating Systems: ---A PEACE Case Study--- 
 
Wolfgang Schr&ouml;der-Preikschat 
tr-93-020 
April 1993 
 
Forthcoming massively parallel systems are  
distributed memory architectures. They consist of  
several hundreds to thousands of autonomous  
processing nodes interconnected by a high-speed  
network. A major challenge in operating system  
design for massively parallel architectures is to  
design a structure that reduces system bootstrap  
time, avoids bottlenecks in serving system calls,  
promotes fault tolerance, is dynamically  
alterable, and application-oriented. In addition  
to that, system-wide message passing is demanded to  
be of very low latency and very high efficiency. State  
of the art parallel operating systems design must  
obey the maxim not to punish an application by  
unneeded system functions. This requires to design a  
parallel operating system as a family of program  
modules, with parallel applications being an  
integral part of that family, and motivates object  
orientation to achieve an efficient  
implementation.  
<P> 
Keywords: MIMD systems, parallel  
operating systems, microkernel family, object  
orientation  
 
----- 
File: 1993/tr-93-021 
 
CNS-1 Architecture Specification: A Connectionist Network Supercomputer 
 
Krste Asanovic, James Beck, Tim Callahan, Jerry Feldman, Bertrand Irissou, 
Brian Kingsbury, Phil Kohn, John Lazzaro, Nelson Morgan, David Stoutamire 
and John Wawrzynek 
tr-93-021 
April 1993 
 
A Collaboration of the University of California,  
Berkeley and the International Computer Science  
Institute. The Connectionist Network  
Supercomputer, or CNS-1, is a multi-year effort to  
incorporate recent advances in VLSI design and  
application-specific computer architecture for  
the realization of a massively parallel machine.  
Application targets for the CNS-1 include  
connectionist networks in the areas of speech  
recognition, language modeling, vision, and  
hardware simulation for VLSI. This technical report  
presents the background and motivation for  
high-level design decisions, along with  
descriptions of several hardware and software  
elements. The document represents a "snapshot" of  
the design, which is expected to be operational in  
1995.  
<P> 
Keywords: connectionist networks, VLSI,  
computer architecture, Torrent, Hydrant,  
application-specific, massively parallel.  
 
----- 
File: 1993/tr-93-022 
 
A Multivalued Evolutionary Algorithm 
 
Hans-Michael Voigt, Joachim Born & Ivan Santibanez-Koref 
tr-93-022 
April 1993 
 
With this paper we present a Multivalued  
Evolutionary Algorithm (MEA) which is inspired by  
fuzzy set theory. The genetic representation and  
encoding is done in such a way that no inferences can be  
drawn from phenotype to genotype. This  
representation influences the used genetic  
operators. The basic operators of the algorithm will  
be explained and comparisons for global  
optimization problems with recently published  
results will be presented.  
 
----- 
File: 1993/tr-93-023 
 
Training Agents to Perform Sequential Behavior 
 
Marco Colombetti, Marco Dorigo 
tr-93-023 
September 1993 
 
This paper is concerned with training an agent to  
perform sequential behavior. In previous work we  
have been applying reinforcement learning  
techniques to control a reactive robot. Obviously, a  
pure reactive system is limited in the kind of  
interactions it can learn. In particular, it can only  
learn what we call pseudo-sequences, that is  
sequences of actions in which the transition signal  
is generated by the appearance of a sensorial  
stimulus. We discuss the difference between  
pseudo-sequences and proper sequences, and the  
implication that these differences have on training  
procedures. A result of our research is that, in case  
of proper sequences, for learning to be successful  
the agent must have some kind of memory; moreover it is  
often necessary to let the trainer and the learner  
communicate. We study therefore the influence of  
communication on the learning process. First we  
consider trainer-to-learner communication  
introducing the concept of reinforcement sensor,  
which let the learning robot explicitly know whether  
the last reinforcement was a reward or a punishment;  
we also show how the use of this sensor induces the  
creation of a set of error recovery rules. Then we  
introduce learner-to-trainer communication,  
which is used to disambiguate indeterminate  
training situations, that is situations in which  
observation alone of the learner behavior does not  
provide the trainer with enough information to  
decide if the learner is performing a right or a wrong  
move. All the design choices we make are discussed and  
compared by means of experiments in a simulated  
world.  
<P> 
Keywords: machine learning, adaptive  
systems, genetic algorithms, learning classifier  
systems, behavior-based robotics, reinforcement  
learning.  
 
----- 
File: 1993/tr-93-024 
 
Generalized Vandermonde Determinants over the Chebyshev Basis 
 
Thorsten Werther 
tr-93-024 
April, 1993 
 
The recent developments in the area of interpolation  
and learnability of sparse polynomials over the  
reals are based on the nonsingularity of the  
generalized Vandermonde matrix. In this paper we  
study real polynomials that admit sparse  
representations in the Chebyshev basis. The main  
result of the paper states the analogy of Michell's  
theorem for the Chebyshev case, i.e. the determinant  
of the generalized Vandermonde matrix build over the  
Chebyshev basis can be represented in this basis as  
the product of the standard Vandermonde determinant  
and a polynomial with nonnegative integer  
coefficients. An immediate consequence of this  
result is the nonsingularity of Vandermonde  
matrices over the Chebyshev basis provided that the  
indeterminates take distinct values greater 1. <P>As an  
application, we investigate the relationship  
between the number of real roots of a polynomial and  
its sparsity with respect to the Chebyshev basis. We  
prove that the number of real zeros of a polynomial,  
either to the left or to the right of the interval of  
orthogonality, does not exceed its sparsity with  
respect to the Chebyshev basis. The bound on the  
number of real roots is used to prove finiteness of the  
Vapnik- Chervonenkis dimension (and thereby  
uniform learnability) of the class of polynomials of  
bounded sparsity over the Chebyshev basis.  
 
----- 
File: 1993/tr-93-025 
 
Kohonen Feature Maps and Growing Cell Structures --a Performance Comparison 
 
Bernd Fritzke 
tr-93-025 
May 1993 
 
A performance comparison of two self-organizing  
networks, the Kohonen Feature Map and the recently  
proposed Growing Cell Struc- tures is made. For this  
purpose several performance criteria for  
self-organizing networks are proposed and  
motivated. The models are tested with three example  
problems of increasing difficulty. The Kohonen  
Feature Map demonstrates slightly superior results  
only for the simplest problem. For the other more  
difficult and also more realistic problems the  
Growing Cell Structures exhibit significantly  
better performance by every criterion. Additional  
advantages of the new model are that all parameters  
are constant over time and that size as well as  
structure of the network are determined  
automatically.  
<P> 
Keywords: feature map, incremental  
learning, Kohonen  
<P>* Presented at NIPS 5 in Denver 
 
----- 
File: 1993/tr-93-026 
 
Growing Cell Structures - a self-organizing network for unsupervised and 
supervised learning 
 
Bernd Fritzke 
tr-93-026 
May 1993 
 
We present a new self-organizing neural network  
model having two variants. The first variant  
performs unsupervised learning and can be used for  
data visualization, clustering, and vector  
quantization. The main advantage over existing ap-  
proaches, e.g., the Kohonen feature map, is the  
ability of the model to automatically find a suitable  
network structure and size. This is achieved through  
a controlled growth process which also includes  
occasional removal of units. The second variant of  
the model is a supervised learning method which  
results from the combination of the abovementioned  
self-organizing network with the radial basis  
function (RBF) approach. In this model it is possible  
- in con- trast to earlier approaches - to perform the  
positioning of the RBF units and the supervised  
training of the weights in parallel. Therefore, the  
current classification error can be used to  
determine where to insert new RBF units. This leads to  
small networks which generalize very well. Results  
on the two- spirals benchmark and a vowel  
classification problem are present- ed which are  
better than any results previously published.  
 
<P> 
Keywords: Self-organization, incremental  
learning, radial basis function, clustering, data  
visualization, pattern classification, two spiral  
problem, feature map  
 
----- 
File: 1993/tr-93-027 
 
A Stochastic Model of Actions and Plans for Anytime Planning under 
Uncertainty 
 
Sylvie Thiebaux, Joachim Hertzberg, William Shoaff, Moti Schneider 
tr-93-027 
May 1993 
 
Building planning systems that operate in real  
domains requires coping with both uncertainty and  
time pressure. This paper describes a model of  
reaction plans, which are generated using a  
formalization of actions and of state descriptions  
in probabilistic logic, as a basis for anytime  
planning under uncertainty. <P>The model has the  
following main features. At the action level, we  
handle incomplete and ambiguous domain  
information, and reason about alternative action  
effects whose probabilities are given. On this  
basis, we generate reaction plans that specify  
different courses of action, reflecting the domain  
uncertainty and alternative action effects; if  
generation time was insufficient, these plans may be  
left unfinished, but they can be reused,  
incrementally improved, and finished later. At the  
planning level, we develop a framework for measuring  
the quality of plans that takes domain uncertainty  
and probabilistic information into account using  
Markov chain theory; based on this framework, one can  
design anytime algorithms focusing on those parts of  
an unfinished plan first, whose completion promises  
the most ``gain''. Finally, the plan quality can be  
updated during execution, according to additional  
information acquired, and can therefore be used for  
on-line planning.  
 
----- 
File: 1993/tr-93-028 
 
pSather: Layered Extensions to an Object-Oriented Language for Efficient 
Parallel Computation 
 
Stephan Murer, Jerome A. Feldman, Chu-Cheow Lim, and Martina-Maria Seidel 
tr-93-028 
June 1993 [November 1993 (2nd edition)] 
 
pSather is a parallel extension of the existing object-oriented language 
Sather. It offers a shared-memory programming model which integrates both 
control- and data-parallel extensions. This integration increases the 
flexibility of the language to express different algorithms and data 
structures, especially on distributed-memory machines (e.g.\ CM-5). This 
report describes our design objectives and the programming language pSather 
in detail. 
 
 
 
----- 
File: 1993/tr-93-029 
 
Labeling RAAM 
 
Alessandro Sperduti 
tr-93-029 
May 1993 
 
In this report we propose an extension of the  
Recursive Auto-Associative Memory (RAAM) by  
Pollack. This extension, the Labeling RAAM (LRAAM),  
is able to encode labeled graphs with cycles by  
representing pointers explicitly. A theoretical  
analysis of the constraints imposed on the weights by  
the learning task under the hypothesis of perfect  
learning and linear output units is presented.  
Cycles and confluent pointers result to be  
particularly effective in imposing constraints on  
the weights. Some technical problems encountered in  
the RAAM, such as the termination problem in the  
learning and decoding processes, are solved more  
naturally in the LRAAM framework. The  
representations developed for the pointers seem to  
be robust to recurrent decoding along a cycle. Data  
encoded in a LRAAM can be accessed by pointer as well as  
by content. The direct access by content can be  
achieved by transforming the encoder network of the  
LRAAM in a Bidirectional Associative Memory (BAM).  
Different access procedures can be defined  
according to the access key. The access procedures  
are not wholly reliable, however they seem to have a  
high likelihood of success. A geometric  
interpretation of the decoding process is given and  
the representations developed in the pointer space  
of a two hidden units LRAAM are presented and  
discussed. In particular, the pointer space results  
to be partitioned in a fractal-like fashion. Some  
effects on the representations induced by the  
Hopfield-like dynamics of the pointer decoding  
process are discussed and an encoding scheme able to  
retain the richness of representation devised by the  
decoding function is outlined. The application of  
the LRAAM model to the control of the dynamics of  
recurrent high-order networks is briefly sketched  
as well.  
 
----- 
File: 1993/tr-93-030 
 
Sensitivity of Boolean Functions, Harmonic Analysis, and Circuit Complexity 
 
Anna Bernasconi and Bruno Codenotti 
tr-93-030 
June 1993 
 
We exploit the notion of sensitivity of Boolean  
functions to find complexity results. We first  
analyze the distribution of the average sensitivity  
over the set of all the Boolean functions, and show  
some applications of this analysis. We then use  
harmonic analysis on the cube to study how the average  
sensitivity of a Boolean function propagates if the  
function corresponds, e.g., to an oracle available  
to compute another function. We use this relation to  
prove that symmetric functions in $AC^0$ have  
exponentially decreasing average sensitivity.  
 
----- 
File: 1993/tr-93-031 
 
On Some Stability Properties of the LRAAM Model 
 
Alessandro Sperduti 
tr-93-031 
June 1993 
 
In this report we discuss some mathematical  
properties of the LRAAM model. The LRAAM model is an  
extension of the RAAM model by Pollack. It allows one  
to obtain distributed reduced representations of  
labeled graphs. In particular, we give sufficient  
conditions on the asymptotical stability of the  
decoding process along a cycle of the encoded  
structure. Data encoded in an LRAAM can also be  
accessed by content by transforming the LRAAM in an  
analog Hopfield network with hidden units and  
asymmetric connection matrix (CA network.)  
Different access procedures can be defined  
according to the access key. Each access procedure  
corresponds to a particular constrained version of  
the CA network. We give sufficient conditions under  
which the property of asymptotical stability of a  
fixed point in one particular constrained version of  
the CA network can be extended to related fixed points  
of different constrained versions of the CA network.  
An example of encoding of a labeled graph on which the  
theoretical results are applied is given as well.  
 
----- 
File: 1993/tr-93-032 
 
Repetitive Hidden-Surface-Removal for Polyhedra 
 
Marco Pellegrini 
tr-93-032 
July 1993 
 
The repetitive hidden-surface-removal problem can  
be rephrased as the problem of finding the most  
compact representation of all views of a polyhedral  
scene that allows efficient on-line retrieval of a  
single view. In this paper we present a novel approach  
to this problem. We assume that a polyhedral scene in  
3-space is given in advance and is preprocessed  
off-line into a data structure. Afterwards, the data  
structure is accessed repeatedly with view-points  
given on-line and the portions of the polyhedra  
visible from each view-point are produced on-line.  
This mode of operation is close to that of real  
interactive display systems. The main difficulty is  
to preprocess the scene without knowing the query  
view-points. <P>Let $n$ be the number total of edges,  
vertices and faces of the polyhedral objects and let  
$k$ be the number of vertices and edges of the image.  
The main result of this paper is that, using an  
off-line data structure of size $m$ with  
$n^{1+\epsilon} \leq m \leq n^{2+\epsilon}$, it is  
possible to answer on-line hidden-surface-removal  
queries in time $O(k\log n + \min\{n\log n,  
kn^{1+\epsilon}/m^{1/2}\})$, when the scene is  
composed of $c$-oriented polyhedra. This data  
structure accommodates dynamic insertion and  
deletion of polyhedral objects. The polyhedra may  
intersect and may have cycles in the dominance  
relation. We also improve worst-case time/storage  
bounds for the repetitive hidden surface removal  
problem when the polyhedral scene is composed of  
unrestricted polyhedra. <P>Preliminary version of  
this work is in the Proceedings of the 1993 Workshop on  
Algorithms and Data Structures.  
 
----- 
File: 1993/tr-93-033 
 
Turning an Action Formalism Into a Planner---A Case Study 
 
Joachim Hertzberg, Sylvie Thiebaux 
tr-93-033 
July 1993 
 
The paper describes a case study that explores the  
idea of building a planner with a neat semantics of the  
plans it produces, by choosing some action formalism  
that is ``ideal'' for the planning application and  
building the planner accordingly. In general---and  
particularly so for the action formalism used in this  
study, which is quite expressive---this strategy is  
unlikely to yield fast and efficient planners if the  
formalism is used naively. Therefore, we adopt the  
idea that the planner approximates the  
theoretically ideal plans, where the approximation  
gets closer, the more run time the planner is allowed.  
As the particular formalism underlying our study  
allows a significant degree of uncertainty to be  
modeled and copes with the ramification problem, we  
end up in a planner that is functionally comparable to  
modern anytime uncertainty planners, yet is based on  
a neat formal semantics.  
 
----- 
File: 1993/tr-93-034 
 
On Lines Missing Polyhedral Sets in 3-Space 
 
Marco Pellegrini 
tr-93-034 
July 1993 
 
We show some combinatorial and algorithmic results  
concerning sets of lines and polyhedral objects in  
3-space. Our main results include: <P>(1) An  
$O(n^32^{c\sqrt{\log n}})$ upper bound on the worst  
case complexity of the set of lines missing a  
star-shaped compact polyhedron with $n$ edges,  
where $c$ is a suitable constant. <P>(2) An $O(n^3  
2^{c\sqrt{\log n}})$ upper bound on the worst case  
complexity of the set of lines that can be moved to  
infinity without intersecting a set of $n$ given  
lines, where $c$ is a suitable constant. This bound is  
almost tight. <P>(3) An $O(n^{1.5+\eps})$ randomized  
expected time algorithm that tests whether a  
direction $v$ exists along which a set of $n$ red lines  
can be translated away from a set of $n$ blue lines  
without collisions. <P>(4) Computing the intersection  
of two polyhedral terrains in 3-space with $n$ total  
edges in time $O(n^{4/3+\eps} + k^{1/3}n^{1+\eps} +  
k\log^2 n)$, where $k$ is the size of the output, and  
$\epsilon >0$ an arbitrary small but fixed constant.  
This algorithm improves on the best previous result  
of Chazelle at al. <P>The tools used to obtain these  
results include Plucker coordinates of lines,  
random sampling and polarity transformations in  
3-space. <P>A preliminary version of this work appeared  
in the Proceedings of the 9th ACM Symposium on  
Computational Geometry.  
 
----- 
File: 1993/tr-93-035 
 
Perturbation: An Efficient Technique for the Solution of Very Large 
Instances of the Euclidean TSP 
 
B. Codenotti, G. Manzini, L. Margara and G. Resta 
tr-93-035 
July 1993 
 
In this paper we introduce a technique for building  
efficient iterated local search procedures. This  
technique, that we call perturbation, uses global  
information on TSP instances to speed-up and improve  
the quality of the tours found by heuristic methods.  
The experimental results done on up to 100,000  
cities, show that our techniques outperform the  
known methods for iterating local search for very  
large instances.  
<P> 
Keywords: TSP, sensitivity,  
perturbation, heuristics, experimental  
evaluation.  
 
----- 
File: 1993/tr-93-036 
 
Sparse Interpolation from Multiple Derivatives 
 
Thorsten Werther 
tr-93-036 
July 1993 
 
In this note, we consider the problem of  
interpolating a sparse function from the values of  
its multiple derivatives at some given point. We give  
efficient algorithms for reconstructing sparse  
Fourier series and sparse polynomials over  
Sturm-Liouville bases. In both cases, the number of  
evaluations is linear in the sparsity.  
 
----- 
File: 1993/tr-93-037 
 
An Algorithm to Learn Read-Once Threshold Formulas, and some generic 
Transformations between Learning Models (Revised Version) 
 
Nader H. Bshouty, Thomas R. Hancock, Lisa Hellerstein, Marek Karpinski 
tr-93-037 
July 1993 
 
We present a membership query (i.e. black box  
interpolation) algorithm for exactly identifying  
the class of read-once formulas over the basis of  
boolean threshold functions. We also present a  
catalogue of generic transformations that can be  
used to convert an algorithm in one learning model  
into an algorithm in a different model.  
 
----- 
File: 1993/tr-93-038 
 
Exploitation of Structured Gating Connections for the Normalization of a 
Visual Pattern 
 
Alessandro Sperduti 
tr-93-038 
July 1993 
 
Structured gating connections can be useful to  
reduce the complexity of networks with a high number  
of inputs. An example of their application to the  
normalization of a visual pattern with respect to  
scale and position is presented. The use of gating  
connections allows us to have a linear number of  
connections in the number of pixels. The connections  
are also very localized.  
 
----- 
File: 1993/tr-93-039 
 
Building convex space partitions induced by pairwise interior-disjoint 
simplices 
 
Marco Pellegrini 
tr-93-039 
August 1993 
 
Given a set $S$ of $n$ pairwise interior-disjoint  
$(d-1)$-simplices in $d$-space, for $d \geq 3$, a  
Convex Space Partition induced by $S$ (denoted  
$CSP(S)$) is a partition of $d$-space into convex  
cells such that the interior of each cell does not  
intersect the interior of any simplex in $S$. In this  
paper it is shown that a $CSP(S)$ of size $O(n^{d-1})$  
can be computed deterministically in time  
$O(n^{d-1})$. These bounds are worst case optimal  
for $d=3$. The results are proved using a variation of  
the efficient hierarchical cuttings of Chazelle.  
 
----- 
File: 1993/tr-93-040 
 
Efficient PRAM Simulation on a Distributed Memory Machine 
 
R. Karp, M. Luby and F. Meyer auf der Heide 
tr-93-040 
August 1993 
 
We present algorithms for the randomized simulation  
of a shared memory machine (PRAM) on a Distributed  
Memory Machine (DMM). In a PRAM, memory conflicts  
occur only through concurrent access to the same  
cell, whereas the memory of a DMM is divided into  
modules, one for each processor, and concurrent  
accesses to the same module create a conflict. The  
delay of a simulation is the time needed to simulate a  
parallel memory access of the PRAM. Any general  
simulation of an m processor PRAM on a n processor DMM  
will necessarily have delay at least m/n. A  
randomized simulation is called time-processor  
optimal if the delay is O(m/n) with high probability.  
Using a novel simulation scheme based on hashing we  
obtain a time-processor optimal simulation with  
delay O(\loglog(n)\logstn). The best previous  
simulations use a simpler scheme based on hashing and  
have much larger delay.  
 
----- 
File: 1993/tr-93-041 
 
Optimal Parallelization of Las Vegas Algorithms 
 
Michael Luby and Wolfgang Ertel 
tr-93-041 
September 1993 
 
Let $A$ be a Las Vegas algorithm, i.e., $A$ is a  
randomized algorithm that always produces the  
correct answer when it stops but whose running time is  
a random variable. In\lit{LSZ93} a method was  
developed for minimizing the expected time required  
to obtain an answer from~$A$ using sequential  
strategies which simulate $A$ as follows: run $A$ for  
a fixed amount of time $t_1$, then run $A$  
independently for a fixed amount of time $t_2$, etc.  
The simulation stops if $A$ completes its execution  
during any of the runs. <P>In this paper, we consider  
parallel simulation strategies for this same  
problem, i.e., strategies where many sequential  
strategies are executed independently in parallel  
using a large number of processors. We present a close  
to optimal parallel strategy for the case when the  
distribution of $A$ is known. If the number of  
processors is below a certain threshold, we show that  
this parallel strategy achieves almost linear  
speedup over the optimal sequential strategy. For  
the more realistic case where the distribution of $A$  
is not known, we describe a universal parallel  
strategy whose expected running time is only a  
logarithmic factor worse than that of an optimal  
parallel strategy. Finally, the application of the  
described parallel strategies to a randomized  
automated theorem prover confirms the theoretical  
results and shows that in most cases good speedup can  
be achieved up to hundreds of processors, even on  
networks of workstations.  
 
----- 
File: 1993/tr-93-042 
 
Lower Bounds on Complexity of Testing Membership to a Polygon for Algebraic 
and Randomized Decision Trees 
 
Dima Grigoriev, Marek Karpinski 
tr-93-042 
August 1993 
 
We describe a new method for proving lower bounds for  
algebraic decision trees. We prove, for the first  
time, that the minimum depth for arbitrary decision  
trees for the problem of testing the membership to a  
polygon with N nodes is Omega(log N). Moreover, we  
prove that the corresponding lower bound for the  
randomized decision trees matches the above bound.  
Finally, we prove that for the algebraic exp-log  
decision trees (cf. [GSY 93]), the minimum depth is  
Omega(sqrt(log N)). We generalize the last result to  
the multidimensional case, showing that if an  
exp-log decision tree tests a membership to a  
semialgebraic set with a sum of Betti numbers M, then  
the depth of a tree is at least Omega(sqrt(log M)).  
 
----- 
File: 1993/tr-93-043 
 
Finite Branching Processes and AND/OR Tree Evaluation 
 
Richard Karp 
tr-93-043 
December 1993 
 
This paper studies tail bounds on supercritical  
branching processes with finite distributions of  
offspring. Given a finite supercritical branching  
process Z_n\_0^{infinity}, we derive upper bounds,  
decaying exponentially fast as c increases, on the  
right-tail probability \Pr[Z_n > c E(Z_n)]. We  
obtain a similar upper bound on the left-tail  
probability \Pr[Z_n < \frac{E(Z_n)}c] under the  
assumption that each individual generates at least  
two offspring. As an application, we observe that the  
evaluation of an AND/OR tree by a canonical algorithm  
in certain probabilistic models can be viewed as a  
two-type supercritical finite branching process,  
and show that the execution time of this algorithm is  
likely to concentrate around its expectation.  
 
----- 
File: 1993/tr-93-044 
 
An application of a neural net for fuzzy abductive reasoning 
 
Matthias Kaiser 
tr-93-044 
August 1993 
 
This is a description of a simple system that is able of  
performing abductive reasoning over fuzzy data  
using a back-propagation neural net for the  
hypothesis generation process. <P>I will first outline  
and exemplify the notion of abduction as a process of  
building hypotheses on the basis of a given set of  
data, evaluating them to find the best hypothesis and  
give explanation for the made selection. I extend  
this notion to account for abductive reasoning over  
fuzzy data. As an example I describe the  
classification of objects according to fuzzy  
sensory features into previously learned  
categories that were represented by a set of objects  
described by feature-value-pairs from which  
prototypes are detected which form the center of a  
category. <P>In the following a brief description of the  
back-propagation algorithm and a design of a  
demonstration system that is capable of carrying out  
abductive reasoning in a small example domain is  
given. The system is able to learn to classify kinds of  
fruit given certain feature-value-pairs and to  
detect the most prototypical  
feature-value-pair-clusters within a category.  
The trained neural net is used for the hypothesis  
generation process. It also provides very critical  
information for the evaluation and explanation of  
hypotheses. I then discuss the implementation of an  
evaluation and explanation component using the  
specific capabilities of the neural net.  
 
----- 
File: 1993/tr-93-045 
 
Sather Iters: Object-Oriented Iteration Abstraction 
 
Stephan Murer, Steve Omohundro, and Clemens Szyperski 
tr-93-045 
August 1993 
 
Sather iters are a powerful new way to encapsulate  
iteration. We argue that such iteration  
abstractions belong in a class' interface on an equal  
footing with its routines. Sather iters were derived  
from CLU iterators but are much more flexible and  
better suited for object-oriented programming. We  
motivate and describe the construct along with  
several simple examples. We compare it with  
iteration based on CLU iterators, cursors, riders,  
streams, series, generators, coroutines, blocks,  
closures, and lambda expressions. Finally, we  
describe how to implement them in terms of coroutines  
and then show how to transform this implementation  
into efficient code.  
 
----- 
File: 1993/tr-93-046 
 
A Performance Analysis of the CNS-1 on Large, Dense Backpropagation Networks 
 
Silvia M. M&uuml;ller 
tr-93-046 
September 1993 
 
We determine in this study the sustained performance  
of the CNS-1 during training and evaluation of large  
multilayered feedforward neural networks. Using a  
sophisticated coding, the 128-node machine would  
achieve up to 111 GCPS and 22 GCUPS. During recall the  
machine would archieve 87% of the peak  
multiply-accumulate performance. The training of  
large nets is less efficient than the recall but only  
by a factor of 1.5 to 2. <P>The benchmark is parallelized  
and the machine code is optimized before analyzing  
the performance. Starting from an optimal parallel  
algorithm, CNS specific optimizations still reduce  
the run time by a factor of 4 for recall and by a factor of  
3 for training. Our analysis also yields some  
strategies for code optimization. <P>The CNS-1 is still  
in design, and therefore we have to model the run time  
behavior of the memory system and the  
interconnection network. This gives us the option of  
changing some parameters of the CNS-1 system in order  
to analyze their performance impact.  
<P> 
Keywords: CNS, performance analysis, run time model,  
backpropagation, parallelization.  
 
----- 
File: 1993/tr-93-047 
 
Source-to-Source Code Generation Based on Pattern Matching and Dynamic 
Programming 
 
Weimin Chen and Volker Turau 
tr-93-047 
August 1993 
 
This paper introduces a new technique for  
source-to-source code generation based on pattern  
matching and dynamic programming. This technique  
can be applied to all source and target-languages  
which satisfy some requirements. The main  
differences to conventional approaches are the  
complexity of the target language, the handling of  
side effects caused by function calls and the  
introduction of temporaries. Code optimization is  
achieved by introducing a new cost- model. The  
technique allows an incremental development based  
on improvements of the target library. These require  
only a modification of the rewriting rules since  
those are separated from the pattern matching  
algorithm. Experience of an successful application  
of our technique is given.  
 
----- 
File: 1993/tr-93-048 
 
The Sublogarithmic Space World 
 
Maciej Liskiewicz and Ruediger Reischuk 
tr-93-048 
August 1993 
 
(Pages: 42) This paper tries to fully characterize  
the properties and relationships of space classes  
defined by Turing machines that use less than  
logarithmic space -- may they be deterministic,  
nondeterministic or alternating (DTM, NTM or ATM).  
We provide several examples of specific languages  
and show that such machines are unable to accept these  
languages. The basic proof method is a nontrivial  
extension of the (1 superscript n) right arrow (1  
superscript (n+n!)) technique to alternating TMs  
 
----- 
File: 1993/tr-93-049 
 
Precise Average Case Complexity Measures 
 
Ruediger Reischuk 
tr-93-049 
August 1993 
 
(Pages: 36) A new definition is given for the average  
growth of a function f : (Sigma superscript *) right  
arrow (IN) with respect to a probability measure mu on  
(Sigma superscript *). This allows us to define  
meaningful average case distributional complexity  
classes for arbitrary time bounds (previously, one  
could not guarantee arbitrary good precision). It is  
shown that basically only the ranking of the inputs by  
decreasing probabilities are of importance. <P>To  
compare the average and worst case complexity of  
problems we study average case complexity classes  
defined by a time bound and a bound on the complexity of  
possible distributions. Here, the complexity is  
measured by the time to compute the rank functions of  
the distributions. We obtain tight and optimal  
separation results between these average case  
classes. Also the worst case classes can be embedded  
into this hierarchy. They are shown to be identical to  
average case classes with respect to distributions  
of exponential complexity.  
 
----- 
File: 1993/tr-93-050 
 
Interior point methods in semidefinite progrmming with applications to 
combinatorial optimization 
 
Farid Alizadeh 
tr-93-050 
September 1993 
 
We study the semidefinite programming problem  
(SDP), i.e the optimization problem of a linear  
function of a symmetric matrix subject to linear  
equality constraints and the additional condition  
that the matrix be positive semidefinite. First we  
review the classical cone duality as specialized to  
SDP. Next we present aninterior point algorithm  
which converges to the optimal solution in  
polynomial time. The approach is a direct extension  
of Ye's projective method for linear programming. We  
also argue that most known interior point methods for  
linear programs can be transformed in a mechanical  
way to algorithms for SDP with proofs of convergence  
and polynomial time complexity also carrying over in  
a similar fashion. Finally we study the significance  
of these results in a variety of combinatorial  
optimization problems including the general 0-1  
integer programs, the maximum clique and maximum  
stable set problems in perfect graphs, the maximum  
$k$-partite subgraph problem in graphs, and various  
graph partitioning and cut problems. As a result, we  
present barrier oracles for certain combinatorial  
optimization problems (in particular, clique and  
stable set problem for perfect graphs) whose linear  
programming formulation requires exponentially  
many inequalities. Existence of such barrier  
oracles refutes the commonly believed notion that in  
order to solve a combinatorial optimization problem  
with interior point methods, one needs its linear  
programming formulation explicitly.  
 
----- 
File: 1993/tr-93-051 
 
Dynamic maintenance of approximate solutions of Min-Weighted Node Cover and 
Min-Weighted Set Cover problems 
 
Giorgio Gambosi, Marco Protasi, Maurizio Talamo 
tr-93-051 
September 1993 
 
In this paper, we introduce new algorithms for the  
dynamic maintenance of approximated solutions of  
Min-Weighted Node Cover and Min-Weighted Set Cover.  
For what concerns Min-Weighted Node Cover, for any  
sequence of edge insertions and deletions, the  
algorithms maintain a solution whose approximation  
ratio (that is, the ratio between the approximate and  
the optimum value) is equal to the best asymptotic one  
for the static case. The algorithms require O(1) time  
for edge insertion, while an O(1) amortized time is  
required for edge deletion. <P>For what regards  
Min-Weighted Set Cover, we present dynamic  
algorithms whose approximation ratio matches one of  
the two different and incomparable best approx-  
imate bounds for the static case. The time complexity  
for element insertion and its amortized complexity  
for element deletion are proportional to the maximum  
redundancy of an element in the approximate  
solution.  
 
----- 
File: 1993/tr-93-052 
 
On a Criterion for Minimum Uncertainty Sensing 
 
Vincenzo Caglioti 
tr-93-052 
September 1993 
 
(Pages: 34) This is an invited article for the  
Structural Complexity Column, edited by Juris  
Hartmanis, which will appear in the Bulletin EATCS in  
October 1993. The scope of the article is indicated in  
the following list of Sections: 1. Overview of  
Information-Based Complexity 2. Breaking  
Intractability 3. Verification 4. Combinatorial  
Complexity 5. Similarities and Differences with  
Discrete Complexity 6. Brief History 7. Appendix 8.  
References  
 
----- 
File: 1993/tr-93-053 
 
On a Criterion for Minimum Uncertainty Sensing 
 
Vincenzo Caglioti 
tr-93-053 
September 1993 
 
A criterion is presented for the automatic selection  
of a sensor detection aimed at observing the state of a  
system, which is described both by discrete variable  
and by continuous ones. The criterion is based on the  
expected value of the entropy variation relative to  
the transition associated to the sensor  
observation. This criterion is then applied to  
object recognition and localization tasks, in which  
the observed system is described by object class  
(i.e., a discrete variable) and by the object  
position (i.e. a vector of continuous parameters).  
The proposed criterion allows to account for the  
information obtained in the case the observed object  
is missed by the measurement. Finally, a simple  
example is discussed concerning an observed system  
constituted by an object. The state of the observed  
system is described in terms of the object identity  
and the object position and orientation parameters.  
The sensors used to observe the system are an  
orientable range finder and a mobile camera.  
 
----- 
File: 1993/tr-93-054 
 
An Investigation into Fault Recovery in Guaranteed Performance Service 
Connections 
 
Colin J. Parris and Anindo Banerjea 
tr-93-054 
October 1993 
 
As high speed networks are starting to provide  
guaranteed performance services, it is imperative  
that fault recovery techniques be revised to support  
this new service. In this paper we investigate one  
aspect of fault recovery in this context, the  
rerouting of guaranteed performance connections  
affected by link faults in the network. Recovery is  
achieved by rerouting the affected connection so as  
to avoid the failed link while ensuring that the  
traffic and performance guarantees made along the  
previous route are satisfied along the new route. The  
goal of the rerouting schemes is to reroute as much of  
the affected traffic as quickly and efficiently as  
possible. We investigate rerouting along the lines  
of two orthogonal components: the locus of reroute,  
which determines the node that does route selection  
and the new route selected; and the timing component,  
which determines when the individual reroute  
attempts are initiated. Within each of these two  
components we examine approaches that span the  
spectrum of that component. We compare all possible  
combinations of these approaches under a  
cross-section of network workloads, using in our  
comparisons a novel metric, the Queuing Delay Load  
Index, that captures both the bandwidth and delay  
resources required by a connection. Extensive  
simulation experiments were conducted on the  
various combinations and their results and analysis  
are presented in the paper.  
 
----- 
File: 1993/tr-93-055 
 
Testable Algorithms for Self-Avoiding Walks 
 
Dana Randall, Alistair Sinclair 
tr-93-055 
September 1993 
 
We present a polynomial time Monte Carlo algorithm  
for almost uniformly generating and approximately  
counting self-avoiding walks in rectangular  
lattices. These are classical problems that arise,  
for example, in the study of long polymer chains.  
While there are a number of Monte Carlo algorithms  
used to solve these problems in practice, these are  
heuristic and their correctness relies on unproven  
conjectures. In contrast, our algorithm relies on a  
single, widely-believed conjecture that is simpler  
than preceding assumptions, and, more importantly,  
is one which the algorithm itself can test. Thus our  
algorithm is reliable, in the sense that it either  
outputs answers that are guaranteed, with high  
probability, to be correct, or finds a  
counterexample to the conjecture.  
 
----- 
File: 1993/tr-93-056 
 
Dynamic Join and Leave for Real-Time Multicast 
 
Wolfgang Effelsberg, Eberhard M&uuml;ller-Menrad 
tr-93-056 
October 1993 
 
Many new applications in networks require support  
for multicast communication. In addition,  
continuous data streams such as audio and video  
require real-time performance guarantees as a  
quality of service. We introduce a model for  
real-time mulitcast channels and present a set of  
scalable algorithms for the dynamic joining and  
leaving of destination nodes in this environment. In  
particular we present an algorithm for finding a good  
attachment point to the multicast tree. We also  
describe detailed admission tests that preserve the  
guarantees given to existing channels. Our  
algorithm for a leaving node specifies in particular  
the resources to be released in the network. We also  
discuss tree reorganization issues.  
<P> 
Keywords: multicast, dynamic, membership, multicast tree  
 
----- 
File: 1993/tr-93-057 
 
Second Order Backpropagation - Efficient Computation of the Hessian Matrix 
for Neural Networks 
 
Raul Rojas 
tr-93-057 
September 1993 
 
Traditional learning methods for neural networks  
use some kind of gradient descent in order to  
determine the network's weights for a given task.  
Some second order learning algorithms deal with a  
quadratic approximation of the error function  
determined from the calculation of the Hessian  
matrix, and achieve improved convergence rates in  
many cases. We introduce in this paper second order  
backpropagation, a method to calculate efficiently  
the Hessian of a linear network of one-dimensional  
functions. This technique can be used to get explicit  
symbolic expressions or numerical approximations  
of the Hessian and could be used in parallel computers  
to improve second order learning algorithms for  
neural networks. It can be of interest also for  
computer algebra systems.  
<BR>    
[A newer version of this method is found in the book:  Raul Rojas, 
<I>Neural Networks</I>, 1996, Springer-Verlag, available in English and 
German.] 
 
----- 
File: 1993/tr-93-058 
 
Towards a cognitively based approach of a a description of spatial deixis 
 
Matthias Kaiser 
tr-93-058 
November 1993 
 
In this presentation an approach towards a  
description of spatial deixis based on the  
perceptual and cognitive abilities of humans is  
outlined. After a basic introduction into space  
perception and representation the findings of this  
part are taken to form the basis for a  
characterization of the phenomenon of deixis as well  
as the conceptual components of deictic expressions  
in a natural language. For the analysis of deictic  
expressions a cross-linguistic view is applied to  
find on the one hand universal components of those  
expressions but also a number of potentially  
influencing factors. The goal is to find features  
that may be components of deictic expressions and  
thus must be considered in a general model of spatial  
deixis which can serve to classify and describe the  
meaning of spatial deictic expressions in any  
natural language.  
 
----- 
File: 1993/tr-93-059 
 
Constructive Deterministic PRAM Simulation on a Mesh-Connected Computer 
 
Andrea Pietracaprina, Geppino Pucci and Jop F. Sibeyn 
tr-93-059 
October 1993 
 
The PRAM model of computation consists of a  
collection of sequential RAM machines accessing a  
shared memory in lock-step fashion. The PRAM is a very  
high-level abstraction of a parallel computer, and  
its direct realization in hardware is beyond reach of  
the current (or even foreseeable) technology. In  
this paper we present a deterministic simulation  
scheme to emulate PRAM computation on a  
mesh-connected computer, a feasible machine where  
each processor has its own memory module and is  
connected to at most four other processors via  
point-to-point links. In order to achieve a good  
worst-case performance, any deterministic  
simulation scheme has to replicate each variable in a  
number of copies. Such copies are stored in the local  
memory modules according to a Memory Organization  
Scheme (MOS), which is known to all the processors. A  
variable is then accessed by routing packets to its  
copies. All deterministic schemes in the literature  
make use of a MOS whose existence is proved via the  
probabilistic method, but that cannot be  
efficiently constructed. We introduce a new  
constructive MOS, and show how to employ it to  
simulate an $n$-processor PRAM on an $n$-node  
mesh-connected computer. Our simulation achieves  
almost optimal slowdown for small memories. This is  
the first constructive deterministic PRAM  
simulation on a bounded-degree network  
 
----- 
File: 1993/tr-93-060 
 
Improved Band Matrix Computations 
 
Victor Pan 
tr-93-060 
September 1993 
 
We solve a band linear system of equations and compute  
the determinant of a band matrix in NC over the complex  
field and its subfields and in RNC over any field. Our  
algorithms support the optimum bound on the  
potential work (the product of time and processor  
bounds); moreover, the algorithms are in NC  
superscript 1 or RNC superscript 1 if the bandwidth is  
a constant. These results substantially improve the  
previous records of [E].  
 
----- 
File: 1993/tr-93-061 
 
Optimum Parallel Computations with Band Matrices 
 
Victor Pan 
tr-93-061 
September 1993 
 
We devise optimum parallel algorithms for solving a  
band linear system of equations and for computing the  
determinant of a band matrix, substantially  
improving the previous record computational  
complexity estimates of [E]. All our algorithms are  
in NC or RNC and processor efficient; almost all of  
them reach the optimum bound on the potential work  
(the product of time and processor bounds).  
Moreover, these algorithms are in NC superscript 1 or  
RNC superscript 1 if the bandwidth is a constant.  
 
----- 
File: 1993/tr-93-062 
 
A Formalization of Viewpoints 
 
Giuseppe Attardi, Maria Simi 
tr-93-062 
October 1993 
 
We present a formalisation for the notion of  
"viewpoint", a construct meant for expressing  
several varieties of relativised truth. The  
formalisation consists in a logic which extends  
first order predicate calculus through an  
axiomatization of provability and with the addition  
of proper reflection rules. The extension is not  
conservative, but consistency is granted.  
Viewpoints are defined as set of reified meta-level  
sentences. A proof theory for viewponts is developed  
which enables to carry out proofs of sentences  
involving several viewpoints. A semantic account of  
viewpoints is provided, dealing with issues of self  
referential theories and paradoxes, and exploiting  
the notion of "contextual entailment". Notions such  
as beliefs, knowledge, truth and situations can be  
uniformly modeled as provability in specialised  
viewpoints, obtained by imposing suitable  
constraints on viewpoints.  
<P> 
Keywords: meta-level,  
logics for truth, belief and knowledge, situations,  
contexts.  
 
----- 
File: 1993/tr-93-063 
 
A Parallel Object-Oriented System for Realizing Reusable and Efficient Data 
Abstractions 
 
Chu-Cheow Lim 
tr-93-063 
October 1993 
 
(319 pages) We examine the use of an object-oriented  
language to make programming multiprocessors  
easier for the general programmer. We choose an  
object-oriented paradigm because we believe that  
its support for encapsulation and software reuse  
allows users who are writing general application  
programs to reuse class libraries designed by expert  
library writers. <P>We describe the design,  
implementation and use of a parallel  
object-oriented language: parallel Sather  
(pSather). PSather has a shared address space  
independent of the underlying multiprocessor  
architecture, because we believe that the  
cooperative nature of parallel programs is most  
easily captured by a shared-memory-like model. To  
account for distributed-memory machines, pSather  
uses an abstract model in which processors are  
grouped in clusters. Associated with a cluster is a  
part of the address space with fast access; access to  
other parts of the address space is $\leq 2$ orders of  
magnitude slower. PSather integrates both control  
and data-parallel constructs to support a variety of  
algorithmic styles. We have an implementation of  
pSather on the CM-5. The prototype shows that even on  
distributed-memory machines without  
hardware/operating system support for a shared  
address space, it is still practical and reasonably  
efficient for the shared address abstraction to be  
implemented in the compiler/runtime. The  
experience also helps us understand the features of  
low-level libraries that are necessary for an  
efficient realization of a high-level language. For  
example, even though low message latency is crucial,  
the message-passing paradigm (active vs. passive,  
polling vs. interrupt-driven) is also important in  
deciding how easy and efficient the language  
implementation will be. We also study certain  
straight-forward compiler optimizations. Several  
abstractions and applications have been written for  
the CM-5 using the shared-address cluster model, and  
we have achieved reasonable speedups. In some cases,  
we can further demonstrate good absolute  
performance for pSather programs (by getting their  
speedups relative to a 1-processor C program). Some  
of the abstractions are reused in several  
applications, to show how the object-oriented  
constructs facilitate code reuse. The work  
described here supports our optimism that pSather is  
a practical and efficient parallel object-oriented  
language. There are, however, still many issues that  
need to be explored in order to provide parallel  
programming environments as powerful as the ones we  
are accustomed to on sequential environments. In the  
conclusion, we summarize some of the possible future  
research directions.  
 
----- 
File: 1993/tr-93-064 
 
Engineering a Programming Language: The Type and Class System of Sather 
 
Clemens Szypersky, Stephen Omohundro, Stephan Murer 
tr-93-064 
November 1993 
 
Sather 1.0 is a programming language whose design has  
resulted from the interplay of many criteria. It  
attempts to support a powerful object-oriented  
paradigm without sacrificing either the  
computational performance of traditional  
procedural languages or support for safety and  
correctness checking. Much of the engineering  
effort went into the design of the class and type  
system. This paper describes some of these design  
decisions and relates them to approaches taken in  
other languages. We particularly focus on issues  
surrounding inheritance and subtyping and the  
decision to explicitly separate them in Sather.  
 
----- 
File: 1993/tr-93-065 
 
An Efficient Probabilistic Context-Free Parsing Algorithm that Computes 
Prefix Probabilities 
 
Andreas Stolcke 
tr-93-065 
November 1993 
 
We describe an extension of Earley's parser for  
stochastic context-free grammars that computes the  
following quantities given a stochastic  
context-free grammar and an input string: a)  
probabilities of successive prefixes being  
generated by the grammar; b) probabilities of  
substrings being generated by the nonterminals,  
including the entire string being generated by the  
grammar; c) most likely (Viterbi) parse of the  
string; d) posterior expected number of  
applications of each grammar production, as  
required for reestimating rule probabilities. (a)  
and (b) are computed incrementally in a single  
left-to-right pass over the input. Our algorithm  
compares favorably to standard bottom-up parsing  
methods for SCFGs in that it works efficiently on  
sparse grammars by making use of Earley's top-down  
control structure. It can process any context-free  
rule format without conversion to some normal form,  
and combines computations for (a) through (d) in a  
single algorithm. Finally, the algorithm has simple  
extensions for processing partially bracketed  
inputs, and for finding partial parses and their  
likelihoods on ungrammatical inputs.  
 
----- 
File: 1993/tr-93-066 
 
Recovering Guaranteed Performance Service Connections from Single and 
Multiple Faults 
 
Anindo Banerjea, Colin Parris and Domenico Ferrari 
tr-93-066 
November 1993 
 
Fault recovery techniques must be reexamined in the  
light of the new guaranteed performance services  
that networks will support. We investigate the  
rerouting of guaranteed performance service  
connections on the occurrence of link faults,  
focussing on the aspects of route selection and  
establishment in the network. In a previous  
investigation, we explored some components of  
rerouting in the presence of single link faults in the  
network. In this paper we study the behavior of our  
techniques in the presence of multiple link faults in  
the network, and also examine the technique of  
retries to improve the success of rerouting. Our  
schemes are simulated on a cross-section of network  
workloads, and compared using the criteria of the  
fraction of the affected traffic that could be  
rerouted, the time to reroute and the amount of  
resources consumed in the network. A novel metric,  
the Queueing Delay Load Index, which captures both  
the bandwidth and delay demands made on the network by  
a connection, is used to present and analyze the  
results.  
 
----- 
File: 1993/tr-93-067 
 
A Software Reuse System for C Codes 
 
Le Van Huu 
tr-93-067 
December 1993 
 
This paper presents PRASSY, a hypertext system for  
the storage and retrieval of procedure source codes,  
on the basis of the semantics of their comments. The  
objective of the system is to provide the program  
developer with the possibility of retrieving and  
reusing the source code of C subroutines that have  
been previously built by his colleagues or that are  
already present in the system. The approach adopted  
by PRASSY is the analysis of the source code comments  
and of the specification documents written in  
natural language, in order to extract indexing  
information. Such information is organized in a  
hypertext structure and the browsing mechanism is  
used by the user to select reusable software  
components. The system provides a way for measuring  
the semantic similarity between the user  
requirements and the candidate node to be selected.  
The paper describes the system's architecture and  
functionalities. Some examples of the user  
interface and the browsing mechanisms are reported.  
Finally, it describes the algorithm proposed by  
Aragon-Ramirez and Paice and adopted by PRASSY for  
defining the semantic similarity among phrases  
expressed in natural language.  
<P> 
Keywords: hypertext, software reuse, semantic phrases  
similarity  
 
----- 
File: 1993/tr-93-068 
 
Lexical Modeling in a Speaker Independent Speech Understanding System 
 
Charles Clayton Wooters 
tr-93-068 
November 1993 
 
This thesis presents an algorithm for the  
construction of models that attempt to capture the  
variation that occurs in the pronunciations of words  
in spontaneous (i.e., non-read) speech. A technique  
for developing alternate pronunciations of words  
and then estimating the probabilities of the  
alternate pronunciations is presented.  
Additionally, we describe the development and  
implementation of a spoken-language understanding  
system called the Berkeley Restaurant Project  
(BeRP). Multiple pronunciation word models  
constructed using the algorithm proposed in this  
thesis are evaluated within the context of the BeRP  
system. The results of this evaluation show that the  
explicit modeling of variation in the pronunciation  
of words improves the performance of both the speech  
recognition and the speech understanding  
components of the BeRP system.  
 
----- 
File: 1993/tr-93-069 
 
On the Definition of Speedup 
 
Wolfgang Ertel 
tr-93-069 
November 1993 
 
We propose an alternative definition for the speedup  
of parallel algorithms. Let A be a sequential  
algorithm and B a parallel algorithm for solving the  
same problem. If A and/or B are randomized or if we are  
interested in their performance on a probability  
distribution of problem instances, the running  
times are described by random variables T_A and T_B.  
The speedup is usually defined as E[T_A]/E[T_B]  
where E is the arithmetic mean. This notion of speedup  
delivers just a number, i.e. much information about  
the distribution is lost. For example, there is no  
variance of the speedup. To define a measure for  
possible fluctuations of the speedup, a new notion of  
speedup is required. The basic idea is to define  
speedup as M(T_A/ T_B) where the functional form of M  
has to be determined. Also, we argue that in many cases  
M(T_A/T_B) is more informative than E[T_A]/E[T_B]  
for a typical user of A and B. We present a set of  
intuitive axioms that any speedup function  
M(T_A/T_B) must fulfill and prove that the geometric  
mean is the only solution. As a result, we now have a  
uniquely defined speedup function that will allow  
the user of an improved system to talk about the  
average performance improvement as well as about its  
possible variations.  
 
----- 
File: 1993/tr-93-070 
 
An Alphabet-Independent Optimal Parallel Search for Three Dimensional 
Patterns 
 
Marek Karpinski, Wojciech Rytter 
tr-93-070 
November 1993 
 
We give an alphabet-independent optimal parallel  
algorithm for the searching phase of three  
dimensional pattern- matching. All occurrences of a  
three dimensional pattern P of shape m x m x m in a text T  
of shape n x n x n are to be found. Our algorithm works in  
log m time with O(N log m) processors of a CREW PRAM,  
where N = n^3. The searching phase in three dimensions  
explores classification of two- dimensional  
periodicities of the cubic pattern. Some new  
projection techniques are developed to deal with  
three dimensions. The periodicities of the pattern  
with respect to its faces are investigated. The  
nonperiodicities imply some sparseness  
properties, while periodicities imply other  
special useful properties (i.e. monotonicity) of  
the set of occurrences. Both types of properties are  
useful in deriving an efficient algorithm. <P>The  
search phase is preceeded by the preprocessing phase  
(computation of the witness table). Our main results  
concern the searching phase, however we present  
shortly a new approach to the second phase also.  
Usefullness of the dictionaries of basic factors  
(DBF's), see [CR 91], in the computation of the three  
dimensional witness table is presented. The DBF  
approach gains simplicity at the expense of a small  
increase in time. It gives a (nonoptimal) O(log m)  
time algorithm using m processors of a CRCW PRAM. The  
alphabet-independent optimal preprocessing is  
very complex even in the case of two dimensions, see  
[GP 92]. For large alphabets the DBF's give  
assymptotically the same complexity as the  
(alphabet-dependent) suffix trees approach (but  
avoids suffix trees and is simpler). <P>However, the  
basic advantage of the DBF approach is its simplicity  
of dealing with three (or more) dimensions. <P>The  
algorithm can be easily adjusted to the case of  
unequally sided patterns.  
 
----- 
File: 1993/tr-93-071 
 
Lower Bounds on Testing Membership to a Polyhedron by Algebraic Decision 
Trees 
 
Dima Grigoriev, Marek Karpinski, Nicolai Vorobjov 
tr-93-071 
November 1993 
 
We describe a new method of proving lower bounds on the  
depth of algebraic decision trees and apply it to  
prove a lower bound Omega(log N) for testing  
membership to a convex polyhedron having N facets of  
all dimensions. This bound apparently does not  
follow from the methods developed by M. Ben-Or, A.  
Bjoerner, L. Lovasz, A. Yao ([B 83], [BLY 92]) because  
the topological invariants used in these methods  
become trivial for the convex polyhedra.  
 
----- 
File: 1993/tr-93-072 
 
Software Protection and Simulation on Oblivious RAMs 
 
Oded Goldreich, Rafail Ostrovsky 
tr-93-072 
November 1993 
 
Software protection is one of the most important  
issues concerning computer practice. There exist  
many heuristics and ad-hoc methods for protection,  
but the problem as a whole has not received the  
theoretical treatment it deserves. In this paper we  
provide theoretical treatment of software  
protection. We reduce the problem of software  
protection to the problem of efficient simulation on  
{\em oblivious\/} RAM. <P>A machine is {\em  
oblivious\/} if the sequence in which it accesses  
memory locations is equivalent for any two inputs  
with the same running time. For example, an oblivious  
Turing Machine is one for which the movement of the  
heads on the tapes is identical for each computation.  
(Thus, it is independent of the actual input.) {\em  
What is the slowdown in the running time of any  
machine, if it is required to be oblivious?\/} In 1979  
Pippenger and Fischer showed how a two-tape {\em  
oblivious\/} Turing Machine can simulate, on-line,  
a one-tape Turing Machine, with a logarithmic  
slowdown in the running time. We show an analogue  
result for the random-access machine (RAM) model of  
computation. In particular, we show how to do an  
on-line simulation of an arbitrary RAM input by a  
probabilistic {\em oblivious\/} RAM with a  
poly-logarithmic slowdown in the running time. On  
the other hand, we show that a logarithmic slowdown is  
a lower bound.  
 
----- 
File: 1993/tr-93-073 
 
One-Way Functions are Essential for Non-Trivial Zero-Knowledge 
 
Rafail Ostrovsky, Avi Wigderson 
tr-93-073 
November 1993 
 
It was known that if one-way functions exist, then  
there are zero-knowledge proofs for every language  
in $\PSPACE$. We prove that unless very {\em weak}  
one-way functions exist, Zero-Knowledge proofs can  
be given only for languages in $\BPP$. For  
average-case definitions of $\BPP$ we prove an  
analogous result under the assumption that {\em  
uniform} one-way functions do not exist. <P>Thus, very  
loosely speaking, zero--knowledge is either {\em  
useless} (exists only for ``easy'' languages), or  
{\em universal} (exists for every provable  
language).  
 
----- 
File: 1993/tr-93-074 
 
How and When to Be Unique 
 
Shay Kutten, Rafail Ostrovsky, Boaz Patt-Shamir 
tr-93-074 
November 1993 
 
One of the fundamental problems in distributed  
computing is how identical processors with  
identical local memory can choose unique IDs  
provided they can flip a coin. The variant considered  
in this paper is the asynchronous shared memory model  
(atomic registers), and the basic correctness  
requirement is that upon termination the processes  
must always have unique IDs. <P>We study this problem  
from several viewpoints. On the positive side, we  
present the first protocol that solves the problem  
and terminates with probability 1. The protocol  
terminates in (optimal) $O(\log n)$ expected time,  
using $O(n)$ shared memory space, where $n$ is the  
number of participating processes. On the negative  
side, we show that no protocol can terminate with  
probability 1 if $n$ is unknown, and that no  
finite-state protocol can terminate with  
probability 1 if the schedule is non-oblivious  
(i.e., may depend on the history of the shared  
variable). <P>We also discuss the dynamic setting  
(where processes may join and leave the system  
dynamically), and give a deterministic protocol for  
the read-modify-write model that needs only 3 shared  
bits.  
 
----- 
File: 1993/tr-93-075 
 
Matching nuts and bolts 
 
Noga Alon, Manuel Blum, Amos Fiat, Sampath Kannan, Moni Naor, Rafail Ostrovsky 
tr-93-075 
November 1993 
 
We describe a procedure which may be helpful to any  
disorganized carpenter who has a mixed pile of bolts  
and nuts and wants to find the corresponding pairs of  
bolts and nuts. The procedure uses our (and the  
carpenter's) ability to construct efficiently  
highly expanding graphs. The problem considered is  
given a collection of $n$ bolts of distinct widths and  
$n$ nuts such that there is a 1-1 correspondence  
between the nuts and bolts. The goal is to find for each  
bolt its corresponding nut by comparing nuts to bolts  
but not nuts to nuts or bolts to bolts. Our objective is  
to minimize the number of operations of this kind (as  
well as the total running time). The problem has a  
randomized algorithm similar to Quicksort. Our main  
result is an $n (\log n)^{O(1)}$-time {\em  
deterministic} algorithm, based on expander  
graphs, for matching the bolts and the nuts.  
 
----- 
File: 1993/tr-93-076 
 
Any Non-Private Boolean Function Is Complete For Private Multi-Party 
Computations 
 
Eyal Kushilevitz, Silvio Micali, Rafail Ostrovsky 
tr-93-076 
November 1993 
 
Let $g$ be an $n$-argument boolean function. Suppose  
we are given a {\em black-box\/} for $g$, to which $n$  
honest-but-curious players can secretly give  
inputs and it broadcasts the result of operating $g$  
on these inputs to all the players. We say that $g$ is  
{\em complete \/} (for multi-party private  
computations) if for {\em every\/} function $f$, the  
$n$ players can compute the function $f$  
$n$-privately, given the black-box for $g$. In this  
paper, we characterize the boolean functions which  
are complete: we show that a boolean function $g$ is  
complete if and only if $g$ itself cannot be computed  
$n$-privately (when there is no black-box  
available). Namely, for boolean functions, the  
notions of {\bf completeness\/} and {\bf  
$n$-privacy} are {\em complementary\/}. On the  
other hand, for non-boolean functions, we show that  
this two notions are {\em not\/} complementary. Our  
result can be viewed as a generalization (for  
multi-party protocols and for $(n\geq 2)$-argument  
functions) of the two-party case, where it was known  
that two-argument functions which contain  
``embedded-OR'' are complete.  
 
----- 
File: 1993/tr-93-077 
 
A Cognitive Model of Sentence Interpretation: the Construction Grammar 
approach 
 
Daniel Jurafsky 
tr-93-077 
December 1993 
 
This paper describes a new,  
psychologically-plausible model of human sentence  
interpretation, based on a new model of linguistic  
structure, Construction Grammar. This on-line,  
parallel, probabilistic interpreter accounts for a  
wide variety of psycholinguistic results on lexical  
access, idiom processing, parsing preferences, and  
studies of gap-filling and other valence  
ambiguities, including various frequency effects.  
We show that many of these results derive from the  
fundamental assumptions of Construction Grammar  
that lexical idioms, idioms, and syntactic  
structures are uniformly represented as  
grammatical constructions, and argue for the use of  
probabilistically-enriched grammars and  
interpreters as models of human knowledge of and  
processing of language.  
 
----- 
File: 1993/tr-93-078 
 
An Evaluation of Burst-level Bandwidth Reservation Methods in WAN 
Environments 
 
Makiko Yoshida, Chinatsu Ikeda and Hiroshi Suzuki 
tr-93-078 
February 1994 
 
This paper shows the evaluation results of fast  
bandwidth reservation (FRP) methods applied to  
bursty traffics, e.g. a large file transfer, in ATM  
networks with long propagation delay. These  
traffics require a large bandwidth for a short time,  
i.e., bursty characteristics and a strict cell loss  
quality. For these traffics, an FRP instead of a call  
level bandwidth resservation is effective to  
utilize network resources under guaranteed QoS.  
Comparison among FRP methods with peak rate controls  
in terms of a transmission completion time is carried  
out under a short and long propagation delay. We  
compare two types of FRP and FRPs with three adaptive  
peak rate controls. <P>Evaluation results show that  
confirmed type FRP is preferable to uncomfirmed type  
FRP. In addition, we see from the results that FRP is  
preferable to unconfirmed type FRP. In addition, we  
see from the results that FRP with peak rate control  
using network availability information provids the  
shortest transmission completion time under light  
load conditions. However, the results also show that  
FRP with simple peak rate control using ACK/NACK  
provides fair transmission completion time under  
heavy load and long propagation delay conditions.  
 
----- 
File: 1993/tr-93-079 
 
On a Sublinear Time Parallel Construction of Optimal Binary Search Trees 
 
Marek Karpinski and Wojciech Rytter 
tr-93-079 
December 1993 
 
We design an efficient sublinear time parallel  
construction of optimal binary search trees. The  
efficiency of the parallel algorithm corresponds to  
its total work (the product time X processors). Our  
algorithm works in O(n_1-e) log(n)) time with the  
total work O(n_2+2e), for an aritrarily small  
constant 0< e less than or equal to one half. This is  
optimal within factor n_2e with respect to the best  
known sequential algorithm given by Knuth, which  
needs only O(n_2) time due to a monotonicity property  
of optimal binary search trees, see {6}). It is  
unknown how to explore this property in an efficient  
NC construction of binary search trees. Here we show  
that it can be effectively used in sublinear time  
parallel computation. Our improvement also relies  
on the use (in independently processed small  
subcomputations) of the parallelism present in the  
Knuth's algorithm. The best known sublinear time  
algorithms for the construction of binary search  
trees (as an instance of a more general problem) have  
O(n_3) work for time larger than n_3/4, see {3} and  
{7}. For time square root of (n) these algorithms need  
n_4 work, while our algorithm needs for this time only  
n_3 work, thus improving the known algorithms by a  
linear factor. Also if time is O(n_1-e) and e is very  
small our improvement is close to O(n). Such  
improvement is similar to the one implied by the  
monotonicity property in sequential computations  
(from n_3 sequential time for a more general dynamic  
programming problem to n_2 time for the special case  
of optimal binary search trees).  
 
----- 
File: 1993/tr-93-080 
 
Dynamic Programming in a Generalized Decision Model 
 
Ulrich Huckenbeck 
tr-93-080 
December 1993 
 
(Pages: 40) We present two dynamic programming  
strategies for a general class of decision  
processes. Each of these algorithms includes among  
others the following graph theoretic optimization  
algorithms as special cases: <P><UL><LI> the Ford-Bellman  
Strategy for optimal paths in acyclic digraphs, <LI> the  
Greedy Method for optimal forests and spanning trees  
in undirected graphs. </UL><P>In our general decision model,  
we define several structural properties of cost  
measures in order to formulate sufficient  
conditions for the correctness of our algorithms.  
<P>Our first algorithm works as fast as the original  
Ford-Bellman Strategy and the Greedy Method,  
respectively. Our second algorithm solves a larger  
class of optimization problems than our first search  
strategy.  
 
----- 
File: 1993/tr-93-081 
 
On Valve Adjustments that Interrupt all s-t-Paths in a Digraph 
 
Ulrich Huckenbeck 
tr-93-081 
December 1993 
 
(Pages: 15) When searching a path in a digraph,  
usually the following situation is given: Every node  
v may be entered by an arbitrary incoming arc (u,v),  
and v may be left by an arbitrary outgoing arc (v,w). <P>In  
this paper, however, we consider graphs with valve  
nodes, which cannot arbitrarily be entered and left.  
More precisely, a movable valve is installed in each  
valve node v. entering v via (u,v) and leaving it via  
(v,w) is only possible if the current position of the  
valve generates a connection between these two arcs;  
if, however, the current valve adjustment  
interrupts this connection then every path using the  
arcs (u,v) and (v,w) is interrupted, too. <P>We  
investigate the complexity of the following  
problem: <BLOCKQUOTE> 
<P>Given a digraph with valve nodes. Let s and t  
be two nodes of this graph. <P>Does there exist a valve  
adjustment that interrupts all paths from s to t?  
</BLOCKQUOTE><P>We  
show that this problem can be solved in deterministic  
polynomial time if all valve nodes belong to a  
particular class of valves; otherwise the problem is  
NP-complete.  
 
----- 
File: 1993/tr-93-082 
 
All-to-all Broadcast on the CNS-1 
 
Silvia M. M&uuml;ller 
tr-93-082 
December 1993 
 
This study deals with the all-to-all broadcast on the  
CNS-1. We determine a lower bound for the run time and  
present an algorithm meeting this bound. Since this  
study points out a bottleneck in the network  
interface, we also analyze the performance of  
alternative interface designs. Our analyses are  
based on a run time model of the network.  
<P> 
Keywords: CNS, all-to-all broadcast, transfer, performance  
analysis, parallelization.  
 
----- 
File: 1994/tr-94-001 
 
Surface Learning with Applications to Lip-Reading 
 
Christoph Bregler and Stephen Omohundro 
tr-94-001 
January 1994 
 
Most connectionist research has focused on learning  
mappings from one space to another (eg.  
classification and regression). This paper  
introduces the more general task of learning  
constraint surfaces. It describes a simple but  
powerful architecture for learning and  
manipulating nonlinear surfaces from data. We  
demonstrate the technique on low dimensional  
synthetic surfaces and compare it to nearest  
neighbor approaches. We then show its utility in  
learning the space of lip images in a system for  
improving speech recognition by lip reading. This  
learned surface is used to improve the visual  
tracking performance during recognition.  
 
----- 
File: 1994/tr-94-002 
 
"Eigenlips" for Robust Speech Recognition 
 
Christoph Bregler and Yochai Konig 
tr-94-002 
January 1994 
 
In this study we improve the performance of a hybrid  
connectionist speech recognition system by  
incorporating visual information about the  
corresponding lip movements. Specifically, we  
investigate the benefits of adding visual features  
in the presence of additive noise and crosstalk  
(cocktail party effect). Our study extends previous  
experiments by using a new visual front end, and an  
alternative architecture for combining the visual  
and acoustic information. Furthermore, we have  
extended our recognizer to a multi-speaker,  
connected letters recognizer. Our results show a  
significant improvement for the combined  
architecture (acoustic and visual information)  
over just the acoustic system in the presence of  
additive noise and crosstalk.  
 
----- 
File: 1994/tr-94-003 
 
Best-first Model Merging for Hidden Markov Model Induction 
 
Andreas Stolcke and Stephen M. Omohundro 
tr-94-003 
January 1994 
 
This report describes a new technique for inducing  
the structure of Hidden Markov Models from data which  
is based on the general `model merging' strategy  
(Omohundro 1992). The process begins with a maximum  
likelihood HMM that directly encodes the training  
data. Successively more general models are produced  
by merging HMM states. A Bayesian posterior  
probability criterion is used to determine which  
states to merge and when to stop generalizing. The  
procedure may be considered a heuristic search for  
the HMM structure with the highest posterior  
probability. <P>We discuss a variety of possible priors  
for HMMs, as well as a number of approximations which  
improve the computational efficiency of the  
algorithm. We studied three applications to  
evaluate the procedure. The first compares the  
merging algorithm with the standard Baum-Welch  
approach in inducing simple finite-state languages  
from small, positive-only training samples. We  
found that the merging procedure is more robust and  
accurate, particularly with a small amount of  
training data. The second application uses labelled  
speech data from the TIMIT database to build compact,  
multiple-pronunciation word models that can be used  
in speech recognition. Finally, we describe how the  
algorithm was incorporated in an operational speech  
understanding system, where it is combined with  
neural network acoustic likelihood estimators to  
improve performance over single-pronunciation  
word models.  
 
----- 
File: 1994/tr-94-004 
 
Near or Far 
 
Hermann H&auml;rtig 
tr-94-004 
January 1994 
 
To efficiently program massively parallel  
computers it is important to be aware of nearness or  
farness of references. It can be a severe performance  
bug if a reference that is meant to be near by a  
programmer turns out to be far. This paper presents a  
simple way to express nearness and farness in such a  
way that compile-time detection of such performance  
bugs becomes possible. It also allows for  
compile-time determination of nearness for many  
cases which can be used for compile time optimization  
techniques to overlap communication with  
processing. The method relies on the type system of a  
strongly typed object oriented language whose type  
rules are extended by three type coercion rules.  
<P>	 
Keywords: massively parallel systems, logical  
shared address space, distributed memory  
architectures, programming languages  
 
----- 
File: 1994/tr-94-005 
 
On the Relation Between BDDs and FDDs 
 
Bernd Becker, Rolf Drechsler, Ralph Werchner 
tr-94-005 
January 1994 
 
Data structures for Boolean functions build an  
essential component of design automation tools,  
especially in the area of logic synthesis. The state  
of the art data-structure is the ordered binary  
decision diagram (OBDD), which results from general  
binary decision diagrams (BDDs), also called  
branching programs, by ordering restrictions. In  
the context of EXOR-based logic synthesis another  
type of decision diagram (DD), called (ordered)  
functional decision diagram ((O)FDD) becomes  
increasingly important. BDDs (FDDs) are directed  
acyclic graphs, where a Shannon decomposition  
(Reed-Muller decomposition) is carried out in each  
node. <P>We study the relation between BDDs and FDDs.  
Both, BDDs and FDDs, result from DDs by defining the  
represented function in differing ways. If the  
underlying DD is complete, the relation between both  
types of interpretation can be described by a  
well-known Boolean transformation tau. This allows  
us to relate the OFDD-size of f and the OBDD-size of  
tau(f). We use this property to derive several  
results on the computational power of OFDDs and  
OBDDs. Symmetric functions are shown to have  
efficient representations as OBDDs and OFDDs as  
well. Classes of functions are given that have  
exponentially more concise OFDDs than OBDDs, and  
vice versa. In contrast to OBDDs, an exponential  
blow-up may occur in an AND-synthesis operation on  
two OFDDs. Finally, we demonstrate how the lower  
bound techniques for OBDDs can be adapted to OFDDs: We  
prove that the hidden weighted bit function and  
multiplication as well require OFDDs of exponential  
size independent of the ordering of the variables.  
Topics: Algorithms and data structures, complexity  
and computability, VLSI systems  
 
----- 
File: 1994/tr-94-006 
 
On Variable Ordering of Ordered Functional Decision Diagrams 
 
Bernd Becker, Rolf Drechsler, Michael Theobald 
tr-94-006 
January 1994 
 
In this paper methods for finding good variable  
orderings for ordered functional decision diagrams  
(OFDDs) are investigated. We present an algorithm  
for exact minimization of OFDDs that is applicable  
for functions up to $n = 14$ variables. We present an  
upper bound for the size of OFDDs representing  
tree-like circuits. Various methods for dynamic  
variable ordering based on the exchange of variables  
are presented. Experimental results are given to  
show the efficiency of our approaches.  
 
----- 
File: 1994/tr-94-007 
 
Precise n-gram Probabilities from Stochastic Context-free Grammars 
 
Andreas Stolcke and Jonathan Segal 
tr-94-007 
January 1994 
 
We present an algorithm for computing n-gram  
probabilities from stochastic context-free  
grammars, a procedure that can alleviate some of the  
standard problems associated with n-grams  
(estimation from sparse data, lack of linguistic  
structure, among others). The method operates via  
the computation of substring expectations, which in  
turn is accomplished by solving systems of linear  
equations derived from the grammar. We discuss  
efficient implementation of the algorithm and  
report our practical experience with it.  
 
----- 
File: 1994/tr-94-008 
 
A Hybrid Fault Simulator for Synchronous Sequential Circuits 
 
Rolf Krieger, Bernd Becker, Martin Keim 
tr-94-008 
January 1994 
 
Fault simulation for synchronous sequential  
circuits is a very time consuming task. The  
complexity of the task increases if there is no  
information about the initial state of the circuit  
available. In this case, an unknown initial state is  
assumed which is usually handled by introducing a  
three-valued logic. It is known, that fault  
simulation based upon this logic only determines a  
lower bound for the fault coverage achievable by a  
test sequence. Therefore, we developed a hybrid  
fault simulator H-FS combining the advantages of a  
fault simulator using the three-valued logic and of  
an exact symbolic fault simulator based upon binary  
decision diagrams. H-FS is able to handle even the  
largest benchmark circuits and thereby determines  
fault coverages much more accurately.  
 
----- 
File: 1994/tr-94-009 
 
A Performance Analysis of the CNS-1 on Spars Connectionist Networks 
 
Silvia M. M&uuml;ller and Benedict Gomes 
tr-94-009 
February 1994 
 
This report deals with the efficient mapping of  
sparse neural networks on CNS-1. We develop parallel  
vector code for an idealized sparse network and  
determine its performance under three memory  
systems. We use the code to evaluate the memory  
systems (one of which will be implemented in the  
prototype), and to pinpoint bottlenecks in the  
current CNS-1 design.  
<P> 
Keywords: CNS-1, performance  
analysis, sparse connectionist networks, memory  
systems, SRAM, SDRAM, RDRAM  
 
----- 
File: 1994/tr-94-010 
 
A Customisable Memory Management Framework 
 
Giuseppe Attardi and Tito Flagella 
tr-94-010 
February 1994 
 
Memory management is a critical issue for many large  
object-oriented applications, but in C++ only  
explicit memory reclamation through the 'delete'  
operator is generally available. We analyse  
different possibilities for memory management in  
C++ and present a dynamic memory management  
framework which can be customised to the need of  
specific applications. The framework allows full  
integration and coexistence of different memory  
management techniques. The Customisable Memory  
Management (CMM) is based on a "primary collector"  
which exploits an evolution of Bartlett's mostly  
copying garbage collector. Specialised collectors  
can be built for separate memory heaps. A 'Heap' class  
encapsulates the allocation strategy for each heap.  
We show how to emulate different garbage collection  
styles or user-specific memory management  
techniques. The CMM is implemented in C++ without any  
special support in the language or the compiler. The  
techniques used in the CMM are general enough to be  
applicable also to other languages.  
<P> 
Keywords: memory management, garbage collection,  
programming languages, C++.  
 
----- 
File: 1994/tr-94-011 
 
Object-Oriented Parallel Programming: Design and Development of an 
Object-Oriented Library for SPMD Programming 
 
Jean-Marc Adamo 
tr-94-011 
February 1994 
 
In the process of writing portable applications, one  
particular way of viewing the parallel programming  
activity is as an application-centered one. This  
paper reports on the object-oriented design of a  
library supporting such an approach. The library has  
been developed within C++ and implemented on the CM5.  
The code has been carefully written so that the  
library could easily be ported to any MIMD machine  
supporting C++. The library allows parallel program  
development in the SPMD style. It has been designed so  
that the compiler can perform a complete type  
checking of user programs. This was a major  
requirement: We wanted the library to provide  
facilities close to those one normally expects from a  
programming language (i.e. with compiled  
programming primitives). We were actually  
interested in checking how far it would be possible to  
go toward achieving such a goal via the natural  
object-oriented extension mechanisms available in  
C++. The present report brings evidence that this is  
quite achievable. The library consists of a set of  
four layers providing: threads, hronous message  
passing, remote read/write facilities, and spread  
arrays and pointers.  
 
----- 
File: 1994/tr-94-012 
 
Modeling Dynamics in Connectionist Speech Recognition - The Time Index Model 
 
Yochai Konig and Nelson Morgan 
tr-94-012 
March 1994 
 
Here, we introduce an alternative to the Hidden  
Markov Model (HMM) as the underlying representation  
of speech production. HMMs suffer from well known  
limitations, such as the unrealistic assumption  
that the observations generated in a given state are  
independent and identically distributed (i.i.d).  
We propose a time index model that explicitly  
conditions the emission probability of a state on the  
time index, i.e., on the number of ``visits'' in the  
current state of the Markov chain in a sequence. Thus,  
the proposed model does not require an i.i.d.  
assumption. The connectionist framework enables us  
to represent the dependence on the time index as a  
non-parametric distribution and to share  
parameters between different speech unit models.  
Furthermore, we discuss an extension to the basic  
time index model by incorporating information about  
the duration of the phone segments. Our initial  
results show that given the position of the  
boundaries between basic speech units, e.g.,  
phones, we can improve our current connectionist  
system performance significantly by using this  
model. However, we still do not know whether these  
boundaries can be estimated reliably, nor do we know  
how much benefit we can obtain from this method given  
less accurate boundary information. Currently we  
are experimenting with two possible approaches:  
trying to learn smooth probability densities for the  
boundaries, and getting a set of reasonable  
segmentations from an N-Best search. In both cases we  
will need to consider the effect of incorrect  
boundaries, since they will undoubtedly occur.  
 
----- 
File: 1994/tr-94-013 
 
Processing Joins With User-Defined Functions 
 
Volker Gaede and Oliver G&uuml;nther 
tr-94-013 
March 1994 
 
Most strategies for the computation of relational  
joins (such as sort-merge or hash-join) are facing  
major difficulties if the join predicate involves  
complex, user-defined functions rather than just  
simple arithmetic comparisons. In this paper, we  
identify a class of user-defined functions that can  
be included in a join predicate, such that a join  
between two sets R and S can still be computed  
efficiently, i.e., in time significantly less than  
O(|R|x|S|). For that purpose, we introduce the  
notion of the $\phi$-function, an operator to  
process each set element separately with respect to  
the user-defined function(s) being used. Then any  
particular join query containing those functions  
can be computed by a variation of some traditional  
join strategy. After demonstrating this technique  
on a spatial database example, we present the results  
of a theoretical analysis and a practical  
performance evaluation.  
<P> 
Keywords: functional  
join, query processing, user-defined predicates,  
z-ordering, query optimization, extensible and  
object-oriented database systems  
 
----- 
File: 1994/tr-94-014 
 
Integration of Bottom-Up and Top-Down Cues for Visual Attention Using 
Non-Linear Relaxation 
 
Ruggero Milanese, Harry Wechsler, Sylvia Gil, Jean-Marc Bost and Thierry Pun 
tr-94-014 
March 1994 
 
Active and selective perception seeks regions of  
interest in an image in order to reduce the  
computational complexity associated with  
time-consuming processes such as object  
recognition. We describe in this paper a visual  
attention system that extracts regions of interest  
by integrating multiple image cues. Bottom-up cues  
are detected by decomposing the image into a number of  
feature and conspicuity maps, while a-priori  
knowledge (i.e. models) about objects is used to  
generate top-down attention cues. Bottom-up and  
top-down information is combined through a  
non-linear relaxation process using energy  
minimization-like procedures. The functionality  
of the attention system is expanded by the  
introduction of an alerting (motion-based) system  
able to explore and avoid obstacles. Experimental  
results are reported, using cluttered and noisy  
scenes.  
 
----- 
File: 1994/tr-94-015 
 
Designing and Integrating User Interfaces of Geographic Database 
Applications 
 
Agnes Voisard 
tr-94-015 
March 1994 
 
In this paper, we investigate the problem of  
designing graphical geographic database user  
interfaces (GDUIs) and of integrating them into a  
database management system (DBMS). Geographic  
applications may vary widely but they all have common  
aspects due to the spatial component of their data:  
Geographic data are not standard data and they  
require appropriate tools for: (i) editing them  
(i.e., display and modify) and (ii) querying them.  
The conceptual problems encountered in designing  
GDUIs are partly due to the merger of two independent  
fields, geographic DBMSs on the one hand, and  
graphical user interfaces (GUIs) on the other hand.  
Although these areas have evolved considerably  
during the past ten years, only little effort has been  
made to understand the problems of connecting them in  
order to efficiently manipulate geographic data on a  
display. This issue raises the general problem of  
coupling a DBMS with specialized modules (in  
particular, the problem of strong vs. weak  
integration), and more generally of the role of a DBMS  
in a specific application. After giving the  
functionalities that a GDUI should provide, we study  
the possible conceptual integrations between a GUI  
and a DBMS. Finally, a map editing model as well as a  
general and modular GDUI architecture are  
presented.  
<P> 
Keywords: Geographic database  
management systems, graphical user interfaces.  
 
----- 
File: 1994/tr-94-016 
 
A stable integer relation algorithm 
 
Carsten R&ouml;ssner and C. P. Schnorr 
tr-94-016 
April 1994 
 
We study the following problem: given x element Rn either find a short 
integer relation m element Zn, so that <x,m>=0 holds for the inner 
product <.,.>, or prove that no short integer relation exists for x. 
Hastad, Just Lagarias and Schnorr (1989) give a polynomial time 
algorithm for the problem. 
<P> 
We present a stable variation of the HJLS--algorithm  
that preserves lower bounds on lambda(x) for  
infinitesimal changes of x. Given x \in {\RR}^n and  
\alpha \in \NN this algorithm finds a nearby point x'  
and a short integer relation m for x'. The nearby point  
x' is 'good' in the sense that no very short relation  
exists for points \bar{x} within half the  
x'--distance from x. On the other hand if x'=x then m  
is, up to a factor 2^{n/2}, a shortest integer  
relation for \mbox{x.} <P>Our algorithm uses, for  
arbitrary real input x, at most \mbox{O(n^4(n+\log  
\alpha))} many arithmetical operations on real  
numbers. If x is rational the algorithm operates on  
integers having at most \mbox{O(n^5+n^3 (\log  
\alpha)^2 + \log (\|q x\|^2))} many bits where q is the  
common denominator for x.  
 
----- 
File: 1994/tr-94-017 
 
Black Box Cryptanalysis of Hash Networks based on Multipermutations 
 
C. P. Schnorr and S. Vaudenay 
tr-94-017 
April 1994 
 
Black box cryptanalysis applies to hash algorithms  
consisting of many small boxes, connected by a known  
graph structure, so that the boxes can be evaluated  
forward and backwards by given oracles. We study  
attacks that work for any choice of the black boxes,  
i.e. we scrutinize the given graph structure. For  
example we analyze the graph of the fast Fourier  
transform (FFT). We present optimal black box  
inversions of FFT-compression functions and black  
box constructions of collisions. This determines  
the minimal depth of FFT-compression networks for  
collision-resistant hashing. We propose the  
concept of multipermutation, which is a pair of  
orthogonal latin squares, as a new cryptographic  
primitive that generalizes the boxes of the FFT. Our  
examples of multipermutations are based on the  
operations circular rotation, bitwise xor,  
addition and multiplication.  
 
----- 
File: 1994/tr-94-018 
 
Dextrous Object Manipulation with Robot Hands Including Rolling and 
Slipping: Improved Motion & Force Computation Method 
 
G&uuml;nter W&ouml;hlke 
tr-94-018 
April 1994 
 
This paper deals with the two fundamental problems  
that occur when objects are manipulated with  
multi-finger robot hands: the determination of the  
joint motions to perform a manipulation according to  
a given object trajectory, and the optimization of  
the joint torques needed to ensure a stable and secure  
grip configuration. The consideration of the effect  
of rolling and slipping of the fingertips on the  
object surface leads to a set of linear differential  
equations for the joint angles and to a partly  
non-linear optimization problem for the joint  
torques solved by the Hooke-Jeeves algorithm. The  
removal of redundant information reduces the  
computational effort to about 40% of the operations  
required for the standard procedure. Especially,  
the resulting object motions are demonstrated at an  
example: the rotation of an ellipsoid-like object  
with the fingers of the Karlsruhe Dextrous Hand.  
 
----- 
File: 1994/tr-94-019 
 
A Preliminary Study of the Semantics of Reduplication 
 
Terry Regier 
tr-94-019 
April 1994 
 
There is a universal component to the semantics of  
reduplication, which can be expressed as a radial  
category of concepts. I present this radial  
category, along with supporting evidence from a  
range of languages, and motivations for the links  
between the various senses. The structure of the  
radial graph gives rise to a number of predicted  
implicational universals. I also show that the  
radial category for reduplication shares an entire  
subsystem of concepts with the radial category for  
the Russian verbal prefix raz-. This sharing of  
subsystems of concepts across separate radial  
categories suggests that there is a single universal  
core conceptual network, with individual  
constructions covering different, possibly  
overlapping, regions.  
 
----- 
File: 1994/tr-94-020 
 
Experiments with the Tenet Real-Time Protocol Suite on the Sequoia 2000 Wide 
Area Network 
 
Anindo Banerjea, Edward W. Knightly, Fred L. Templin, and Hui Zhang 
tr-94-020 
April 1994 
 
Emerging distributed multimedia applications have  
stringent performance requirements in terms of  
bandwidth, delay, delay-jitter, and loss rate. The  
Tenet real-time protocol suite provides the  
services and mechanisms for delivering such  
performance guarantees, even during periods of high  
network load and congestion. The protocols achieve  
this by using resource management, connection  
admission control, and appropriate packet service  
disciplines inside the network. The Sequoia 2000  
network employs the Tenet Protocol Suite at each of  
its hosts and routers making it one of the first wide  
area packet-switched networks to provide  
end-to-end per-connection performance  
guarantees. This paper presents experiments of the  
Tenet protocols on the Sequoia 2000 network  
including measurements of the performance of the  
protocols, the service received by real multimedia  
applications using the protocols, and comparisons  
with the service received by applications that use  
the Internet protocols (UDP/IP). We conclude that  
the Tenet protocols successfully protect the  
real-time channels from other traffic in the  
network, including other real-time channels, and  
continue to meet the performance guarantees, even  
when the network is highly loaded.  
 
----- 
File: 1994/tr-94-021 
 
Parsing Neural Networks Combining Symbolic and Connectionist Approaches 
 
Christel Kemke 
tr-94-021 
May 1994 
 
In this paper we suggest combining symbolic and  
subsymbolic approaches in order to build fast  
parsers based on context-free grammars.  
Symbol-based parsers well known in Artificial  
Intelligence (AI) and Computational Linguistics  
(CL) provide highly developed tools and techniques,  
but they suffer from certain inabilities, for  
example to process ambiguous sentences or  
ungrammatical structures. <P>Connectionist parsers,  
on the other hand, have problems with representing  
recursive structures, processing sequences, and  
the handling of variables. But they have the  
advantage of being fault-tolerant and representing  
syntactic and semantic knowledge in a distributed  
manner. <P>We analyzed the existing work on  
connectionist parsers and developed three  
different systems (PAPADEUS, INKAS, and INKOPA) in  
order to tackle the above described problems of  
symbolic and connectionist approaches. The main  
common characteristic of all three systems is the  
dynamic generation of the parse tree and thus of the  
parsing network. This technique was developed using  
the known parsing techniques in AI and CL, especially  
chart-parsing. Also the use of context-free  
grammars had its source in these fields.  
 
----- 
File: 1994/tr-94-022 
 
On the Relationship between Synthesizing and Tagging 
 
Hans Werner Guesgen 
tr-94-022 
May 1994 
 
During recent years, various constraint  
satisfaction algorithms have been developed. Among  
them are Freuder's synthesizing algorithm and our  
tagging method. We will compare the two approaches in  
this paper and work out commonalities and  
differences. <P>The purpose of this paper is to give a  
deeper insight into existing methods (rather than  
introducing new ones). Although the algorithms we  
chose for our investigation might not be the most  
valuable ones from the viewpoint of applications,  
they illustrate important and interesting  
principles of constraint satisfaction.  
<P> 
Keywords: constraint satisfaction, exhaustive search,  
synthesizing, tagging  
 
----- 
File: 1994/tr-94-023 
 
Computational Complexity and Knowledge Complexity 
 
Oded Goldreich, Rafail Ostrovsky and Erez Petrank 
tr-94-023 
June 1994 
 
We study the computational complexity of languages  
which have interactive proofs of logarithmic  
knowledge complexity. We show that all such  
languages can be recognized in ${\cal BPP}^{\cal  
NP}$. Prior to this work, for languages with  
greater-than-zero knowledge complexity (and  
specifically, even for knowledge complexity 1) only  
trivial computational complexity bounds (i.e.,  
only recognizability in ${\cal PSPACE}={\cal IP}$)  
were known. In the course of our proof, we relate  
statistical knowledge-complexity with perfect  
knowledge-complexity; specifically, we show that,  
for the honest verifier, these hierarchies  
coincide, up to a logarithmic additive term (i.e.,  
${\cal SKC}(k(\cdot))\subseteq{\cal  
PKC}(k(\cdot)+\log(\cdot))$).  
 
----- 
File: 1994/tr-94-024 
 
The Design and Evaluation of Routing Algorithms for Real-time Channels 
 
Ron Widyono 
tr-94-024 
June 1994 
 
The Tenet Scheme specifies a real-time  
communication service that guarantees performance  
through network connections with reserved  
resources, admission control, and rate con- trol.  
Within this framework, we develop and eval uate algo-  
rithms that find routes for these multicast  
connections. The main goals a establishment of the  
routed connection, to maximize the use- ful  
utilization of the network, and to be timely. The  
prob- lem to be solved is finding a minimum cost tree  
where each source to destination path is constrained  
by a delay bound. This problem is NP-complete, so  
heuristics based mainly on minimum incremental cost  
are developed. Algorithms we develop use those  
heuristics to calculate paths that are merged into a  
tree. We evaluate our design decisions through  
simulation, measuring success through the number of  
successfully established connections.  
 
----- 
File: 1994/tr-94-025 
 
Fast and Efficient Parallel Algorithms for Problems in Control Theory 
 
B. Codenotti, B. N. Datta, K. Datta, M. Leoncini 
tr-94-025 
August 1994 
 
Remarkable progress has been made in both theory and  
applications of all important areas of control. On  
the other hand, progress in computational aspects of  
control theory, especially in the area of  
large-scale and parallel computations, has been  
painfully slow. In this paper we address some central  
problems arising in control theory, namely the  
controllability and the eigenvalue assignment  
problems, and the solution of the Lyapunov and  
Sylvester observer matrix equations. For all these  
problems we give parallel algorithms that run in  
almost linear time on a Parallel Random Access  
Machine model. The algorithms make efficient use of  
the processors and are scalable, which makes them of  
practical worth also in the case of limited  
parallelism.  
<P> 
Keywords: parallel algorithms,  
linear algebra, control theory, controllability,  
eigenvalue assignment, Lyapunov equation,  
Sylvester equation  
 
----- 
File: 1994/tr-94-026 
 
A Formal Framework for Weak Constraint Satisfaction Based on Fuzzy Sets 
 
Hans Werner Guesgen 
tr-94-026 
June 1994 
 
Recent work in the field of artificial intelligence  
has shown that many problems can be represented as a  
set of constraints on a set of variables, i.e., as a  
constraint satisfaction problem. Unfortunately,  
real world problems tend to be inconsistent, and  
therefore the corresponding constraint  
satisfaction problems don't have solutions. A way to  
circumvent inconsistent constraint satisfaction  
problems is to make them fuzzy. The idea is to  
associate fuzzy values with the elements of the  
constraints, and to combine these fuzzy values in a  
reasonable way, i.e., a way that directly  
corresponds to the way how crisp constraint problems  
are handled.  
<P> 
Keywords: weak constraint  
satisfaction, constraint relaxation, fuzzy sets  
 
----- 
File: 1994/tr-94-027 
 
Some MPEG Decoding Functions on Spert -- An Example for Assembly Programmers 
 
Arno Formella 
tr-94-027 
October 1994 
 
We describe our method how to implement C--program  
sequences in torrent (T0) assembler code while there  
is no efficient automatic tool. We use  
re-structuring of the source code, vectorization,  
dataflow graphs, a simple scheduling strategy and a  
straight forward register allocation algorithm. We  
define some lower and an upper bound for the expected  
run time. For two functions, namely the color  
transformation and reverse DCT, we achieve almost  
54, respectively 16 times the performance of a Sparc 2  
workstation.  
 
----- 
File: 1994/tr-94-028 
 
On the parallel complexity of Gaussian Elimination with Pivoting 
 
M. Leoncini 
tr-94-028 
August 1994 
 
Consider the Gaussian Elimination algorithm with  
the well-known Partial Pivoting strategy for  
improving numerical stability (GEPP). Vavasis  
proved that the problem of determining the pivot  
sequence used by GEPP is log space-complete for {\bf  
P}, and thus inherently sequential. ${\rm\bf  
P}\ne{\rm\bf NC}$, we prove here that either the  
latter problem cannot be solved in parallel time  
$O(N^{1/2-\epsilon})$ or all the problems in {\bf P}  
admit polynomial speedup. Here $N$ is the order of the  
input matrix and $\epsilon$ is any positive  
constant. This strengthens the P-completeness  
result mentioned above. We conjecture that the  
result proved in this paper holds for the stronger  
bound $O(N^{1-\epsilon})$ as well, and provide  
supporting evidence to the conjecture. Note that  
this is equivalent to assert the asymptotic  
optimality of the naive parallel algorithm for GEPP  
(modulo ${\rm\bf P}\ne{\rm\bf NC}$).  
<P> 
Keywords: Gaussian Elimination with Partial Pivoting,  
P-complete problems, NC class, polynomial speedup,  
strict P-completeness  
 
----- 
File: 1994/tr-94-029 
 
Efficient Approximation Algorithms for Sparse Polynomials over Finite Fields 
 
Marek Karpinski and Igor Shparlinski 
tr-94-029 
July 1994 
 
We obtain new lower bounds on the number of non zeros of  
sparse polynomials and give a fully polynomial time  
(e,d) approximation algorithm for the number of  
non-zeros of multivariate sparse polynomials over a  
finite field of q elements and degree less than q - 1.  
This answers partially to an open problem of D.  
Grivoriev and M. Karpinski. Also, probabilistic and  
deterministic algorithms for testing identity to  
zero of a sparse polynomial given by a "black-box" are  
given. Finally, we propose an algorithm to estimate  
the size of the image of a univariate sparse  
polynomial.  
 
----- 
File: 1994/tr-94-030 
 
Simulating Threshold Circuits by Majority Circuits (Extended Version) 
 
Mikael Goldmann and Marek Karpinski 
tr-94-030 
August 1994 
 
We prove that a single threshold gate with arbitrary  
weights can be simulated by an explicit  
polynomial-size depth 2 majority circuit. In  
general we show that a depth d threshold circuit can be  
simulated uniformly by a majority circuit of depth d +  
1. Goldmann, Hastad, and Razborov showed in [10] that  
a non-uniform simulation exists. Our construction  
answers two open questions posed in [10]: we give an  
explicit construction whereas [10] uses a  
randomized existence argument, and we show that such  
a simulation is possible even if the depth d grows with  
the number of variables n (the simulation in [10]  
gives polynomial- size circuits only when d is  
constant).  
 
----- 
File: 1994/tr-94-031 
 
Massively Parallel Real-Time Reasoning with Very Large Knowledge Bases: An 
Interim Report 
 
D. R. Mani and Lokendra Shastri 
tr-94-031 
August 1994 
 
We map structured connectionist models of knowledge  
representation and reasoning ontoexisting general  
purpose massively parallel architectures with the  
objective of developing and implementing  
practical, real-time reasoning systems. SHRUTI, a  
connectionist knowledge representation and  
reasoning system which attempts to model reflexive  
reasoning, serves as our representative  
connectionist model. Realizations of SHRUTI are  
developed on the Connection Machine CM-2--an SIMD  
architecture--and on the connection Machine  
CM-5--an MIMD architecture. <P>Though SIMD  
implementations on the CM-2 are reasonably  
fast--requiring a few seconds to tens of seconds for  
answering queries--experiments indicate that SPMD  
message passing systems are vastly superior to SIMD  
systems and offer hundred-fold speedups. The CM-5  
implementation can encode large knowledge bases  
with several hundred thousand (randomly generated)  
rules and facts, and respond in under 500  
milliseconds to a range of queries requiring  
inference depths of up to eight. <P>This work provides  
some new insights into the simulation of structured  
connectionist networks on massively parallel  
machines and is a step toward developing large yet  
efficient knowledge representation and reasoning  
systems.  
 
----- 
File: 1994/tr-94-032 
 
Detection of Side-Effects in Function Procedures 
 
Robert Griesemer 
tr-94-032 
September 1994 
 
Procedural programming languages usually do not  
support side-effect free functions but merely a form  
of function procedures. We argue that functions  
should be free of (non-local) side-effects, if they  
are considered as abstraction mechanism for  
expressions. While it is easy to statically detect  
side-effects in functions that do not dynamically  
allocate variables, this is no longer the case for  
functions that do create new data structures. After  
giving a classification of different levels of  
side-effects, we describe a simple and efficient  
method that allows for their dynamic detection while  
retaining assignments, i.e., without referring to a  
pure functional implementation. The method has been  
implemented for an experimental subset of Oberon.  
 
----- 
File: 1994/tr-94-033 
 
Admission Control in Networks with Bounded Delay Services 
 
Jorg Liebeherr, Dallas E. Wrege and Domenico Ferrari 
tr-94-033 
August 1994 
 
To support the requirements for the transmission of  
continuous media, such as audio and video,  
multiservice packet switching networks must  
provide service guarantees to connections,  
including guarantees on throughput, network  
delays, and network delay variations. For the most  
demanding applications, the network must offer a  
service which can provide deterministic guarantees  
for the maximum delay of packets from all  
connections, referred to as bounded delay service.  
The admission control functions in a network with a  
bounded delay service must have available  
schedulability conditions that detect violations  
of delay guarantees in a network switch. In this  
study, exact schedulability conditions are  
presented for packet switches which transmit  
packets based on an Earliest-Deadline-First (EDF)  
or a Static-Priority (SP) algorithm. The  
schedulability conditions are given in terms of a  
general traffic model, making the conditions  
applicable to a large class of traffic  
specifications. A comparison of the new  
schedulability conditions with existing, less  
accurate, conditions show the efficiency gain  
obtained by using exact conditions. Examples are  
presented that show how the selection of a particular  
traffic specification and a schedulability  
condition impact the efficiency of a bounded delay  
service.  
<P> 
Keywords: Multiservice Networks,  
Real-time Networks, Bounded Delay Service,  
Multiplexing, Quality of Service, Packet  
Scheduling, Admission Control, Static-Priority,  
Earliest-Deadline-First.  
 
----- 
File: 1994/tr-94-034 
 
Design and Analysis of a High-Performance Packet Multiplexer for 
Multiservice Networks with Delay Guarantees 
 
Jorg Liebeherr and Dallas E. Wrege 
tr-94-034 
August 1994 
 
A major challenge for the design of multiservice  
networks with quality of service guarantees is an  
efficient implementation of a bounded delay  
service, that is, a service that guarantees maximum  
end-to-end delays for every packet from a single  
traffic stream. A crucial component of a bounded  
delay service is the packet multiplexing technique  
employed at network switches that must keep the  
variable statistical multiplexing delays below a  
predetermined threshold. To achieve a high  
utilization of network resources, the multiplexing  
technique must be sufficiently sophisticated to  
support a variable set of delay bounds for a large  
number of traffic streams. On the other hand, since  
multiplexing of packets is to be performed at the data  
rate of the network links, the complexity of the  
multiplexer should be strictly limited. A novel  
packet multiplexing technique, called Rotating  
Priority Queues (RPQ), is presented which exploits  
the tradeoff between efficiency, i.e., the ability  
to support many connections with delay bounds, and  
low complexity. The operations required by the RPQ  
multiplexer are similar to those of the simple, but  
inefficient, Static Priority (SP) multiplexer. The  
efficiency of RPQ can be madearbitrarily close to the  
highly efficient, yet complex, Earliest Deadline  
First (EDF) multiplexer. Exact expressionsfor the  
worst case delays in an RPQ multiplexer are derived  
and compared to expressions for an EDF multiplexer.  
 
----- 
File: 1994/tr-94-035 
 
Output Sets, Halting Sets and an Arithmetical Hierarchy for Ordered Subrings 
of the Real Number under Blum/Shub/Smale Computation 
 
Rose Saint John 
tr-94-035 
August 1994 
 
The original exposition of Blum/Shub/Smale  
compuation for subrings and subfields of real  
numbers (1989) asks how generally output and halting  
sets coincide. Aspects of this question were  
subsequently addressed by Michaux, Byerly, and  
Friedman/Mansfield. This document synthesizes,  
simplifies, and extends their answers.  
<P>Distinguishing output sets from halting sets in the  
reals and subrings of the reals leads to a natural  
arithmetical hierarchy of non-computable sets.  
Operators analogous to the Jump operator of  
classical recursion theory are used to build an  
arithmetical hierarchy from the empty-set. As  
expected, the classical arithmetical hierarchy for  
the natural numbers occurs as a special case.  
Additional special cases arise in other subrings and  
subfields of the real numbers.  
 
----- 
File: 1994/tr-94-036 
 
On finding a minimal enclosing parallelgram 
 
Christian Schwarz, J&uuml;rgen Teich, Emo Welzl and Brian Evans 
tr-94-036 
August 1994 
 
Given a convex polygon C with n vertices, we show how a  
parallelogram with minimal area enclosing C can be  
computed in linear time O(n). The problem is of  
interest in digital signal processing.  
 
----- 
File: 1994/tr-94-037 
 
Faster Computation On Directed Networks of Automata 
 
Rafail Ostrovsky, Daniel Wilkerson 
tr-94-037 
August 1994 
 
We show how an arbitrary strongly-connected {\em  
directed} network of synchronous finite-state  
automata (with bounded in- and out-degree) can  
accomplish a number of basic distributed network  
tasks in $O(ND)$ time, where $D$ is the diameter of the  
network and $N$ is the number of processors. The tasks  
include (among others) the Firing Synchronization  
Problem; Network Search and Traversal; building  
outgoing and incoming Spanning Trees; Wake-up and  
Report When Done; and simulating a step of an  
undirected network protocol for the underlying  
graph of the directed network. Our approach compares  
favorably to the best previously known $O(N^2)$  
algorithms of Even, Litman and Winkler \cite{elw}  
for all these problems.  
 
----- 
File: 1994/tr-94-038 
 
MBP on TO: mixing floating- and fixed-point formats in BP learning 
 
Davide Anguita and B. Gomes 
tr-94-038 
August 1994 
 
We examine the efficient implementation of back prop  
type algorithms on T0 [4], a vector processor with a  
fixed point engine, designed for neural network  
simulation. A matrix formulation of back prop,  
Matrix Back Prop [1], has been shown to be very  
efficient on some RISCs [2]. Using Matrix Back Prop,  
we achieve an asymptotically optimal performance on  
T0 (about 0.8 GOPS) for both forward and backward  
phases, which is not possible with the standard  
on-line method. Since high efficiency is futile if  
convergence is poor (due to the use of fixed point  
arithmetic), we use a mixture of fixed and floating  
point operations. The key observation is that the  
precision of fixed point is sufficient for good  
convergence, if the range is appropriately chosen.  
Though the most expensive computations are  
implemented in fixed point, we achieve a rate of  
convergence that is comparable to the floating point  
version. The time taken for conversion between fixed  
and floating point is also shown to be reasonable.  
 
----- 
File: 1994/tr-94-039 
 
Priority Encoding Transmission 
 
Andres Albanese, Johannes Bl&ouml;mer, Jeff Edmonds, and Michael Luby 
tr-94-039 
August 1994 
 
We introduce a novel approach for sending messages  
over lossy packet- based networks. The new method,  
called Priority Encoding Transmission, allows a  
user to specify a different priority on each segment  
of the message. Based on the priorities, the sender  
uses the system to encode the segments into packets  
for transmission. The system ensures recovery of the  
segments in order of their priority. The priority of a  
segment determines the minimum number of packets  
sufficient to recover the segment. <P>We define a  
measure for a set of priorities, called the rate,  
which dictates how much information about the  
message must be contained in each bit of the encoding.  
We develop systems for implementing any set of  
priorities with rate equal to one. We also give an  
information-theoretic proof that there is no system  
that implements a set of priorities with rate greater  
than one. <P>This work has immediate applications to  
multi-media and high speed networks applications,  
especially in those with bursty sources and multiple  
receivers with heterogeneous capabilities.  
Implementations of the system show promise of being  
practical.  
 
----- 
File: 1994/tr-94-040 
 
Introducing resources management in IP-based nodes 
 
Pietro Manzoni 
tr-94-040 
October 1994 
 
The Internet Protocol was designed to be used with  
packet-switched communication networks and, as  
originally designed, does not provide the  
characteristics necessary to support voice and  
video transmission. The lack of control over the  
amount of connections supported leads to highly  
variable delays for packets and often to packet loss.  
<P>In this paper, an enhancement of an IP based node  
(called IP') is presented to allow a simple  
management of the node's resources. We introduce  
higher interaction between the transport and the  
network layers through additional processes and  
functions. The paper also presents, as an example, a  
transport layer protocol that shows how to take  
advantage of the new functionalities provided by the  
IP' nodes. <P>Two fundamental hypothesis throughout  
the design process were: 1) the effort in moving an  
IP-based node to an IP'-based node had to be smaller  
than the effort required in moving to a completely  
different protocol suite, and 2) the regular  
Internet traffic should not be affected or modified  
at all. <P>Simulations results are presented to show  
that this approach can actually bound the variation  
of delay and throughput. In addition this approach  
can also control the number of packets lost.  
 
----- 
File: 1994/tr-94-041 
 
Approaching the 5/4-Approximation for Rectilinear Steiner Trees 
 
Piotr Berman, Ulrich F&ouml;ssmeier, Marek Karpinski, Michael Kaufmann and Alexander Zelikovsky 
tr-94-041 
August 1994 
 
The rectilinear Steiner tree problem requires to  
find a shortest tree connecting a given set of  
terminal points in the plane with rectilinear  
distance. We show that the performance ratios of  
Zelikovsky's [17] heuristic is between 1.3 and  
1.3125 (before it was only bounded from above by  
1.375), while the performance ratio of the heuristic  
of Berman and Ramaiyer [1] is at most 1.271 (while the  
previous bound was 1.347). Moreover, we provide  
O(n*log2n)-time algorithms that satisfy these  
performance ratios.  
 
----- 
File: 1994/tr-94-042 
 
Counting Curves and Their Projections 
 
Joachim von zur Gathen, Marek Karpinski and Igor Shparlinski 
tr-94-042 
August 1994 
 
Some deterministic and probabilistic methods are  
presented for counting and estimating the number of  
points on curves over finite fields, and on their  
projections. The classical question of estimating  
the size of the image of a univariate polynomial is a  
special case. For curves given by sparse  
polynomials, the counting problem is #P-complete  
via probabilistic parsimonious Turing reductions.  
 
----- 
File: 1994/tr-94-043 
 
On the Computational Complexity of Matching on Chordal and Strongly Chordal 
Graphs 
 
Elias Dahlhaus and Marek Karpinski 
tr-94-043 
August 1994 
 
In this paper we study the computational complexity  
(both sequential and parallel) of the maximum  
matching problem for chordal and strongly chordal  
graphs. We show that there is a linear time greedy  
algorithm for a maximum matching in a strongly  
chordal graph provided a strongly perfect  
elimination ordering is known. This algorithm can be  
also turned into a parallel algorithm. The technique  
used can be also extended for the multidimensional  
matching for chordal and strongly chordal graphs  
yielding the first polynomial time algorithms for  
these classes of graphs (the multidimensional  
matching is NP-complete in general).  
 
----- 
File: 1994/tr-94-044 
 
Feature Binding through Synchronized Neuronal Oscillations: A Preliminary 
Study 
 
Ruggero Milanese 
tr-94-044 
August 1994 
 
In this report we analyze the feature binding  
problem, a combinatorial complexity problem that  
affects connectionist networks using multiple  
topographic representations of an image. Inspired  
from some evidence about the human visual system, we  
suggest that a solution to this problem may derive by  
the combined use of attention mechanisms and by  
exploiting the temporal synchrony of neuronal  
firing. To this end, a new framework is proposed in  
terms of a neuronal model, and of a computational  
architecture capable of producing synchronized  
firing in distributed assemblies of neurons. This  
synchronized behavior only affects neurons  
selected by the network to represent objects of  
interest. The architecture is structured into a set  
of feature, conspicuity, and saliency maps, whose  
neurons are connected in a feedback loop. A number of  
mechanisms are proposed in order to implement each of  
these stages, including strategies for reinforcing  
the synchronous firing of the selected neurons.  
 
----- 
File: 1994/tr-94-045 
 
Development of Parallel BLAS with ARCH Object-Oriented Parallel Library, 
Implementation on CM-5 
 
J. M. Adamo 
tr-94-045 
August 1994 
 
This paper reports on the development of BLAS classes  
using the ARCH library. The BLAS library consists in  
two new SpreadMatrix and Spread Vector classes that  
are simply derived from the ARCH SpreadArray class.  
Their implementation essentially makes use of the  
ARCH remote read and write functions together with  
barrier-synchronization. They provide a good  
illustration of how ARCH can contribute to the  
development of loosely-synchronous systems. This  
paper describes the architecture of SpreadMatrix  
and SpreadVector classes and illustrates their use  
through the construction of a neural-network  
simulator.  
 
----- 
File: 1994/tr-94-046 
 
Object Oriented Design of a BP Neural Network Simulator and Implementation 
on the Connection Machine (CM-5) 
 
J. M. Adamo and D. Anguita 
tr-94-046 
September 1994 
 
In this paper we describe the implementation of the backpropagation 
algorithm by means of an object oriented library (ARCH). The use of 
this library relieve the user from the details of a specific parallel 
programming paradigm and at the same time allows a greater portability 
of the generated code. 
<P> 
To provide a comparison with existing solutions, we survey the most 
relevant implementations of the algorithm proposed so far in the 
literature, both on dedicated and general purpose computers. 
<P> 
Extensive experimental results show that the use of the library does 
not hurt the performance of our simulator, on the contrary our 
implementation on a Connection Machine(CM-5) is comparable with the 
fastest in its category. 
 
----- 
File: 1994/tr-94-047 
 
Traffic Characterization and Switch Utilization using a Deterministic 
Bounding Interval Dependent Traffic Model 
 
Edward W. Knightly and Hui Zhang 
tr-94-047 
August 1994 
 
Compressed digital video is one of the most important  
types of traffic in the future integrated services  
networks. It is difficult to support this class of  
traffic since on one hand, compressed video is  
bursty, while on the other hand, it requires  
performance guarantees from the network. The common  
belief is that we are unlikely to achieve a high  
network utilization while providing performance  
guarantees to bursty traffic. While this is  
certainly true for traditional data traffic,  
compressed video is much more "regular" and "smooth"  
than data traffic. In this paper, we propose a  
deterministic bounding interval-dependent (BIND)  
model to capture the source's characteristics. We  
use the BIND model together with a tighter analysis  
technique to show that, contrary to common belief,  
reasonable network utilization can be achieved for  
compressed video even when deterministic  
guarantees are provided. In the study, we used  
several 10 minutes long MPEG compressed video  
sequences to demonstrate the effectiveness of the  
new model. Since even if all packets are  
deterministically guaranteed to meet their loss and  
delay bounds, sources may be multiplexed beyond  
their peak rate, we define the Deterministic  
Multiplexing Gain (DMG) as the fraction above a  
peak-rate allocation scheme that is achieved while  
still providing a deterministic performance  
guarantee. We show that with the new BIND model,  
network utilizations as high as 60% and DMG's of up to  
2.8 are achievable for MPEG video.  
<P> 
Keywords: quality of service, video traffic characterization,  
deterministic multiplexing gain  
 
----- 
File: 1994/tr-94-048 
 
Comparison of Rate-Controlled Static Priority and Stop-and-Go 
 
Hui Zhang and Edward W. Knightly 
tr-94-048 
August 1994 
 
To support emerging real-time applications, high  
speed integrated services networks need to provide  
end-to-end performance guarantees on a  
per-connection basis in a networking environment.  
In addition to the issue of how to allocate resources  
to meet diverse QOS requirements in a single switch,  
resource management algorithms also need to account  
for the fact that traffic may get burstier and  
burstier as it traverses the network due to complex  
interaction among packet streams at each switch. To  
address this problem, several non-work-conserving  
packet service disciplines have been proposed that  
fully or partially reconstruct the traffic pattern  
of the original source inside the network. This is  
achieved by a policing or delay-jitter control  
mechanism in which packets may be held at  
intermediate switches in order to keep the traffic  
from becoming burstier. In this paper, we compare two  
non-work-conserving disciplines: Stop-and-Go and  
Rate-Controlled Static Priority or RCSP.  
Stop-and-Go uses a multi-level framing strategy to  
allocate resources in a single switch and to ensure  
traffic smoothness throughout the network. RCSP  
decouples the server functions by having two  
components: a regulator to partially or fully  
reconstruct the traffic pattern and a static  
priority scheduler to allocate delay bounds in a  
single switch. We compare the two service  
disciplines in terms of traffic specification,  
scheduling mechanism, buffer space requirement,  
end-to-end delay characteristics, connection  
admission control algorithms, and achievable  
network utilization. The comparison is first done  
analytically, and then using MPEG compressed video  
traces for numerical investigations into the  
properties of practical real-time network sources.  
 
----- 
File: 1994/tr-94-049 
 
Lower Space Bounds for Randomized Computation 
 
Rusins Freivalds and Marek Karpinski 
tr-94-049 
September 1994 
 
It is a fundamental open problem in the randomized  
computation how to separate different randomized  
time or randomized small space classes (cf., e.g.,  
[KV 87], [KF 88]). In this paper we study lower space  
bounds for randomized computation, and prove lower  
space bounds up to log n for the specific sets computed  
by the Monte Carlo Turing machines. This enables us  
for the first time, to separate randomized space  
classes below log n (cf. [KV 87], [KV 88]), allowing us  
to separate, say, the randomized space O(1) from the  
randomized space O(log* n). We prove also lower space  
bounds up to log log n and log n, repectively, for  
specific sets computed by probabilistic Turing  
machines, and one-way probabilistic Turing  
machines.  
 
----- 
File: 1994/tr-94-050 
 
Scalable resource reservation for multi-party real-time communication 
 
Amit Gupta, Wingwai Howe, Mark Moran and Quyen Nguyen 
tr-94-050 
October 1994 
 
Current approaches to supporting real-time  
communication allocate network resources either to  
individual connections, or to aggregates of  
connections, based on type of traffic, protocol, or  
performance requirements. The first approach  
provides well-defined performance guarantees that  
are independent of other network traffic. The second  
approach may achieve higher utilization of network  
resources, but the expected performance is less  
well-defined since it is dependent on the behavior of  
unrelated (possibly unknown) connections.  
Resource sharing is a new approach that exploits  
known relationships between related connections to  
allow network resources to be shared without  
sacrificing well-defined guarantees. Most  
importantly, for large conferences with a bounded  
number of concurrent speakers, resource  
requirements do not increase with the number of  
potential speakers. Therefore, resource sharing is  
an important tool for providing real-time  
performance guarantees for large conferences. This  
paper presents a fully distributed technique for  
using resource sharing to provide real-time  
guarantees in a general internetworking  
environment. The technique is described in the  
context of its implementation in the next generation  
of the Tenet real-time protocols. However, the  
underlying principles are equally applicable to  
other communication paradigms and techniques. A  
companion report presents the results of simulation  
experiments; the simulations show that resource  
sharing leads to large gains in connection  
acceptance rates and a significant reduction in  
computational overhead associated with admission  
control for real-time communication.  
 
----- 
File: 1994/tr-94-051 
 
Evaluation of resource sharing benefits 
 
Amit Gupta, Wingwai Howe, Mark Moran and Quyen Nguyen 
tr-94-051 
October 1994 
 
Current approaches to supporting real-time  
communication allocate network resources either to  
individual connections, or to aggregates of  
connections, based on type of traffic, protocol, or  
performance requirements. The first approach  
provides well-defined performance guarantees that  
are independent of other network traffic. The second  
approach may achieve higher utilization of network  
resources, but the expected performance is less  
well-defined since it is dependent on the behavior of  
unrelated (possibly unknown) connections. We  
previously presented resource sharing, a new  
approach that exploits known relationships between  
related connections to allow network resources to be  
shared without sacrificing well-defined  
guarantees. Resource sharing is very important for  
large conferences with a bounded number of  
concurrent speakers, resource requirements do not  
increase with the number of potential speakers. In  
this paper, we evaluate resource sharing benefits by  
analysis and by simulation. Results show that  
resource sharing leads to a large gain in the  
connection acceptance rate, and a significant  
reduction in the computational overhead associated  
with admission control.  
 
----- 
File: 1994/tr-94-052 
 
Automatic Induction of Finite State Transducers for Simple Phonological 
Rules 
 
Dan Gildea and Dan Jurafsky 
tr-94-052 
October 1994 
 
This paper presents a method for learning  
phonological rules from sample pairs of underlying  
and surface forms, without negative evidence. The  
learned rules are represented as finite state  
transducers that accept underlying forms as input  
and generate surface forms as output. The algorithm  
for learning them is an extension of the OSTIA  
algorithm for learning general subsequential  
finite state transducers. Although OSTIA is capable  
of learning arbitrary s.f.s.t's in the limit, large  
dictionaries of actual English pronunciations did  
not give enough samples to correctly induce  
phonological rules. We then augmented OSTIA with two  
kinds of knowledge specific to natural language  
phonology, representing a naturalness bias from  
``universal grammar''. A bias that underlying  
phones are often realized as phonetically similar or  
identical surface phones was implemented by using  
alignment information between the underlying and  
surface strings. A bias that phonological rules  
apply across natural phonological classes was  
implemented by learning decision trees based on  
phonetic features on each state of the transducer.  
The additions helped in learning more compact,  
accurate, and general transducers than the  
unmodified OSTIA algorithm. An implementation of  
the algorithm successfully learns a number of  
English postlexical rules, including flapping,  
t-insertion and t-deletion.  
 
----- 
File: 1994/tr-94-053 
 
Software Reliability via Run-Time Result-Checking 
 
Manuel Blum and Hal Wasserman 
tr-94-053 
October 1994 
 
We review the field of result-checking, discussing  
simple checkers and self-correctors. We argue that  
such checkers could profitably be incorporated in  
software as an aid to efficient debugging and  
reliable functionality. We consider how to modify  
traditional checking methodologies to make them  
more appropriate for use in real-time, real-number  
computer systems. In particular, we suggest that  
checkers should be allowed to use "stored  
randomness": i.e., that they should be allowed to  
generate, pre-process, and store random bits prior  
to run-time, and then to use this information  
repeatedly in a series of run-time checks. In a case  
study of checking a general real-number linear  
transformation (for example, a Fourier Transform),  
we present a simple checker which uses stored  
randomness, and a self-corrector which is  
particularly efficient if stored randomness is  
allowed.  
<P> 
Keywords: result checking, instance  
checking, verification, testing.  
 
----- 
File: 1994/tr-94-054 
 
Therapy Plan Generation in Complex Dynamic Environments 
 
Oksana Arnold and Klaus P. Jantke 
tr-94-054 
October 1994 
 
There has been developed a methodology for the  
automatic synthesis of therapy plans for complex  
dynamic systems. An algorithm has been implemented  
and testet. This is the core of some control synthesis  
module which is embedded in a larger knowledge-based  
system for control, diagnosis and therapy. There are  
several applications. <P>The approach is based on  
certain concepts of structured graphs. The overall  
search space is a family of hierarchically  
structured plans. Together with some goal  
specification it is forming a so-called rooted  
family. Simple concepts of graph substitution and  
rewriting are introduced. The output of the planner  
is a hierarchically structured plan. This has a  
uniquely determined normal form taken for  
execution. <P>Plan generation is interpreted as  
inductive program synthesis. Indeed, the planner  
developed and implemented works as an inductive  
inference machine. It turns out that consistency and  
executability are two fundamental, but  
distinguished concepts. When describing the  
program synthesis algorithm, we focus on constraint  
monitoring. This is taken as a basis for generating  
programs being consistent with the underlying  
technology representation.  
 
----- 
File: 1994/tr-94-055 
 
Counting in Lattices: Combinatorial Problems from Statistical Mechanics 
 
Dana Randall 
tr-94-055 
October 1994 
 
In this thesis we consider two classical  
combinatorial problems arising in statistical  
mechanics: counting matchings and self-avoiding  
walks in lattice graphs. The first problem arises in  
the study of the thermodynamical properties of  
monomers and dimers (diatomic molecules) in  
crystals. Fisher, Kasteleyn and Temperley  
discovered an elegant technique to exactly count the  
number of perfect matchings in two dimensional  
lattices, but it is not applicable for matchings of  
arbitrary size, or in higher dimensional lattices.  
We present the first efficient approximation  
algorithm for computing the number of matchings of  
any size in any periodic lattice in arbitrary  
dimension. The algorithm is based on Monte Carlo  
simulation of a suitable Markov chain and has  
rigorously derived performance guarantees that do  
not rely on any assumptions. In addition, we show that  
these results generalize to counting matchings in  
any graph which is the Cayley graph of a finite group.  
<P>The second problem is counting self-avoiding walks  
in lattices. This problem arises in the study of the  
thermodynamics of long polymer chains in dilute  
solution. While there are a number of Monte Carlo  
algorithms used to count self-avoiding walks in  
practice, these are heuristic and their correctness  
relies on unproven conjectures. In contrast, we  
present an efficient algorithm which relies on a  
single, widely-believed conjecture that is simpler  
than preceding assumptions and, more importantly,  
is one which the algorithm itself can test. Thus our  
algorithm is reliable, in the sense that it either  
outputs answers that are guaranteed, with high  
probability, to be correct, or finds a  
counterexample to the conjecture. In either case we  
know we can trust our results and the algorithm is  
guaranteed to run in polynomial time. This is the  
first algorithm for counting self-avoiding walks in  
which the error bounds are rigorously controlled.  
 
----- 
File: 1994/tr-94-056 
 
Multi-level Architecture of object-oriented Operating Systems 
 
Sven Graupner, Winfried Kalfa, and Frank Schubert 
tr-94-056 
November 1994 
 
Applications should be provided with optimal  
infrastructures at their run time. The proposed  
architecture encourages to structure a system into  
sets of interacting instances supported by optimal  
infrastructures at multiple levels .  
Infrastructures are organized as sets of instances  
as well, but of more elementary quality. Thus, a  
recursive architecture results with related  
infrastructures and instance areas that forms an  
n-ary tree. Each instance area provides the  
infrastructure for higher instance areas and needs  
itself a lower level infrastructure. Processing is  
considered as performing services among instances.  
<P>Object-orientation is proved to be suitable for  
structuring instance areas and infrastructures.  
Instances performing services are objects. A  
discussion of general principles of  
object-orientation gives the background to apply it  
to this architecture. Most existing  
object-oriented systems only consider one kind or  
''quality'' of objects, which is however inadequate  
for operating systems. The paper discusses what  
essentially makes different ''qualities of  
objects'' at different levels and how activities are  
related to them. <P>In the last section the design and the  
implementation of a lowest level infrastructure is  
presented which is taken from an operating system  
prototype that follows the proposed architecture  
and which is under development in our group.  
 
----- 
File: 1994/tr-94-057 
 
Information Theory and Noisy Computation 
 
William S. Evans 
tr-94-057 
November 1994 
 
Thesis The information carried by a signal  
unavoidably decays when the signal is corrupted by  
random noise. This occurs when a noisy channel  
transmits a message as well as when a noisy component  
performs computation. We first study this signal  
decay in the context of communication and obtain a  
tight bound on the decay of the information carried by  
a signal as it crosses a noisy channel. We then use this  
information theoretic result to obtain depth lower  
bounds in the noisy circuit model of computation  
defined by von Neumann. In this model, each component  
fails (produces 1 instead of 0 or vice-versa)  
independently with a fixed probability, and yet the  
output of the circuit should be correct with high  
probability. Von Neumann showed how to construct  
circuits in this model that reliably compute a  
function and are no more than a constant factor deeper  
than noiseless circuits for the function. Our result  
implies that such a multiplicative increase in depth  
is necessary for reliable computation. The result  
also indicates that above a certain level of  
component noise, reliable computation is  
impossible. <P>We use a similar technique to lower bound  
the size of reliable circuits in terms of the noise and  
complexity of their components, and the sensitivity  
of the function they compute. Our bound is  
asymptotically equivalent to previous bounds as a  
function of sensitivity, but unlike previous  
bounds, its dependence on component noise implies  
that as this noise increases to 1/2, the size of  
reliable circuits must increase unboundedly. In all  
cases, the bound is strictly stronger than previous  
results. <P>Using different techniques, we obtain the  
exact threshold for component noise, above which  
noisy formulas cannot reliably compute all  
functions. We obtained an upper bound on this  
threshold in studying the depth of noisy circuits.  
The fact that this bound is only slightly larger than  
the true threshold indicates the high precision of  
our information theoretic techniques.  
 
----- 
File: 1994/tr-94-058 
 
Hierarchical Encoding of MPEG Sequences Using Priority Encoding Transmission 
(PET) 
 
Christian Leicher 
tr-94-058 
November 1994 
 
Priority Encoding Transmission (PET) is a new  
approach to the transmission of prioritized  
information over lossy packet- switched networks.  
The basic idea is that the source assigns different  
priorities to different segments of data, and then  
PET encodes the data using multi-level redundancy  
and disperses the encoding into the packets to be  
transmitted. The property of PET is that the  
destination is able to recover the data in priority  
order based on the number of packets received per  
message. <P>This work addresses the hierarchical  
encoding of MPEG video streams in a PET scenario. Its  
focus is more on the recovery aspect, rather than on  
computational issues. The basic idea is that  
inter-frames are less redundantly encoded than  
intra- frames. It introduces a scenario which should  
prove the feasibility of our design considerations  
and describes simulation results with different  
MPEG sequences.  
<P> 
Keywords: Packet video, PET, MPEG, Erasure Codes  
 
----- 
File: 1994/tr-94-059 
 
Tenet Real-Time Protocol Suite: Design, Implementation, and Experiences 
 
Anindo Banerjea, Domenico Ferrari, Bruce A. Mah, Mark Moran, Dinesh C.  Verma, and Hui Zhang 
tr-94-059 
November 1994 
 
Many future applications will require guarantees on  
network performance, such as bounds on throughput,  
delay, delay jitter, and reliability. To address  
this need, the Tenet Group at the University of  
California at Berkeley has designed, simulated, and  
implemented a suite of network protocols to support  
{\m real-time channels} (network connections with  
mathematically provable per formance guarantees).  
The protocols, which constitute the prototype Tenet  
Real-Time Protocol Suite ({\m Suite 1}), run on a  
packet-switching internetwork, and can coexist  
with the popular Internet Suite. We rely on the use of  
connection-oriented communication, admission  
control, and channel rate control. <P>This protocol  
suite is the first complete set of communication  
protocols that can transfer real-time streams with  
guaranteed quality in packet-switching  
internetworks. Our initial development was done on a  
local-area FDDI network. We have since installed our  
protocols on the experimental wide-area  
internetwork of Project Sequoia 2000, where they  
have been running for several months. We have  
performed a number of experiments and  
demonstrations in this environ ment using  
continuous-media loads (particularly video). Our  
results show that our approach is both feasible and  
practical to build, and that it can successfully  
provide performance guarantees to real-time  
applications. This paper describes the design and  
implementation of the suite, the experiments we  
performed, and selected results, along with the  
lessons we learned.  
 
----- 
File: 1994/tr-94-060 
 
Feature selection for object tracking in traffic scenes 
 
Sylvia Gil, Ruggero Milanese, and Thierry Pun 
tr-94-060 
November 1994 
 
This paper describes a motion-analysis system,  
applied to the problem of vehicle tracking in  
real-world highway scenes. The system is structured  
in two stages. In the first one, a motion-detection  
algorithm performs a figure/ground segmentation,  
providing binary masks of the moving objects. In the  
second stage, vehicles are tracked for the rest of the  
sequence, by using Kalman filters on two state  
vectors, which represent each target's position and  
velocity. A vehicle's motion is represented by an  
affine model, taking into account translations and  
scale changes. Three types of features have been used  
for the vehicle's description state vectors. Two of  
them are contour-based: the bounding box and the  
centroid of the convex polygon approximating the  
vehicles contour. The third one is region-based and  
consists of the 2-D pattern of the vehicle in the  
image. For each of these features, the performance of  
the tracking algorithm has been tested, in terms of  
the position error, stability of the estimated  
motion parameters, trace of the motion model's  
covariance matrix, as well as computing time. A  
comparison of these results appears in favor of the  
use of the bounding box features.  
 
----- 
File: 1994/tr-94-061 
 
Resource partitioning for multi-party real-time communication 
 
Amit Gupta, Domenico Ferrari 
tr-94-061 
November 1994 
 
For real-time communication services to achieve  
widespread usage, it is important that the network's  
management be allowed to control the services  
effectively. An important management capability  
concerns resource partitioning, i.e.,  
distributing the different resources available at  
any given server (network node or link) among a number  
of partitions, where the admission control and  
establishment computations for a given connection  
need to consider only the connections in the same  
partition, and are completely independent of the  
connections accepted in other partitions. Resource  
partitioning is useful for a number of applications,  
including the creation of virtual private  
subnetworks, and of mechanisms for advance  
reservation of real-time network services, fast  
establishment of real-time connections, and mobile  
computing with real-time communication. In  
previous work, we presented a scheme for resource  
partitioning in a guaranteed performance  
networking environment with EDD-based packet  
scheduling disciplines. We now present the results  
of our continuing research, giving admission  
control tests for resource partitioning for two  
additional scheduling disciplines, FIFO and RCSP,  
as well. We also simulate our resource partitioning  
scheme in a multi-party application scenario. Our  
simulations confirm that resource fragmentation  
losses due to resource partitioning are small, and  
that resource partitioning reduces the admission  
control computation overhead. A somewhat  
surprising result from the simulation experiments  
is that, under circumstances that arise naturally in  
multi-party communication scenarios, resource  
partitioning results in higher overall connection  
acceptance rate.  
 
----- 
File: 1994/tr-94-062 
 
Sather 1.0 Tutorial 
 
Michael Philippsen 
tr-94-062 
December 1994 
 
This document provides basic information on how to  
obtain your copy of the Sather 1.0 system and gives  
several pointers to articles discussing Sather 1.0  
in more detail. We thoroughly describe the  
implementation of a basic chess program. By  
carefully reading this document and the discussed  
example program, you will learn enough about Sather  
1.0 to start programming in Sather 1.0 yourself. This  
document is intended for programmers familiar with  
object oriented languages such as Eiffel or C++. The  
main features of Sather 1.0 are explained in detail:  
we cover the difference between subtyping and  
implementation inheritance and explain the  
implementation and usage of iters. Moreover, the  
example program introduces all the class elements  
(constants, shared and object attributes, routines  
and iters) are introduced. Most statements and most  
expressions are also discussed. Where appropriate,  
the usage of some basic features which are provided by  
the Sather 1.0 libraries are demonstrated. The  
Tutorial is completed by showing how an external  
class can be used to interface to a C program.  
 
----- 
File: 1994/tr-94-063 
 
Approximating Minimum Cuts under Insertion 
 
Monika Rauch Henzinger 
tr-94-063 
November 1994 
 
This paper presents insertions-only algorithms for  
maintaining the exact and approximate size of the  
minimum edge and vertex cut of a graph. The algorithms  
are optimal in the sense that they match the  
performance of the best static algorithm for the  
problem. We first give an incremental algorithm that  
maintains a $(2+\epsilon)$-approximation of the  
size minimum edge cut in amortized time  
$O(1/\epsilon^2)$ per insertion and $O(1)$ per  
query. Next we show how to maintain the exact size  
$\lambda$ of the minimum edge cut in amortized time  
$O(\lambda \log n)$ per operation. Combining these  
algorithms with random sampling finally gives a  
randomized Monte-Carlo algorithm that maintains a  
$(1+\epsilon)$-approximation of the minimum edge  
cut in amortized time $O((\log \lambda) ((\log  
n)/\epsilon)^2)$ per insertion. <P>Finally we present  
the first 2-approximation algorithm for the size  
$\kappa$ of the minimum vertex cut in a graph. It takes  
time $O(n^2 \min (\sqrt n, \kappa))$. This is an  
improvement of a factor of $\kappa$ over the time for  
the best algorithm for computing the exact size of the  
minimum vertex cut, which takes time $O(\kappa^2 n^2  
+ k^3 n^{1.5})$. We also give the first algorithm for  
maintaining a $(2+\epsilon)$-approximation of the  
minimum vertex cut under insertions. Its amortized  
insertion time is $O(n /\epsilon)$. The algorithms  
output the approximate or exact size $k$ in constant  
time and a cut of size $k$ in time linear in its size.  
 
<P> 
Keywords: dynamic graph algorithms, data  
structures, analysis and design of algorithms.  
 
----- 
File: 1994/tr-94-064 
 
Remap: Recursive Estimation and Maximization of a Posteriori Probabilities 
 
Herve Bourlard, Yochai Konig and Nelson Morgan 
tr-94-064 
November 1994 
 
In this report, we describe the theoretical  
formulation of REMAP, an approach for the training  
and estimation of posterior probabilities using a  
recursive algorithm that is reminiscent of the EM  
(Expectation Maximization) algorithm for the  
estimation of data likelihoods. Although very  
general, the method is developed in the context of a  
statistical model for transition-based speech  
recognition using Artificial Neural Networks (ANN)  
to generate probabilities for hidden Markov models  
(HMMs). In the new approach, we use local conditional  
posterior probabilities of transitions to estimate  
global posterior probabilities of word sequences  
given acoustic speech data. Although we still use  
ANNs to estimate posterior probabilities, the  
network is trained with targets that are themselves  
estimates of local posterior probabilities. These  
targets are iteratively re-estimated by the REMAP  
equivalent of the forward and backward recursions of  
the Baum-Welch algorithm to guarantee regular  
increase (up to a local maximum) of the global  
posterior probability. Convergence of the whole  
scheme is proven. <P>Unlike most previous hybrid  
HMM/ANN systems that we and others have developed,  
the new formulation determines the most probable  
word sequence, rather than the utterance  
corresponding to the most probable state sequence.  
Also, in addition to using all possible state  
sequences, the proposed training algorithm uses  
posterior probabilities at both local and global  
levels and is discriminant in nature.  
 
----- 
File: 1994/tr-94-065 
 
Complexity Issues for Solving Triangular Linear Systems in Parallel 
 
Eunice E. Santos, 
tr-94-065 
December 1994 
 
We consider the problem of solving triangular linear  
systems on parallel distributed-memory machines.  
Working with the LogP model, we present tight  
asymptotic bounds for solving these systems using  
forward/back- ward substitution. Specifically, in  
this paper we present lower bounds on execution time  
independent of the data layout, lower bounds for data  
layouts in which the number of data items per  
processor is bounded, and lower bounds for specific  
data layouts commonly used in designing paral- lel  
algorithms for this problem. Furthermore,  
algorithms are provided which have running times  
within a constant factor of the lower bounds  
described. Finally, we present a generalization of  
the lower bounds to banded triangular linear  
systems.  
 
----- 
File: 1994/tr-94-066 
 
Side Effect Free Functions in Object-Oriented Languages 
 
Noemi Rodriguez and Roberto Jerusalimschy 
tr-94-066 
December 1994 
 
Mathematical functions have always been considered  
an important abstraction to be incorporated in  
programming languages. However, in most imperative  
languages this abstraction is not really supported,  
since any kind of side effect is allowed in a function,  
with at most a warning in the manual that such effects  
are not good programming practice. <P>Several levels of  
control over side effects may be identified, ranging  
from this total lack of control up to functions that  
use only the functional subset of the language. In  
this paper we study the class of functions (called  
{\em side effect free}, or \sef) which may not change  
old values in memory, but may create new values. A  
method is described for statically ensuring that a  
function is \sef in the programming language School,  
an imperative object oriented language whose main  
design goal is to achieve good flexibility with a  
secure static type system. The proposed algorithm is  
completely done in compile time, integrated with the  
type checking. It ensures that any function accepted  
as \sef\ cannot modify pre-existent objects, that  
is, objects created priorly to the function  
activation. <P>A formal memory model for the execution  
of School is presented in the paper; this allows a  
precise definition of \sef methods to be given. <P>The  
method for checking that a function is \sef relies on  
the concept of \old objects. An object is \old, from  
the point of view of a function invocation, if it was  
created before this invocation. Such objects are  
seen through a special filter, called the \old  
transformation, during checking of a \sef function,  
disallowing the invocation of methods which may  
cause any modification to them. One important point  
is that, since types in School are used solely at  
compile time, the use of this filter does not imply in  
any runtime conversions. After applying the \old  
transformation, checking of \sef functions reduces  
to the normal type checking in School, with only one  
extra rule: assignments to instance variables are  
forbidden. This means that the introduction of side  
effect free methods does not imply in much extra  
implementation effort or complexity of  
understanding. <P>Guarantee of ``side effect  
free-ness'' is in general associated with lack of  
flexibility. We believe that the proposed method  
achieves good results in this direction, since \sef  
functions can do most tasks which are in fact  
side-effect free. This is discussed in the paper with  
the use of some examples.  
 
----- 
File: 1994/tr-94-067 
 
Fundamental Limits and Tradeoffs of Providing Deterministic Guarantees to 
VBR Video Traffic 
 
E. Knightly, D. Wrege, J. Liebeherr, and H. Zhang 
tr-94-067 
December 1994 
 
Compressed digital video is one of the most important  
traffic types in future integrated services  
networks. However, a network service that supports  
delay-sensitive video imposes many problems since  
compressed video sources are variable bit rate (VBR)  
with a high degree of burstiness. In this paper, we  
consider a network service that can provide  
deterministic guarantees on the minimum throughput  
and the maximum delay of VBR video traffic. A common  
belief is that due to the burstiness of VBR traffic,  
such a service will not be efficient and will  
necessarily result in low network utilization. We  
investigate the fundamental limits and tradeoffs in  
providing deterministic performance guarantees to  
video and use a set of 10 to 90 minute long  
MPEG-compressed video traces for evaluation.  
Contrary to conventional wisdom, we are able to show  
that a deterministic service can be provided to video  
traffic even while maintaining a high level of  
network utilization. We first consider an ideal  
network environment that employs the most accurate  
video traffic characterizations,  
Earliest-Deadline-First packet schedulers, and  
exact admission control conditions. The  
utilization achievable in this situation provides  
the fundamental limits of a deterministic service.  
We then investigate the utilization limits in a  
network environment that takes into account  
practical constraints, such as the need for fast  
policing mechanisms, simple packet scheduling  
algorithms, and efficient admission control tests.  
Even when considering these practical tradeoffs, we  
demonstrate that a considerably high network  
utilization is achievable by a deterministic  
service.  
 
----- 
File: 1994/tr-94-068 
 
LOG-Space Polynomial End-to-End Communication 
 
Eyal Kushilevitz and Rafail Ostrovsky and Adi Rosen 
tr-94-068 
December 1994 
 
Communication between processors is the essence of  
distributed computing: clearly, without  
communication distributed computation is  
impossible. However, as networks become larger and  
larger, the frequency of link failures increases.  
The End-to-End Communication is a classical problem  
that asks how to carry out fault-free communication  
between two processors over a network, in spite of  
such {\em frequent} communication faults. The sole  
minimum assumption is that the two processors that  
are trying to communicate are not permanently  
disconnected (i.e., the communication should  
proceed even in the case that there does not (ever)  
simultaneously exist, at any time, any operational  
path between the two processors that are trying to  
communicate.) For the first time, we present a  
protocol which solves this fundamental problem with  
logarithmic-space and polynomial-communication  
at the same time. This is an {\em exponential memory  
improvement} to {\em all} previous  
polynomial-communication solutions. That is, all  
previous polynomial-communication solutions  
needed at least {\em linear} (in $n$, the size of the  
network) amount of memory per edge. Our algorithm  
maintains a simple-to-compute $O(\log n)$-bits  
potential function at each edge in order to perform  
routing, and uses a novel technique of packet  
canceling which allows us to keep only {\em one}  
packet per edge. We stress that both the computation  
of our potential function and our packet-canceling  
policy are totally local in nature; we believe that  
they are applicable to other settings as well.  
 
----- 
File: 1994/tr-94-070 
 
Automatic Alignment of Array Data and Processes To Reduce Communication Time 
on DMPPs 
 
Michael Philippsen 
tr-94-070 
December 1994 
 
This paper investigates the problem of aligning  
array data and processes in a distributed-memory  
implementation. We present complete algorithms for  
compile-time analysis, the necessary program  
restructuring, and subsequent code-generation,  
and discuss their complexity. We finally evaluate  
the practical usefulness by quantitative  
experiments. The technique presented analyzes  
complete programs, including branches, loops, and  
nested parallelism. Alignment is determined with  
respect to offset, stride, and general axis  
relations. Both placement of data and processes are  
computed in a unifying framework based on an extended  
preference graph and its analysis. Furthermore,  
dynamic redistribution and replication are  
considered in the same technique. The experimental  
results are very encouraging. The optimization  
algorithms implemented in the Modula-2*, compiler,  
developed at the University of Kalrsruhe, improved  
the execution times of the programs by on average over  
40% on a MasPar MP-1 with 16384 processors. Updated  
March 1995  
 
 
----- 
File: 1994/tr-94-071 
 
Improved Randomized On-Line Algorithms for the List Update Problem 
 
Susanne Albers 
tr-94-071 
December 1994 
 
The best randomized on-line algorithms known so far  
for the list update problem achieve a  
competitiveness of $\sqrt{3} \approx 1.73$. In this  
paper we present a new family of randomized on-line  
algorithms that beat this competitive ratio. Our  
improved algorithms are called TIMESTAMP  
algorithms and achieve a competitiveness of  
$\max\{2-p, 1+p(2-p)\}$, for any real number  
$p\in[0,1]$. Setting $p = (3-\sqrt{5})/2$, we  
obtain a $\phi$-competitive algorithm, where $\phi  
= (1+\sqrt{5})/2\approx 1.62$ is the Golden Ratio.  
TIMESTAMP algorithms coordinate the movements of  
items using some information on past requests. We can  
reduce the required information at the expense of  
increasing the competitive ratio. We present a very  
simple version of the TIMESTAMP algorithms that is  
$1.68$-competitive. The family of TIMESTAMP  
algorithms also includes a new deterministic  
2-competitive on-line algorithm that is different  
from the MOVE-TO-FRONT rule.  
 
----- 
File: 1995/tr-95-001 
 
Polynomial Bounds for VC Dimension of Sigmoidal Neural Networks 
 
Marek Karpinski, Angus Macintyre 
tr-95-001 
January 1995 
 
We introduce a new method for proving explicit upper  
bounds on the VC Dimension of general functional  
basis networks, and prove as an application, for the  
first time, the VC Dimension of analog neural  
networks with the sigmoid activation function  
$\sigma(y)=1/1+e^{-y}$ to be bounded by a quadratic  
polynomial in the number of programmable  
parameters.  
 
----- 
File: 1995/tr-95-002 
 
A Tower Architecture for Meta-Level Inference Systems Based on Omega-Ordered 
Horn Theories 
 
Pierre E. Bonzon 
tr-95-002 
January 1995 
 
We present a simple meta-level inference system  
based on a non-ground representation of both base and  
meta-knowledge given under the form of  
omega-ordered Horn theories. Processing is done via  
an extension of the traditional "vanilla"  
interpreter for logic programs, whose novel lifting  
mechanism allows one to hop up and down the hierarchy  
of theories. The resulting computational system  
resembles very much the tower architecture defined  
for functional programming. While lifting does  
prevent infinite recursion, successful  
termination depends on the actual ordering of  
theories. At the end, this situation amounts to  
facing yet another, meta-meta-level search  
problem. <P>The expressive power of this system is  
illustrated with the solutions to various problems  
from the current literature, including the 3 wise men  
problem. It looks like a reasonable assumption to  
hypothesize that most (if not all) specialized  
reasoning performed under the label of "proofs in  
context" can be formulated within this system.  
 
----- 
File: 1995/tr-95-003 
 
Understanding Radio Broadcasts On Soccer: The Concept `Mental Image' and Its 
Use in Spatial Reasoning 
 
J&ouml;rg R. J. Schirra 
tr-95-003 
January 1995 
 
Most cognitive theories agree that a listener of a  
sports broadcast on radio usually imagines the scene  
described; the concept `mental image' appears in a  
specific sort of explanations. In contrast to this  
conception, it is argued that this concept should  
rather be understood as part of a certain kind of  
grounding (or justifying) explanations of the radio  
listener's understanding. This particular  
conception is based on the distinction between  
`specification' and `implementation' as found in  
the theory of abstract data types. Its application to  
the field of spatial concepts leads to a  
computational system (ANTLIMA) which exemplifies  
how the expression `mental image' could be used while  
explaining a speaker's ability to control the  
resolvability of ambiguities in an objective  
description of what the speaker sees.  
 
----- 
File: 1995/tr-95-004 
 
Efficiency Comparison of Real-Time Transport Protocols 
 
Pasquale di Genova and Giorgio Ventre 
tr-95-004 
March 1995 
 
In this paper we consider the problem of providing  
efficient network support to distributed real-time  
applications with different communication  
requirements. In the case of resource reservation  
protocols, the level of efficiency of a transport  
service connection provided by a communication  
system is influenced by the applications  
requirements, in terms of amount of network  
resources needed to provide guaranteed Quality of  
Service. We consider the Tenet protocol suite, a  
connection-oriented internetworking set of  
protocols based upon resource reservation. The  
suite provides a real-time network service (i.e., a  
service with guaranteed performance) to two types of  
applications: continuous media (CM) clients that  
generate data at regular time intervals (e.g., video  
and audio); message oriented clients that generate  
data at arbitrary times (e.g., urgent messages and  
remote control applications). We compare the  
performance of the transport protocol for CM clients  
(CMTP) to that of the transport protocol for message  
oriented clients (RMTP). In particular, we consider  
the buffer usage in the underlying real-time  
internetwork protocol (RTIP). The results of the  
simulations show that in the CMTP case, by taking  
advantage of the regular nature of CM clients, proper  
mechanisms can be adopted to further smooth  
traffic,so that buffers are used much more  
efficiently than in the rmtp case.  
 
----- 
File: 1995/tr-95-005 
 
Emulation of Traffic Congestion on ATM Gigabit Networks 
 
Jordi Domingo-Pascual, Andres Albanese, Wieland Holfelder 
tr-95-005 
March 1995 
 
The deployment of gigabit networks and broadband  
services has started to support multimedia  
applications, however, these gigabit networks are  
rarely saturated since only a few applications are  
able to stress the network. We consider a future  
scenario where the use of multimedia applications,  
such as audio and video teleconferencing in a  
multi-user environment, is expected to grow  
rapidly. Therefore, both customers and network  
providers, need to foresee the performance and  
behavior of the network and the applications in this  
scenario. From the customer's point of view, it is  
important to develop procedures to perform traffic  
measurements and to be able to test the local ATM  
equipment. In this paper we propose a method to  
introduce heavy load into an ATM switch and at the User  
Network Interface (UNI) for studying the  
performance and forecast evolved scenarios. In the  
experiments we use local equipment (ATM switch and  
workstations), local network management  
applications and diagnostics software. The  
emulated load is generated in a workstation,  
introduced into the ATM switch and intensified by  
replicating and re-circulating the cells. The  
method presented is an easy and affordable way for  
performance testing and an alternative to traffic  
modeling. Several experiments have been performed  
and the measurements obtained are presented.  
 
----- 
File: 1995/tr-95-006 
 
A Fast Parallel Cholesky Decomposition Algorithm for Tridiagonal Symmetric 
Matrices 
 
Ilan Bar-On, Bruno Codenotti, and Mauro Leoncini 
tr-95-006 
February 1995 
 
We present a new fast and stable parallel algorithm  
for computing the Cholesky decomposition of real  
symmetric and positive definite tridiagonal  
matrices. This new algorithm is especially suited  
for the solution of linear systems and for computing a  
few eigenvalues of very large matrices. We  
demonstrate these results on the Connection Machine  
CM5, where we obtain a very satisfactory  
performance. We finally note that the algorithm can  
be generalized to block tridiagonal and band  
systems.  
 
----- 
File: 1995/tr-95-007 
 
Characterization of Video Traffic 
 
Rahul Garg 
tr-95-007 
January 1995 
 
ATM networks will carry a wide variety of data over the  
same packet switching network. A majority of this  
traffic is expected to be real-time video generated  
by video on demand, video conferencing systems, etc.  
We study the characteristics of video data  
compressed using standard coding algorithms  
namely, JPEG, MPEG and also popular ones such as the  
video conferencing software NV. A wide range of video  
sources from movies to a class lecture were analyzed.  
Most of the traces were longer than an hour. The bit  
rate of the traces has been characterized using the  
leaky bucket model. We also show a method of choosing  
appropriate leaky bucket parameters. Burstiness  
function is used to characterize the burstiness of  
the video traffic at different time scales.  
<P> 
Keywords: ATM, Burstiness, Burstiness Function,  
Characterization, JPEG, Leaky Bucket, MPEG,  
Networks, NV, Packet Video, Traffic  
Characterization.  
 
----- 
File: 1995/tr-95-008 
 
Distributed advance reservation of real-time connections 
 
Domenico Ferrari, Amit Gupta, Giorgio Ventre 
tr-95-008 
March 1995 
 
The ability to reserve real-time connections in  
advance is essential in all distributed multi-party  
applications (i.e., applications involving  
multiple human beings) using a network that controls  
admissions to provide good quality of service. This  
paper discusses the requirements of the clients of an  
advance reservation service, and a distributed  
design for such a service. The design is described  
within the context of the Tenet Real-Time Protocol  
Suite 2, a suite being developed for multi-party  
communication, which will offer advance  
reservation capabilities to its clients based on the  
principles and the mechanisms proposed in the paper.  
Simulation results providing useful data about the  
performance and some of the properties of these  
mechanisms are also presented. We conclude that the  
one described here is a viable approach to  
constructing an advance reservation service within  
the context of the Tenet Suites as well as that of other  
solutions to the multi-party real-time  
communication problem.  
 
----- 
File: 1995/tr-95-009 
 
Adaptive Parameter Pruning in Neural Networks 
 
Lutz Prechelt 
tr-95-009 
March 1995 
 
Neural network pruning methods on the level of  
individual network parameters (e.g. connection  
weights) can improve generalization. An open  
problem in the pruning methods known today (OBD, OBS,  
autoprune, epsiprune) is the selection of the number  
of parameters to be removed in each pruning step  
(pruning strength). This paper presents a pruning  
method ``lprune'' that automatically adapts the  
pruning strength to the evolution of weights and loss  
of generalization during training. The method  
requires no algorithm parameter adjustment by the  
user. The results of extensive experimentation  
indicate that lprune is often superior to autoprune  
(which is superior to OBD) on diagnosis tasks unless  
severe pruning early in the training process is  
required. Results of statistical significance  
tests comparing autoprune to the new method lprune as  
well as to backpropagation with early stopping are  
given for 14 different problems.  
 
----- 
File: 1995/tr-95-010 
 
1.757 and 1.267-Approximation Algorithms for the Network and Rectilinear 
Steiner Tree Problems 
 
Marek Karpinski, Alexander Zelikovsky 
tr-95-010 
March 1995 
 
The Steiner tree problem requires to find a shortest  
tree connecting a given set of terminal points in a  
metric space. We suggest a better and fast heuristic  
for the Steiner problem in graphs and in rectilinear  
plane. This heuristic finds a Steiner tree at most  
1.757 and 1.267 times longer than the optimal  
solution in graphs and rectilinear plane,  
respectively.  
 
----- 
File: 1995/tr-95-011 
 
Polynomial Time Approximation Schemes for Dense Instances of $\NP$-Hard 
Problems 
 
Sanjev Arora, David Karger, Marek Karpinski 
tr-95-011 
March 1995 
 
We present a unified framework for designing  
polynomial time approximation schemes (PTASs) for  
``dense'' instances of many $\NP$-hard  
optimization problems, including maximum cut,  
graph bisection, graph separation, minimum $k$-way  
cut with and without specified sources, and maximum  
3-satisfiability. Dense graphs for us are graphs  
with minimum degree $\Theta(n)$, although some of  
our algorithms work so long as the graph is dense ``on  
average''. (Denseness for non-graph problems is  
defined similarly.) The unified framework begins  
with the idea of {\em exhaustive sampling:} picking a  
small random set of vertices, guessing where they go  
on the optimum solution, and then using their  
placement to determine the placement of everything  
else. The approach then develops into a PTAS for  
approximating certain {\em smooth\/} integer  
programs where the objective function is a ``dense''  
polynomial of constant degree.  
 
----- 
File: 1995/tr-95-012 
 
Differential Evolution - a simple and efficient adaptive scheme for global 
optimization over continuous spaces 
 
Rainer Storn and Kenneth Price 
tr-95-012 
March 1995 
 
A new heuristic approach for minimizing possibly  
nonlinear and non differentiable continuous space  
functions is presented. By means of an extensive  
testbed, which includes the De Jong functions, it  
will be demonstrated that the new method converges  
faster and with more certainty than Adaptive  
Simulated Annealing as well as the Annealed  
Nelder&Mead approach, both of which have a  
reputation for being very powerful. The new method  
requires few control variables, is robust, easy to  
use and lends itself very well to parallel  
computation.  
 
----- 
File: 1995/tr-95-013 
 
Communication Performance Models 
 
Stefan B&ouml;cking 
tr-95-013 
March 1995 
 
Communication performance models enable  
distributed real-time and multimedia applications  
to describe their performance requirements as  
regards throughput, delay and loss behavior of a  
particular communication service. The purpose of  
this paper is to give a basic understanding of  
communication performance models by presenting  
four different models: two models designed by the  
Tenet Group, one model on which ATM channel traffic is  
characterized by the ATM Forum, and the RFC 1363 Flow  
Specification of the Internet community. Besides  
their presentation in a unified terminology, their  
usability is shown by a video-on-demand example.  
<P>	 
Keywords: multimedia communication, real-time  
communication, quality-of-service (QoS)  
 
----- 
File: 1995/tr-95-014 
 
On the Problem of Masking Special Errors by Signature Analyzers 
 
Lutz Voelkel 
tr-95-014 
April 1995 
 
Signature analysis is an important compact method in  
digital testing. Applying this method, a test  
response sequence of a device under test is  
compressed by a linear feedback shift register  
(LFSR, for short). Masking occurs if a faulty device  
yields the same signature as the ccorresponding good  
device. Due to the linearity of any LFSR, this happens  
if and only if the "error sequence" which is obtained  
by the "exor" operation from the correct and the  
incorrect sequence, leads to the zero signature. The  
masking properties of signature analyzers depend  
widely on their structure which can be expressed  
algebraically by properties of their  
"characteristic polynomials". <P>There are three main  
directions of research in masking properties of  
signature analyzers: <P><UL><LI>(i) more general masking  
results either expressed by the characteristic  
polynomial or in terms of other LFSR properties; <LI>(ii)  
"quantitative" results, mostly expressed by  
computations or estimations of error  
probabilities; <LI>(iii) "qualitative" results, e.g.  
concerning the general possiblity or impossibility  
of LFSR to mask special types of error sequences. </UL><P> 
Following the third direction, we present a survey of  
masking properties of signature analyzers  
concerning error sequences having any odd weight, in  
the lecture. There are some results but also many open  
problems in this field. We have found some further  
insights in these problems by computer simulations.  
 
----- 
File: 1995/tr-95-015 
 
Physical Mapping of Chromosomes Using Unique Probes 
 
Farid Alizadeh, Richard M. Karp, Deborah K. Weisser, and Geoffrey Zweig 
tr-95-015 
April 1995 
 
The goal of physical mapping of the genome is to  
reconstruct a strand of DNA given a collection of  
overlapping fragments, or clones, from the strand.  
We present several algorithms to infer how the clones  
overlap, given data about each clone. We focus on data  
used to map human chromosomes 21 and Y, in which  
relatively short substrings, or probes, are  
extracted from the ends of clones. The substrings are  
long enough to be unique with high probability. The  
data we are given is an incidence matrix of clones and  
probes. <P>In the absence of error, the correct  
placement can be found easily using a PQ-tree. The  
data is never free from error, however, and  
algorithms are differentiated by their performance  
in the presence of errors. We approach errors from two  
angles: by detecting and removing them, and by using  
algorithms which are robust in the presence of  
errors. <P>We have also developed a strategy to recover  
noiseless data through an interactive process which  
detects anomalies in the data and retests  
questionable entries in the incidence matrix of  
clones and probes. <P>We evaluate the effectiveness of  
our algorithms empirically, using simulated data as  
well as real data from human chromosome 21.  
 
----- 
File: 1995/tr-95-016 
 
A Combined BIT and TIMESTAMP Algorithm for the List Update Problem 
 
Susanne Albers, Bernhard von Stengel, Ralph Werchner 
tr-95-016 
April 1995 
 
A simple randomized on-line algorithm for the list  
update problem is presented that achieves a  
competitive factor of 1.6, the best known so far. The  
algorithm makes an initial random choice between two  
known algorithms that have different worst-case  
request sequences. The first is the BIT algorithm  
that, for each item in the list, alternates between  
moving it to the front of the list and leaving it at its  
place after it has been requested. The second is a  
TIMESTAMP algorithm that moves an item in front of  
less often requested items within the list.  
<P>	 
Keywords: On-line algorithms, analysis of  
algorithms, competitive analysis, linear lists,  
list-update  
 
----- 
File: 1995/tr-95-017 
 
Comparing Algorithms for Dynamic Speed-Setting of a Low-Power CPU 
 
Kinshuk Govil, Edwin Chan, & Hal Wasserman 
tr-95-017 
April 1995 
 
To take advantage of the full potential of ubiquitous  
computing devices, we will need systems which  
minimize power consumption. Weiser et al. and others  
have suggested that this may be accomplished in part  
by a CPU which dynamically changes speed and voltage,  
thereby saving energy by spreading run cycles into  
idle time. Here we continue this research, using a  
simulation to compare a number of policies for  
dynamic speed-setting. Our work clarifies a  
fundamental power vs. delay tradeoff, as well as the  
role of prediction and of speed-smoothing in dynamic  
speed-setting policies. We conclude that success  
seems to depend more on simple smoothing algorithms  
than on sophisticated prediction techniques, but  
defer to the eventual replication of these results on  
actual multiple-speed systems.  
<P> 
Keywords: ubiquitous, portable, power usage, variable-speed  
CPU.  
 
----- 
File: 1995/tr-95-018 
 
Modeling and Optimization of PET-Redundancy Assignment for MPEG Sequences 
 
Rainer Storn 
tr-95-018 
May 1995 
 
Priority Encoding Transmission (PET) is an encoding  
scheme which provides multiple levels of redundancy  
in order to protect the different contents of a data  
set according to their importance. The task of  
optimally assigning redundancies for the PET  
encoding scheme is investigated for the special case  
of MPEG-1 encoded video sequences. The  
prerequisites for this optimization problem and the  
way of proceeding for its solution are outlined and  
several suggestions for further improvements are  
given.  
 
----- 
File: 1995/tr-95-019 
 
Modeling a Copier Paper Path: A Case Study in Modeling Transportation 
Processes 
 
Vineet Gupta and Peter Struss 
tr-95-019 
May 1995 
 
We present a compositional model of paper  
transportation in a photocopier that is meant to  
support different problem solving tasks like  
simulation and diagnosis, and to be applicable to a  
variety of configurations. Therefore, we try to  
avoid making hard-wired implicit assumptions about  
design principles and possible scenarios. In order  
to simplify our analysis, the model abstracts away  
from the physical forces and reasons only about  
velocities. Nonetheless, it succeeds in  
determining essential features of the motion of the  
sheet of paper like buckling and tearing. The  
framework provided is quite generic and can be used as  
a starting point for developing models of other  
transportation domains.  
 
----- 
File: 1995/tr-95-020 
 
Average Case Analysis of Dynamic Graph Algorithms 
 
David Alberts, Monika Rauch Henzinger 
tr-95-020 
May 1995 
 
We present a model for edge updates with restricted  
randomness in dynamic graph algorithms and a general  
technique for analyzing the expected running time of  
an update operation. This model is able to capture the  
average case in many applications, since (1) it  
allows restrictions on the set of edges which can be  
used for insertions and (2) the type (insertion or  
deletion) of each update operation is arbitrary,  
i.e., {\em not} random. We use our technique to  
analyze existing and new dynamic algorithms for the  
following problems: maximum cardinality matching,  
minimum spanning forest, connectivity, 2-edge  
connectivity, $k$-edge connectivity, $k$-vertex  
connectivity, and bipartiteness. Given a random  
graph $G$ with $m_0$ edges and $n$ vertices and a  
sequence of $l$ update operations such that the graph  
contains $m_i$ edges after operation $i$, the  
expected time for performing the updates for any $l$  
is $O(l \log n + n \sum_{i=1}^{l} 1/\sqrt m_i)$ in the  
case of minimum spanning forests, connectivity,  
2-edge connectivity, and bipartiteness. The  
expected time per update operation is $O(n)$ in the  
case of maximum matching. We also give improved  
bounds for $k$-edge and $k$-vertex connectivity.  
Additionally we give an insertions-only algorithm  
for maximum cardinality matching with worst-case  
$O(n)$ amortized time per insertion.  
 
----- 
File: 1995/tr-95-021 
 
Exploiting Process Lifetime Distributions for Dynamic Load Balancing 
 
Mor Harchol-Balter and Allen B. Downey 
tr-95-021 
May 1995 
 
We propose a preemptive migration scheme that  
assumes no prior knowledge about the behavior of  
processes, and show that it significantly  
outperforms more traditional non-preemptive  
migration schemes. Our scheme migrates a process  
only if the process' expected remaining lifetime  
justifies the cost of migration. To quantify this  
heuristic, we perform empirical studies on the  
distribution of process lifetimes and the  
distribution of memory use (which dominates  
migration cost) for a variety of workloads. We use  
these results to derive a robust criterion for  
selecting processes for migration. Using a  
trace-driven simulation based on actual job arrival  
times and lifetimes, we show that under our  
preemptive policy the mean slowdown of all processes  
is 40% less than under an optimistic non-preemptive  
migration scheme that uses name lists. Furthermore,  
the preemptive policy reduces the number of severely  
delayed processes by a factor of ten, compared with  
the non-preemptive scheme.  
 
----- 
File: 1995/tr-95-022 
 
Scaling Issues in the Design and Implementation of the Tenet RCAP2 Signaling 
Protocol 
 
Wendy Heffner 
tr-95-022 
May 1995 
 
Scalability is a critical metric when evaluating the  
design of any distributed system. In this paper we  
examine Suite 2 of the Tenet Network Protocols, which  
supports real-time guarantees for multi-party  
communication over packet switched networks. In  
particular, we evaluate the scalability of both the  
system design and the prototype implementation of  
the signaling protocol, RCAP2. The scalability of  
the design is analyzed on several levels. It is  
analyzed with regard to its support for large  
internetworks, many multi-party connections, and a  
large number of receivers in a single connection. In  
addition, the prototype implementation is examined  
to see where decisions have been made that reduce the  
scalability of the initial system design. We propose  
implementation alternatives that are more  
scalable. Finally, we evaluate the scalability of  
system design in compar ison to those of the ST-II  
signaling protocol (SCMP) and of RSVP.  
<P> 
Keywords: scaling, multicast connection, multimedia  
networking, real-time communication, Tenet  
protocols  
 
----- 
File: 1995/tr-95-023 
 
Properties of Stochastic Perceptual Auditory-event-based Models for 
Automatic Speech Recognition 
 
Su-Lin Wu 
tr-95-023 
May 1995 
 
Recently, physiological and psychoacoustic  
studies have uncovered new evidence supporting the  
idea that human auditory processes focus on the  
transitions between spoken sounds rather than on the  
steady-state portions of spoken sounds for speech  
recognition. Stochastic Perceptual  
Auditory-event-based Models (SPAMs) were  
developed by Morgan, Bourlard, Hermansky and  
Greenberg to take this new evidence into account for  
word models in speech recognition by machines. This  
paper details our efforts to build a speech  
recognition system based on some of the properties of  
SPAMs. Although not all aspects of the complete SPAM  
theory have been implemented, we did find that fairly  
good recognition is possible with a system that  
concentrates almost exclusively on the transitions  
between speech sounds. Additionally, we found that  
such a system enhanced the more conventional  
phoneme-based system, which emphasized  
recognition of steady-state sounds. This blended  
system performed better than either system alone,  
especially in the case of noise-obscured speech.  
 
----- 
File: 1995/tr-95-024 
 
Applying Large Vocabulary Hybrid HMM-MLP Methods to Telephone Recognition of 
Digits and Natural Numbers 
 
Kristine W. Ma 
tr-95-024 
May 1995 
 
The hybrid Hidden Markov Model (HMM) / Neural Network  
(NN) speech recognition system at the International  
Computer Science Institute (ICSI) uses a single  
hidden layer MLP (Multi Layer Perceptron) to compute  
the emission probabilities of the states of the HMM.  
This recognition approach was developed and has  
traditionally been used for large vocabulary size  
continuous speech recognition. In this report,  
however, such a recognition scheme is applied  
directly to three much smaller vocabulary size  
corpora, the Bellcore isolated digits, the TI  
connected digits, and the Center for Spoken Language  
Understanding Numbers'93 database. The work  
reported here is not only on developing small  
baseline systems to facilitate all future research  
experiments, but also on using these systems to  
evaluate front-end research issues, and the  
feasibility of using context-dependency for speech  
recognition under the hybrid approach developed at  
ICSI. In addition, using the TI connected digits, the  
performance of ICSI's baseline system on small  
vocabulary size speaker-independent task is  
compared with those of other speech research  
institutes.  
 
----- 
File: 1995/tr-95-025 
 
Fuzzy Inferencing: A Novel, Massively Parallel Approach 
 
Andrzej Buller 
tr-95-025 
May 1995 
 
The report presents a model of a fuzzy control in which  
decisions are worked out based on results of a  
competition between groups of agents which,  
represented by binary words, navigate in a neural  
working memory. Each agent is endowed with a strategy  
of its own behavior and carries its opinion. The  
opinions are symbolic statements encoding facts  
and/or rules. A Fuzzy Knowledge Base provides rules,  
as well as the values of membership of given measures  
to appropriate facts interpreted as fuzzy sets. At a  
given moment an indoctrinating device generates a  
fact or its negation with a probability calculated  
based on related membership value. A debate in the  
Society of Agents results in a victory of adherents of  
a particular solution. An ultimate decision is based  
on a poll. A hardware facilitating this kind of  
computation, as well as some simulation results are  
discussed.  
<P> 
Keywords: fuzzy control, neural  
networks, distributed inferencing.  
 
----- 
File: 1995/tr-95-026 
 
Differential Evolution Design of an IIR-Filter with Requirements for 
Magnitude and Group Delay 
 
Rainer Storn 
tr-95-026 
June 1995 
 
The task of desinging an 18 parameter IIR-filter  
which has to meet tight specifications for both  
magnitude response and group delay is investigated.  
This problem must usually be tackeled by specialized  
desing methods and requires a an expert in digital  
signal processing for its solution. The usage of the  
general purpose minimization method Diffe- rential  
Evolution (DE), however, allows to perform the  
filter design with a minimum knowledge about digital  
filters.  
 
----- 
File: 1995/tr-95-027 
 
Operations on Multimodal Records: Towards a Computational Cognitive 
Linguistics 
 
Andrzej Buller 
tr-95-027 
June 1995 
 
The report discusses a cognitive model in which a key  
concept Multimodal Record (MMR)--an organized  
aggregate of transcripts of signals representing  
all information an Agent continuously acquired for a  
certain period of time. The MMR consists of a video  
track, sound track, and a number of tracks containing  
transcripts of the values of temperature, pressure,  
etc., as well as transcripts of states of the Agent's  
internal structure.Three basic operations on MMRs,  
i.e. multimodal difference (m-),multimodal union  
(m+) and multimodal intersection (m*), to be  
performed using neural network has been introduced.  
Based on theoperations one can explain and/or  
implement a number of psycho- linguistic phenomena.  
MMR may be considered as a computable formof Image  
Scheme--the basic concept of Lakoff-Langacker  
CognitiveGrammar. Hence, the proposed model seems  
to be a bridge over thegap between the  
non-computational Cognitive Linguistics and an  
applied neurocomputing. Moreover, it may be  
considered as a steptowards a unified  
symbolic-connectionist paradigm.  
<P> 
Key words:  
Cognitive Grammar, Neurocomputing, Language  
acquisition  
 
----- 
File: 1995/tr-95-028 
 
Tenet Suite 1 and the Continuous Media Toolkit 
 
Peter Staunton 
tr-95-028 
June 1995 
 
The Continuous Media Toolkit is a flexible toolkit  
which facilitates development of local and  
distributed continuous media applications. Data  
transfer across a computer network is provided on a  
connectionless, best-effort basis using a network  
protocol called Cyclic-UDP. A second set of network  
protocols, called Tenet Suite 1, has been designed to  
provide a simplex, unicast, connection-oriented  
service to realtime traffic in a packet-switched  
internetwork, with guaranteed performance in terms  
of data throughput, end-to-end delay, delay jitter,  
and loss rate. This report describes an extension to  
the toolkit which allows an application developer to  
employ the guaranteed network services of Tenet  
Suite 1.  
 
----- 
File: 1995/tr-95-029 
 
Direct Methods for Solving Tridiagonal Linear Systems in Parallel 
 
Eunice E. Santos 
tr-95-029 
July 1995 
 
We consider the problem of solving tridiagonal  
linear systems on paral- lel distributed-memory  
machines. We present tight asymptotic bounds for  
solving these systems on the LogP model using two very  
common direct methods : odd-even cyclic reduction  
and prefix summing. Specifically, we present lower  
bounds on execution time independent of data layout,  
and lower bounds for specific data layouts commonly  
used in designing parallel algorithms to solve  
tridiagonal linear systems. Moreover, algorithms  
are provided which have running times within a  
constant factor of the lower bounds provided.  
 
----- 
File: 1995/tr-95-030 
 
Growing a Hypercubical Output Space in a Self-Organizing Map 
 
H.-U. Bauer, Th. Villmann 
tr-95-030 
July 1995 
 
Neural maps project data given in a (possibly  
high-dimensional) input space onto a neuron  
position in a (usually low-dimensional) output  
space grid. An important property of this projection  
is the preservation of neighborhoods; neighboring  
neurons in output space respond to neighboring data  
points in input space. To achieve this preservation  
in an optimal way during learning, the topology of the  
output space has to roughly match the effective  
structure of the data in the input space. We here  
present a growth algorithm, called the GSOM, which  
enhances a widespread map self-organization  
process, Kohonen's Self-Organizing Feature Map  
(SOFM), by an adaptation of the output space grid  
during learning. During the procedure the output  
space structure is restricted to a general  
hypercubical shape, with the overall  
dimensionality of the grid and its extensions along  
the different directions being subject of the  
adaptation. This constraint distinguishes the  
present algorithm from other, less or not  
constrained approaches to the problem of map  
topology adaptation. Depending on the embedding of  
neural maps in larger information processing  
systems, a regular neuronal grid can be essential for  
a successful operation of the overall system. We  
apply our GSOM-algorithm to three examples, two of  
which involve real world data. Using recently  
developed methods for measuring the degree of  
neighborhood preservation in neural maps, we find  
the GSOM-algorithm to produce maps which preserve  
neighborhoods in a nearly optimal fashion.  
 
----- 
File: 1995/tr-95-031 
 
Parallel Sorting With Limited Bandwidth 
 
Micah Adler, John W. Byers, and Richard M. Karp 
tr-95-031 
July 1995 
 
We study the problem of sorting on a parallel computer  
with limited communication bandwidth. By using the  
recently proposed PRAM($m$) model, where $p$  
processors communicate through a small, globally  
shared memory consisting of $m$ bits, we focus on the  
trade-off between the amount of local computation  
and the amount of inter-processor communication  
required for parallel sorting algorithms. We prove a  
lower bound of $\Omega(\frac{n \log m}{m})$ on the  
time to sort $n$ numbers in an exclusive-read variant  
of the PRAM($m$) model. We show that Leighton's  
Columnsort can be used to give an asymptotically  
matching upper bound in the case where $m$ grows as a  
fractional power of $n$. The bounds are of a  
surprising form, in that they have little dependence  
on the parameter $p$. This implies that attempting to  
distribute the workload across more processors  
while holding the problem size and the size of the  
shared memory fixed will not improve the optimal  
running time of sorting in this model. We also show  
that both the upper and the lower bound can be adapted  
to bridging models that address the issue of limited  
communication bandwidth: the LogP model and the BSP  
model. The lower bounds provide convincing evidence  
that efficient parallel algorithms for sorting rely  
strongly on high communication bandwidth.  
 
----- 
File: 1995/tr-95-032 
 
Scheduling Parallel Communication: The h-relation Problem 
 
Micah Adler, John W. Byers, and Richard M. Karp 
tr-95-032 
July 1995 
 
This paper is concerned with the efficient  
scheduling and routing of point-to-point messages  
in a distributed computing system with $n$  
processors. We examine the $h$-relation problem, a  
routing problem where each processor has at most $h$  
messages to send and at most $h$ messages to receive.  
Communication is carried out in rounds. Direct  
communication is possible from any processor to any  
other, and in each round a processor can send one  
message and receive one message. The off-line  
version of the problem arises when every processor  
knows the source and destination of every message. In  
this case the messages can be routed in at most $h$  
rounds. More interesting, and more typical, is the  
on-line version, in which each processor has  
knowledge only of $h$ and of the destinations of those  
messages which it must send. The on-line version of  
the problem is the focus of this paper. <P>The difficulty  
of the $h$-relation problem stems from {\em message  
conflicts}, in which two or more messages are sent to  
the same processor in a given round, but at most one can  
be received. The problem has been well studied in the  
OCPC optical network model, but not for other  
contemporary network architectures which resolve  
message conflicts using other techniques. In this  
paper, we study the $h$-relation problem under  
alternative models of conflict resolution, most  
notably a FIFO queue discipline motivated by  
wormhole routing and an arbitrary write discipline  
motivated by packet-switching networks. In each  
model the problem can be solved by a randomized  
algorithm in an expected number of rounds of the form  
$ch + $o$(h)$ + $\log^{\Theta(1)}n$, and we focus on  
obtaining the smallest possible asymptotic  
constant factor $c$. We first present a lower bound,  
proving that a constant factor of 1 is not achievable  
in general. We then present a randomized algorithm  
for each discipline and show that they achieve small  
constant factors.  
 
----- 
File: 1995/tr-95-033 
 
Smoothing and Multiplexing Tradeoffs for Deterministic Performance 
Guarantees to VBR Video 
 
Edward W. Knightly and Paola Rossaro 
tr-95-033 
July 1995 
 
The burstiness of variable bit rate traffic makes it  
difficult to both efficiently utilize network  
resources and provide end-to-end network  
performance guarantees to the traffic sources.  
Generally, smoothing or shaping traffic sources at  
the entrance of the network reduces their burstiness  
to allow higher utilization within the network.  
However, this buffering introduces an additional  
delay so that, in effect, lossless smoothing trades  
queueing delay inside the network for smoothing  
delay at the network edge. In this paper, we consider  
the net effect of smoothing on end-to-end  
performance guarantees where a no-loss,  
no-delay-violation deterministic guarantee is  
provided with the D-BIND traffic model. We  
analytically quantify these tradeoffs and provide a  
set of general rules for determining under which  
conditions smoothing provides a net gain. We also  
empirically investigate these tradeoffs using  
traces of MPEG compressed video.  
 
----- 
File: 1995/tr-95-034 
 
H-BIND: A New Approach to Providing Statistical Performance Guarantees to 
VBR Traffic 
 
Edward W. Knightly 
tr-95-034 
July 1995 
 
Current solutions to providing statistical  
performance guarantees to bursty traffic such as  
compressed video encounter several problems: 1)  
source traffic descriptors are often too simple to  
capture the burstiness and important  
time-correlations of VBR sources or too complex to be  
used for admission control algorithms; 2)  
stochastic descriptions of a source are inherently  
difficult for the network to enforce or police; 3)  
multiplexing inside the network's queues may change  
the stochastic properties of the source in an  
intractable way, precluding the provision of  
end-to-end QoS guarantees to heterogeneous sources  
with different performance requirements. In this  
paper, we present a new approach to providing  
end-to-end statistical performance guarantees  
that overcomes these limitations. We term the  
approach Hybrid Bounding Interval Dependent  
(H-BIND) because it uses the Deterministic-BIND  
traffic model to capture the correlation structure  
and burstiness properties of a stream; but unlike a  
deterministic performance guarantee, it achieves a  
Statistical Multiplexing Gain (SMG) by exploiting  
the {\em statistical} properties of  
deterministically-bounded streams. Using traces  
of MPEG-compressed video, we show that the H-BIND  
scheme can achieve average network utilizations of  
up to 86% in a realistic scenario.  
 
----- 
File: 1995/tr-95-035 
 
Pairwise Independence and Derandomization 
 
Michael Luby and Avi Wigderson 
tr-95-035 
July 1995 
 
This set of notes gives several applications of the  
following paradigm. The paradigm consists of two  
complementary parts. The first part is to design a  
probabilistic algorithm described by a sequence of  
random variables so that the analysis is valid  
assuming limited independence between the random  
variables. The second part is the design of a small  
probability space for the random variables such that  
they are somewhat independent of each other. Thus,  
the analysis of the algorithm holds even when the  
random variables used by the algorithm are generated  
according to the small space.  
 
----- 
File: 1995/tr-95-036 
 
New Approximation Algorithms for the Steiner Tree Problems 
 
Marek Karpinski, Alexander Zelikovsky 
tr-95-036 
August 1995 
 
The Steiner tree problem asks for the shortest tree  
connecting a given set of terminal points in a metric  
space. We design new approximation algorithms for  
the Steiner tree problems using a novel technique of  
choosing Steiner points in dependence on the  
possible deviation from the optimal solutions. We  
achieve the best up to now approximation ratios of  
1.644 in arbitrary metric and 1.267 in rectilinear  
plane, respectively.  
<P> 
Keywords: Approximation  
Algorithms, Network Steiner Tree Problem,  
Rectilinear Steiner Tree Problem, Approximation  
Ratio.  
 
----- 
File: 1995/tr-95-037 
 
A Cognitive Off-line Model for Motor Interpretation of Handwritten Words 
 
Claudio M. Privitera 
tr-95-037 
August 1995 
 
The image of a word or a generic hand made drawing on a  
piece of paper is usually characterized by a series of  
interfering zones where the cursive trace  
intersects itself or printed lines already present  
on the writing surface. In these zones, the odometric  
information is ambiguous and any trivial inference  
on the original pen tip movement cannot be done. In  
this report, starting from some basic cognitive  
considerations, a general procedure is developed to  
analize a generic image of a word or a common hand made  
scribble. This approach allows to detect each  
ambiguity part of the image and then interpretate  
them to finally recover a part of the original  
temporal information.  
 
----- 
File: 1995/tr-95-038 
 
Context and Vision 
 
Vito Di Gesu&#039 and Francesco Isgro&#039 
tr-95-038 
August 1995 
 
This report deals with problems of representation  
and handling of concurrent processes in  
multi-processor machines or in distributed and  
co-operating systems oriented to image analysis.  
For this purpose, the definition and some formal  
properties of a new synchronization engine, named  
"context" are given. Contexts are introduced as  
object variables in pictorial languages to  
represent distributed computation on spatial data.  
In particular, details of its implementation on the  
PIctorial C Language (PICL) are given. Operations  
are defined on the contexts space; the existing  
relations between contexts, formal languages, and  
graphs are considered, and have been used to optimize  
the implementation of contexts inside PICL.  
<P>	 
Keywords: parallel languages, concurrence, graph  
theory, image Analysis  
 
----- 
File: 1995/tr-95-039 
 
Average Case Analyses of List Update Algorithms, with Applications to Data 
Compression 
 
Susanne Albers and Michael Mitzenmacher 
tr-95-039 
August 1995 
 
We study the performance of the Timestamp(0) (TS(0))  
algorithm for self-organizing sequential search on  
discrete memoryless sources. We demonstrate that  
TS(0) is better than Move-to-front on such sources,  
and determine performance ratios for TS(0) against  
the optimal offline and static adversaries in this  
situation. Previous work on such sources compared  
online algorithms only to static adversaries. One  
practical motivation for our work is the use of the  
Move-to-front heuristic in various compression  
algorithms. Our theoretical results suggest that in  
many cases using TS(0) in place of Move-to-front in  
schemes that use the latter should improve  
compression. Tests using implementations on a  
standard corpus of test documents demonstrate that  
TS(0) leads to improved compression.  
 
----- 
File: 1995/tr-95-040 
 
Enabling Compiler Transformations for pSather 1.1 
 
Michael Philippsen 
tr-95-040 
August 1995 
 
pSather 1.1 is a parallel extension of the  
object-oriented sequential programming language  
Sather 1.1. A compiler for sequential Sather is  
available which is written in Sather. This document  
describes the basic ideas of the extensions of the  
sequential Sather compiler to handle pSather  
programs and is thus a high-level documentation of  
parts of the pSather compiler. Most of the  
transformations are presented in form of a  
transformation from pSather to Sather.  
 
----- 
File: 1995/tr-95-041 
 
Dealing with negated knowledge and inconsistency in a neurally motivated 
model of memory and reflexive reasoning 
 
Lokendra Shastri and Dean J. Grannes 
tr-95-041 
August 1995 
 
Recently, SHRUTI has been proposed as a  
connectionist model of rapid reasoning. It  
demonstrates how a network of simple neuron- like  
elements can encode a large number of specific facts  
as well as systematic knowledge (rules) involving  
n-ary relations, quanti- fication and concept  
hierarchies, and perform a class of reasoning with  
extreme efficiency. The model, however, does not  
deal with negated facts and rules involving negated  
antecedents and consequents. We describe an  
extension of SHRUTI that can encode positive as well  
as negated knowledge and use such knowledge during  
reflexive reasoning. The extended model explains  
how an agent can hold inconsistent knowledge in its  
long-term memory without being ``aware'' that its  
beliefs are inconsistent, but detect a  
contradiction whenever inconsistent beliefs that  
are within a certain inferential distance of each  
other become co-active during an episode of  
reasoning. Thus the model is not logically  
omniscient, but detects contradictions whenever it  
tries to use inconsistent knowledge. The extended  
model also explains how limited attentional focus or  
action under time pressure can lead an agent to  
produce an erroneous response. A biologically  
signficant feature of the model is that it uses only  
local inhibition to encode negated knowledge. Like  
the basic model, the extended model encodes and  
propagates dynamic bindings using temporal  
synchrony.  
<P> 
Keywords: long-term memory; rapid  
reasoning; dynamic bindings; synchrony; knowledge  
representation; neural oscillations; working  
memory; negation; inconsistent knowledge;  
tractable reasoning.  
 
----- 
File: 1995/tr-95-042 
 
Complexity and Real Computation: A Manifesto 
 
Lenore Blum, Felipe Cucker, Mike Shub and Steve Smale 
tr-95-042 
August 1995 
 
Finding a natural meeting ground between the highly  
developed complexity theory of computer science  
-with its historical roots in logic and the discrete  
mathematics of the integers--- and the traditional  
domain of real computation, the more eclectic less  
foundational field of numerical analysis -with its  
rich history and longstanding traditions in the  
continuous mathematics of analysis--- presents a  
compelling challenge. Here we illustrate the issues  
and pose our perspective toward resolution. This  
article is essentially the introduction of a book  
with the same title (to be published by Springer) to  
appear shortly.  
 
----- 
File: 1995/tr-95-043 
 
Performance Oriented Specification for Heterogenous Parallel Systems using 
Graphical Based Specifications 
 
Herwig Unger and Bernd Daene 
tr-95-043 
August 1995 
 
Today, multiprocessor systems can be used even for  
the solution of small problems. In contrast to this  
advantage in the development of hardware solutions  
there are only a few methods to specify and to generate  
efficient parallel programs especially in the area  
of heterogenous systems. <P>In the report we intend to  
show that Petri Nets are a suitable description  
language for doing so. An important point in this  
favour is that Petri Nets can represent both aspects  
influencing the quality of a solution in an uniform  
model: the software and the hardware on which the  
generated program will be executed. In that way the  
executable program can be derived by compiling the  
corresponding part of the model. Therefore powerful  
transformations of a given Petri Net are required in  
an iteration process. Thats why a classification  
about such transformations is given in the main part  
of our contribution, furthermore an new one will be  
introduced. Because run time input data strongly  
influence the performance a possibility of a dynamic  
implementation arising from such a transformation  
will be discussed too.  
 
----- 
File: 1995/tr-95-044 
 
Complexity of Searching an Immobile Hider in a Graph 
 
Bernhard von Stengel, Ralph Werchner 
tr-95-044 
August 1995 
 
We study the computational complexity of certain  
search-hide games on a graph. There are two players,  
called searcher and hider. The hider is immobile and  
hides in one of the nodes of the graph. The searcher  
selects a starting node and a search path of length at  
most k . His objective is to detect the hider, which he  
does with certainty if he visits the node chosen for  
hiding. Finding the optimal randomized strategies  
in this zero-sum game defines a fractional path  
covering problem and its dual, a fractional packing  
problem. If the length k of the search path is  
arbitrary, then the problem is NP-hard. The problem  
remains NP-hard if the searcher may freely revisit  
nodes that he has seen before. In that case, the  
searcher selects a connected subgraph of k nodes  
rather than a path of k nodes. If k is logarithmic in the  
number of nodes of the graph, then the problem can be  
solved in polynomial time; this is shown using a  
recent technique called color-coding due to Alon,  
Yuster, and Zwick. The same results hold for edges  
instead of nodes, that is, if the hider hides in an edge  
and the searcher searches k edges on a path or on a  
connected subgraph.  
<P> 
Keywords: Covering and  
packing, game theory, graph search,  
NP-completeness  
 
----- 
File: 1995/tr-95-045 
 
Random Walks on Colored Graphs: Analysis and Applications 
 
Diane Hernek 
tr-95-045 
August 1995 
 
This thesis introduces a model of a random walk  
on a colored undirected graph. Such a graph has a  
single vertex set and $k$ distinct sets of edges, each  
of which has a color. A particle begins at a designated  
starting vertex and an infinite color sequence $C$ is  
specified. At time $t$ the particle traverses an edge  
chosen uniformly at random from those edges of color  
$C_t$ incident to the current vertex. <P>The first part  
of this thesis addresses the extent to which an  
adversary, by choosing the color sequence, can  
affect the behavior of the random walk. In  
particular, we consider graphs that are covered with  
probability one on all infinite sequences, and study  
their expected cover time in the worst case over all  
color sequences and starting vertices. We prove  
tight doubly exponential upper and lower bounds for  
graphs with three or more colors, and exponential  
bounds for the special case of two-colored graphs. We  
obtain stronger bounds in several interesting  
special cases, including random and repeated  
sequences. These examples have applications to  
understanding how the entries of the stationary  
distributions of ergodic Markov chains scale under  
various elementary operations. <P>The random walks we  
consider are closely related to space-bounded  
complexity classes and a type of interactive proof  
system. The second part of the thesis investigates  
these relationships and uses them to obtain  
complexity results for reachability problems in  
colored graphs. We also use our techniques to obtain  
complexity results for problems from the theory of  
nonhomogeneous Markov chains. We consider the  
problem of deciding, given a finite set ${\cal C } =  
\{C_1 , \ldots, C_A \}$ of $n \times n$ stochastic  
matrices, whether every infinite sequence over  
$\cal C$ forms an ergodic Markov chain, and prove that  
it is PSPACE-complete. We also show that to decide  
whether a given finite-state channel is  
indecomposable is PSPACE-complete. This question  
is of interest in information theory where  
indecomposability is a necessary and sufficient  
condition for Shannon's theorem.  
 
----- 
File: 1995/tr-95-046 
 
Pet - Priority Encoded Transmission 
 
Bernd Lamparter, Andres Albanese, Malik Kalfane, and Michael Luby 
tr-95-046 
August 1995 
 
This paper presents a new Forward Error Correction  
scheme with several priority levels. It is useful for  
applications dealing with real-time transport  
streams like video and audio. Those streams consist  
of several data parts with different importance. PET  
allows to protect those parts with appropriate  
redundancy and thus guarantees, that the more  
important parts arrive before the less important  
ones. <P>In the video we show the impacts of losses to an  
MPEG video stream with and without PET protection.  
Due to the fragile nature of MPEG the unprotected  
stream breaks up, the PET protected stream is  
unaffected by low losses and jerky when high losses  
are present.  
 
----- 
File: 1995/tr-95-047 
 
The Implementation of PET 
 
Bernd Lamparter and Malik Kalfane 
tr-95-047 
August 1995 
 
This report describes the implementation of PET  
(Priority Encoding Transmission) and its  
integration into VIC PET is a new Forward Error  
Correction (FEC) scheme with several priority  
levels. It is useful for applications dealing with  
real-time transport streams like video and audio in a  
lossy environment. Those streams consist of several  
data parts with different importance. PET allows to  
protect those parts with appropriate redundancy and  
thus guarantees, that the more important parts  
arrive before the less important ones.  
 
----- 
File: 1995/tr-95-048 
 
An XOR-Based Erasure-Resilient Coding Scheme 
 
Johannes Bl&ouml;mer, Malik Kalfane, Marek Karpinski, Richard Karp, Michael Luby, David Zuckerman 
tr-95-048 
August 1995 
 
An $(m,n,b,r)$-erasure-resilient coding scheme  
consists of an encoding algorithm and a decoding  
algorithm with the following properties. The  
encoding algorithm produces a set of $n$ packets each  
containing $b$ bits from a message of $m$ packets  
containing $b$ bits. The decoding algorithm is able  
to recover the message from any set of $r$ packets.  
Erasure-resilient codes have been used to protect  
real-time traffic sent through packet based  
networks against packet losses. In this paper we  
construct an erasure-resilient coding scheme that  
is based on a version of Reed-Solomon codes and which  
has the property that $r=m.$ The encoding and  
decoding algorithms run in quadratic time and have  
been customized to give the first real-time  
implementations of {\it Priority Encoding  
Transmission\/} (PET) \cite{ABEL},\cite{ABELS}  
for medium quality video transmission on Sun  
SPARCstation 20 workstations.  
 
----- 
File: 1995/tr-95-049 
 
Imperative Concurrent Object-Oriented Languages: An Annotated Bibliography 
 
Michael Philippsen 
tr-95-049 
August 1995 
 
The title says it all.  
 
----- 
File: 1995/tr-95-050 
 
Imperative Concurrent Object-Oriented Languages 
 
Michael Philippsen 
tr-95-050 
August 1995 
 
During the last decade object-oriented programming  
has grown from marginal influence into widespread  
acceptance. During the same period of time, progress  
on the side of hardware and networking has changed the  
computing environment from sequential to parallel.  
Multi-processor workstations are  
state-of-the-art. Many proposals have been made to  
combine both developments. Always the prime  
objective was to provide the advantages of  
object-oriented software design at the increased  
power of parallel machines. However, combining both  
concepts has proven itself to be a notoriously  
difficult task. Depending on the approach, often key  
characteristics of either the object-oriented  
paradigm or key performance factors of parallelism  
are sacrificed, often resulting in unsatisfactory  
languages. <P>This survey first recapitulates  
well-known characteristics of both the  
object-oriented paradigm and parallel  
programming, before the design space of a  
combination is marked out by identifying various  
interdependences of key concepts. The design space  
is then filled with data points: For proposed  
languages we provide brief characteristics and  
feature tables. Both feature tables and the  
comprehensive bibliography listing might help to  
identify open questions and to prevent  
re-inventions. <P>For ``Web-Surfers'' we provide a  
wealth of interesting addresses.  
 
----- 
File: 1995/tr-95-051 
 
A Security Architecture for Tenet Scheme 2 
 
Rolf Oppliger, Amit Gupta, Mark Moran, and Riccardo Bettati 
tr-95-051 
August 1995 
 
This report proposes a security architecture for  
Tenet Scheme 2. The basic ideas are (1) to use Internet  
layer security protocols, such as the IP Security  
Protocol (IPSP) and Internet Key Management  
Protocol (IKMP), to establish authentic  
communication channels between RCAP daemons, (2) to  
handle client authentication and authorization  
locally, and (3) to use a proxy-based mechanism to  
propagate access rights. The security architecture  
uses as its building blocks a collision-resistant  
one-way hash function to compute and verify message  
authentication codes, and a digital signature  
system.  
 
----- 
File: 1995/tr-95-052 
 
Reactive Local Search for the Maximum Clique Problem 
 
R. Battiti and M. Protasi 
tr-95-052 
September 1995 
 
A new Reactive Local Search (RLS) algorithm is  
proposed for the solution of the Maximum-Clique  
problem. RLS is based on local search complemented by  
a feedback (memory-based) scheme to determine the  
amount of diversification. The reaction acts on the  
single parameter that decides the temporary  
prohibition of selected moves in the neighborhood,  
in a manner inspired by Tabu Search. The performance  
obtained in computational tests appears to be  
significantly better with respect to all algorithms  
tested at the the second DIMACS implementation  
challenge. The worst-case complexity per iteration  
of the algorithm is O(max{n,m}) where n and m are the  
number of nodes and edges of the graph. In practice,  
when a vertex is moved, the number of operations tends  
to be proportional to its number of missing edges and  
therefore the iterations are particularly fast in  
dense graphs.  
<P> 
Keywords: maximum clique problem,  
heuristic algorithms, tabu search, reactive search  
 
----- 
File: 1995/tr-95-053 
 
Efficient Implementation of Multi-Methods for Statically Typed Languages 
 
V. Turau and W. Chen 
tr-95-053 
September 1995 
 
Some of the benefits of object-oriented programming  
such as extensibility and reusability are  
fundamentally based on inheritance and late  
binding. Dynamic dispatching is not only time  
consuming but it also prevents the usage of some  
optimization techniques such as inlining or  
interprocedural analysis. The situation is even  
more severe for languages supporting multi-  
methods, where dispatching is not only performed  
based on the type of the receiver, but also based on the  
types of the arguments. The most efficient way to  
perform dynamic dispatching is to avoid it as often as  
possible, without restricting the use of  
multi-methods. In this paper it is shown how this goal  
can be achieved through static analysis. We present a  
technique which discards all method calls which can  
be statically bound. Furthermore, even if a method  
cannot be statically bound, we derive information  
which will at run time speed up the dispatching  
process considerably.  
<P> 
Keywords: Object-oriented  
programming languages, multi-methods,  
dispatching, static analysis  
 
----- 
File: 1995/tr-95-054 
 
Elementary Proofs of some Results on Representations of p-groups 
 
Mohammad A. Shokrollahi 
tr-95-054 
September 1995 
 
A result of Roquette states that if D is an absolutely  
irreducible representation of a p-group G over the  
field of complex numbers, then D can be realized in  
K(chi(g) | g in G), where chi is the character of D and  
K=Q(i) or K=Q according to whether p=2 or not. Based on  
Baum and Clausen's algorithm for computing the  
irreducible representations of supersolvable  
groups, we give an elementary proof of a theorem  
which, among other well-known facts on  
representations of p-groups, implies Roquette's  
result.  
 
----- 
File: 1995/tr-95-055 
 
Noisy Information and Computational Complexity: A Short Survey} 
 
Leszek Plaskota 
tr-95-055 
September 1995 
 
In the modern world, the importance of information  
can be hardly overestimated. Information also plays  
a prominent role in scientific computations. A  
branch of computational complexity which deals with  
problems for which information is partial, noisy,  
and priced is called {\em information--based  
complexity}. <P>In most of the work on  
information--based complexity, the emphasis was on  
partial and exact information. We concentrate our  
attention on {\em noisy} information. We consider  
deterministic and random noise. The analysis of  
noisy information leads to a variety of new  
algorithms and complexity results. <P>This short  
survey has a reach extension in the form of a monograph  
`Noisy Information and Computational Complexity',  
to be published in Cambridge University Press.  
 
----- 
File: 1995/tr-95-056 
 
How to benefit from noise 
 
Leszek Plaskota 
tr-95-056 
September 1995 
 
We compare nonadaptive and adaptive designs for  
estimating linear functionals in the (minimax)  
statistical setting. It is known that adaptive  
designs are no better in the worst case setting for  
convex and symmetric classes, as well as in the  
average case setting with Gaussian distributions.  
<P>In the statistical setting, the opposite is true.  
Namely, adaptive designs can be significantly  
better. Moreover, using adaptive designs one can  
obtain much better estimators for noisy data than for  
exact data. These results hold because adaption and  
noisy data make the Monte Carlo simulation possible.  
 
----- 
File: 1995/tr-95-057 
 
The Sather 1.0 Specification 
 
David Stoutamire and Stephen Omohundro 
tr-95-057 
October 1995 
 
This document is a concise specification of Sather  
1.0. Sather is an object oriented language designed  
to be simple, efficient, safe, flexible and non-  
proprietary. Sather has parameterized classes,  
object-oriented dispatch, statically-checked  
strong (contravariant) typing, separate  
implementation and type inheritance, multiple  
inheritance, garbage collection, iteration  
abstraction, higher-order routines and iters,  
exception handling, assertions, preconditions,  
postconditions, and class invariants. The ICSI  
compiler supported this 1.0 specification from 1994  
through much of 1995. There are later specifications  
which supersede this document; check the WWW site  
http://www.icsi.berkeley.edu/Sather.  
 
----- 
File: 1995/tr-95-058 
 
The pSather 1.0 Manual 
 
David Stoutamire 
tr-95-058 
October 1995 
 
This document describes pSather 1.0, the parallel  
and distributed extension to Sather 1.0 (see ICSI  
tech report tr-95-057.ps.gz). pSather adds support  
for threads, synchronization, communication, and  
placement of objects and threads. The ICSI compiler  
supported this 1.0 specification through much of  
1995. There are later specifications which  
supercede this document; check the WWW site  
http://www.icsi.berkeley.edu/Sather.  
 
----- 
File: 1995/tr-95-059 
 
Fault handling for multi-party real-time communication 
 
Amit Gupta and Kurt Rothermel 
tr-95-059 
October 1995 
 
For real-time communication services to achieve  
widespread usage, it is important that the network  
services behave gracefully if any component(s)  
fail. While other researchers have previously  
considered failure-handling for non-real-time  
communication as well as for unicast real-time  
communication, these failure-recovery techniques  
must be reexamined in the light of the changes  
introduced by the new protocols and services for  
supporting multi-party real-time communication.  
In this report, we describe techniques and  
mechanisms for maintaining network services for  
multi-party real-time communication in the face of  
failures that may make parts of the network  
inaccessible. The key goal is that the protocols  
should provide high performance in the common case  
(i.e., in absence of failed components) and the  
network performance should gracefully degrade in  
face of network failures; e.g., in the presence of  
network faults, the routes selected may not be as  
good, the connection set-up may take a little more  
time, or resource allocation may be less efficient.  
We describe appropriate policies for storing state  
in the network, as well as the mechanisms for  
re-establishing connectivity for previously  
established connections and to permit setting up new  
connections to existing conferences. We also  
describe a redundancy-based approach, using  
forward error correction (FEC), and dispersing the  
FEC'ed data among disjoint routes. With these  
mechanisms, we can make multi-party real-time  
communication protocols robust to single and/or  
multiple failures in the network, {\em without}  
diluting the strength of the performance guarantees  
offered, or sacrifing the system performance in the  
common case, i.e., when all components work  
correctly.  
 
----- 
File: 1995/tr-95-060 
 
Dynamic resource migration for multi-party real-time communication 
 
Riccardo Bettati and Amit Gupta 
tr-95-060 
October 1995 
 
With long-lived multi-party connections, resource  
allocation subsystems in distributed real-time  
systems or communication networks must be aware of  
dynamically changing network load in order to reduce  
call-blocking probabilities. We describe a  
distributed mechanism to dynamically reallocate  
(``migrate'') resources without adversely  
affecting the performance that established  
connections receive. In addition to allowing  
systems to dynamically adapt to load, this mechanism  
allows for distributed relaxation of resources  
(i.e. the adjustment of overallocation of resources  
due to conservative assumptions at connection  
establishment time) for multicast connections. We  
describe how dynamic resource migration is  
incorporated in the Tenet Scheme 2 protocols for  
multiparty real-time communication.  
 
----- 
File: 1995/tr-95-061 
 
Efficient Input Reordering for the DCT Based on a Real-Valued Decimation in 
Time FFT 
 
Rainer Storn 
tr-95-061 
September 1995 
 
The possibility of computing the Discrete Cosine  
Transform (DCT) of length N=2**n, n integer, via an  
N-point Discrete Fourier Transform (DFT) is widely  
known from the literature. In this correspondence it  
will be demonstrated that this computation can be  
done in-place by just employing butterfly swaps if  
the input reordering - necessary for the DCT  
computation via DFT - is combined with the  
bit-reverse scrambling required by the decimation  
in time Fast Fourier Transform-algorithm.  
 
----- 
File: 1995/tr-95-062 
 
The Supervisor Synthesis Problem for Unrestricted CTL is NP-complete 
 
Marco Antoniotti and Bud Mishra 
tr-95-062 
November 1995 
 
The problem of restricting a finite state model (a  
Kripke structure) in order to satisfy a set of  
unrestricted \CTL\ \formulas\ is named the {\em  
``Unrestricted \CTL\ Supervisor Synthesis  
Problem''}. The finite state model has the  
characteristics described by Ramadge and Wonham,  
that is, its transitions are partitioned between  
{\em controllable} and {\em uncontrollable} ones.  
The set of \CTL\ \formulas\ represents a  
specification of the {\em desired behavior} of the  
system, which may be achieved through a {\em control  
action}. This note shows the problem to be $\cal  
NP$-complete.  
<P> 
Keywords: Discrete Event Systems,  
Temporal Logic, Supervisor Synthesis  
 
----- 
File: 1995/tr-95-063 
 
Mapping of speech front-end signal processing to high performance vector 
architectures 
 
Paola Moretto 
tr-95-063 
December 1995 
 
Front-end signal processing is a crucial stage for  
speech recognition systems. The capability of  
operating in adverse conditions, with high  
background noise and different channel  
characteristics, is one of the major goals when  
developing automatic speech recognition systems  
for use in real world environments. We describe the  
study of the mapping of a fundamental part of speech  
recognition systems - a robust speech front end  
algorithm called RASTA - to the Torrent  
architecture. The mapping problem is particularly  
relevant because at the moment there is no efficient  
automatic tool for implementing algorithms on this  
architecture.  
 
----- 
File: 1995/tr-95-064 
 
On the Power of Randomized Branching Programs 
 
Farid Ablayev and Marek Karpinski 
tr-95-064 
November 1995 
 
We define the notion of a randomized branching  
program in the natural way similar to the definition  
of a randomized circuit. We exhibit an explicit  
function $f_{n}$ for which we prove that:  
<P><UL><LI>1) $f_{n}$  
can be computed by polynomial size randomized  
read-once ordered branching program with a small  
one-sided error; <LI>2) $f_{n}$ cannot be computed in  
polynomial size by deterministic read-once  
branching programs; <LI>3) $f_{n}$ cannot be computed in  
polynomial size by deterministic read-$k$-times  
ordered branching program for $k=o(n/\log n)$ (the  
required deterministic size is  
$\exp\left(\Omega\left(\frac{n}{k}\right)\right)$).  
</UL> 
<P> 
Keywords: Randomized Branching Programs, Read-k  
Branching Programs, Lower Bounds, Two-way  
Communication Game.  
 
----- 
File: 1995/tr-95-065 
 
VC Dimension of Sigmoidal and General Pfaffian Neural Networks 
 
Marek Karpinski and Angus Macintyre 
tr-95-065 
November 1995 
 
We introduce a new method for proving explicit upper  
bounds on the VC Dimension of general functional  
basis networks, and prove as an application, for the  
first time, that the VC Dimension of analog neural  
networks with the sigmoidal activation function  
$\sigma(y)=1/1+e^{-y}$ is bounded by a quadratic  
polynomial $O((lm)^2)$ in both the number $l$ of  
programmable parameters, and the number $m$ of  
nodes. The proof method of this paper generalizes to  
much wider class of Pfaffian activation functions  
and formulas, and gives also for the first time  
polynomial bounds on their VC Dimension. We present  
also some other applications of our method.  
 
<P> 
Keywords: VC Dimension, Pfaffian Activation  
Functions and Formulas, Neural Networks, Sparse  
Networks, Boolean Computation.  
 
----- 
File: 1995/tr-95-066 
 
An Exponential Lower Bound on the Size of Algebraic Decision Trees for MAX 
 
Dima Grigoriev, Marek Karpinski and Andrew C. Yao 
tr-95-066 
November 1995 
 
We prove an exponential lower bound on the size of any  
fixed -degree algebraic decision tree for solving  
MAX, the problem of finding the maximum of $n$ real  
numbers. This complements the $n-1$ lower bound of  
Rabin \cite{R72} on the depth of algebraic decision  
trees for this problem. The proof in fact gives an  
exponential lower bound on size for the polyhedral  
decision problem MAX= of testing whether the $j$-th  
number is the maximum among a list of $n$ real numbers.  
Previously, except for linear decision trees, no  
nontrivial lower bounds on the size of algebraic  
decision trees for any familiar problems are known.  
We also establish an interesting connection between  
our lower bound and the maximum number of minimal  
cutsets for any rank-$d$ hypergraphs on $n$  
vertices.  
<P> 
Keywords: Lower Bounds, Algebraic  
Decision Trees, MAX Problem, Selection Problems,  
Hypergraphs, Minimal Cutsets.  
 
----- 
File: 1995/tr-95-067 
 
Making Automatic Speech Recognition More Robust to Fast Speech 
 
Nikki Mirghafori, Eric Fosler, and Nelson Morgan 
tr-95-067 
December 1995 
 
Psychoacoustic studies show that human listeners  
are sensitive to speaking rate variations  
\cite{summerfield81}. Automatic speech  
recognition (ASR) systems are even more affected by  
the changes in rate, as double to quadruple word  
recognition error rates of average speakers have  
been observed for fast speakers on many ASR systems  
\cite{pallett93}. In this work, we have studied the  
causes of higher error and concluded that both the  
{\em acoustic-phonetic} and the {\em phonological}  
differences are sources of higher word error rates.  
We have also studied various measures for  
quantifying rate of speech (ROS), and used simple  
methods for estimating the speaking rate of a novel  
utterance using ASR technology. We have implemented  
mechanisms that make our ASR system more robust to  
fast speech. Using our ROS estimator to identify fast  
sentences in the test set, our rate-dependent system  
has 24.5\% fewer errors on the fastest sentences and  
6.2\% fewer errors on all sentences of the WSJ93  
evaluation set relative to the baseline HMM/MLP  
system. These results were achieved using some gross  
approximations: adjustment for one rate over an  
entire utterance, hand-tweaked rather than optimal  
transition parameters, and quantization of rate  
effects to two levels (fast and not fast).  
<P> 
Keywords: Automatic Speech Recognition, Speaking Rate,  
Robustness, Duration Modeling  
 
----- 
File: 1995/tr-95-068 
 
A Lower Bound for Randomized Algebraic Decision Trees 
 
Dima Grigoriev, Marek Karpinski, Friedhelm Meyer auf der Heide and Roman Smolensky 
tr-95-068 
December 1995 
 
We extend the lower bounds on the depth of algebraic  
decision trees to the case of {\em randomized}  
algebraic decision trees (with two-sided error) for  
languages being finite unions of hyperplanes and the  
intersections of halfspaces, solving a long  
standing open problem. As an application, among  
other things, we derive, for the first time, an  
$\Omega(n^2)$ {\em randomized} lower bound for the  
{\em Knapsack Problem} which was previously only  
known for deterministic algebraic decision trees.  
It is worth noting that for the languages being finite  
unions of hyperplanes our proof method yields also a  
new elementary technique for deterministic  
algebraic decision trees without making use of  
Milnor's bound on Betti number of algebraic  
varieties.  
<P> 
Keywords: Lower Bounds, Randomized  
Algebraic Decision Trees, Hyperplanes, Faces,  
Knapsack Problem, Element Distinctness Problem.  
 
----- 
File: 1995/tr-95-069 
 
Derandomizing Approximation Algorithms for Hard Counting Problems 
 
Michael Luby 
tr-95-069 
December 1995 
 
No Abstract available. 
 
----- 
File: 1995/tr-95-070 
 
A Quality of Service Management Architecture (QoSMA): A preliminary study 
 
Marco Alfano 
tr-95-070 
December 1995 
 
The widespread use of distributed multimedia  
applications is posing new challenges in the  
management of resources for guaranteeing Quality of  
Service (QoS). For applications relying on the  
transfer of multimedia information, and in  
particular continuous media, it is essential that  
QoS is guaranteed at any level of the distributed  
system, including the operating system, the  
transport protocol, and the underlying network.  
Enhanced protocol support for end-to-end QoS  
negotiation, renegotiation, and indication of QoS  
degradation is also required. Little attention,  
however, has so far been paid to the definition of a  
coherent framework that incorporates QoS  
interfaces, management and mechanisms across all  
the layers of a management architecture. This paper  
describes a preliminary study in the development of  
an integrated Quality of Service Management  
Architecture (QoSMA) which offers a framework to  
specify and manage the required performance  
properties of multimedia applications over  
heterogeneous distributed systems.  
<P> 
Keywords: Quality of Service, QoS, Management, Multimedia  
Applications, Distributed Systems, Real Time.  
 
----- 
File: 1996/tr-96-001 
 
Interaction Selection and Complexity Control for Learning in Binarized 
Domains 
 
Gerald Fahner 
tr-96-001 
May 1996 
 
We empirically investigate the potential of a novel,  
greatly simplified classifier design for binarized  
data. The generic model allocates a sparse,  
"digital" hidden layer comprised of interaction  
nodes that compute PARITY of selected submasks of  
input bits, followed by a sigmoidal output node with  
adjustable weights. Model identification  
incorporates user-assigned complexity  
preferences. We discuss the situations: a) when the  
input space obeys a metrics b) when the inputs are  
discrete attributes We propose a family of  
respective model priors that make search through the  
combinatorial space of multi-input interactions  
feasible. Model capacity and smoothness of the  
approximation are controlled by two complexity  
parameters. Model comparison over the parameter  
plane discovers models with excellent performance.  
In some cases interpretable structures are  
achieved. We point out the significance of our novel  
data mining tool for overcoming scaling problems,  
impacts on real-time systems, and possible  
contributions to the development of non-standard  
computing devices for inductive inference.  
 
<P> 
Keywords: learning algorithms, feature selection,  
Walsh-functions, input-space representation,  
complexity measures, capacity control, model  
comparison  
 
----- 
File: 1996/tr-96-002 
 
Computation of Irregular Primes up to Eight Million (Preliminary Report) 
 
M. A. Shokrollahi 
tr-96-002 
January 1996 
 
We report on a joint project with Joe Buhler, Richard  
Crandall, Reiji Ernvall, and Tauno Metnky dealing  
with the computation of irregular primes and  
cyclotomic invariants for primes between four and  
eight million. This extends previous computations  
of Buhler et al. [4]. Our computation of the irregular  
primes is based on a new approach which has originated  
in the study of Stickelberger codes[13]. It reduces  
the problem to that of finding zeros of a polynomial  
over Fp degree <(p-1)/2 among the quadratic  
residues. Use of fast polynomial gcd-algorithms  
gives an O(p log2p log log p)-algorithm for this task.  
By employing the SchTnhage-Strassen algorithm for  
fast integer multiplication combined with a version  
of fast multiple evaluation of polynomials we design  
an algorithm with running time O(p log p log log p).  
This algorithm is particularly efficient when run on  
primes p for which p-1 has small prime factors. We also  
give some improvements on the previous  
implementations for computing the cyclotomic  
invariants of a prime. ls  
 
----- 
File: 1996/tr-96-003 
 
Ramification and Causality 
 
Michael Thielscher 
tr-96-003 
January 1996 
 
The ramification problem in the context of  
commonsense reasoning about actions and change  
names the challenge to accommodate actions whose  
execution causes indirect effects. Not being part of  
the respective action specification, such effects  
are consequences of general laws describing  
dependencies between components of the world  
description. We present a general approach to this  
problem which incorporates causality, formalized  
by directed relations between two single effects  
stating that, under specific circumstances, the  
occurrence of the first causes the second. Moreover,  
necessity of exploiting causal information in this  
way or a similar is argued by elaborating the  
limitations of common paradigms employed to handle  
ramifications, namely, the principle of  
categorization and the policy of minimal change. Our  
abstract solution is exemplarily integrated into a  
specific calculus based on the logic programming  
paradigm.  
<P> 
Keywords: Reasoning About Actions,  
Causality, Ramification Problem, Logic  
Programming.  
 
----- 
File: 1996/tr-96-004 
 
The Rank of Sparse Random Matrices over Finite Fields 
 
Johannes Bl&ouml;mer, Richard Karp, Emo Welzl 
tr-96-004 
January 1996 
 
Let M be a random matrix over GF[q] such that for each  
entry M_ij in M and for each non-zero field element w  
the probability Pr[M_i}=w] is p/(q-1), where  
p=(log(n)-c)/n and c is an arbitrary but fixed  
positive constant. The probability for a matrix  
entry to be zero is 1-p. It is shown that the expected  
rank of M is n-O(1). Furthermore, there is a constant A  
such that the probability that the rank is less than  
n-k is less than A/q^k. It is also shown that if c grows  
depending on n and is unbounded as n goes to infinity  
then the expected difference between the rank of M and  
n is unbounded.  
 
----- 
File: 1996/tr-96-005 
 
Computing Irreducible Representations of Supersolvable Groups over Small 
Finite Fields 
 
A. Omrani and A. Shokrollahi 
tr-96-005 
January 1996 
 
We present an algorithm to compute a full set of  
irreducible representations of a supersolvable  
group G over a finite field K, charK/||G|, which is not  
assumed to be a splitting field of G. The main  
subroutines of our algorithm are a modification of  
the algorithm of Baum and Clausen[1] to obtain  
information on algebraically conjugate  
representations, and an effective version of  
Speiser's generalization of Hilbert's Theorem 90  
stating that H1(Gal(L/K),GL(n,L)) vanishes for all  
n m 1.  
<P> 
Keywords: Computational representation  
theory, Galois cohomology,  
 
----- 
File: 1996/tr-96-006 
 
Managing ABR Capacity in Reservation-based Slotted Networks 
 
Roya Ulrich, Pieter Kritzinger 
tr-96-006 
January 1996 
 
For slotted networks carrying full multi-media  
traffic to work successfully, it is essential that  
connection setup and management is done well under  
all traffic conditions. Major challenges remain  
with the current state of the technology, however,  
particularly on how one copes with traffic bursts.  
Existing reservation-based networks do not allow  
the user to dynamically adjust his bandwidth  
requirements on demand. In this paper we propose a new  
scheme, called the reservoir scheme, which allows  
dynamic and distributed resource allocation. The  
basic idea behind the scheme is to reserve bandwidth  
with a guaranteed bit rate for each virtual circuit.  
The user is allowed to decentrally allocate  
additional bandwidth from an Available Bit Rate  
(ABR) reservoir to satisfy dynamic changes of  
Variable Bit Rate (VBR) traffic. The duration and  
bandwidth of this dynamic access are negotiated in  
the call setup phase and do not require any  
renegotiation with the service provider so that this  
solution overcomes the rigidity of current static  
bandwidth reservation schemes. The additional  
management requirements are low compared to other  
dynamic bandwidth reservation schemes. We also  
describe an analytic model and simulation which we  
used to determine whether it would be practical to  
apply the proposed scheme in a slotted network.  
 
<P> 
Keywords: Resource management, bandwidth  
allocation, dealy- and loss-sensitive  
application, variable bit rate traffic,  
performance evaluation.  
 
----- 
File: 1996/tr-96-007 
 
Algebraic Settings for the Problem "P does not equal NP?" 
 
Lenore Blum, Felipe Cucker, Mike Shub, and Steve Smale, 
tr-96-007 
February 1996 
 
When complexity theory is studied over an arbitrary  
unordered field K, the classical theory is  
recaptured with K = Z2. The fundamental result that  
the Hilbert Nullstellensatz as a decision problem is  
NP-complete over K allows us to reformulate and  
investigate complexity questions within an  
algebraic framework and to develop transfer  
principles for complexity theory. Here we show that  
over algebraically closed fields K of  
characteristic 0 the fundamental problem "P does not  
equal NP?" has a single answer that depends on the  
tractability of the Hilbert Nullstellensatz over  
the complex number. A key component of the proof is the  
Witness Theorem enabling the elimination of  
transcendental constants in polynomial time.  
 
----- 
File: 1996/tr-96-008 
 
A Geometric Proof of a Formula for the Number of Young Tableaux of a Given 
Shape 
 
Michael Luby 
tr-96-008 
February 1996 
 
This paper contains a short proof of a formula by  
Frame, Robinson, and Thrall [1] which counts the  
number of Young tableaux of a given shape. The proof is  
based on a simple but novel geometric way of  
expressing the area of a Ferrers diagram.  
 
----- 
File: 1996/tr-96-009 
 
Explicit and Implicit Indeterminism: Reasoning About Uncertain and 
Contradictory Specifications of Dynamic Systems 
 
Sven-Erik Bornscheuer and Michael Thielscher 
tr-96-009 
February 1996 
 
A high-level action semantics to specify and reason  
about dynamic systems is presented which supports  
both uncertain knowledge (taken as explicit  
indeterminism) and contradictory information  
(taken as implicit indeterminism). We start by  
developing an action description language for  
intentionally representing nondeterministic  
actions in dynamic systems. We then study the  
different possibilities of interpreting  
contradictory specifications of concurrent  
actions. We argue that the most reasonable  
interpretation which allows for exploiting as much  
information as possible is to take such conflicts as  
implicit indeterminism. As the second major  
contribution, we present a calculus for our  
resulting action semantics based on the logic  
programming paradigm including  
negation-as-failure and equational theories.  
Soundness and completeness of this encoding wrt the  
notion of entailment in our high-level action  
language is proved by taking the completion  
semantics for equational logic programs with  
negation.  
<P> 
Keywords: reasoning about actions, logic  
programming.  
 
----- 
File: 1996/tr-96-010 
 
On Interpolating Polynomials over Finite Fields 
 
M. A. Shokrollahi 
tr-96-010 
February 1996 
 
A set of monomials $x^{a_0},\ldots,x^{a_r}$ is  
called interpolating with respect to a subset $S$ of  
the finite field $\F_q$, if it has the property that  
given any pairwise different elements  
$x_0,\ldots,x_r$ in $S$ and any set of elements  
$y_0,\ldots,y_r$ in $\F_q$ there are elements  
$c_0,\ldots,c_r$ in $\F_q$ such that  
$y_h=\sum_{j=0}^r c_j x_h^{a_j}$ for $0\le h\le r$.  
In this paper we address the question of determining  
interpolating sets with respect to $S=\F_q$ and  
$S=\F_q^\times$. For $q$ a prime and $S=\F_q$ this is  
a problem of N.~Reingold and D.~Spielman posed by  
A.~Odlyzko in~\cite[p.~399]{ff}. We call the  
interpolating set $\{x^{a_0},\ldots,x^{a_r}\}$  
trivial if its exponent set coincides with  
$\{0,b,2b,\ldots,rb\}\bmod (q-1)$ for some $b$  
coprime to $q-1$. The question is whether all  
interpolating sets with respect to $\F_q$ are  
trivial. <P>We start by relating this to a problem on  
cyclic MDS codes. We then show that for $r=2$ and  
$S=\F_q^\times$ the problem is equivalent to  
whether or not for some $m$ the polynomial $(x^m  
-1)/(x-1)$ is a permutation polynomial over $\F_q$.  
The latter problem has been investigated by  
R.~Matthews~\cite{matt}. Using B\'ezout's  
Theorem and results on arcs in projective spaces, we  
show that in a certain range for $r$ (depending on $q$  
and the maximum of the $a_i$) the only interpolating  
sets with respect to $\F_q^\times$ are trivial. We  
then proceed to sharpen this result for the special  
exponent set $0,1,2,\ldots,r-1,m$ where $m$  
satisfies $r\le m\le q-2$. Finally, we exhibit an  
example of a nontrivial interpolating set with  
respect to $\F_q^\times$ for even $q\ge8$. In the  
language of finite geometries this is an example of a  
complete $q$-arc over $\F_q$, and in the language of  
coding theory this is an example of a cyclic MDS-code  
which is not equivalent to a generalized  
Reed-Solomon code.  
<P> 
Keywords: MDS-Codes, arcs,  
normal rational curves, cyclic codes,  
interpolation.  
 
----- 
File: 1996/tr-96-011 
 
A DSOM hierarchical model for reflexive processing: an application to visual 
trajectory classification 
 
Claudio Privitera and Lokendra Shastri 
tr-96-011 
June 1996 
 
Any intelligent system, whether human or robotic,  
must be capable of dealing with patterns over time.  
Temporal pattern processing can be achieved if the  
system has a short-term memory capacity (STM) so that  
different representations can be maintained for  
some time. In this work we propose a neural model  
wherein STM is realized by leaky integrators in a  
self-organizing system. The model exhibits  
compositionality, that is, it has the ability to  
extract and construct progressively complex and  
structured associations in an hierarchical manner,  
starting with basic and primitive (temporal)  
elements. An important feature of the proposed model  
is the use of temporal correlations to express  
dynamic bindings.  
<P> 
Keywords: Dynamic  
Self-Organizing Map, Short-term Memory,  
Compositional Knowledge, Representation,  
Dynamics Bindings  
 
----- 
File: 1996/tr-96-012 
 
The Sather 1.1 Specification 
 
David Stoutamire and Stephen Omohundro 
tr-96-012 
August 1996 
 
This document is a concise specification of Sather  
1.1. Sather is an object oriented language designed  
to be simple, efficient, safe, flexible and non-  
proprietary. Sather has parameterized classes,  
object-oriented dispatch, statically-checked  
strong (contravariant) typing, separate  
implementation and type inheritance, multiple  
inheritance, garbage collection, iteration  
abstraction, closures, exception handling,  
assertions, preconditions, post conditions, and  
class invariants. <P>This 1.1 specification  
significantly polishes and improves the 1.0  
language specification with an introduction,  
index, and examples. New constructs include `out'  
arguments, less restrictive overloading, and  
improved external language interfaces.  
 
----- 
File: 1996/tr-96-013 
 
The Voice Mail Digits and Their Performance on ICSI's Hybrid HMM/ANN System 
 
Rainer Klisch 
tr-96-013 
April 1996 
 
This report describes how we used ICSI's Hidden  
Markov Model (HMM) / Artificial Neural Network (ANN)  
speech recognition system to evaluate the Voice Mail  
(VM) digits corpus. We will present the new database,  
discuss the structure of the HMM/ANN recognizer, and  
finally report on the recognition performance we  
achieved in this initial work.  
 
----- 
File: 1996/tr-96-014 
 
A Note on Matrix Rigidity 
 
M. A. Shokrollahi and V. Stemann 
tr-96-014 
April 1996 
 
The rigidity of a matrix is defined as the number of  
entries in the matrix that have to be changed in order  
to reduce its rank below a certain value. Starting  
from a combinatorial lemma, we give in this paper  
explicit constructions of $n\times n$ matrices over  
infinite fields with the property that if we change no  
more than $c\frac{n^2}{r}\log\frac{n}{r}$  
entries in the matrix, the rank remains at least $r$.  
($c$ is an absolute constant.) In the second part of  
the paper we use the theory of algebraic-geometric  
codes to construct $n\times n$ matrices over a finite  
field $\Fq$ such that any $\lceil \varepsilon  
n\rceil\times \lceil \varepsilon n\rceil$  
submatrix of such a matrix has rank at least $\lceil  
\delta n\rceil$, for some constants $\varepsilon$  
and $\delta$ depending on $q$. We then apply our  
combinatorial lemma to these matrices to obtain  
lower bounds on their rigidity.  
<P> 
Keywords: Matrix  
rigidity, circuit complexity, communication  
complexity  
 
----- 
File: 1996/tr-96-015 
 
Cyclical Local Structural Risk Minimization with Growing Neural Networks 
 
Jan Matti Lange 
tr-96-015 
April 1996 
 
With that paper a new concept for learning from  
examples called Cyclical Local Structural Risk  
Minimization (CLSRM) minimizing a global risk by  
cyclical minimization of residual local risks is  
introduced. The idea is to increase the capacity of  
the learning machine cyclically only in those  
regions where the effective loss is high and to do a  
stepwise local risk minimization, restricted to  
those regions. An example for the realization of the  
CLSRM principle is the TACOMA (TAsk Decomposition,  
COrrelation Measures and local Attention neurons)  
learning architecture. The algorithm generates a  
feed-forward network bottom up by cyclical  
insertion of cascaded hidden layers. The output of a  
hidden unit is locally restricted with respect to the  
network input space using a new kind of activation  
function combining the localcharacteristic of  
radial basis functions with sigmoid functions. The  
insertion of such hidden units increases the  
capacity only locally and leads finally to a neural  
network with a capacity well adapted to the  
distribution of the training data. The performance  
of the algorithm is shown for classification and  
function approximation benchmarks.  
 
----- 
File: 1996/tr-96-016 
 
Deterministic Generalized Automata 
 
Dora Giammarresi and Rosa Montalbano 
tr-96-016 
May 1996 
 
A generalized automaton (GA) is a finite automaton  
where the single transitions are defined on words  
rather than on single letters. Generalized automata  
were considered by K. Hashiguchi who proved that the  
problem of calculating the size of a minimal GA is  
decidable. <P>We define the model of deterministic  
generalized automaton (DGA) and study the problem of  
its minimization. A DGA has the restriction that, for  
each state, the sets of words corresponding to the  
transitions of that state are prefix sets. We solve  
the problem of calculating the number of states of a  
minimal DGA for a given language, by giving a  
procedure that effectively constructs a minimal DGA  
starting from the minimal equivalent  
(conventional) deterministic automaton.  
 
----- 
File: 1996/tr-96-017 
 
Structural Gr\"obner Basis Detection 
 
Bernd Sturmfels and Markus Wiegelmann 
tr-96-017 
May 1996 
 
We determine the computational complexity of  
deciding whether $m$ polynomials in $n$ variables  
have relatively prime leading terms with respect to  
some term order. This problem is NP-complete in  
general, but solvable in polynomial time for $m$  
fixed and for $n-m$ fixed. Our new algorithm for the  
latter case determines a candidate set of leading  
terms by solving a maximum matching problem. This  
reduces the problem to linear programming.  
 
----- 
File: 1996/tr-96-018 
 
A Management Platform for Global Area ATM Networks 
 
Roya Ulrich 
tr-96-018 
May, 1996 
 
Technological progress has made providing numerous  
new services to large number of users possible.  
Concurrently, we also experience an increased  
interest in real-time and interactive  
applications, e. g. teleseminaring, video  
conferencing and application sharing, in  
particular, because of the worldwide and  
decentralized character of today's research and  
development organizations. <P>The International  
Computer Science Institute (ICSI) is a participant  
of the first transatlantic ATM link which is an  
integral part of the Multimedia Applications on  
Intercontinental Highways (MAY) Project.  
Additionally, ICSI is attached to the Bay Area  
Gigabit Network (BAGNet) providing ATM  
connectivity at the best-effort basis. Both  
projects provide platforms to identify the key  
research and development topics in cooperative  
real-time communication. <P>The technical report  
gives a brief introduction to the ATM infrastructure  
at ICSI and addresses challenging management issues  
of multimedia applications in such global area ATM  
networks. We explore three management areas:  
performance, configuration, and fault management  
with respect to the user's point of view. Finally, we  
introduce a management platform and tools we have  
been developing which help the user to better predict  
the quality of service provided and to recover from  
faults occurred in the system or during a  
transmission.  
 
----- 
File: 1996/tr-96-019 
 
An Introduction to Modular Process Nets 
 
Dietmar Wikarski 
tr-96-019 
April 1996 
 
Modular process nets are a graphical and formal  
notation for the representation of technical and  
business process models containing concurrent  
activities. They are low-level Petri nets equipped  
with innovative module and communication concepts,  
optionally enhanced by the use of a task concept as  
known from the areas of computer-supported  
cooperative work (CSCW) and workflow management.  
Though originally developed for modeling,  
analysis, simulation and control of workflows and  
computer-based process control systems, this class  
of models can also be used in other areas where a formal  
description of complex processes is needed. After a  
description of the basic aims and design decisions  
for modular process nets and a brief introduction to  
low-level Petri nets, the report gives a detailed  
description of a hierarchical module concept for  
nets and introduces the new class of elementary  
process nets. The module concept is part of a more  
general "object-based" approach to Petri nets,  
whereas the main feature of elementary process nets  
is the definition of constructs for synchronous and  
asynchronous communication between separately  
interpreted net instances via events and token  
passing. The report is intended to be a precise and  
systematic introduction to modular process nets. At  
the same time, it is kept as informal as possible in  
order to provide a broad spectrum of non-specialist  
users with a comprehensible means of expression for  
complex processes. Typical application examples  
are included.  
 
----- 
File: 1996/tr-96-020 
 
Parallel Balanced Allocation 
 
Volker Stemann 
tr-96-020 
June 1996 
 
We study the well known problem of throwing $m$ balls  
into $n$ bins. If each ball in the sequential game is  
allowed to select more than one bin, the maximum load  
of the bins can be exponentially reduced compared to  
the `classical balls into bins' game. <P>We consider a  
static and a dynamic variant of a randomized parallel  
allocation where each ball can choose a constant  
number of bins. All results hold with high  
probability. In the static case all $m$ balls arrive  
at the same time. We analyze for $m=n$ a very simple  
optimal class of protocols achieving maximum load $O  
\left(\sqrt[r]{\frac{\log n}{\log\log  
n}}\right)$ if $r$ rounds of communication are  
allowed. This matches the lower bound of  
\cite{ACMR95}. <P>Furthermore, we generalize the  
protocols to the case of $m > n$ balls. An optimal load  
of $O(m/n)$ can be achieved using $\frac{\log\log  
n}{\log(m/n)}$ rounds of communication. Hence, for  
$m = n\frac{\log\log n}{\log\log\log n}$ balls this  
slackness allows to hide the amount of  
communication. In the `classical balls into bins'  
game this optimal distribution can only be achieved  
for $m = n\log n$. <P>In the dynamic variant $n$ of the $m$  
balls arrive at the same time and have to be allocated.  
Each of these initial $n$ balls has a list of $m/n$  
successor-balls. As soon as a ball is allocated its  
successor will be processed. We present an optimal  
parallel process that allocates all $m=n\log n$  
balls in $O(m/n)$ rounds. Hence, the expected  
allocation time is constant. The main contribution  
of this process is that the maximum allocation time is  
additionally bounded by $O(\log\log n)$.  
 
----- 
File: 1996/tr-96-021 
 
Randomized Efficient Algorithms for Compressed Strings: the Finger-Print 
Approach 
 
Leszek Gasieniec, Marek Karpinski, Wojciech Plandowski, Wojciech Rytter 
tr-96-021 
June 1996 
 
Denote by LZ(w) the coded form of a string w produced by  
Lempel-Ziv encoding algorithm. We consider several  
classical algorithmic problems for texts in the  
compressed setting. The first of them is the  
equality-testing: given LZ(w) and integers i,j,k  
test the equality: w[i ... i+k] = w[j ... j+k]. We give a  
simple and efficient randomized algorithm for this  
problem using the finger-printing idea. The  
equality testing is reduced to the equivalence of  
certain context-free grammars generating single  
strings. The equality-testing is the bottleneck in  
other algorithms for compressed texts. We relate the  
time complexity of several classical problems for  
texts to the complexity Eq(n) of equality-testing.  
Assume n = |LZ(T)|, m = |LZ(P)| and U = |T|. Then we can  
compute the compressed representations of the sets  
of occurrences of P in T, periods of T, palindromes of  
T, and squares of T respectively in times O(n log^2 U *  
Eq(m) + n^2 log U), O(n log^2 U * Eq(n) + n^2 log U), O(n  
log^2 U * Eq(n) + n^2 log U) and O(n^2 log^3 U * Eq(n) + n^3  
log^2 U), where Eq(n) = O(n log log n). The  
randomization improves considerably upon the known  
deterministic algorithms (\cite{KPR} and  
\cite{KRS}).  
 
----- 
File: 1996/tr-96-022 
 
Determining Priority Queue Performance from Second Moment Traffic 
Characterizations 
 
Edward W. Knightly 
tr-96-022 
June 1996 
 
A crucial problem to the efficient design and  
management of integrated services networks is how to  
best allocate and reserve network resources for  
heterogeneous and bursty traffic streams in  
multiplexers that support prioritized service  
disciplines. In this paper, we introduce a new  
approach for determining per-connection QoS  
parameters such as delay-bound violation  
probability and loss probability in multi-service  
networks. The approach utilizes a traffic  
characterization that consists of the variances of a  
stream's rate distribution over multiple interval  
lengths, which captures its burstiness properties  
and autocorrelation structure. The resource  
allocation scheme is based on application of the  
Central Limit Theorem over intervals, together with  
use of stochastic delay-bounding techniques; it  
results in simple and efficient algorithms for  
determining QoS parameters. We perform experiments  
with long traces of MPEG-compressed video and show  
that the new scheme is accurate enough to capture most  
of the inherent statistical multiplexing gain,  
achieving average network utilizations of up to 90%  
for these traces.  
 
----- 
File: 1996/tr-96-023 
 
Structural Classification - A Preliminary Report 
 
Jana Koehler, Kilian Stoffel and James A. Hendler 
tr-96-023 
July 1996 
 
A new type of classification algorithm is introduced  
that works on the folded representation of concepts.  
The algorithm comprises two phases: a preprocessing  
phase working on the normal-form representation of  
concepts to test for unsatisfiability and  
tautology, and a structural classifier that  
generates predecessors and successors of concepts  
by exploiting new optimization techniques not  
available to standard classifiers. <P>Working on the  
folded terminology instead of its expanded and  
normalized representation allows to significantly  
reduce the number of subsumptions tests that are  
necessary to correctly classify a concept. We  
describe the algorithm, and prove it sound and  
complete for two different languages. It can be  
extended to more expressive languages when combined  
with a new method for reasoning about number  
restrictions over role hierarchies based on  
diophantine equations. <P>The algorithm is very fast  
and very well parallelizable taking less than 4 hours  
for the classification of a terminology of 100,000  
concepts on an SP2.  
<P> 
Keywords: concept languages,  
description logics, classification, optimization  
techniques  
 
----- 
File: 1996/tr-96-024 
 
Reservoir-based ABR Servive in ATM Networks 
 
Wolfgang Frohberg and Roya Ulrich 
tr-96-024 
July 1996 
 
ATM technology tends to be the major networking  
technology for the Broadband- ISDN. Motivated by the  
growing amount of Internet traffic, which will be  
carried over ATM networks, we extend the  
reservoir-based resource management proposed in  
[ULR95] to ATM networks, where it can be used to  
provide an available bit rate (ABR) bearer service.  
ABR is connection-oriented and performs a variable  
bit rate data transport without timing constraints.  
The reservoir-based ABR scheme (ResABR) proposed in  
this paper assigns bandwidth on demand of the  
sources, taking into account the network  
utilization at the request time. The basic idea of  
ResABR is to divide virtual paths into two logical  
parts. One part contains the bandwidth necessary to  
guarantee a minimum bandwidth for all connections.  
Another part of each VP provides a reservoir of extra  
bandwidth, which can be used by one or more of the  
ResABR connections for a short time to send bursts.  
The advantages of the ResABR scheme are: resource  
management actions are necessary only when a burst  
occurs; no extra storage of cells inside the network  
is necessary; the scheme is robust and it provides  
less computational effort; it is fair between  
sources.  
 
----- 
File: 1996/tr-96-025 
 
Space Bounds for Interactive Proof Systems with Public Coins and Bounded 
Number of Rounds 
 
Maciej Liskiewicz and Rudiger Reischuk 
tr-96-025 
July 1996 
 
This paper studies interactive proof systems using  
public coin tosses, respectively Arthur-Merlin  
games, with a sublogarithmic space-bounded  
verifier. We provide examples of specific languages  
and show that such systems working with bounded  
number of rounds of interaction are unable to accept  
these languages. As a consequence, a separation of  
the second and the third level of the  
round/alternation hierarchy is obtained. It is well  
known that such a property does not hold for the  
corresponding polynomial time classes: in  
["Proceedings of the 17th ACM Symposium on Theory of  
Computing", ACM Press, 1985, 421-429] Babai showed  
that the hierarchy of complexity classes  
AM_k~Time(POL) collapses to the second level.  
 
----- 
File: 1996/tr-96-026 
 
Qualification and Causality 
 
Michael Thielscher 
tr-96-026 
July 1996 
 
In formal theories for reasoning about actions, the  
qualification problem denotes the problem to  
account for the many conditions which, albeit being  
unlikely to occur, may prevent the successful  
execution of an action. While a solution to this  
problem must involve the ability to assume away by  
default these abnormal disqualifications of  
actions, the common straightforward approach of  
globally minimizing them is inadequate as it lacks an  
appropriate notion of causality. This is shown by a  
simple counter-example closely related to the  
well-known Yale Shooting scenario. To overcome this  
difficulty, we propose to incorporate causality by  
treating the fact that an action is qualified as  
ordinary fluent, i.e., a proposition which may  
change its truth value in the course of time by  
potentially being (indirectly) affected by the  
execution of actions. Abnormal disqualifications  
then are initially assumed away, unless there is  
evidence to the contrary. Our formal account of the  
qualification problem includes the proliferation  
of explanations for surprising disqualifications  
and also accommodates so-called miraculous  
disqualifications, which go beyond the agent's  
explanation capacity. In the second part, we develop  
a fluent calculus-based encoding of domains that  
require a proper treatment of abnormal  
disqualifications. In particular, default rules  
are employed to account for the intrinsic  
nonmonotonicity of the qualification problem. The  
resulting action calculus is proved correct wrt. our  
formal characterization of the qualification  
problem.  
<P> 
Keywords: temporal reasoning,  
qualification problem, causality, nonmonotonic  
reasoning.  
 
----- 
File: 1996/tr-96-027 
 
Fractal Behavior of Video and Data Traffic 
 
Wolfgang Frohberg 
tr-96-027 
July 1996 
 
A fractal is a function or a process in which an  
identical motif repeats itself on an ever  
diminishing scale. The motif of a fractal can be a  
feature influenced by chance. Fractals can be found  
in nature everywhere, for instance the surface of the  
moon is a fractal, where the motif of craters is  
repeated in a scale from inches to miles. It is created  
by random collisions with space objects. Fractals  
are also called self-similar, because they show the  
same picture when looking at them in different  
scales. Fractals can be found in the load profile of  
data and video traffic, too. Fractal behavior has  
serious consequences for the modeling, design and  
operation of packet switched networks like ATM. They  
are: 1) no smoothing effect while traffic is  
multiplexed and, 2) unpredictable burst lengths.  
This leads to difficulties in buffer dimensioning  
and in traffic control schemes. Understanding and  
modeling the fractal behavior is a new research  
challenge. More knowledge is needed to understand  
reasons for the fractal properties and to model them  
in order to design networks, services and even  
applications with regard to it. There are several  
methods to find out fractal properties of data and  
video traffic. One of them, the so called pox diagram,  
will be applied. We will show results achieved by  
application of this approach on measured video  
traffic. Additionally results of other measurement  
in data networks and in the Internet will be  
presented.  
 
----- 
File: 1996/tr-96-028 
 
Computability of String Functions Over Algebraic Structures ( Preliminary 
Version ) 
 
Armin Hemmerling 
tr-96-028 
August 1996 
 
We present a model of computation for string  
functions over single--sorted, total algebraic  
structures and study some features of a general  
theory of computability within this framework. Our  
concept generalizes the Blum--Shub--Smale setting  
of computability over the reals and other rings. By  
dealing with strings of arbitrary length instead of  
tuples of fixed length, some suppositions of deeper  
results within former approaches to generalized  
recursion theory become superfluous. Moreover,  
this gives the basis for introducing computational  
complexity in a BSS--like manner. <P>Relationships  
both to classical computability and to Friedman's  
concept of eds computability are established. Two  
kinds of nondeterminism as well as several variants  
of recognizability are investigated with respect to  
interdependencies on each other and on properties of  
the underlying structures. For structures of finite  
signatures, there are universal programs with the  
usual characteristics. In the general case (of not  
necessarily finite signature), the existence of  
universal functions is equivalent to the effective  
encodability of the structures, whereas the  
existence of m--complete sets turns out to be  
independent on those properties.  
 
----- 
File: 1996/tr-96-029 
 
JAM: A Java Toolkit for Traffic Analyzing and Monitoring 
 
Andreas M&auml;rz, Roya Ulrich 
tr-96-029 
August, 1996 
 
Providing reliable multimedia services requires  
considerable effort with currently available  
hardware and software. Major difficulties are to  
cope with the changing quality of service  
parameters. Network as well as the operating system  
can handle these requirements and share resources  
optimal among several active multimedia  
applications only if proper information about  
traffic characteristics are available. The traffic  
characteristics also helps to improve application  
performance in terms of the execution time and  
required resources. Therefore, monitoring traffic  
is an essential step to support performance  
management in any network. However, because of the  
dynamic traffic behavior, the on-line monitoring  
and the on-line analysis of values becomes more  
important in real-time communication. <P>In this  
technical report, a toolkit, called JAM (Java  
toolkit for traffic analyzing and monitoring) is  
introduced. JAM allows the user to configure a  
multimedia conference and to collect performance  
statistics for different protocol layers and  
provides a graphical user interface for on-line  
visualization of statistics gained from a running  
multimedia session.  
 
----- 
File: 1996/tr-96-030 
 
Generalized Thermography: Algorithms, Implementation, and Application to Go 
Endgames 
 
Martin M&uuml;ller, Elwyn Berlekamp and Bill Spight 
tr-96-030 
October 1996 
 
Thermography is a powerful method for analyzing  
combinatorial games. It has been extended to games  
that contain loops in their game graph by Berlekamp.  
We survey the main ideas of this method and discuss how  
it applies to Go endgames. After a brief review of the  
methodology, we develop an algorithm for  
generalized thermography and describe its  
implementation. To illustrate the power and scope of  
the resulting program, we give an extensive catalog  
of examples of Ko positions and their thermographs.  
We introduce a new method related to thermography for  
analyzing ko in the context of a specific ko threat  
situation. We comment on some well-known Go  
techniques, terminology, and ``exotic'' Go  
positions from a thermography point of view. Our  
analysis shows that a framework based on generalized  
thermography can be useful for the opening and  
midgame as well. We suggest that such a framework will  
serve as the basis for future strong Go programs.  
Part 2 is found in: <a href="ftp://ftp.icsi.berkeley.edu/pub/techreports/1996/tr-96-030b.ps.gz">tr-96-030b.ps.gz</a> 
 
----- 
File: 1996/tr-96-031 
 
Reasoning about Sets via Atomic Decomposition 
 
Hans J&uuml;rgen Ohlbach and Jana Koehler 
tr-96-031 
August 1996 
 
We introduce a new technique that translates  
cardinality information about finite sets into  
simple arithmetic terms and thereby enables a system  
to reason about such set cardinalities by solving  
arithmetic equation problems. <P>The atomic  
decomposition technique separates a collection of  
sets into mutually disjoint smallest components  
(``atoms'') such that the cardinality of the sets are  
just the sum of the cardinalities of their atoms. <P>With  
this idea it is possible to have languages combining  
arithmetic formulae with set terms, and to translate  
the formulae of this combined logic into pure  
arithmetical formulae. <P>As a particular application  
we show how this technique yields new inference  
procedures for concept languages with so called  
number restriction operators.  
<P> 
Keywords: concept languages, description logics, number  
restrictions, arithmetic reasoning  
 
----- 
File: 1996/tr-96-032 
 
A Simple Approximation Algorithm in $\Z[e^{2\pi i/8}]$ 
 
M. A. Shokrollahi and V. Stemann 
tr-96-032 
August 1996 
 
We describe a very simple and efficient new algorithm  
for the approximation of complex numbers by  
algebraic integers in $\Z[e^{2\pi i/8}]$ whose  
coeffcients with respect to the usual basis are  
bounded in absolute value by a given integer $M$. Its  
main idea is the use of a novel signature technique. An  
important application is the reduction of dynamic  
range requirements for residue number system  
implementations of the discrete Fourier transform.  
The algorithm uses at most $10 \log(M)$ arithmetic  
steps and $2.4\log(M)$ additional memroy. It yields  
approximations within a distance of at most  
$3.42/M$. Several examples are included which show  
that the algorithm is very fast in practice. For  
instance, 50000 complex approximations take less  
than 0.7 seconds on a SPARC-5.  
<P> 
Keywords: Fast Fourier  
transforms, cyclotomic fields, continued  
fractions.  
 
----- 
File: 1996/tr-96-033 
 
Approximation of Complex Numbers by Cyclotomic Integers 
 
M. A. Shokrollahi and V. Stemann 
tr-96-033 
August 1996 
 
We present a new method of approximating complex  
numbers by cyclotomic integers in $\Z[e^{2\pi  
i/2^n}]$ whose coefficients with respect to the  
basis given by powers of $e^{2\pi i/2^n}$ are bounded  
in absolute value by a given integer $M$. It has been  
suggested by Cozzens and  
Finkelstein~\cite{cofi:85} that such  
approximations reduce the dynamic range  
requirements of the discrete Fourier transform. For  
fixed $n$ our algorithm gives approximations with an  
error of $O(1/M^{2^{n-2}-1})$. This proves a  
heuristic formula of Cozzens and Finkelstein. We  
will also prove a matching lower bound for the worst  
case error of any approximation algorithm and hence  
show that our algorithm is essentially optimal.  
Further, we derive a slightly different and more  
efficient algorithm for approximation by $16$th  
roots of unity. The basic ingredients of our  
algorithm are the explicit Galois theory of  
cyclotomic fields as well as cyclotomic units. We use  
a deep number theoretic property of these units  
related to the class number of the field. Various  
examples and running times for this case and that of  
approximation by $32$nd roots of unity are included.  
Finally, we derive the algebraic and analytic  
foundations for the generalization of our results to  
arbitrary algebraic number fields.  
<P> 
Keywords: Discrete Fourier tranform, cyclotomic fields,  
cyclotomic units, complex approximation, integer  
linear programming.  
 
----- 
File: 1996/tr-96-034 
 
On the Representative Power of Commented Markov Models 
 
Reinhard Blasig and Gerald Fahner 
tr-96-034 
August 1996 
 
A CMM (Commented Markov Model) is a learning  
algorithm to model and extrapolate discrete  
sequences. The learning involves the inferences of  
{\em objects}, {\em variables} and {\em classes},  
describing the sequences. In this paper, all  
sequences considered will be character sequences.  
As pointed out in an earlier paper [2], the structures  
utilized by CMM are powerful enough to represent and  
evaluate any {\em primitive recursive function}.  
This paper will provide a formal proof of this claim.  
We will therefore concentrate on the issues of  
representation and leave the issues of CMM induction  
aside.  
 
----- 
File: 1996/tr-96-035 
 
The Syllable Re-revisited 
 
Alfred Hauenstein 
tr-96-035 
August 1996 
 
In this report an approach to speech recognition  
using syllables as basic modelling units is compared  
to a state-of-the-art system employing phonemes.  
The technological framework is ICSI's hybrid  
HMM-ANN recognition system applied on small to  
medium vocabulary recognition tasks. <P>Although the  
number of units to be classified nearly doubles, it is  
shown that the syllable can outperform the phoneme  
slightly but significantly in terms of unit  
classification capability, measured as frame error  
rate. Comparing the overall system performance  
(measured in word error rate) the phoneme-based  
system still performs obviously better for  
continuous speech tasks, while the syllable-based  
system is superior for isolated word recognition  
tasks on cross-database tests. This suggests the  
need for further work on the understanding of the  
interaction of knowledge sources on the frame-,  
word-, and sentence-level in current recognition  
systems.  
<P> 
Keywords: speech recognition, hybrid  
HMM-ANN classification, syllable  
 
----- 
File: 1996/tr-96-036 
 
Adaptive Load Sharing based on a Broker Module 
 
M. Avvenuti, L. Rizzo, and L. Vicisano 
tr-96-036 
August 1996 
 
This paper describes a dynamic,  
symmetrically-initiated load sharing scheme which  
adapts to changing load condition by varying the  
algorithm's dependency on system's status  
information. The scheme is hybrid in that it relies on  
a a fully distributed algorithm when the system is  
heavily loaded, but resorts to a centrally  
coordinated location policy when parts of the system  
become idle. The simplicity of the algorithms  
proposed makes it possible to use a centralized  
component without incurring in scalability  
problems and presenting instabilities. Both  
algorithms are very lightweight and do not need any  
tuning of parameters, so that they are extremely easy  
to implement to the point that an inexpensive  
hardware implementation of the centralized  
component is capable of handling millions of  
requests per second. Simulations show that the  
hybrid approach outperforms existing dynamic  
algorithms under all load conditions and task  
generation patterns, it is weakly sensitive to  
processing overhead and communication delays, and  
scales well (to hundreds of nodes) despite the use of a  
centralized component.  
<P> 
Keywords: distributed  
systems, resource management, load sharing,  
adaptive algorithms, simulation, performance  
evaluation.  
 
----- 
File: 1996/tr-96-037 
 
An Analysis of the Divergence of Two Sather Dialects 
 
David Stoutamire, Wolf Zimmermann, and Martin Trapp 
tr-96-037 
August 1996 
 
Sather is an object oriented language designed to be  
simple, efficient, safe, and non-proprietary. It  
was originally envisioned as a ``cleaned-up''  
version of Eiffel, addressing perceived failures in  
simplicity and efficiency. The first public  
implementation (Sather 0) was first released to the  
public by ICSI in 1991. Shortly after, a compiler  
group at the University of Karlsruhe created the  
first native code compiler. A major effort then began  
to redesign the language to correct shortcomings in  
Sather 0 and to make Sather suitable for  
general-purpose, large scale programming. In part  
because each compiler group was building a compiler  
for a moving design target, the two parallel efforts  
resulted in two dialects, Sather 1 and Sather K. This  
report analyzes the essential causes of the  
differences, which result from differences in each  
group's goals.  
 
----- 
File: 1996/tr-96-039 
 
System Design by Constraint Adaptation and Differential Evolution 
 
Rainer M. Storn 
tr-96-039 
November 1996 
 
A simple optimization procedure for constraint  
based problems which works without an objective  
function is described. The absence of an objective  
function makes the problem formulation  
particularly simple. The new method lends itself to  
parallel computation and is well suited for tasks  
where a family of solutions is required, trade-off  
situations have to be dealt with or the design center  
has to be found.  
<P> 
Keywords: optimization, monte carlo  
techniques, design centering, Differential  
Evolution.  
 
----- 
File: 1996/tr-96-040 
 
A Cooperative Multimedia Environment with QOS Control: Architectural and 
Implementation Issues 
 
Marco Alfano and Nikolaos Radouniklis 
tr-96-040 
September 1996 
 
A cooperative multimedia environment allows users  
to work remotely on common projects by sharing  
applications (e.g., CAD tools, text editors, white  
boards) and simultaneously communicate  
audiovisually. Several dedicated applications  
(e.g., MBone tools) exist for transmitting video,  
audio and data between users. Due to the fact that they  
have been developed for the Internet which does not  
provide any Quality of Service (QoS) guarantee,  
these applications do not or only partially support  
specification of QoS requirements by the user. In  
addition, they all come with different user  
interfaces. <P>We have developed a Cooperative  
Multimedia Environment (CME) made up of Cooperative  
Multimedia Applications (COMMA), one for each user.  
A COMMA presents a user with a single interface that  
allows him to invite other users to a cooperative  
session, select the media services to be used in the  
session, and specify his Quality of Service (QoS)  
requirements for the media services throughout the  
session. <P>In this work, we describe the architectural  
details of the CME and its componentents with  
particular emphasis to the QoS mapping and control  
mechanisms. We also present the design and  
implementation details of an experimental  
prototype that provides video, audio and white board  
services.  
<P> 
Keywords: Cooperative Multimedia  
Environment, Quality of Service, QoS, Multimedia  
Applications, Distributed Systems, Real Time.  
 
----- 
File: 1996/tr-96-041 
 
Design and Implementation of a Web-based Tool for ATM Connection Management 
 
Martin Bernhardt 
tr-96-041 
August 1996 
 
At the International Computer Science Institute  
(ICSI), there is an ongoing effort to gain experience  
on ATM and multi-media applications. ICSI is  
participating in two ATM pilots called Bay Area  
Gigabit Network (BAGNet) and Multimedia  
Applications on Intercontinental Highway (MAY).  
Beside these wide-area trial ICSI's ATM network is  
used for local multi-media experiments. The ATM  
environment at ICSI is heterogeneous. Both, local  
and long distance traffic is based on permanent  
virtual connections. The management of this  
environment has often been cumbersome and  
time-consuming for a number of reasons: The ATM  
devices have to be accessed separately in an  
unintegrated manner. Different vendor-specific  
tools with different user interfaces are used.  
Configuration data is stored unstructured,  
redundant and not centralized. Users cannot setup or  
verify a connection without knowing  
device-specific details. Hence, the need for a  
software tool arose that can minimize the  
administrative work spent on connection  
management. This technical report contains my  
master's thesis which is about the design and  
implementation of TOMCAD - a tool for monitoring and  
configuration of ATM devices. Being a web-based  
software tool it can support local and wide-area  
connectivity and provide access for local and remote  
users.  
<P> 
Keywords: TOMCAD, ATM, connection  
management, PVC, Web, Internet  
 
----- 
File: 1996/tr-96-042 
 
Efficient Oblivious Parallel Sorting on the MasPar MP-1 
 
Klaus Brockmann, Rolf Wanka 
tr-96-042 
September 1996 
 
We address the problem of sorting a large number N of  
keys on a MasPar MP-1 parallel SIMD machine of  
moderate size P where the processing elements (PEs)  
are interconnected as a toroidal mesh and have 16KB  
local storage each. We present a comparative study of  
implementations of the following deterministic  
oblivious sorting methods: Bitonic Sort, Odd-Even  
Merge Sort, and FastSort. We successfully use the  
guarded split&merge operation introduced by Rueb.  
The experiments and investigations in a simple,  
parameterized, analytical model show that, with  
this operation, from a certain ratio N/P upwards both  
Odd-Even Merge Sort and FastSort become faster on  
average than the up to the present fastest,  
sophisticated implementation of Bitonic Sort by  
Prins. Though it is not as efficient as Odd-Even Merge  
Sort, FastSort is to our knowledge the first method  
specially tailored to the mesh architecture that can  
be, when implemented, competitive on average with a  
mesh-adaptation of Bitonic Sort for large N/P.  
 
----- 
File: 1996/tr-96-043 
 
Multidimensional Access Methods 
 
Volker Gaede, Oliver G&uuml;nther 
tr-96-043 
October 1996 
 
Search operations in databases require some special  
support at the physical level. This is true for  
conventional databases as well as for spatial  
databases, where typical search operations include  
the point query (find all objects that contain a given  
search point) and the region query (find all objects  
that overlap a given search region). More than ten  
years of spatial database research have resulted in a  
great variety of multidimensional access methods to  
support such operations. This paper gives an  
overview of that work. After a brief survey of spatial  
data management in general, we first present the  
class of point access methods, which are used to  
search sets of points in two or more dimensions. The  
second part of the paper is devoted to spatial access  
methods, which are able to manage extended objects  
(such as rectangles or polyhedra). We conclude with a  
discussion of theoretical and experimental results  
concerning the relative performance of the various  
approaches.  
<P> 
Keywords: multidimensional access  
methods, data structures, spatial databases  
 
----- 
File: 1996/tr-96-044 
 
MMM: A WWW-Based Method Management System for Using Software Modules 
Remotely 
 
Oliver G&uuml;nther, Rudolf M&uuml;ller, Peter Schmidt, Hemant Bhargava, Ramayya Krishnan 
tr-96-044 
October 1996 
 
The World Wide Web has been highly successful as a tool  
for the distributed publishing and sharing of online  
documents among large dispersed groups. This raises  
the question whether the distributed authoring and  
execution of software modules can be supported in a  
similar manner. We study this problem by first  
developing the requirements of a group of developers  
and users of statistical software at a German  
national research laboratory. We then propose an  
information system design that meets these  
requirements and report on MMM, a prototype  
implementation.  
 
----- 
File: 1996/tr-96-045 
 
Coevolutionary Game-Theoretic Multi-Agent Systems: the Application to 
Mapping and Scheduling Problems 
 
Franciszek Seredynski 
tr-96-045 
October 1996 
 
Multi-agent systems based on iterated,  
noncooperative N-person games with limited  
interaction are considered. Each player in the game  
has a payoff function and a set of actions. While each  
player acts to maximise his payoff, we are interested  
in the global behavior of the team of players,  
measured by the average payoff received by the team.  
To evolve a global behavior in the system, we propose  
two coevolutionary schemes with evaluation only  
local fitness functions. The first scheme we call  
loosely coupled genetic algorithms, and the second  
one loosely coupled classifier systems. We present  
simulation results which indicate that the global  
behavior in both systems evolves, and is achieved  
only by a local cooperation between players acting  
without global information about the system. The  
models of coevolutionary multi-agent systems are  
applied to develop parallel and distributed  
algorithms of dynamic mapping and scheduling tasks  
in parallel computers.  
 
----- 
File: 1996/tr-96-046 
 
Echo Cancellation Techniques for Multimedia Applications - A Survey 
 
Rainer M. Storn 
tr-96-046 
November 1996 
 
The problem of acoustical echo in a headset-free full  
duplex communication environment is explained and  
the potential solutions are sketched. The different  
methods for acoustic echo cancellation (AEC) via  
adaptive filters are outlined and their suitability  
for a 16-bit fixed point implementation on a digital  
signal processor (DSP) is evaluated. The current  
prototype for the ICSI Acoustic Echo Canceller  
(IAEC) which uses an allpass-based subband adaptive  
approach is introduced and directions for future  
work are given.  
<P> 
Keywords: Full duplex  
communication, acoustical echo, adaptive filters,  
echo cancellation.  
 
----- 
File: 1996/tr-96-047 
 
Interactive Proof Systems with Public Coin: Lower Space Bounds and 
Hierarchies of Complexity Classes 
 
Maciej Liskiewicz 
tr-96-047 
November 1996 
 
This paper studies small space-bounded interactive  
proof systems (IPSs) using public coin tosses,  
respectively Turing machines with both  
nondeterministic and probabilistic states, that  
works with bounded number of rounds of interactions.  
For this model of computations new impossibility  
results are shown. As a consequence we prove that for  
sublogarithmic space bounds, IPSs working in $k$  
rounds are less powerful than systems of $2k^{k-1}$  
rounds of interactions. It is well known that such a  
property does not hold for polynomial time bounds.  
Babai showed that in this case any constant number of  
rounds can be reduced to 2 rounds.  
 
----- 
File: 1996/tr-96-048 
 
Transmission of multimedia data over lossy networks (Thesis) 
 
Martin Isenburg 
tr-96-048 
August 1996 
 
This thesis addressed quality orientated  
improvements for multimedia connections over  
packet switched and lossy networks. The problems  
involved in establishing real-time communication  
over networks such as the Internet have been  
investigated and the definite network characteris  
tics that cause these problems have been clearly  
marked out. The quality of audio communication  
essentially depends on the number of packets lost and  
on the variation in packet arrival times. Efficient  
mechanism to minimize the impact of delay jitter have  
already been proposed in literature, whereas  
dealing with packet loss remains an active research  
area. The measurements about the packet loss rate for  
audio streams over the Internet showed that the  
number of consecutive lost packet usually is small.  
This rigorously proved that open loop mechanisms  
that add redundancy on the sending side are suited to  
cope with the loss of information. We presented two  
transmission concepts that overcome these network  
limitations using forward error correction  
schemes. The `piggyback protected transmission`  
was introduced - a resilient scheme that has already  
showed its usefulness in improving full duplex audio  
communication. The `priority encoded  
transmission`, which had never been applied to audio  
streams before, was examined for its capability in  
protecting the transmission of audio data over lossy  
networks. We showed that for time critical  
point-to-point communication the comparatively  
simple `pick a-back protected transmission` is a  
better choice than `priority encoded  
transmission`. In a broadcast scenario on the other  
hand where large delays are acceptable the PET  
approach will yield in better results because of its  
robustness against long packet loss periods and its  
capacity to transmit to receivers with widely  
different network bandwidth. In order to apply the  
`priority encoded transmission` towards audio  
streams, it was necessary to develop a layered audio  
encoding scheme. A major part of thesis is concerned  
with discussing and analyzing different  
transformations of an audio signal in respect to time  
and frequency. Finally we are able to present an audio  
codec that we have developed from scratch and that  
yields into a compressed and layered representation  
of the audio signal. In contrast to common standard  
codecs this encoding scheme is well suited to work  
together with PET. Furthermore we demonstrated how  
our new encoding scheme improves the performance of  
the `piggyback protected transmission`. Through  
diminishing the redundancy in the redundant  
information a better audio quality can be achieved in  
case of isolated packet losses.  
 
----- 
File: 1996/tr-96-049 
 
Metadata in Geographic and Environmental Data Management 
 
Oliver G&uuml;nther, Agnes Voisard 
tr-96-049 
November 1996 
 
Metadata is used increasingly in geographic and  
environmental information systems to improve both  
the availability and the quality of the information  
delivered. The growing popularity of  
Internet-based data servers has accelerated this  
trend even further. In this chapter we give an  
overview of metadata schemes and implementations  
that are common in this domain. Case studies include  
the Content Standards for Digital Geospatial  
Metadata of the U.S. Federal Geographic Data  
Committee (FGDC), and the Catalogue of Data Sources  
(CDS) of the European Environmental Agency. Another  
activity that we will discuss in somewhat greater  
detail concerns the UDK project, an international  
software engineering effort to facilitate access to  
environmental data. The UDK (Environmental Data  
Catalogue) is a public meta information system and  
navigation tool that helps users to identify and  
retrieve environmental data from the government and  
other sources. In 1995, first versions of the UDK were  
made available in Austria and Germany; several other  
European countries are currently evaluating the  
system. We will present the UDK data model, its  
implementation as a distributed information  
system, and its integration into the World Wide Web.  
To appear in: W. Klas and A. Sheth (eds.), Managing  
Multimedia Data: Using Metadata to Integrate and  
Apply Digital Data, McGraw Hill, 1997.  
 
----- 
File: 1996/tr-96-050 
 
Randomized $\mathbf{\Omega (n^2)}$ Lower Bound for Knapsack 
 
Dima Grigoriev, Marek Karpinski 
tr-96-050 
November 1996 
 
We prove $\Omega (n^2)$ complexity \emph{lower  
bound} for the general model of \emph{randomized  
computation trees} solving the \emph{Knapsack  
Problem}, and more generally \emph{Restricted  
Integer Programming}. This is the \emph{first  
nontrivial} lower bound proven for this model of  
computation. The method of the proof depends  
crucially on the new technique for proving lower  
bounds on the \emph{border complexity} of a  
polynomial which could be of independent interest.  
 
----- 
File: 1996/tr-96-051 
 
The Complexity of Two-Dimensional Compressed Pattern Matching 
 
Piotr Berman, Marek Karpinski, Lawrence Larmore, Wojciech Plandowski, 
Wojciech Rytter 
tr-96-051 
December 1996 
 
We study computational complexity of  
two-dimensional compressed pattern matching  
problems. Among other things, we design an efficient  
randomized algorithm for the equality problem of two  
compressed two-dimensional patterns as well as  
prove computational {\em hardness} of the general  
two-dimensional compressed pattern matching.  
 
----- 
File: 1996/tr-96-052 
 
Optimal Trade-Offs Between Size and Slowdown for Universal Parallel Networks 
 
Friedhelm Meyer auf der Heide, Martin Storch, and Rolf Wanka 
tr-96-052 
December 1996 
 
A parallel processor network is called n-universal  
with slowdowns, if it can simulate each computation  
of each constant-degree processor network with n  
processors with slowdown s. We prove the following  
lower bound trade-off: For each constant-degree  
n-universal network of size m with slowdown s,  
m*s=Omega(n log m) holds. Our trade-off holds for a  
very general model of simulations. It covers all  
previously considered models and all known  
techniques for simulations among networks. For  
m>=n, this improves a previous lower bound by a factor  
of loglog n, proved for a weaker simulation model. For  
m<n, this is the first non-trivial lower bound for  
this problem. In this case, this lower bound is  
asymptotically tight.  
 
----- 
File: 1996/tr-96-053 
 
Correctness of Constructing Optimal Alphabetic Trees Revisited 
 
Marek Karpinski, Lawrence L. Larmore, Wojciech Rytter 
tr-96-053 
December 1996 
 
Several new observations which lead to new  
correctness proofs of two known algorithms  
(Hu-Tucker and Garsia-Wachs) for construction of  
optimal alphabetic trees are presented. A  
generalized version of the Garsia-Wachs algorithm  
is given. Proof of this generalized version works in a  
structured and illustrative way and clarifies the  
usually poorly-understood behavior of both the  
Hu-Tucker and Garsia-Wachs algorithms. The  
generalized version permits any non-negative  
weights, as opposed to strictly positive weights  
required in the original Garsia-Wachs algorithm.  
New local structural properties of optimal  
alphabetic trees are given. The concept of {/em  
well-shaped segment\/} (a part of an optimal tree) is  
introduced. It is shown that some parts of the optimal  
tree are known in advance to be well-shaped, and this  
implies correctness of the algorithms rather  
easily. The crucial part of the correctness proof of  
the Garsia-Wachs algorithm, namely the {\em  
structural theorem}, is identified. The  
correctness proof of the Hu-Tucker algorithm  
consists of showing a very simple mutual simulation  
between this algorithm and the Garsia-Wachs  
algorithm. For this proof, it is essential to use the  
generalized version of Garsia-Wachs algorithm, in  
which an arbitrary locally minimal pair is  
processed, not necessarily the rightmost minimal  
pair. Such a generalized version is also needed for  
parallel implementations. Another result  
presented in this paper is the clarification of the  
problem of resolving ties (equalities between  
weights of items) in the Hu-Tucker algorithm. This is  
related to the proof, by simulation, of correctness  
of the Hu-Tucker algorithm. It is shown that the  
condition that there are no ties may generally be  
assumed without harm and that, essentially, the  
Hu-Tucker algorithm avoids ties automatically.  
 
----- 
File: 1996/tr-96-055 
 
Aspects of Algebraic Geometry over Non Algebraically Closed Fields 
 
Tomas Sander 
tr-96-055 
December 1996 
 
In this paper we study algebraic-geometric  
properties of the set of K-rational points V(K) of  
K-varieties V. We introduce an elementary class of  
fields of characteristic 0 for which we prove:  
<P><UL> 
<LI>1)  
Algebraic-geometric data of V(K) (e.g. the  
dimension of V(K)) can be computed under natural  
assumptions on K, <LI>2) Uniform Finiteness Theorems (of  
Bezout Theorem type) and other complexity results  
and bounds, <LI>3) Algebraic-geometric concepts are  
definable by first order formulas in the language of  
rings. </UL><P>This class contains for example  
algebraically and real closed fields, Henselian  
fields (e.g. the p-adic numbers and power series  
fields), PAC-fields (i.e. pseudo algebraically  
closed fields), PRC-fields and PpC-fields (of  
characteristic 0). Further structural properties  
of \ek are studied.  
 
----- 
File: 1996/tr-96-056 
 
Torrent Architecture Manual 
 
Krste Asanovic and David Johnson 
tr-96-056 
December 1996 
 
This manual contains the specification of the  
Torrent Instruction Set Architecture (ISA).  
Torrent is a vector ISA designed for digital signal  
processing applications. Torrent is based on the  
32-bit MIPS-II ISA, and this manual is intended to be  
read as a supplement to the book "MIPS RISC  
Architecture" by Kane and Heinrich. Torrent is the  
ISA of the T0 vector microprocessor which is  
described in the separate "T0 Engineering Data"  
technical report.  
<P> 
Keywords: Torrent, T0, Vector Microprocessor  
 
----- 
File: 1996/tr-96-057 
 
T0 Engineering Data 
 
Krste Asanovic and James Beck 
tr-96-057 
December 1996 
 
T0 (Torrent-0) is a single-chip fixed-point vector  
microprocessor designed for multimedia,  
human-interface, neural network, and other digital  
signal processing tasks. T0 includes a MIPS-II  
compatible 32-bit integer RISC core, a 1 Kbyte  
instruction cache, a high performance fixed-point  
vector coprocessor, a 128-bit wide external memory  
interface, and a byte-serial host interface. T0  
implements the Torrent ISA described in a separate  
"Torrent Architecture Manual" technical report.  
This manual contains detailed information on the T0  
vector microprocessor, including information  
required to build T0 into a system, instruction  
execution timings, and information on low level T0  
software interfaces required for operating system  
support.  
<P> 
Keywords: Torrent, T0, Vector  
microprocessor  
 
----- 
File: 1996/tr-96-058 
 
Recognition of Handwritten Digits and Human Faces by Convolutional Neural 
Networks 
 
Claus Neubauer 
tr-96-058 
December 1996 
 
Convolutional neural networks provide an efficient  
method to constrain the complexity of feedforward  
neural networks by weightsharing. In this paper two  
variations of convolutional networks -  
Neocognitron and Neoperceptron - are compared with  
classifiers based on fully connected feedforward  
layers (i.e. Multilayerperceptron, Nearest  
Neighbor Classifier, Autoencoding network).  
Beside the original Neocognitron a modification  
called Neoperceptron is proposed which combines  
neurons from Perceptron with the localized network  
structure of Neocognitron. Instead error  
backpropagation in this work a modular training  
procedure is applied, whereby layers are trained  
sequentially from the input to the output layer in  
order to recognize features of increasing  
complexity. <P>For a quantitative experimental  
comparison with standard classifiers two  
recognition tasks have been chosen: handwritten  
digit recognition and face recognition. In the first  
example on handwritten digit recognition the  
generalization of convolutional networks is  
compared to fully connected networks. In several  
experiments the influence of variations of  
position, size and orientation of digits is  
determined and the relation between training sample  
size and validation error is observed. In the second  
example recognition of human faces is investigated  
under constrained and variable conditions with  
respect to face orientation and illumination and the  
limitations of convolutional networks are  
discussed.  
 
----- 
File: 1996/tr-96-059 
 
Approximating Dense Cases of Covering Problems 
 
Marek Karpinski and Alexander Zelikovsky 
tr-96-059 
December 1996 
 
We study dense cases of several covering problems. An  
instance of the set cover problem with m sets is dense  
if there is e > 0 such that any element belongs to at  
least em sets. We show that the dense set cover problem  
can be approximated with the performance ratio c log n  
for any c > 0 and it is unlikely to be NP-hard. We  
construct a polynomial-time approximation scheme  
for the dense Steiner tree problem in n-vertex  
graphs, i.e. for the case when each terminal is  
adjacent to at least n-vertices. We also study the  
vertex cover problem in e-dense graphs. Though this  
problem is shown to be still MAX-SNP-hard as in  
general graphs, we find a better approximation  
algorithm with the performance ratio ***. The  
superdense cases of all these problems are shown to be  
solvable in polynomial time.  
 
----- 
File: 1997/tr-97-001 
 
A Modular Analysis of Network Transmission Protocols 
 
Micah Adler, Yair Bartal, John W. Byers, Mike Luby and Danny Raz 
tr-97-001 
April 1997 
 
We describe a new model for the analysis of data  
transmission protocols in lossy communication  
networks. We study the performance of protocols in an  
adversarial setting where the loss pattern and  
latencies of packets are determined by an adversary.  
We advocate the modular decomposition of data  
transmission protocols into a {\em time scheduling  
policy}, which determines {\em when} packets are to  
be sent, and a {\em data selection policy}, which  
determines {\em what} data is to be placed in each sent  
packet. We concentrate on the data selection policy  
and require that the protocol will achieve high  
bandwidth utilization in transmitting any prefix of  
the transmitted message. The simple and universal  
data selection policy we introduce is provably close  
to optimal in the following sense: For {\em any} time  
scheduling policy and {\em any} network behavior, in  
the worst case prefix measure our data selection  
policy performs as well as any other data selection  
policy up to a constant additive term. Our explicit  
modular decomposition of a transmission protocol  
into two policies should be contrasted with existing  
network protocols such as TCP/IP. Our result shows  
that the performance of the overall transmission  
protocol would not degrade in performance (and could  
improve dramatically) if it used our universal data  
selection policy in place of its own. We therefore  
reduce the problem of designing a data transmission  
protocol to the task of designing a time scheduling  
policy.  
 
----- 
File: 1997/tr-97-002 
 
The Spectro-Microscopy Electronic Notebook 
 
Sonia R. Sachs, Carla M. Dal Sasso Freitas, Victor Markowitz, Anna Talis, I-Min A. Chen, Ernest Szeto and Harumi A. Kuno 
tr-97-002 
January 1997 
 
This paper gives an overview of the Electronic  
Notebook for the Spectro-Microscopy Collaboratory  
at the Advanced Light Source Beamline 7 (ALS-BL7).  
The Spectro-Microscopy Collaboratory project has  
the goal of using current network and  
video-conferencing technology to provide remote  
access to the facilities at ALS-BL7. The Electronic  
Notebook is a tool that allows physicists accessing  
the ALS-BL7 facilities to store and retrieve all  
information generated as they collaborate to run  
experiments. The Electronic Notebook replaces a  
multiplicity of manual and automated procedures  
currently used for storage/retrieval of data  
associated with experiments at the ALS-BL7. In  
addition, the Electronic Notebook offers new and  
powerful capabilities, while providing users with a  
homogeneous user interface to various tools. This  
paper outlines the architectural design of the  
Electronic Notebook, and describes its visual  
interface, which is used to prompt local and remote  
users to enter information related to their  
experiments, and provides query and browsing  
facilities to enable information retrieval.  
 
----- 
File: 1997/tr-97-003 
 
Exploiting temporal binding to learn relational rules within a connectionist 
network 
 
Lokendra Shastri 
tr-97-003 
May 1997 
 
Rules encoded by traditional rule-based systems are  
brittle and inflexible because it is difficult to  
specify the precise conditions under which a rule  
should fire. If the conditions are made too specific a  
rule does not always fire when it should. If the  
conditions are made too general, the rule fires even  
when it should not. In contrast, connectionist  
networks are considered to be capable of learning  
soft and robust rules. Work in connection- ist  
learning however, has focused primarily on  
classification and feature formation, and the  
problem of learning rules involving relations and  
roles (variables) has received relatively little  
attention. We present a simple demonstration of rule  
learning involving relations and variables within a  
connectionist network. The network learns the  
appropriate correspondence between roles of  
antecedent and consequent relations as well as the  
features that role fillers must possess for a rule to  
be applicable in a given situation. Each rule can be  
viewed as a mapping from the symbolic level to the  
symbolic level mediated by a semantic filter  
embedded within a subsymbolic level. The network  
uses synchronous firing of nodes to express dynamic  
bindings.  
<P> 
Key Words: learning; rules; first-order  
rules; bindings; synchrony; relational rules.  
 
----- 
File: 1997/tr-97-004 
 
Protocol Enhancement and Compression for X-Based Application Sharing 
 
Martin Mauve 
tr-97-004 
February 1997 
 
Application sharing is a technology which allows two  
or more users located at geographically different  
places to synchronously work with an unmodified  
single-user application. To make this technology  
available to the network-based X Window System,  
several different software products have been  
developed. All of them use a protocol similar to the X  
Window System protocol X11 to display the output of a  
single-user application on more than one screen and  
to receive response from more than one user. However,  
this protocol was designed to be run over a fast LAN.  
Used over a high-latency or a low-bandwidth  
connection, it leads to serious delays and loss of  
interactivity. While there have been some efforts to  
make the X11 protocol more suitable for those  
scenarios, none of them have been integrated into  
application-sharing software. The objectives of  
this work are to review existing techniques for  
enhancement and compression of the X11 protocol, to  
prove that those techniques can be integrated into  
application sharing products by providing a  
prototype integration, and to identify areas of  
future work. It will be shown that the caching and  
compression techniques of the prototype  
integration reduce the synchronicity of  
application sharing products by up to 74%, and the  
amount of sent data by an average of 70%.  
 
----- 
File: 1997/tr-97-005 
 
Mapping Conceptual Geographic Models onto DBMS Data Models 
 
Agnes Voisard and Benoit David 
tr-97-005 
March 1997 
 
We study the representation and manipulation of  
geographic information in a database management  
system (DBMS). The conceptual geographic model that  
we use as a basis hinges on a complex object model,  
whose set and tuple constructors make it efficient  
for defining not only collections of geographic  
objects but also relationships between them. In  
addition, it allows easy manipulation of non-basic  
types such as spatial data types. We investigate the  
mapping of our reference model onto major commercial  
DBMS models, namely a relational model extended to  
abstract data types (ADT) and an object-oriented  
model. Our analysis shows the strengths and limits of  
the two model types for handling highly structured  
data with spatial components.  
 
----- 
File: 1997/tr-97-006 
 
Abstraction and Decoposition in Open GIS 
 
Agnes Voisard and Heinz Schweppe 
tr-97-006 
March 1997 
 
With the emergence of distributed computing and the  
increasing trend towards the reuse of geographic  
data, a new generation of geographic information  
systems (GIS) is currently being specified. The key  
characteristics of these open GIS are modularity and  
extensibility, and they are composed of existing  
software systems such as database management  
systems, traditional GIS, statistics packages and  
simulation models. They can be defined in terms of  
generic frameworks which facilitate both  
information exchange between participating  
systems and the addition of new functionalities.  
Even though the idea of defining open GISs is not new,  
it is crucial that the steps necessary to In this  
report, we propose a layer decomposition for the  
design of an open GIS. Each layer corresponds to a  
different level of abstraction, starting with the  
application or user level down to the invocation of  
systems services. In addition, each such level can be  
specified by the same set of concepts: data,  
operation and session (DOS model). The metadata  
needed for the interaction between levels is  
indispensable to achieve openness. We believe that  
the clear definition of such a framework will greatly  
facilitate open GIS design.  
 
----- 
File: 1997/tr-97-007 
 
On-line Load Balancing for Related Machines 
 
Piotr Berman, Moses Charikar, Marek Karpinski 
tr-97-007 
January 1997 
 
We consider the problem of scheduling permanent jobs  
on related machines in an on-line fashion. We design a  
new algorithm that achieves the competitive ratio of  
$3+\sqrt{8} \approx 5.828$ for the deterministic  
version, and $3.31/\ln 2.155 \approx 4.311$ for its  
randomized variant, improving the previous  
competitive ratios of 8 and $2e \approx 5.436$. We  
also prove lower bounds of $2.4380$ on the  
competitive ratio of deterministic algorithms and  
$1.8372$ on the competitive ratio of randomized  
algorithms for this problem.  
 
----- 
File: 1997/tr-97-008 
 
Cased Base Reasoning: A New Technology for Experience Based Construction of 
Knowledge Systems 
 
K. Althoff, M. Richter and W. Wilke 
tr-97-008 
March 1997 
 
We will discuss the role of case-based reasoning - a  
new emerging technology that contributes to solving  
the well-known problems of software maintenance,  
reuse, and quality improvement by storing,  
retrieving and adapting similar past cases - in this  
new light. Case-based reasoning, which has proven to  
be of practical importance by a large number of  
industrial/business applications, is a flexible  
approach to software development that has overcome  
the indicated difficulties to a large extent. We will  
point out in which way case-based reasoning takes up  
the separation issue by a certain decomposition idea  
in order to offer a useful flexibility required to  
adapt software production in a changing world. One  
important contribution of case-based reasoning  
technology is that it allows to reduce teh "update  
complexity" to a smaller dimension. We will show for  
which kinds of application tasks case-based  
reasoning is more flexible than other approaches and  
we will illustrate this using the introduced general  
structure of a case-based reasoning system. From a  
software engineering perspective future research  
on case-based reasoning will deal with the analysis  
ofwhich kind the "invariants of case-based  
reasoning" are. These invariants need to be  
standardized, as well as the corresponding methods.  
As a conclusion we will draw the attention to some  
points which seem to be important for future  
directions in research on and applications with case  
based reasoning technology.  
 
----- 
File: 1997/tr-97-009 
 
Generalized Planning and Information Retrieval 
 
Michael M. Richter 
tr-97-009 
March 1997 
 
No abstract available. 
----- 
File: 1997/tr-97-010 
 
Perspectives on the Integration of Fuzzy and Case-Based Reasoning Systems 
 
Michael M. Richter 
tr-97-010 
March 1997 
 
We discuss relations and differences between fuxzzy  
and case based reasoning methods in order to indicate  
possibilities for future research activities. We  
interpret the basic concepts of each approach in  
terms of the other one and discuss the computational  
methods in particular form an knowledge engineering  
point of view.  
 
----- 
File: 1997/tr-97-011 
 
Multilayered Extended Semantic Networks-The MESNET Paradigm 
 
Hermann Helbig 
tr-97-011 
March 1997 
 
Semantic Networks (SN) have been used in many  
applications especially in the field of natural  
language understanding (NLU). The multilayered  
extended semantic network MESNET presented in this  
paper on the one hand follows the tradition of SN  
starting with the work of Qullian. On the other hand,  
MESNET for the first time consequently and  
explicitly makes use of a multilayered structuring  
of a SN built upon an orthogonal system of dimensions  
and especially upon the distinction between an  
intentional and preextensional layer.  
Furthermore, MESNET is based on a comprehensive  
system of classificatory means (sorts and features)  
as well as on semantically primitive relations and  
functions. It uses a relatively large but fixed  
inventory of representational means,  
encapsulation of concepts and a distinction between  
immanent and situative knowledge. The whole complex  
of representational means is independent of special  
application domains. At the same time, it is fine  
grained enough to allow for the differentiation of  
all important nuances of meaning in the knowledge  
representation. MESNET has been especially  
developed for natural language understanding in  
question answering systems (QAS). A first prototype  
is successfully used for the meaning representation  
of natural language expressions in the system LINAS.  
In this paper, MESNET is presented in its double  
function as a cognitive model and as the target  
language for the semantic interpretation processes  
in NLU systems.  
 
----- 
File: 1997/tr-97-012 
 
User-friendly Information Retrieval in Data Bases and in the World Wide Web 
 
Hermann Helbig 
tr-97-012 
March 1997 
 
The paper describes two methods for realizing a  
user-friendly access to distributed information  
resources. The first method (Method 1) is based on a  
form-driven dialogue, which is used in the project  
named "MEDOC". IT aims at an experienced user who is  
familiar with attribute value structures of data  
base schemes of typical information retrieval  
systems (IRS) and who knows the definition of boolean  
operators. The second method (Method II) applied in  
the system LINAS is from the very beginning oriented  
towards natural language communication between  
end-user and IRS. Both methods can be used in an  
interface between the user and information  
brokering system helping him/her to find an  
appropriate information provider for his/her  
demands in networked information systems. Method I  
gives the user a certain guidance in formulating  
his/her queries but has a restricted expressive  
power. It almost never supports the sere in  
automatically finding more complicated  
descriptional elements, as for instance  
classificators of a standardized classificational  
system. Method II on the other hand, is devoted to the  
"native user" having no experience with information  
retrieval techniques. Allowing for an unrestricted  
natural language input, it is distinguished by a  
great expressive power and gives valuable support in  
automatically finding descriptors and  
classificational categories used in the  
description of documents. In comparison with Method  
I, there is less guidance in formulating the users  
demands.  
 
----- 
File: 1997/tr-97-013 
 
Differential Evolution: A Method for Optimization of Real Scheduling 
Problems 
 
Martin R&uuml;ttgers 
tr-97-013 
March 1997 
 
A new method for optimizing scheduling problems with  
nonlinear objective functions and multiple  
dependent restrictions is presented. This method is  
based on an Evolutionary Algorithm but has special  
changing operators for a directed search over the  
entire solution space. It can be implemented for  
solving real problems very fast, it requires only few  
control variables, it is robust, easy to use and lends  
itself very well to parallel computation. The  
implementation for solving a model representing a  
real scheduling problem in foundries is presented.  
This application shows good results and the  
comparison to a method based on a stochastic  
Evolutionary Algorithm, having the reputation for  
beeing very powerful, shows that the new method  
converges faster and with more certainty.  
 
----- 
File: 1997/tr-97-014 
 
Parallel Optimizations: Advanced Constructs and Compiler Optimizations for a 
Parallel, Object Oriented, Shared Memory Language running on a distributed 
System 
 
Claudio Fleiner 
tr-97-014 
April 1997 
 
Today's processors provide more and more processing  
power, yet there are many applications whose  
processing demand cannot be met by a single processor  
in the near future, besides, the demand for more  
processing power seems to increase at least as fast as  
the speed of new processors and the only way to  
complete such calculation-intensive programs is to  
execute them on many processors at once. The history  
of parallel computers of the last several years  
suggests that the distributed, parallel computer  
model will gain widespread acceptance as the most  
important one. In this model a computer consists of  
several node, each with its own processors and  
memory, As such a computer does not offer one global  
memory space, but rather a separate memory per node  
(distributed memory), it is no longer possible to  
directly use the shared memory programming  
paradigm. However, as it is generally easier to  
program with shared memory rather than using message  
based communications, several new languages and  
language extensions that simulate shared memory  
have been suggested. Such a parallel, distributed  
language has not only to provide special support for  
managing parallelism and synchronization, the  
specification and implementation of the language  
has to address the issue of distributed memory as  
well. One of the most important issues is the  
selection of the memory consistency model, which  
defines when writes of one node are observed by the  
other nodes of the distributed computer. Many vital  
optimizations used by compilers for serial  
languages are often not possible if the memory model  
is too restrictive, but a weaker memory model makes  
the language harder to use. This thesis discusses  
several problems and solutions for such languages.  
It uses the language pSather, an object oriented,  
parallel language developed at the International  
Computer Science Institute in Berkeley as an  
example. A very flexible synchronization construct  
including different implementations of it, is  
introduced that allows the user to define new  
synchronization primitives, and avoids deadlocks  
and starvations in many common cases. Several memory  
consistency models and their implications for  
programmers and the compiler, especially regarding  
optimizations, are discussed. The effect of several  
optimizations (adaptations of optimizations used  
in serial compilers and special parallel  
optimizations) and their implementation will be  
shown. The effect of those optimizations will be  
measured by using test programs written in pSather.  
The results clearly indicate that a weaker memory  
model is necessary to achieve the desired efficiency  
and speedup, even though usage of the language  
becomes less convenient. However, pSather offers  
some constructs that solve some of the problems.  
 
----- 
File: 1997/tr-97-015 
 
Efficiency of PET and MPEG Encoding for Video Streams: Analytical QoS 
Evaluations 
 
Bernd E. Wolfinger 
tr-97-015 
April 1997 
 
A promising solution in the transmission of video  
streams via communication networks is to use forward  
error control in order to mask some of the  
transmission errors and data losses at the receiving  
side. The redundancy required, however, to achieve  
error correction without retransmissions will  
consume some transmission capacity of a network,  
therefore possibly enforcing stronger compression  
of the video stream to be transmitted. In this paper we  
introduce analytical models which allow us to  
determine the expected frame loss probability of  
MPEG encoded video streams assuming communication  
via constant bit rate (CBR) virtual circuits with  
data losses and/or unrecoverable transmission  
errors. The models can be used to compare the  
quality-of-service (QoS) as observed on  
Application Layer for encoding schemes without and  
with forward error control, possibly making use of  
different prioritization of transmitted data units  
(in particular applying PET encoding algorithm as  
designed at ICSI). The models are applied in various  
case studies to compare the efficiency of the error  
control schemes covered.  
 
----- 
File: 1997/tr-97-016 
 
An Approximation Algorithm for the Bandwidth Problem on Dense Graphs 
 
Marek Karpinski, J&uuml;rgen Wirtgen, Alex Zelikovsky 
tr-97-016 
May 1997 
 
The bandwidth problem is the problem of numbering the  
vertices of a given graph G such that the maximum  
difference between the numbers of adjacent vertices  
is minimal. The problem has a long history and is known  
to be NP-complete [Papadimitriou, 1997]. Only few  
special cases of this problem are known to be  
efficiently approximable. In this paper we present  
the first constant approximation ratio algorithms  
on dense instances of this problem.  
 
----- 
File: 1997/tr-97-018 
 
Empirical Observations of Probabilistic Heuristics for the Clustering 
Problem 
 
Jeff Bilmes, Amin Vahdat, Windsor Hsu, Eun-Jin Im 
tr-97-018 
May 1997 
 
We empirically investigate a number of strategies  
for solving the clustering problem under the minimum  
variance error criterion. First, we compare the  
behavior of four algorithms, 1) randomized minimum  
spanning tree, 2) hierarchical grouping, 3)  
randomized maximum cut, and 4) standard k-means. We  
test these algorithms with a large corpus of both  
contrived and real-world data sets and find that  
standard k-means performs best. We found, however,  
that standard k-means can, with non-negligible  
probability, do a poor job optimizing the minimum  
variance criterion. We therefore investigate  
various randomized k-means modifications. We  
empirically find that by running randomized k-means  
only a modest number of times, the probability of a  
poor solution becomes negligible. Using a large  
number of CPU hours to experimentally derive the  
apparently optimal solutions, we also find that  
randomized k-means has the best rate of convergence  
to this apparent optimum.  
 
----- 
File: 1997/tr-97-019 
 
Optimization with the Hopfield network based on correlated noises: an 
empirical approach 
 
Jacek Mandziuk 
tr-97-019 
May 1997 
 
This paper presents two simple optimization  
techniques based on combining the Langevin Equation  
with the Hopfield Model. Proposed models - referred  
as Stochastic Model (SM) and Pulsed Noise Model (PNM)  
- can be viewed as straightforward stochastic  
extensions of the Hopfield optimization network.  
Optimization with SM, unlike in previous related  
models, in which $delta$-correlated Gaussian  
noises were considered, is based on Gaussian noises  
with positive autocorrelation times. This is a  
reasonable assumption from a hardware  
implementation point of view. In the other model -  
PNM, Gaussian noises are injected to the system only  
at certain time instances, as opposite to  
continuously maintained $delta$-correlated  
noises used in the previous related works. In both  
models (SM and PNM), intensities of noises added to  
the model are independent of neurons' potentials.  
Moreover, instead of impractically long inverse  
logarithmic cooling schedules, linear cooling is  
tested. With the above strong simplifications  
neither SM nor PNM is expected to rigorously maintain  
Thermal Equilibrium (TE). However, approximate  
numerical tests based on the canonical  
Gibbs-Boltzmann distribution show, that  
differences between rigorous and estimated values  
of TE parameters are relatively low (within a few  
percent). In this sense both models are said to  
perform Quasi Thermal Equilibrium. Optimization  
performance and Quasi Thermal Equilibrium  
properties of both models are tested on the  
Travelling Salesman Problem.  
 
----- 
File: 1997/tr-97-020 
 
Normal Bases via General Gau\ss\ Periods 
 
Joachim von zur Gathen, Sandra Schlink, and M. Amin Shokrollahi 
tr-97-020 
May 1997 
 
Gau\ss\ periods have been used successfully as a tool  
for constructing normal bases in finite fields.  
Starting from a primitive $r$th root of unity, one  
obtains under certain conditions a normal basis for  
${\F_{q^n}}$ over ${\F_q}$, where $r$ is a prime and  
$nk=r-1$ for some integer $k$. We generalize this  
construction by allowing arbitrary integers $r$  
with $nk=\varphi(r)$, and find in many cases smaller  
values of $k$ than is possible with the previously  
known approach.  
<P> 
Keywords: Gau\ss\ periods, normal  
bases, finite fields, cyclotomic fields, algebraic  
number theory  
 
----- 
File: 1997/tr-97-021 
 
A Gentle Tutorial on the EM algorithm including Gaussian Mixtures and 
Baum-Welch 
 
Jeff Bilmes 
tr-97-021 
May 1997 
 
We introduce maximum-likelihood, the general EM  
algorithm, and two examples, Gaussian mixture  
densities and the Baum-Welch algorithm. We do not  
discuss the convergence properties.  
 
----- 
File: 1997/tr-97-022 
 
Polynomial Time Approximation Schemes for Some Dense Instances of NP-Hard 
Optimization Problems 
 
Marek Karpinski 
tr-97-022 
May 1997 
 
We overview recent results on the existence of  
polynomial time approximation schemes for some  
dense instances of NP-hard optimization problems.  
We indicate further some inherent limits for  
existence of such schemes for some other dense  
instances of the optimization problems.  
 
----- 
File: 1997/tr-97-023 
 
Reorganization in Persistent Object Stores 
 
Reda Salama, Lutz Wegner and Jens Thamm 
tr-97-023 
May 1997 
 
The Record Identifier (RID) storage concept was  
initially made popular through IBM's System R. It  
remains in use for DEC's Rdb and IBM's DB2 and is  
attractive because of its self-contained nature. It  
can even be combined with pointer swizzling.  
Although simple in principle, its details are tricky  
and little has been released to the public. One  
particular problem is the reclamation of empty space  
when a RID-file becomes sparsely populated. Since  
RIDs, also called Tuple Identifiers (TIDs), are  
invariant by definition, pages can be deleted  
physically, but not logically. Therefore, there  
must be a mapping from "old" to "new" page numbers. If  
the self-contained nature is to be preserved, this is  
not to be achieved by a table but rather through some  
arithmetical "folding" similar to hashing schemes.  
Page numbers are meant to collide creating merged  
pages. The paper explains in detail an efficient  
division-folding method where f adjacent pages are  
merged into one  
<P> 
Keywords: persistent storage, file  
reorganizations, pointer swizzling, complex  
objects  
 
----- 
File: 1997/tr-97-024 
 
Collaboration Support in Networked Distance Learning 
 
Bernd Kr&auml;mer and Lutz Wegner 
tr-97-024 
May 1997 
 
Learning is basically a social process. Experiences  
with Computer Supported Learning (CAL) over the last  
thirty years have shown that technology cannot  
substitute for some of the essential elements of this  
process, e.g. personal communication, face-to face  
collaboration, positive and negative  
reinforcement through fellow students, etc. Today,  
with local and wide area networks becoming a reality,  
there seems to be a chance to simulate some elements of  
this learning process by a suitable combination of  
synchro nous and asynchronous collaboration  
techniques. In particular, this paper proposes ways  
of supporting this interaction within a consistent,  
representation-independent complex object model.  
To map the model onto affordable technologies we  
borrow structures and methods from both database  
research and current multi-media course  
development. The arguments for the suitability of  
our approach, keeping in mind that distance learning  
remains a necessity in many circumstance  
 
----- 
File: 1997/tr-97-025 
 
Constructing semantic representations using the MDL principle 
 
Gabriele Scheler 
tr-97-025 
July 1997 
 
Words receive a significant part of their meaning  
from use in communicative settings. The formal  
mechanisms of lexical acquisition, as they apply to  
rich situational settings, may also be studied in the  
limited case of corpora of written texts. This work  
constitutes an approach to deriving semantic  
representations for lexemes using techniques from  
statistical induction. In particular, a number of  
variations on the MDL principle were applied to  
selected sample sets and their influence on emerging  
theories of word meaning explored. We found that by  
changing the definition of description length for  
data and theory - which is equivalent to different  
encodings of data and theory - we may customize the  
emerging theory, augmenting and altering frequency  
effects. Also the influence of stochastic  
properties of the data on the size of the theory has  
been demonstrated. The results consist in a set of  
distributional properties of lexemes, which  
reflect cognitive distinctions in the meaning of  
words.  
 
----- 
File: 1997/tr-97-027 
 
Deciding Properties of Polynomials without Factoring 
 
T. Sander and M. A. Shokrollahi 
tr-97-027 
August 1997 
 
The polynomial time algorithm of Lenstra, Lenstra,  
and Lovasz [17] for factoring integer polynomials  
and variants thereof have been widely used to show  
that various computational problems in number  
theory have polynomial time solutions. Among them is  
the problem of factoring polynomials over algebraic  
number fields which is used itself as a major  
subroutine for several other algorithms. Although a  
theoretical breakthrough, algorithms based on  
factorization of polynomials over number fields are  
notoriously slow and hard to implement, with running  
times ranging between O(n\u12+e\d) and  
O(n\u18+e\d) depending on which variant of the  
lattice basis reduction is used. Here, n is an upper  
bound for the maximum of the degrees and the  
bit-lengths of the coefficients of the polynomials  
involved. On the other hand, in many situations one  
does not need the full power of factorization, so one  
may ask whether there exist faster algorithms in  
these cases. In this paper we develop more efficient  
Monte Carlo algorithms to decide certain properties  
of roots of integer polynomials, without factoring  
them. Such problems arise, e.g., when solving  
systems of algebraic equations. Our methods applied  
to this situation give thus information about the  
solutions of such systems of equations. Assuming the  
validity of the Extended Riemann Hypothesis, our  
algorithms run in time O(n\ul+e\d) in worst case,  
though they usually terminate much faster if the  
input polynomials do not have the properties the  
algorithm is testing. Besides this substantial  
improvement in the running time, our algorithms have  
the advantage of being conceptually easy. Their  
building blocks are gcd-computations in polynomial  
rings over finite fields, and primality tests for  
integers. However, despite the simplicity of our  
algorithms, their analysis is involved and uses  
tools from algebraic and analytic number theory. Our  
methods yield polynomial time algorithms even in  
cases where the factorization method does not. We  
exhibit such an example by showing that the language  
consisting of pairs (g,m) where g is a monic  
irreducible polynomial such that all its roots are  
integral linear combinations of mth roots of unity,  
is in co-RP. Currently, we do not know of any  
deterministic polynomial time algorithm to decide  
this problem, even if we assume the validity of the  
Extended Riemann Hypothesis. We will also show that  
computing the minimal m such that (g,m) belongs to  
this language is intractable by means of present  
methods: we prove that this problem is polynomial  
time equivalent to that of computing the largest  
square free divisor of an integer.  
 
----- 
File: 1997/tr-97-028 
 
Sorting on a Massively Parallel System Using a Library of Basic Primitives: 
Modeling and Experimental Results 
 
Alf Wachsmann and Rolf Wanka 
tr-97-028 
August 1997 
 
We present a comparative study of implementations of  
the following sorting algorithms on the Parsytec  
SC320 reconfigurable, asynchronous, massively  
parallel MIMD machine: Bitonic Sort, Odd-Even Merge  
Sort without and with guarded split&merge, Periodic  
Balanced Sort, Columnsort, and two variants of  
Samplesort. The experiments are performed on 2- up to  
5-dimensional wrapped butterfly networks with 8 up  
to 160 processors. We make use of library functions  
that provide primitives for global variables and  
synchronization, and we show that it is possible to  
implement efficient and portable programs easily.  
We assume the time for accessing a global variable to  
be linear in the parameters s, d, and c, where s is the  
size of the variable, d the distance between the  
accessing processor and the processor holding the  
variable, and c the contention, i. e., the number of  
processors accessing the variable simultaneously.  
In order to predict the performance, we model the  
runtime of this access by a trilinear function.  
Similarly, the runtime of a synchronization is  
described by a bilinear function, depending on the  
number of processors involved and their maximum  
distance. Our experiments show that, in the context  
of parallel sorting, this model that can be applied  
easily is sufficiently detailed to give good runtime  
predictions. The experiments confirming the  
predictions point out that Odd-Even Merge Sort with  
guarded split&merge is the fastest method if the  
processors hold few keys. If there are many keys per  
processor, a variant of Samplesort that uses  
Odd-Even Merge Sort as a subroutine is the fastest  
method. Additionally, we show that the relative  
behavior of implementations of different  
algorithms is quite similar to their theoretical  
relation.  
 
----- 
File: 1997/tr-97-029 
 
Playing Tetris on Meshes and Multi-Dimensional Shearsort 
 
Miroslaw Kutylowski and Rolf Wanka 
tr-97-029 
August 1997 
 
Shearsort is a classical sorting algorithm working  
in rounds on 2-dimensional meshes of processors. Its  
elementary and elegant runtime analysis can be found  
in various textbooks. There is a straightforward  
generalization of Shearsort to multi-dimensional  
meshes. As experiments turn out, it works fast.  
However, no method has yet been shown strong enough to  
provide a tight analysis of this algorithm. In this  
paper, we present an analysis of the 3-dimensional  
case and show that on the l x l x l-mesh, it suffices to  
perform 2 log l + 10 rounds while 2 log l + 1 rounds are  
necessary. Moreover, tools for analyzing  
multi-dimensional Shearsort are provided.  
 
----- 
File: 1997/tr-97-030 
 
Hybrid Approaches to Neural Network*based Language Processing 
 
Stefan Wermter 
tr-97-030 
August 1997 
 
In this paper we outline hybrid approaches to  
artificial neural network-based natural language  
processing. We start by motivating hybrid  
symbolic/connectionist process ing. Then we  
suggest various types of symbolic/connectionist  
integration for language processing:  
connectionist structure architectures, hybrid  
transfer architectures, hybrid processing  
architectures. Furthermore, we focus particularly  
on loosely coupled, tightly coupled, and fully  
integrated hybrid processing architectures. We  
give par ticular examples of these hybrid processing  
architectures and argue that the hybrid approach to  
artificial neural network-based language  
processing has a lot of potential to overcome the gap  
between a neural level and a symbolic conceptual  
level.  
 
----- 
File: 1997/tr-97-031 
 
More robust J-RASTA processing using spectral subtraction and harmonic 
sieving 
 
Hiroaki Ogawa 
tr-97-031 
August 1997 
 
We investigated spectral subtraction (SS) and  
harmonic sieving (HS) techniques as preprocessing  
for J-RASTA processing to achieve more robust  
feature extrac tion for automatic speech  
recognition. We confirmed that spectral  
subtraction im proved J-RASTA processing, and  
showed that harmonic sieving additively improved  
J-RASTA+SS. We investigated the performance with  
the Bellcore isolated digits task corrupted with car  
noise (additive noise) and linear distortion filter  
(convolutional noise). The J-RASTA+SS+HS system  
reduces the word error rate by 39% given pitch  
estimated from clean speech, and 35% given pitch  
estimated from corrupted speech. The system was also  
tested with several kind of noises from the NOISEX92  
database; each noise sample was added with speech for  
a resulting of 0dB signal to noise ratio. SS  
significantly reduced word error rate for all type of  
noises (white noise 39%, pink noise 51%, car noise  
78%, tank noise 59%, and machine gun noise 19%). Given  
correct pitch, HS additively reduced the word error  
rate for the first three noises (white noise 7%, pink  
noise 16%, and car noise 17%).  
 
----- 
File: 1997/tr-97-032 
 
Parallel Complexity of Numerically Accurate Linear System Solvers 
 
Mauro Leoncini, Giovanni Manzini, and Luciano Margara 
tr-97-032 
August 1997 
 
We prove a number of negative results about practical  
(i.e., work efficient and numerically accurate)  
algorithms for computing the main matrix  
factorizations. In particular, we prove that the  
popular Householder and Givens' methods for  
computing the QR decomposition are P-complete, and  
hence presumably inherently sequential, under both  
real and floating point number models. We also prove  
that Gaussian Elimination (GE) with a weak form of  
pivoting, which only aims at making the resulting  
Algorithm nondegenerate (but possibly unstable) is  
likely to be inherently sequential as well. Finally,  
we prove that GE with partial pivoting is P-complete  
when restricted to Symmetric Positive Definite  
matrices, for which it is known that even plain GE does  
not fail. Altogether, the results of this paper give  
further formal support to the widespread belief that  
there is a tradeoff between parallelism and accuracy  
in numerical algorithms.  
 
----- 
File: 1997/tr-97-033 
 
Social Carrier Recommendation for Selecting Services in Electronic 
Telecommunication Markets: A Preliminary Report 
 
Beat Liver and Joern Altmann 
tr-97-033 
August 1997 
 
The proliferation of telecommunication services  
and the need to manage quality of service on an  
end-to-end basis require an approach for  
automatically selecting services that provide  
sufficient quality of service at minimal cost. An  
agent-based approach is appropriate for such a  
purpose. For this reason, this paper presents a  
social carrier recommendation method, which is an  
essential component of application level  
end-to-end quality of service management as well as a  
way to make the final step towards electronic  
telecommunication markets. For electronic  
telecommunication markets, the proposed approach  
provides a consumer-based evaluation of services as  
well as "rational" user agents that select services  
and carriers based on needs, offered prices, and  
ratings. Therefore, this approach complements  
existing market mechanisms that either provide  
means to buy services or intend to improve sales and  
customer service of carriers.  
 
----- 
File: 1997/tr-97-035 
 
Sather 2: A Language Design for Safe, High-Performance Computing 
 
Benedict Gomes, Welf Loewe, Juergen W. Quittek, and Boris Weissman 
tr-97-035 
December 1997 
 
        Consistency of objects in a concurrent computing 
        environment is usually ensured by serializing all 
        incoming method calls. However, for high performance 
        parallel computing intra-object parallelism, 
        i.e. concurrent execution of methods on an object, 
        is desirable. Currently, languages supporting 
        intra-object parallelism are based on object models 
        that leave it to the programmer to ensure consistency. 
        We present an object model, that ensures object 
        consistency while supporting intra-object concurrency 
        thereby offering both safety and efficiency. The 
        description starts with a simple and safe, but 
        inefficient model and gradually increases the 
        sophistication by introducing features for expressiveness 
        and greater efficiency while maintaining safety. 
        Based on this model we define extensions for guarded 
        suspension and data parallel programming. The model 
        and the extensions are defined as a language proposal 
        for a new version of Sather, Sather 2. The proposal 
        is based on Sather 1.1, but replaces the parallel 
        extensions of this version. 
 
----- 
File: 1997/tr-97-036 
 
Active Threads: an Extensible and Portable Light-Weight Thread System 
 
Boris Weissman 
tr-97-036 
September 1997 
 
This document describes a portable light-weight thread runtime  
system for uni- and multiprocessors targeted at irregular  
applications. Unlike most other thread packages, which utilize  
hard-coded scheduling policies, Active Threads provides a general  
mechanism for building data structure specific thread schedulers  
and for composing multiple scheduling policies within a single  
application. This allows modules developed separately to retain  
their scheduling policies when used together in a single application.  
Flexible scheduling policies can exploit the temporal and spatial  
locality inherent in many applications. 
	In spite of the added flexibility, the Active Threads API is  
close to that of more conventional thread packages. Simple  
synchronization is achieved by standard mutexes, semaphores, and  
condition variables while more powerful parallel constructs can be  
easily built from threads, thread bundles (collections of threads  
with similar properties such as schedulers) and user-defined  
synchronization objects.  
	Active Threads can be used directly by application and  
library writers or as a virtual machine target for compilers for  
parallel languages. The package is retargeted by porting the Active  
Threads Portability Interface that includes only eight primitives.  
Active Threads has been ported to several hardware platforms including 
 SPARC, Intel i386 and higher, DEC Alpha AXP, HPPA and outperformed  
vendor provided thread packages by as much as orders of magnitude.  
A typical thread context switch cost is on the order of dozens of  
instructions and is only an order of magnitude more expensive than  
a function call. This document presents an involved performance  
analysis and comparisons with other commercial and research parallel  
runtimes. 
	Active Threads are used as a compilation target for Sather,  
a parallel object-oriented language under development at ICSI.  
Active Threads are also being used as a base for a distributed  
extension of C++ that supports thread migration. 
 
----- 
File: 1997/tr-97-037 
 
Rapid learning of binding-match and binding-error detector circuits via 
long-term potentiation 
 
Lokendra Shastri 
tr-97-037 
October 1997 
 
It is argued that the memorization of events and  
situations (episodic memory) requires the RAPID  
formation of neural circuits responsive to binding  
errors and binding matches. While the formation of  
circuits responsive to binding matches can be mod  
eled by associative learning mechanisms, the rapid  
formation of circuits responsive to binding errors  
is difficult to explain given their seemingly  
paradoxical behavior, such a circuit must be FORMED  
in response to the occurrence of a binding (i.e., a  
particular pattern in the input), but subsequent to  
its formation, it must not fire anymore in response to  
the occurrence of the very binding (i.e., pattern)  
that led to its formation. A plausible account of the  
formation of such circuits has not been offered. A  
computational model is described that demonstrates  
how a transient pattern of ac tivity representing an  
event can lead to the rapid formation of circuits for  
detecting bindings and binding errors as a result of  
long-term potentiation within structures whose  
architecture and circuitry are similar to those of  
the hippocampal formation, a neural structure known  
to be critical to episodic memory. The model exhibits  
a high memory capacity and is robust against limited  
amounts of diffuse cell loss. The model also offers an  
alternate interpretation of the functional role of  
region CA3 in the formation of episodic memories, and  
predicts the nature of memory impairment that would  
result from damage to various regions of the  
hippocampal formation.  
 
----- 
File: 1997/tr-97-038 
 
Thread Migration with Active Threads 
 
Michael Holtkamp 
tr-97-038 
September 1997 
 
This thesis introduced thread migration as a tool to  
ease parallel programming with multiple SMPs  
connected by fast networks. Simple dynamic load  
balancing strategies have been implemented that  
automatically migrate thr eads between cluster. It  
has been shown that applications could improve their  
performance using a very simple load balancing  
strategy. Even for the worst initial distribution)  
of the application, applications gained speedup up  
to the number of overall processors used. The  
improvements could be achieved for different  
problems and different numbers of processors. These  
performance measurements show that load balancing  
eases the placement problem of parallel  
applications on multiple SMPs. If the initial  
distribution of the application is unfavorable, the  
unbalanced load can be balanced effectively. Even  
further, applications do not have to care for the  
placement. Speedups are achieved if all threads of  
the application are started on one cluster. Active  
Threads offer a flexible event handler mechanism  
that makes it possible to implement even more  
flexible load balancing policies with thread  
migration than the one used in this work. This might  
gain in further improvements. One can think of  
migrating bundles of semantically related threads.  
One can also implement mechanisms to migrate data to  
improve the locality of the execution.  
 
----- 
File: 1997/tr-97-039 
 
Positional Logic Algebra - PLA - A Fascinating Alternative Approach 
 
Christian M. Hamann, and Lev Chtcherbanski 
tr-97-039 
September 1997 
 
The Russian researcher, M. Telpiz, presented 1985 in  
Russia a totally new approach to logic algebra (and L.  
Chtcherbanski, a friend of his, brought the ideas to  
Berlin, Germany in 1995). PLA may be an elegant and  
better representation for some problem domains then  
the Boolean Algebra. Highlights of PLA are: - Only one  
simple algorithm holds for all calculations - Invert  
operators build invert functions - Operators are  
directly applicable on operators and therefore -  
Compilation of multi-layer networks are possible  
via simple calculations over operators only PLA has a  
potential for new applications in logical calculus  
problems, specially with many variables. Because  
operators are directly applicable on operators, PLA  
may be of special interest in research areas of  
Genetic Algorithms, Evolution-Strategy and  
Artificial Life.  
 
----- 
File: 1997/tr-97-040 
 
REx: Learning A Rule and Exceptions 
 
Ethem Alpaydin 
tr-97-040 
October 1997 
 
We propose a method where the dataset is explained as a  
"rule" and a set of "ex ceptions" to the rule. The rule  
is a parametric model valid over the whole input space  
and exceptions are nonparametric and local. This  
approach is applicable both to function  
approximation and classification. We explain how  
the rule and exceptions can be learned using  
cross-validation. We investigate three ways of  
combining the rule and exceptions: (1) In a  
multistage approach, if the rule is confident of its  
output, we use it; otherwise, output is interpolated  
from a table of stored exceptions. (2) In a  
multiexpert approach, the exceptions are defined as  
gaussian units just like in a radial-basis functions  
network; the rule can be seen as a parametric  
input-dependent offset to which the gaussian  
exceptions are added. (3) The rule and exceptions can  
be written as a mixture model like in Mixtures of  
Experts and they can be combined in a cooperative or  
competitive manner. The system can be trained using a  
gradient based, or in the case of (3) EM, algorithm.  
The model can be combined with Hidden Markov models  
for sequence processing. We analyse REx as an arcing  
method and compare it with bagging and boosting. The  
proposed approaches are tested on several datasets  
in terms of generalization accuracy, memory  
requirement, and training time with significant  
performance.  
 
----- 
File: 1997/tr-97-041 
 
When Push Comes to Shove: A Computational Model of the Role of Motor Control 
in the Acquisition of Action Verbs 
 
David R. Bailey 
tr-97-041 
October 1997 
 
Children learn a variety of verbs for hand actions  
starting in their second year of life. The semantic  
distinctions can be subtle, and they vary across  
languages, yet they are learned quickly . How is this  
possible? This dissertation explores the  
hypothesis that to explain the acquisition and use of  
action verbs, motor control must be taken into  
account. It presents a model of embodied  
semantics--based on the principles of neural  
computation in general and on the human motor system  
in particular--which takes a set of labelled actions  
and learns both to label novel actions and to obey  
verbal commands. A key feature of the model is the  
executing schema, an active controller mechanism  
which, by actually driving behavior, allows the  
model to carry out verbal commands. A hard-wired  
mechanism links the activity of executing schemas to  
a set of linguistically important features  
including hand posture, joint motions, force,  
aspect and goals. The feature set is relatively small  
and is fixed, helping to make learning tractable.  
Moreover, the use of traditional feature structures  
facilitates the use of model merging , a Bayesian  
probabilistic learning algorithm which rapidly  
learns plausible word meanings, automatically  
determines an appropriate number of senses for each  
verb, and can plausibly be mapped to a connectionist  
recruitment learning architecture. The learning  
algorithm is demonstrated on a handful of English  
verbs, and also proves capable of making some  
interesting distinctions found  
crosslinguistically.  
 
----- 
File: 1997/tr-97-042 
 
Analysis of Random Processes via And-Or Tree Evaluation 
 
Michael G. Luby, Michael Mitzenmacher, and M. Amin Shokrollahi 
tr-97-042 
November 1997 
 
We introduce a new set of probabilistic analysis  
tools based on the analysis of And-Or trees with  
random inputs. These tools provide a unifying,  
intuitive, and powerful framework for carrying out  
the analysis of several previously studied random  
processes. including random loss-resilient codes,  
solving random k-SAT formulae using the pure literal  
rule, the greedy algorithm for matchings in random  
graphs. In addition, these tools allow  
generalizations of these problems not previously  
analyzed to be analyzed in a straightforward manner.  
We illustrate our methodology on the three problems  
listed above.  
 
----- 
File: 1997/tr-97-043 
 
Java Multimedia Studio v1.0 
 
Giancarlo Fortino 
tr-97-043 
November 1997 
 
Along with the emergence of a new generation of multimedia applications 
has come a need to facilitate real-virtual teleconferences and automatic 
generation of contents. In this direction Java Multimedia Studio, a tool 
completely java-based allowing to edit, record and playback multimedia 
sessions over the Internet MBone, has been developed. Java Multimedia 
Studio is founded on a QoS centered Java and Actor-based Framework 
providing the management of local and distributed synchronization of 
streams by mixing, translating and filtering RTP packets. It not only 
enhances on-line and enables off-line multimedia conferencing but also 
gives a much more challenging opportunity to create multimedia sessions 
enriching their contents. 
 
----- 
File: 1997/tr-97-044 
 
Improved Low-Density Parity-Check Codes Using Irregular Graphs and Belief 
Propagation 
 
Michael G. Luby, Michael Mitzenmacher, M. Amin Shokrollahi, and Daniel A.  Spielman 
tr-97-044 
November 1997 
 
We construct new families of error-correcting codes  
based on Gallager's low-density parity-check  
codes, which we call irregular codes. When decoded  
using belief propa gation, our codes can correct more  
errors than previously known low-density parity  
check codes. For example, for rate 1/4 codes on 16,000  
bits over a binary symmetric channel, previous  
low-density parity-check codes can correct up to  
approximately 16% errors, while our codes can  
correct over 17%. Our improved performance comes  
from using codes based on irregular random bipartite  
graphs, based on the work of [7]. Previously studied  
low-density parity-check codes have been derived  
from regu lar bipartite graphs. We report  
experimental results for our irregular codes on both  
binary symmetric channels and Gaussian channels. In  
some cases our results come very close to reported  
results for turbo codes, suggesting that, with  
improvements, irregular codes may be able to match  
turbo code performance.  
 
----- 
File: 1997/tr-97-045 
 
Analysis of Low Density Codes and Improved Designs Using Irregular Graphs 
 
Michael G. Luby, Michael Mitzenmacher, M. Amin Shokrollahi, and Daniel A.  Spielman 
tr-97-045 
November 1997 
 
In [6] Gallager introduces a family of codes based on  
sparse bipartite graphs, which he calls low-density  
parity-check codes. He suggests a natural decoding  
algorithm for these codes, and proves a good bound on  
the fraction of errors that can be corrected. As the  
codes that Gallager builds are derived from regular  
graphs, we refer to them as regular codes. Following  
the general approach introduced in [7] for the design  
and analysis of loss-resilient codes, we consider  
error-correcting codes based on random irregular  
bipartite graphs, which we call irregular codes. We  
introduce tools based on linear programming for  
designing linear time irregular codes with better  
error-correcting capabilities than possible with  
regular codes. For example, the decoding algorithm  
for the rate 1/2 regular codes of Gallager can  
provably correct up to 5.1% errors, whereas we have  
found irregular codes for which our decoding  
algorithm can provably correct up to 6.2%.  
 
----- 
File: 1997/tr-97-046 
 
Parallel Computing on MultiSpert 
 
Philipp F&auml;rber 
tr-97-046 
December 1997 
 
This report provides an overview of the MultiSpert 
parallel computer system and its performance characteristics. 
We describe the underlying hardware and its limitations, as 
well as the additional communication layers which provide an 
efficient remote procedure calling mechanism. 
Timing measurements on a 5 node prototype confirm MultiSpert 
scalability to high levels of performance. 
 
----- 
File: 1997/tr-97-047 
 
Quicknet on MultiSpert: Fast Parallel Neural Network Training 
 
Philipp F&auml;rber 
tr-97-047 
December 1997 
 
The MultiSpert parallel system is a straight-forward extension 
of the Spert workstation accelerator, which is predominantly 
used in speech recognition research at ICSI. In order to deliver 
high performance for Artificial Neural Network training without 
requiring changes to the user interfaces, the exisiting 
Quicknet ANN library was modified to run on MultiSpert. 
In this report, we present the algorithms used in the parallelization 
of the Quicknet code and analyse their communication and computation 
requirements. The resulting performance model yields a better 
understanding of system  speed-ups and potential bottlenecks. 
Experimental results from actual training runs validate the model 
and demonstrate the achieved performance levels. 
 
----- 
File: 1997/tr-97-049 
 
Towards Mobile Cryptography 
 
Tomas Sander and Christian F. Tschudin 
tr-97-049 
November 1997 
 
Mobile code technology has become a driving force for  
recent advances in distributed systems. The concept  
of mobility of executable code raises major security  
problems. In this paper we deal with the protection of  
mobile code from possibly malicious hosts. We  
conceptualize on the specific cryptographic  
problems posed by mobile code. We are able to provide a  
solution for some of these problems. We present  
techniques how to achieve "non-interactive  
computing with encrypted programs" in certain cases  
and give a complete solution for this problem in  
important instances. We further present a way how an  
agent might securely perform a cryptographic  
primitive, digital signing, in an untrusted  
execution environment. Our results are based on the  
use of homomorphic encryption schemes and function  
composition techniques.  
 
----- 
File: 1997/tr-97-050 
 
Multicasting Multimedia Streams with Active Networks 
 
Albert Banchs, Wolfgang Effelsberg, Christian Tschudin, and Volker Turau 
tr-97-050 
March 1998 
 
Active networks allow code to be loaded dynamically  
into network nodes at run-time. This code can perform  
tasks specific to a stream of packets or even a single  
packet. In this paper we compare two active network  
architectures: the Active Network Transfer System  
(ANTS) and the Messenger System (M0). We have  
implemented a robust audio multicast protocol and a  
layered video multicast protocol with both active  
network systems. We discuss the differences of the  
two systems, evaluate architectural strengths and  
weaknesses, compare the runtime performance, and  
report practical experience and lessons learned.  
 
<P> 
Keywords: Active Network, ANTS, M0, robust audio,  
scalable video, layered video  
 
----- 
File: 1997/tr-97-051 
 
Multi-Band Speech Recognition: A Summary of Recent Work at ICSI 
 
Naghmeh Nikki Mirghafori 
tr-97-051 
December 1997 
 
In this technical report we discuss the recent work on  
multi-band ASR at ICSI. This exposition consists of  
three themes. Our first topic is the design and  
implementation of a multi-band baseline system.  
Next, we discuss the analysis of multi-band ASR, in  
terms of phonetic information transmission and  
potential advantage of asynchronous merging of  
sub-band streams. The third topic is motivated by the  
intuition that some bands are inherently better for  
classifying some phones, whereas others lack  
sufficient information for such discrimination. We  
report on a multi-band system designed based on this  
hypothesis.  
<P> 
Keywords: speech recognition, multi-band processing.  
 
----- 
File: 1997/tr-97-053 
 
Constructing Fuzzy Graphs from Examples 
 
Michael R. Berthold and Klaus-Peter Huber 
tr-97-053 
December 1997 
 
Methods to build function approximators from  
example data have gained consider able interest in  
the past. Especially methodologies that build  
models that allow an interpretation have attracted  
attention. Most existing algorithms, however, are  
either complicated to use or infeasible for  
high-dimensional problems. This article presents  
an efficient and easy to use algorithm to construct  
fuzzy graphs from example data. The resulting fuzzy  
graphs are based on locally independent fuzzy rules  
that operate solely on selected, important  
attributes. This enables the application of these  
fuzzy graphs also to problems in high dimensional  
spaces. Using illustrative examples and a real world  
data set it is demonstrated how the resulting fuzzy  
graphs offer quick insights into the structure of the  
example data, that is, the underlying model.  
 
----- 
File: 1997/tr-97-054 
 
A Performance Evaluation of Fine Grain Thread Migration with Active Threads 
 
Boris Weissman, Benedict Gomes, J&uuml;rgen W. Quittek, and Michael Holtkamp 
tr-97-054 
December 1997 
 
Thread migration is established as a mechanism for  
achieving dynamic load sharing and data locality.  
However, migration has not been used with  
fine-grained parallelism due to the relatively high  
over heads associated with thread and messaging  
packages. This paper describes a high performance  
thread migration system for fine-grained  
parallelism, implemented with user level threads  
and user level messages. The thread system supports  
an extensible event mechanism which permits an  
efficient interface between the thread and  
messaging systems without compromising the  
modularity of either. Migration is supported by user  
level primitives; applications may implement  
different migration policies on top of the migration  
interface pr ovided. The system is portable and can be  
used directly by application and library writers or  
serve as a compilation target for parallel  
programming languages. Detailed performance  
metrics are presented to evaluate the system. The  
system runs on a cluster of SMPs and the performance  
obtained is orders of magnitude better than other  
reported measurements.  
 
----- 
File: 1997/tr-97-055 
 
Type-Safety and Overloading in Sather 
 
B. Gomes, D. Stoutamire and B. Weissman 
tr-97-055 
December 1997 
 
Method overloading is a form of statically resolved 
multi-methods which may be used to express specialization in a type 
hierarchy[GSWF97]. The design of the overloading rule in Sather is 
constrained by the presence of multiple-subtyping, and the ability to 
add supertyping edges to the type graph after-the-fact [SO96].  We 
describe the design of overloading rules which permit method 
specialization while allowing separate type-checking i.e. existing 
code cannot be broken by after-the-fact addition of supertyping edges. 
 
----- 
File: 1997/tr-97-056 
 
Portable, Modular Expression of Locality 
 
David Stoutamire 
tr-97-056 
December 1997 
 
It is difficult to achieve high performance while  
programming in the large. In particular,  
maintaining locality hinders portability and  
modularity. Existing methodologies are not  
sufficient: explicit communication and coding for  
locality require the programmer to violate  
encapsulation and compositionality of software  
modules, while automated compiler analysis remains  
unreliable. This thesis presents a performance  
model that makes thread and object locality  
explicit. Zones form a runtime hierarchy that  
reflects the intended clustering of threads and  
objects, which are dynamically mapped onto hardware  
units such as processor clusters, pages, or cache  
lines. This conceptual indirection allows  
programmers to reason in the abstract about locality  
without committing to the hardware of a specific  
memory system. Zones comple ment conventional  
coding for locality and may be added to existing code  
to improve performance without affecting  
correctness. The integration of zones into the  
Sather language is described, including an  
implementa tion of memory management customized to  
parameters of the memory system.  
 
----- 
File: 1997/tr-97-057 
 
Deployment of RASTA-PLP with the Siemens ZT Speech Recognition System 
 
Michael L. Shire 
tr-97-057 
December 1997 
 
RelAtive SpecTral Analysis - Perceptual Linear  
Predicion (RASTA-PLP) is the standard speech  
feature extraction method used at the International  
Computer Science Institute. There it has been used  
primarily in conjunction with a hybrid Artificial  
Neural Network (ANN) and Hidden Markov Model (HMM)  
speech recognition system. this work explores the  
viability of the RASTA-PLP as a candidate feature  
extraction method in the Siemens ZT recognition  
system. Experiments with a basic RASTA-PLP setup  
confirm that it provides good performance and is a  
potentially useful tool which merits further  
research and experimentation.  
 
----- 
File: 1997/tr-97-058 
 
A Lower Bound for Integer Multiplication on Randomized Read-Once Branching 
Programs 
 
Farid Ablayev and Marek Karpinski 
tr-97-058 
December 1997 
 
We prove an exponential lower bound 2\u2\d(n/\ulog  
n\d) on the size of any randomized ordered read-once  
branching program computing integer  
multiplication. Our proof depends on proving a new  
lower bound on Yao's randomized one-way communica  
tion complexity of certain boolean functions. It  
generalizes to some other common models of  
randomized branching programs. In contrast, we  
prove that testing integer multiplication,  
contrary even to nondeterministic situation, can be  
computed by randomized ordered read-once branching  
program in polynomial size. It is also known that  
computing the latter problem with deterministic  
read-once branching programs is as hard as factoring  
integers.  
 
----- 
File: 1997/tr-97-059 
 
Polynomial Time Approximation of Dense Weighted Instances of MAX-CUT 
 
W. Fernandez de la Vega and M. Karpinski 
tr-97-059 
December 1997 
 
We give the first polynomial time approximability  
characterization of dense weighted instances of  
MAX-CUT, and some other dense weighted NP-hard  
problems in terms of their empirical weight  
distributions. This gives also the first almost  
sharp char acterization of inapproximability of  
unweighted 0,1 MAX-BISECTION instances in terms of  
their density parameter only.  
 
----- 
File: 1997/tr-97-060 
 
On Approximation Hardness of the Bandwidth Problem 
 
Marek Karpinski and J&uuml;rgen Wirtgen 
tr-97-060 
December 1997 
 
The bandwidth problem is the problem of enumerating  
the vertices of a given graph G such that the maximum  
difference between the numbers of adjacent vertices  
is minimal. The problem has a long history and a number  
of applications and is known to be NP-hard,  
Papadimitriou [Pa 76]. There is not much known though  
on approximation hardness of this problem. In this  
paper we show, that there are no efficient polynomial  
time approximation schemes for the bandwidth  
problem under some plausible assumptions.  
Furthermore, we show that there are no polynomial  
time approximation algorithms with an absolute  
error guarantee of n\u1-e\u for any e > 0 unless P = NP.  
 
----- 
File: 1997/tr-97-061 
 
Using Value Semantic Abstractions to Guide Strongly Typed 
Library Design 
 
B. Gomes, D. Stoutamire, B. Weissman and J. Feldman 
tr-97-061 
December 1997 
 
 
        This report addresses typing problems that arise when 
modelling simple mathematical entities in strongly typed languages 
such as Sather, which are eliminated by a proper distinction between 
mutable and immutable abstractions.  We discuss the reasons why our 
intuition leads us astray, and provide a solution using statically 
type-safe specialization through constrained overloading. We also 
discuss the type relationships between mutable and immutable classes 
and the notion of freezing objects. 
 
----- 
File: 1998/tr-98-001 
 
Isoperimetric Functions of Amalgamations of Nilpotent Groups 
 
Christian Hidber 
tr-98-001 
January 1998 
 
We consider amalgamations of finitely generated  
nilpotent groups of class c. We show that doubles  
satisfy a polynomial isoperimetric inequality of  
degree 2c\u2. Generalising doubles we introduce  
non-twisted amalgamations and we show that they  
satisfy a polynomial isoperimetric inequality as  
well. We give a sufficient condition for  
amalgamations along abelian subgroups to be  
non-twisted and thereby to satisfy a polynomial  
isoperimetric inequality. We conclude by giving an  
example of a twisted amalgamation along an abelian  
subgroup having an exponential isoperimetric  
function.  
 
----- 
File: 1998/tr-98-002 
 
Maximizing Throughput of Reliable Bulk Network Transmissions 
 
John W. Byers 
tr-98-002 
January 1998 
 
We study combinatorial optimization and on-line  
scheduling problems which arise in the context of  
supporting applications which transmit bulk data  
over high-speed networks. One of our primary  
objectives in this thesis work is to formulate  
appropriate theoretical models in which to develop  
and analyze efficient algorithms for these problems  
- models which reflect both the experience of network  
architects, the design of network protocols, and  
contributions of theoretical research.  
        <P> 
        We first consider the optimization problem of maximizing the  
utilization of a shared resource, network  
bandwidth, across a set of point-to-point  
connections. A feasible solution to this allocation  
problem is an assignment of transmission rates to the  
connections which does not violate the capacity  
constraints of the network links. The connections  
and routers which are responsible for establishing  
this allocation must do so with incomplete  
information and limited communication  
capabilities. We develop a theoretical model which  
addresses these considerations and study the  
tradeoff between the quality of the solution we can  
obtain and the distributed running time. Our main  
theoretical result is a distributed algorithm for  
this problem which generates a feasible (1 +  
e)-approximation to the optimal allocation in a  
polylogarithmic number of distributed rounds. A  
sequential implementation of our distributed  
algorithm gives a simple, efficient approximation  
algorithm for general positive linear programming.  
Subsequent experience with an implementation of the  
algorithm indicates that it is well suited to future  
deployment in high-speed networks.  
        <P> 
        The next problem  
we consider is the following on-line scheduling  
problem, which the sender of a point-to-point bulk  
transmission must address. Given an on-line  
sequence of transmission times, determine which  
data item to transmit at each transmission time, so as  
to maximize effective throughput to the receiver at  
all points in time. For this application, we measure  
effective throughput as the length of the intact  
prefix of the message at the receiver. This problem is  
made difficult in practice by factors beyond the  
sender's control, such as packet loss and wide  
variance in packet round-trip times.  
        <P>  
        Using the method of competitive analysis, we compare the  
performance of our algorithm to that of an omniscient  
algorithm. We prove that while all deterministic  
policies perform poorly in this model, a simple  
randomized policy delivers near-optimal  
performance at any given point in time with high  
probability. Moreover, our theoretical result  
ensures that typical performance does not degrade  
significantly - a claim which our empirical studies  
bear out. Using the models and tools developed for  
these problems, we then consider analo gous problems  
which arise for multicast bulk transmissions,  
transmissions targeted to mul tiple destinations.  
We show how to tune our bandwidth allocation policy to  
still deliver a (1 + e)-approximation to the optimal  
allocation in a polylogarithmic number of  
distributed rounds. For the scheduling problem, we  
prove that no on-line scheduling policy can deliver  
high performance which scales with the number of  
receivers without using encoding. We then show that  
by using forward error correction coding  
techniques, a simple multicast policy delivers  
effective throughput within a constant factor of  
optimal independent of the number of receivers.  
 
----- 
File: 1998/tr-98-004 
 
Simplified ART: A new class of ART algorithms 
 
Andrea Baraldi and Ethem Alpaydin 
tr-98-004 
February 1998 
 
The Simplified Adaptive Resonance Theory (SART)  
class of networks is pro posed to handle problems  
encountered in Adaptive Resonance Theory 1 (ART  
1)-based algorithms when detection of binary and  
analog patterns is performed. The basic idea of SART  
is to substitute ART 1-based "unidirectional"  
(asymmetric) activation and match functions with  
"bidirectional" (symmetric) function pairs. This  
substitution makes the class of SART algorithms  
potentially more robust and less time-consuming  
than ART 1-based systems. One SART algorithm, termed  
Fuzzy SART, is discussed. Fuzzy SART employs  
probabilistic and possibilistic fuzzy membership  
functions to combine soft com petitive learning with  
outlier detection. Its soft competitive strategy  
relates Fuzzy SART to the well-known  
Self-Organizing Map and Neural Gas clustering  
algorithm. A new Normalized Vector Distance, which  
can be employed by Fuzzy SART, is also presented.  
Fuzzy SART performs better than ART 1-based  
Carpenter-Grossberg-Rosen Fuzzy ART in the  
clustering of a simple two-dimensional data set and  
the standard four-dimensional IRIS data set. As  
expected, Fuzzy SART is less sensitive than Fuzzy ART  
to small changes in input parameters and in the order  
of the presentation sequence. In the clustering of  
the IRIS data set, performances of Fuzzy SART are  
analogous to or better than those of several  
clustering models found in the literature.  
 
<P> 
Keywords: hard and soft competitive learning,  
cluster detection, ART 1-based systems,  
Self-Organizing Map, Neural Gas algorithm, fuzzy  
set theory, fuzzy clustering.  
 
----- 
File: 1998/tr-98-005 
 
Digital Fountain Approach to Reliable Distribution of Bulk Data 
 
John Byers, Michael Luby, Michael Mitzenmacher, and Ashutosh Rege 
tr-98-005 
February 1998 
 
The proliferation of applications that must  
reliably distribute bulk data to a large number of  
autonomous clients motivates the design of new  
multicast and broadcast protocols. We describe an  
ideal, fully scalable protocol for these  
applications that we call a digital fountain. A  
digital fountain allows any number of heterogeneous  
clients to acquire bulk data with optimal efficiency  
at times of their choosing. Moreover, no feedback  
channels are needed to ensure reliable delivery,  
even in the face of high loss rates. We develop a  
protocol that closely approximates a digital  
fountain using a new class of erasure codes that are  
orders of magnitude faster than standard erasure  
codes. We provide performance measurements that  
demonstrate the feasibility of our approach and  
discuss the design, implementation and performance  
of an experimental system.  
<P> 
Keywords: digital  
fountain, reliable data distribution, bulk  
distribution, on demand download, erasure codes,  
forward-error correcting (FEC), IP multicast,  
broadcast, lossy channels, heterogeneous  
conditions.  
 
----- 
File: 1998/tr-98-006 
 
Enabling Synchronous Joint-Working In Java 
 
Vladimir Minenko 
tr-98-006 
March 1998 
 
This report gives an outlook on technologies for  
joint-working with Java-based programs - applets  
and applications. Various approaches and APIs  
applied to the Java environment are discussed and  
compared. A new architecture for scalable Java  
application sharing is presented. Several  
suggestions on possible future features of JDK  
facing synchronous joint-working are presented.  
 
<P> 
Keywords: collaboration, joint-working, Swing,  
Java, JDK, conferencing, application sharing,  
CSCW.  
 
----- 
File: 1998/tr-98-008 
 
From GISystems to GIServices: 
Spatial Computing on the Internet Marketplace 
 
Oliver G&uuml;nther and Rudolf M&uuml;ller 
tr-98-008 
March 1998 
 
Many of the functions performed by GIS seem to be  
amenable to a business model that is fundamentally  
different from the one we see today. At present, GIS  
users typically own the hardware and software they  
use. They pay license and maintenance fees to various  
vendors. The alternative would be a  
service-oriented approach where users make their  
input data available to some GIS service center that  
performs the necessary computations remotely and  
sends the results back to the user. Customers pay only  
for that particular usage of the GIS technology -  
without having to own a GIS. We discuss this business 
model and associated problems of privacy and  
ease-of-use. We also give an overview of our MMM  
system (http://mmm.wiwi.hu-berlin.de), a  
distributed computing infrastructure that  
supports this business model.  
 
----- 
File: 1998/tr-98-009 
 
Image segmentation through contextual clustering 
 
A. Baraldi, P. Blonda, F. Parmiggiani and G. Satalino 
tr-98-009 
March 1998 
 
Several interesting strategies are adopted by the  
well-known Pappas clustering algorithm to segment  
smooth images. These include exploitation of  
contextual information to model both class  
conditional densities and {\it a priori} knowledge  
in a Bayesian framework. Deficiencies of this  
algorithm are that: i) it removes from the scene any  
genuine but small region; and ii) its  
feature-preserving capability largely depends on a  
user-defined smoothing parameter. This parameter  
is equivalent to a clique potential of a Markov Random  
Field model employed to capture known stochastic  
components of the labeled scene. In this paper a  
modified version of the Pappas segmentation  
algorithm is proposed to process smooth and  
noiseless images requiring enhanced  
pattern-preserving capability. In the proposed  
algorithm: iii) no spatial continuity in pixel  
labeling is enforced to capture known stochastic  
components of the labeled scene; iv) local intensity  
parameters, pixel labels, and global intensity  
parameters are estimated in sequence; and v) if no  
local intensity average is available to model one  
category in the neighborhood of a given pixel, then  
global statistics are employed to determine whether  
that category is the one closest to pixel data.  
Results show that our contextual algorithm can be  
employed: vi) in cascade to any noncontextual  
(pixel-wise) hard $c$-means clustering algorithm  
to enhance detection of small image features; and  
vii) as the initialization stage of any crisp and  
iterative segmentation algorithm requiring priors  
to be neglected on earlier iterations (such as the  
Iterative Conditional Modes algorithm).  
<P> 
Keywords: Markov Random Field, Bayes' theorem, image  
segmentation.  
 
----- 
File: 1998/tr-98-010 
 
Geospatial Information Extraction: Querying or Quarrying? 
 
Agnes Voisard and Marcus Juergens 
tr-98-010 
April 1998 
 
     We focus here on the access to multiple, distributed,  
     heterogeneous and autonomous information sources storing  
     geospatial data and we study alternatives to integrate them. 
     Common solutions to data integration in the database area  
     nowadays are the data warehouse approach and the wrapper/mediator 
     approach. None of them is really satisfactory to handle a large  
     range of geospatial applications.  In this paper we present a novel  
     hybrid approach to data integration based on the  two popular paradigms.  
     We believe that such architectures will be of major importance 
     in the geospatial applications of the near future.   
 
----- 
File: 1998/tr-98-011 
 
CORBA--Based Interoperable Geographic Information Systems 
 
H.-Arno Jacobsen and Agnes Voisard 
tr-98-011 
April 1998 
 
 
A new generation of geographic  information systems (GIS)  emphasizing 
an  open   architecture, interoperability, and   extensibility in their 
design has received a great deal of attention in research and industry 
over the past few years.  The key idea behind these systems is to move 
away from the traditional monolithic view in system engineering, to an 
open design embracing  many co-existing  distributed  (sub)-systems, 
such  as database   management systems   (DBMS),  statistic  packages, 
computational geometry libraries and  even traditional GIS.    
While some success  has been achieved  in the area of geospatial  data 
integration (data models and formats), it is still unclear what common 
services these  open GIS  should provide and  how their design 
would  benefit  from available  distributed computing infrastructures.  
This latter  question  is especially  interesting  with regard to  the 
increasing   attention that   object-oriented distributed   computing 
infrastructures  have received  recently   in   the community. 
 
In this paper, we describe a generic open GIS with an emphasis on the 
services it should provide.  We then study the design of such a system 
based on object services and features provided by the Common Object 
Request Broker Architecture (CORBA).  We also report on the use of the 
CORBA platform for implementing a fully-operational distributed open 
GIS. We conclude by arguing for a closer integration of GIS 
functionality into the CORBA architecture, as already done for the 
medical and financial domains. 
----- 
File: 1998/tr-98-012 
 
Reconstructing Polyatomic Structures from Discrete X-Rays: 
NP-Completeness Proof for Three Atoms 
 
Marek Chrobak and Christoph D&uuml;rr 
tr-98-012 
April 1998 
 
We address a discrete tomography problem arising in the study of the 
atomic structure of crystal lattices.  A polyatomic structure 
<em>T</em> is an integer lattice in dimension <em>D>=2</em>, whose 
points may be occupied by <em>c</em> types of atoms. To ``analyze'' 
<em>T</em>, we conduct <em>l</em> measurements that we refer to as 
<em>discrete X-rays</em>.  A discrete X-ray in direction <em>xi</em> 
determines the number of atoms of each type on each line parallel to 
<em>xi</em>. Given such <em>l</em> non-parallel X-rays, we wish to 
reconstruct <em>T</em>. 
<p> 
The complexity of the problem for <em>c=1</em> (one atom) has been 
completely determined by Gardner, Gritzmann and Prangerberg, who 
proved that the problem is NP-complete for any dimension <em>D>=2</em> 
and <em>l>=3</em> non-parallel X-rays, and that it can be solved in 
polynomial time otherwise. 
<p> 
The NP-completeness result above clearly extends to any <em>c>=2</em>, 
and therefore when studying the polyatomic case we can assume that 
<em>l=2</em>. As shown in another article by the same authors, this 
problem is also NP-complete for <em>c>=6</em> atoms, even for 
dimension <em>D=2</em> and for the axis-parallel X-rays.  The authors 
of that article conjecture that the problem remains NP-complete for 
<em>c =3,4,5</em>. 
<p> 
We resolve this conjecture by proving that the problem is indeed 
NP-complete for <em>c>=3</em> in 2D, even for the axis-parallel 
X-rays. Our construction relies heavily on some structure results for 
the realizations of 0-1 matrices with given row and column sums. 
<p> 
Keywords: Discrete Tomography, X-rays, HRTEM, QUANTITEM, 
Multicommodity flow, Contigency table. 
 
----- 
File: 1998/tr-98-013 
 
A Digital Fountain Approach to Reliable Distribution of Bulk Data 
 
John W. Byers, Michael Luby, Michael Mitzenmacher, Ashutosh Rege  
tr-98-013 
May 1998 
 
The proliferation of applications that must reliably distribute  
bulk data to a large number of autonomous clients motivates 
the design of new multicast and broadcast protocols.  
We describe an ideal, fully scalable protocol for these applications  
that we call a digital fountain. 
A digital fountain allows any number of heterogeneous  
clients to acquire bulk data with optimal efficiency  
at times of their choosing.  
Moreover, no feedback channels are needed to ensure  
reliable delivery, even in the face of high loss rates. 
<p> 
We develop a protocol that closely approximates a digital fountain  
using a new class of erasure codes that for large block sizes 
are orders of magnitude faster than standard erasure codes. 
We provide performance measurements that demonstrate the feasibility 
of our approach and discuss the design, implementation and  
performance of an experimental system. 
<P> 
Keywords: erasure codes, Tornado codes, FEC codes, digital fountain, 
reliable multicast, reliable broadcast, one-way transmission, 
satellite, wireless, Internet. 
 
----- 
File: 1998/tr-98-014 
  
Incorporating Information From Syllable-length Time Scales into Automatic Speech Recognition 
  
Su-Lin Wu 
tr-98-014 
May 1998 
  
Incorporating the concept of the syllable into speech recognition may 
improve recognition accuracy through the integration of information 
over syllable-length time spans.  Evidence from psychoacoustics and 
phonology suggests that humans use the syllable as a basic perceptual 
unit.  Nonetheless, the explicit use of such long-time-span units is 
comparatively unusual in automatic speech recognition systems for 
English.  The work described in this thesis explored the utility of 
information collected over syllable-related time-scales.  The first 
approach involved integrating syllable segmentation information into 
the speech recognition process.  The addition of acoustically-based 
syllable onset estimates (Shire 1997) resulted in a 10% relative 
reduction in word-error rate.  The second approach began with 
developing four speech recognition systems based on long-time-span 
features and units, including modulation spectrogram features 
(Greenberg & Kingsbury 1997).  Error analysis suggested the strategy 
of combining, which led to the implementation of methods that merged 
the outputs of syllable-based recognition systems with the 
phone-oriented baseline system at the frame level, the syllable level 
and the whole-utterance level.  These combined systems exhibited 
relative improvements of 20-40% compared to the baseline system for 
clean and reverberant speech test cases. 
<P>  
Keywords: speech recognition, syllable, combination, syllabic onsets, 
human auditory perception, reverberation, neural network 
  
 
----- 
File: 1998/tr-98-015 
 
Incremental Class Learning approach and its application to 
Handwritten Digit Recognition 
  
Jacek Mandziuk and Lokendra Shastri 
tr-98-015 
June 1998 
  
Incremental Class Learning (ICL) provides a feasible framework for the 
development of scalable learning systems. Instead of learning a complex 
problem at once, ICL focuses on learning subproblems incrementally, one at a 
time --- using the results of prior learning for subsequent learning --- and 
then combining the solutions in an appropriate manner. With respect to 
multi-class classification problems, the ICL approach presented in this 
paper can be summarized as follows. Initially the system focuses on one 
category. After it learns this category, it tries to identify a compact 
subset of features (nodes) in the hidden layers, that are crucial for the 
recognition of this category. The system then {\em freezes} these crucial 
nodes (features) by fixing their incoming weights. As a result, these 
features cannot be obliterated in subsequent learning. These frozen features 
are available during subsequent learning and can serve as parts of weight 
structures build to recognize other categories. As more categories are 
learned, the set of features gradually stabilizes and learning a new 
category requires less effort. Eventually, learning a new category may only 
involve combining existing features in an appropriate manner. The approach 
promotes the {\em sharing} of learned features among a number of categories 
and also alleviates the well-known {\em catastrophic interference} problem. 
We present results of applying the ICL approach to the Handwritten Digit 
Recognition problem, based on a spatio-temporal representation of patterns. 
 
<P> 
Keywords: Incremental Class Learning, catastrophic interference problem, 
supervised learning, spatio-temporal representation, pattern recognition,  
Handwritten Digit Recognition, neural network 
 
----- 
File: 1998/tr-98-016 
 
The auditory organization of speech in listeners and machines 
 
Martin Cooke and Daniel P.W. Ellis 
tr-98-016 
June 1998 
 
Speech is typically perceived against a background of other 
sounds. Listeners are adept at extracting target sources from the 
acoustic mixture reaching the ears. The auditory scene analysis 
account holds that this feat is the result of a two stage process. In 
the first stage, sound is decomposed both within and across auditory 
nuclei. Subsequent processes of perceptual organisation are informed 
both by cues which suggest a common source of origin and prior 
experience. These operate on the decomposed auditory scene to extract 
coherent evidence for one or more sources for subsequent 
processing. Auditory scene analysis in listeners has been studied for 
several decades and recent years have seen a steady accumulation of 
computational models of perceptual organisation. The purpose of this 
review is to describe the evidence for auditory organization in 
listeners and to explore the computational models which have been 
motivated by such evidence. The primary focus is on speech rather than 
on sources such as polyphonic music or nonspeech ambient backgrounds, 
although these other domains are equally amenable to auditory 
organization. The review concludes with a discussion of the 
relationship between auditory scene analysis and alternative 
approaches to sound source segregation. 
 
----- 
File: 1998/tr-98-017 
 
Scatter-partitioning RBF network for function regression 
and image segmentation: Preliminary results 
 
Andrea Baraldi 
tr-98-017 
June 1998 
 
Scatter-partitioning Radial Basis Function (RBF) networks increase 
their number of degrees of freedom with the complexity of an 
input-output mapping to be estimated on the basis of a supervised 
training data set.  Due to its superior expressive power a 
scatter-partitioning Gaussian RBF (GRBF) model, termed Supervised 
Growing Neural Gas (SGNG), is selected from the literature. SGNG 
employs a one-stage error-driven learning strategy and is capable of 
generating and removing both hidden units and synaptic connections.  A 
slightly modified SGNG version is tested as a function estimator when 
the training surface to be fitted is an image, i.e., a 2-D signal whose 
size is finite. The relationship between the generation, by the 
learning system, of disjointed maps of hidden units and the presence, 
in the image, of pictorially homogeneous subsets (segments) is 
investigated. Unfortunately, the examined SGNG version performs poorly 
both as function estimator and image segmenter.  This may be due to an 
intrinsic inadequacy of the one-stage error-driven learning strategy to 
adjust structural parameters and output weights simultaneously but 
consistently. In the framework of RBF networks, further studies should 
investigate the combination of two-stage error-driven learning 
strategies with synapse generation and removal criteria. 
<P> 
Keywords: RBF networks, supervised and unsupervised learning from data, 
prototype vectors, synaptic links, Gestaltist theory, image 
segmentation, low-level vision. 
 
----- 
File: 1998/tr-98-018 
 
SAR image segmentation exploiting no background knowledge on speckled 
radiance: A feasibility study 
 
Andrea Baraldi and Flavio Parmiggiani 
tr-98-018 
June 1998 
 
This work presents a SAR image segmentation scheme consisting of a 
sequence of four modules, all selected from the literature. These 
modules are: i) a speckle model-free contour detector that is the core 
of the segmentation scheme; ii) a geometrical procedure to detect 
closed regions from non-connected contours; iii) a region growing 
procedure whose merging rules exploit local image properties, both 
topological and spectral, to eliminate artifacts and reduce 
oversegmentation introduced by the second stage; iv) a neural network 
clustering algorithm to detect global image regularities in the 
sequence of within-segment properties extracted from the partitioned 
image provided by the third stage.  In the framework of a commercial 
image-processing software toolbox, the proposed SAR image segmentation 
scheme employs a contour detector that is promising because:  i) it is 
easy to use, requiring the user to select only one contrast threshold 
as a relative number; and ii) it exploits no prior domain-specific 
knowledge about the data source and the content of the scene, i.e., it 
is capable of processing SAR images as well as both achromatic and 
multi-spectral optical images.  The segmentation scheme is tested on 
three images acquired by different SAR sensors.  The robustness of the 
segmentation method is assessed by changing only one parameter of the 
procedure in the different experiments. Experimental results are 
interpreted as an encouragement to focus further multidisciplinary 
research on how to combine responses of multi-scale filter banks in 
low-level visual systems. 
<P> 
Keywords: speckled radiance, speckle noise, image segmentation, 
low-level vision. 
----- 
File: 1998/tr-98-019 
 
Decoding Algebraic-Geometric Codes Beyond the  Error-Correction Bound 
 
M. Amin Shokrollahi and H. Wasserman 
tr-98-019 
June 1998 
 
We generalize Sudan's results for Reed-Solomon codes to the class of 
algebraic-geometric codes, designing polynomial-time algorithms which 
decode beyond the error-correction bound (d-1)/2, where d is the 
minimum distance of the code. 
 
We introduce [n,k,e,b]_q-codes, which are linear [n,k]_q-codes such 
that any Hamming sphere of radius e contains at most b 
codewords. Using the sequence of Garcia-Stichtenoth function fields, 
we construct sequences of constant-rate [n,k,e,b]_q-codes for which 
e/n tends to epsilon>1/2 as n grows large, while b and q remain 
fixed. Equivalently, we specify arbitrarily large constant-rate codes 
over a fixed field F_q such that a codeword is efficiently, 
non-uniquely reconstructible after more than half of its letters have 
been arbitrarily corrupted. Additionally, we discover a very simple 
algorithm for conventional decoding of AG-codes.   
 
Furthermore, we construct codes such that a codeword is uniquely and 
efficiently reconstructible after more than half of its letters have 
been corrupted by noise which is random in a specified sense. We 
summarize our results in terms of bounds on asymptotic parameters, 
giving a new characterization of decoding beyond the traditional 
error-correction bound.  
<P> 
Keywords: Algebraic geometric codes, Reed-Solomon codes, decoding. 
 
----- 
File: 1998/tr-98-020 
 
Reconstructing hv-Convex Polyominoes from Orthogonal Projections 
 
Marek Chrobak and Christoph D&uuml;rr 
tr-98-020 
July 1998 
 
<P> 
Tomography is the area of reconstructing objects from projections. Here 
we wish to reconstruct a set of cells in a two dimensional grid, given 
the number of cells in every row and column. The set is required to be 
an <I>hv-convex polyomino</I>, that is all its cells must be connected 
and the cells in every row and column must be consecutive. 
<P> 
A simple, polynomial algorithm for reconstructing hv-convex polyominoes 
is provided, which is several orders of magnitudes faster than the best 
previously known algorithm from Barcucci et al. In addition, the problem 
of reconstructing a special class of <I>centered</I> hv-convex polyominoes 
is addressed. (An object is centered if it contains a row whose length 
equals the total width of the object). It is shown that in this case the 
reconstruction problem can be solved in linear time. 
<P> 
Implementations are available from  
<A HREF="http://www.icsi.berkeley.edu/~cduerr/Xray/Polyomino/polyomino.html">here</A>. 
<P> 
Keywords: Combinatorial problems, discrete tomography, polyominoes.  
----- 
File: 1998/tr-98-022 
 
Optimal Dynamic Embeddings of Complete Binary Trees into Hypercubes 
 
Volker Heun and Ernst W. Mayr 
tr-98-022 
August 1998 
 
The double-rooted complete binary tree is a complete binary tree where the 
root is replaced by an edge. It is folklore that the double-rooted complete 
binary tree is a spanning tree of the hypercube of the same size. 
Unfortunately, the usual construction of an embedding of a double-rooted 
complete binary tree into the hypercube does not provide any hint how this 
embedding can be extended if each leaf spawns two new leaves. In this 
paper, we present simple dynamic embeddings of double-rooted complete 
binary trees into hypercubes which do not suffer from this disadvantage. 
We also present edge-disjoint embeddings of large binary trees with optimal 
load and unit dilation. Furthermore, all these embeddings can be 
efficiently implemented on the hypercube itself such that the embedding of 
each new level of leaves can be computed in constant time. Since complete 
binary trees are similar to  double-rooted complete binary trees, our 
results can be immediately transfered to complete binary trees. 
<P> 
Keywords: Simulation of Algorithms, Hypercubes, Graph Embeddings, 
Complete Binary Trees 
----- 
File: 1998/tr-98-023 
 
Efficient Dynamic Embeddings of Binary Trees into Hypercubes 
 
Volker Heun and Ernst W. Mayr 
tr-98-023 
August 1998 
 
In this paper, a deterministic algorithm for dynamically embedding binary 
trees into hypercubes is presented. Because of a known lower bound, any 
such algorithm must use either randomization or migration, i.e., remapping 
of tree vertices, to obtain an embedding of trees into hypercubes with 
small dilation, load, and expansion simultaneously. Using migration of 
previously mapped tree vertices, the presented algorithm constructs a 
dynamic embedding which achieves dilation of at most 9, unit load, nearly 
optimal expansion, and constant edge- and node-congestion. This is the 
first dynamic embedding that achieves these bounds simultaneously. 
Moreover, the embedding can be computed efficiently on the hypercube 
itself. The amortized time for each spawning step is bounded by 
O(log^2(L)), if in each step at most L new leaves are spawned. From this 
construction, a dynamic embedding of large binary trees into hypercubes is 
derived which achieves dilation of at most 6 and nearly optimal load. 
Similarly, this embedding can be constructed with nearly optimal load q on 
the hypercube itself in amortized time O(q log^2(L/q)) per spawning step, 
if in each step at most L new leaves are added. 
<P> 
Keywords: Simulation of Algorithms, Hypercubes, Binary Trees, 
Dynamic Graph Embeddings 
----- 
File: 1998/tr-98-026 
 
A Fuzzy Based Load Sharing Mechanism for Distributed Systems 
 
Herwig Unger and Thomas Boehme 
tr-98-026 
August 1998 
 
This report presents a load sharing heuristic for distributed computing on 
workstation clusters. The approach is novel in that it combines the use of 
predicted resource requirements of processes (CPU-time, memory requirements, 
density of the I/O-stream) and a fuzzy logic controller which makes 
the placement decision. The heuristic is distributed, i.e. each node runs 
a copy of the prediction and load sharing code, and its implementation is  
based on PVM. Using a benchmark program (Choleski factorization) experiments 
were conducted to compare the proposed heuristic against standard PVM and 
an older version of the presented heuristic without the fuzzy logic  
controller.   
 
<P> 
Keywords: Distributed Systems, Fuzzy Logic, PVM, Workstation Cluster 
----- 
File: 1998/tr-98-027 
 
Face Recognition: a Summary of 1995 - 1997 
 
Thomas Fromherz 
tr-98-027 
August 1998 
 
The development of face recognition over the past years allows an 
organization into three types of recognition algorithms, namely frontal, 
profile, and view-tolerant recognition, depending on the kind of imagery 
and the according recognition algorithms. While frontal recognition 
certainly is the classical approach, view-tolerant algorithms usually 
perform recognition in a more sophisticated fashion by taking into 
consideration some of the underlying physics, geometry, and statistics. 
Profile schemes as stand-alone systems have a rather marginal significance 
for identification. However, they are very practical either for fast coarse 
pre-searches of large face databases to reduce the computational load for a 
subsequent sophisticated algorithm, or as part of a hybrid recognition scheme. 
<p> 
Such hybrid approaches have a special status among face recognition systems 
as they combine different recognition approaches in an either serial or 
parallel order to overcome the shortcomings of the individual components. 
<p> 
Keywords: Face recognition, Identification, Authentication, Hybrid 
recognition, Classifiers 
 
----- 
File: 1998/tr-98-028 
 
Learning from data: general issues and special applications of Radial Basis Function networks 
 
Andrea Baraldi and N. A. Borghese 
tr-98-028 
August 1998 
 
In the first part of this work some important issues 
regarding the use of data-driven learning systems are  
discussed. Next, a special category of learning systems 
known as artificial Neural Networks (NNs) is presented. 
Our attention is focused on a specific class of NNs,  
termed Radial Basis Function (RBF) networks, which are  
widely employed in classification and function regression tasks. 
A constructive RBF network, termed Hierarchical RBF (HRBF) model, 
is proposed. An application where the HRBF model is  
applied to reconstruct a continuous 3-D surface  
from range data samples is presented.  
<p> 
Keywords: Inductive and deductive types of inference, 
learning from data, predictive learning, 
supervised and unsupervised learning,  
actual risk and empirical risk, 
curse of dimensionality, basis function, kernel function, 
neural networks, Multi-Layer-Perceptron,  
Radial Basis Function network,  
data-driven and error-driven learning,  
hybrid learning, one- and two-stage learning,  
grid-partitioning and scatter-partitioning network,  
constructive network, Hierarchical Radial Basis Function network. 
 
----- 
File: 1998/tr-98-029 
 
Approximate Protein Folding in the HP Side Chain Model on Extended Cubic Lattices 
 
Volker Heun 
tr-98-029 
December 1998 
 
One of the most important open problems in computational molecular biology 
is the prediction of the conformation of a protein based on its amino acid 
sequence. In this paper, we design approximation algorithms for structure 
prediction in the so-called HP side chain model. The major drawback of the 
standard HP side chain model is the bipartiteness of the cubic lattice. To 
eliminate this drawback, we introduce the extended cubic lattice which 
extends the cubic lattice by diagonals in the plane. For this lattice, we 
present two linear algorithms with approximation ratios of 59/70 and 37/42, 
respectively. The second algorithm is designed for a `natural' subclass of 
proteins, which covers more than 99.5% of all sequenced proteins. This is 
the first time that a protein structure prediction algorithm is designed 
for a `natural' subclass of all combinatoric  possible sequences. 
<p> 
Keywords: Approximation Algorithm, Protein Folding, Polymer Structure 
Prediction, HP Model, HP Side Chain Model, Extended Cubic Lattice 
 
----- 
File: 1998/tr-98-031 
 
MICO: A CORBA 2.2 compliant implementation 
 
Arno Puder and Kay Roemer 
tr-98-031 
September 1998 
 
The Common Object Request Broker Architecture (CORBA) 
describes the architecture of a middleware platform which 
supports the implementation of applications in distributed and 
heterogeneous environments. In contrast to other 
middleware platforms like DCOM from Microsoft, CORBA is a 
specification that does not prescribe any specific technology. 
In fact, the specification is freely available from the OMG's 
homepage and everyone can implement a compliant CORBA system. 
In this technical report we give an overview of MICO, a freely 
available CORBA implementation. The acronym MICO, in the 
spirit of GNU, recursively expands to "Mico Is COrba". 
 
----- 
File: 1998/tr-98-032 
 
A Security Mechanism for the Resource Management  
in a Web Operating System 
 
Herwig Unger 
tr-98-032 
September 1998 
 
Resource security is maybe one of the most important features for any  
distributed computating on the Web. In this article a new adaptive  
approach shall be presented realizing the authorization and identification  
of a remote user using fingerprints built from a set of typical system  
data related to the respective user. The suggested approach avoids the  
use of secured, trusted third machines and adapts access rights using  
a fine-grained set of confidence levels for a possibly changing group  
of users. 
 
----- 
File: 1998/tr-98-033 
 
Online Association Rule Mining 
 
Christian Hidber 
tr-98-033 
September 1998 
 
We present a novel algorithm to compute large itemsets 
online.  The user is free to change the support threshold 
any time during the first scan of the transaction sequence. 
The algorithm maintains a superset of all large itemsets and 
for each itemset a shrinking, deterministic interval on its 
support.  After at most 2 scans the algorithm terminates with 
the precise support for each large itemset. Typically our 
algorithm is by an order of magnitude more memory efficient 
than Apriori or DIC. 
 
----- 
File: 1998/tr-98-035 
  
The PHiPAC v1.0 Matrix-Multiply Distribution. 
  
Jeff Bilmes, Krste Asanovic, Chee-Whye Chin, Jim Demmel 
tr-98-035 
October 1998 
 
  
Modern microprocessors can achieve high performance on linear 
algebra kernels but this currently requires extensive 
machine-specific hand tuning.  We have developed a methodology whereby near-peak 
performance on a wide range of systems can be achieved automatically 
for such routines.  First, by analyzing current machines and C 
compilers, we've developed guidelines for writing Portable, 
High-Performance, ANSI C (PHiPAC, pronounced ``fee-pack'').  Second, 
rather than code by hand, we produce parameterized code 
generators. Third, we write search scripts that find the best 
parameters for a given system.  We report on a BLAS GEMM compatible 
multi-level cache-blocked matrix multiply generator which produces 
code that achieves around 90\% of peak on the Sparcstation-20/61, IBM 
RS/6000-590, HP 712/80i, SGI Power Challenge R8k, and SGI Octane R10k, 
and over 80\% of peak on the SGI Indigo R4k.  In this paper, we 
provide a detailed description of the PHiPAC V1.0 matrix multiply 
distribution. We describe the code generator in detail including the 
various register and higher level blocking strategies. We also 
document the organization and parameters of the search scripts.  This 
technical report is an expanded version of a previous paper that 
appeared in ICS97. 
 
----- 
File: 1998/tr-98-036 
 
Scheduling with Limited Machine Availability 
 
G&uuml;nter Schmidt 
tr-98-036 
October 1998 
 
This paper reviews results related to deterministic scheduling problems where 
machines are not continuously available for processing. There might be  
incomplete information about the points of time machines change availability. 
The complexity of single and multi machine problems is analyzed considering 
criteria on completion times and due dates. The review mainly covers  
intractability results, polynomial optimization and approximation algorithms. 
In some places also results from enumerative algorithms and heuristics are 
surveyed. 
<P> 
Keywords: scheduling theory, availability constraints, algorithms 
 
----- 
File: 1998/tr-98-037 
 
Robust Speech Recognition Using Articulatory Information 
 
Katrin Kirchhoff 
tr-98-037 
August 1998 
 
This report describes experiments in speech recognition using articulatory 
information.  Previously, articulatory-based speech recognizers have 
exclusively been developed for clean speech; the potential of an 
articulatory representation of the speech signal for noisy test conditions, 
by contrast, has not been explored. Moreover, there have barely been 
attempts at systematically combining articulatory information with standard 
acoustic recognizers. This paper investigates these aspects  
and reports speech recognition results on a variety of 
acoustic test conditions for individual acoustic and articulatory speech 
recognizers, as well as for a combined system. On a continuous numbers 
recognition task, the acoustic system generally performs equal to, or 
slightly better than, the articulatory system, whereas the articulatory 
system shows a statistically significant improvement on noisy speech with a 
low signal-to-noise ratio. The combined system nearly always performs 
significantly better than either of the individual systems. 
 
----- 
File: 1998/tr-98-038 
 
A survey of fuzzy clustering algorithms for pattern recognition 
 
Andrea Baraldi and Palma Blonda 
tr-98-038 
October 1998 
 
Clustering algorithms aim at modelling fuzzy (i.e., ambiguous)  
unlabeled patterns efficiently. Our goal is to propose a 
theoretical framework where clustering systems  
can be compared on the basis of their learning strategies. <BR><BR> 
In the first part of this work, the following issues are  
reviewed: relative (probabilistic) and absolute (possibilistic) 
fuzzy membership functions and their relationships 
to the Bayes rule, batch and on-line learning,  
growing and pruning networks, modular network architectures,  
topologically perfect mapping, ecological nets and  
neuro-fuzziness. From this discussion an equivalence between the 
concepts of fuzzy clustering and soft competitive learning  
in clustering algorithms is proposed as 
a unifying framework in the comparison of clustering systems. 
Moreover,  a set of functional attributes is selected 
for use as dictionary entries in our comparison. <BR><BR> 
In the second part of this paper, five clustering algorithms 
taken from the literature are reviewed and compared 
on the basis of the selected properties of interest.  
These network clustering models are:<BR> 
i) Self-Organizing Map (SOM); <BR> 
ii) Fuzzy Learning Vector Quantization (FLVQ);<BR> 
iii) Fuzzy Adaptive Resonance Theory (Fuzzy ART);<BR> 
iv) Growing Neural Gas (GNG); 
and <BR> 
v) Fully self-Organizing Simplified Adaptive  
Resonance Theory (FOSART).<BR><BR> 
 
Although our theoretical comparison is fairly simple, it yields 
observations that may appear paradoxical.  
Firstly, only FLVQ, Fuzzy ART and FOSART exploit concepts  
derived from fuzzy set theory (e.g., relative and/or  
absolute fuzzy membership functions). Secondly, only SOM,  
FLVQ, GNG and FOSART employ soft competitive learning 
mechanisms, which are affected by asymptotic misbehaviors in 
the case of FLVQ, i.e., only SOM, GNG and FOSART are considered 
effective fuzzy clustering algorithms. 
 
----- 
File: 1998/tr-98-039 
 
The Virtual Gallery (TVIG) - 3D visualization of a queryable art-database on the Internet 
 
Andreas Mueller and Erich Neuhold 
tr-98-039 
August 1998 
 
The still rapidly growing Internet offers new ways to 
reach an increasing number of people in all areas of 
life. More and more companies take advantage of this 
fact by advertising and selling their products through 
this new electronic media. Art is a great example for 
using this new approach, because the visualization is 
the most important aspect and the physical presence of 
the exhibited object has just a secondary significance 
for the buying process, in contrary to other products 
(e.g. instruments, perfume, cars, etc.). 
This paper introduces an electronic service for 
galleries and artists to exhibit their artwork on the 
Internet easily and efficiently. The Virtual Internet 
Gallery (TVIG) utilizes a database to offer fast search 
functionality and performs a 3D visualization of the 
user's query result, applying VRML. Users, who are 
interested in the exhibited art, can contact the gallery 
or artist directly through the system. 
 
----- 
File: 1998/tr-98-041 
 
Markov Models and Hidden Markov Models: A Brief Tutorial 
 
Eric Fosler-Lussier 
tr-98-041 
December 1998 
 
        This tutorial gives a gentle introduction to Markov models 
        and Hidden Markov models as mathematical abstractions, and 
        relates them to their use in automatic speech recognition. 
        This material was developed for the Fall 1995 semester of 
        CS188: Introduction to Artificial Intelligence at the 
        University of California, Berkeley.  It is targeted for 
        introductory AI courses; basic knowledge of probability 
        theory (e.g. Bayes' Rule) is assumed.  This version is 
        slightly updated from the original, including a few minor 
        error corrections, a short "Further Reading" section, and 
        exercises that were given as a homework in the Fall 1995 
        class. 
 
----- 
File: 1998/tr-98-042 
 
Unsupervised Learning from Dyadic Data  
 
Thomas Hofmann and Jan Puzicha 
tr-98-042 
December 1998 
 
Dyadic data refers to a domain with two finite sets of  
objects in which observations are made for dyads, i.e.,  
pairs with one element from either set. This includes  
event co-occurrences, histogram data, and single stimulus  
preference data as special cases. Dyadic data arises naturally 
in many applications ranging from computational linguistics  
and information retrieval to preference analysis and computer  
vision. In this paper, we present a systematic, domain-independent 
framework for unsupervised learning from dyadic data by  
statistical mixture models. Our approach covers different models 
with flat and hierarchical latent class structures and unifies 
probabilistic modeling and structure discovery. Mixture models 
provide both, a parsimonious yet flexible parameterization of  
probability distributions with good generalization performance  
on sparse data, as well as structural information about  
data-inherent grouping structure. We propose an annealed version  
of the standard Expectation Maximization algorithm for model  
fitting which is empirically evaluated on a variety of data sets  
from different domains. 
 
----- 
File: 1998/tr-98-043 
 
Advances in SHRUTI: A neurally motivated model of relational knowledge representation and rapid inference using temporal synchrony. 
 
Lokendra Shastri 
tr-98-043 
December 1998 
 
We are capable of drawing a variety of inferences effortlessly,  
spontaneously, and with remarkable efficiency --- as though these 
inferences are a <B>reflex</B> response of our cognitive apparatus. This 
remarkable human ability poses a challenge for cognitive science and 
computational neuroscience: How can a network of slow neuron-like 
elements represent a large body of systematic knowledge and perform a 
wide range of inferences with such speed? The connectionist model 
SHRUTI attempts to address this challenge by demonstrating how a  
neurally plausible network can encode a large body of semantic and 
episodic facts,	systematic rules, and knowledge about entities and 
types, and yet perform a wide range of explanatory and predictive 
inferences within a few hundred	milliseconds. Relational structures 
(frames, schemas) are represented in SHRUTI by clusters of cells, 
and inference in SHRUTI corresponds to a transient propagation of 
rhythmic activity over such cell-clusters wherein <B>dynamic bindings</B> 
are represented by the synchronous firing of appropriate cells. SHRUTI 
encodes mappings across relational structures using high-efficacy 
links that enable the propagation of rhythmic activity, and it encodes 
items in long-term memory as coincidence and conincidence-error 
detector circuits that become active in response to the occurrence 
(or non-occurrence) of appropriate coincidences in the on going flux of 
rhythmic activity. Finally, ``understanding'' in SHRUTI corresponds to 
reverberant and coherent activity along closed loops of neural  
circuitry. Over the past several years, SHRUTI has undergone several 
enhancements that have augmented its expressiveness and inferential 
power. This paper describes some of these extensions that enable 
SHRUTI to (i) deal with negation and inconsistent beliefs, (ii) 
encode evidential rules and facts, (iii) perform inferences requiring 
the dynamic instantiation of entities, and (iv) seek coherent 
explanations of observations. 
<P> 
Keywords: knowledge representation; inference; evidential reasoning; 
  dynamic binding; temporal synchrony. 
 
----- 
File: 1999/tr-99-002 
 
Fast Convergence of the Glauber Dynamics for Sampling Independent Sets: Part I 
 
Michael Luby and Eric Vigoda 
tr-99-002 
January 1999 
 
We consider the problem of sampling independent sets of a 
graph with maximum degree $\delta$.  The weight of each  
independent set is expressed in terms of a fixed positive  
parameter $\lambda\leq\frac{2}{\delta-2}$, where the weight  
of an independent set $\sigma$ is $\lambda^{|\sigma|}$. 
The Glauber dynamics is a simple Markov chain Monte Carlo 
method for sampling from this distribution. We show fast  
convergence of this dynamics.  This paper gives the more  
interesting proof for triangle-free graphs.  The proof for  
arbitrary graphs is given in a companion paper.  We also  
prove complementary hardness of approximation results, 
which show that it is hard to sample from this distribution 
when $\lambda > \frac{c}{\delta}$ for a constant $c > 0$. 
 
 
----- 
File: 1999/tr-99-003 
 
Fast Convergence of the Glauber Dynamics for Sampling Independent Sets: Part II 
 
Eric Vigoda 
tr-99-003 
January 1999 
 
This work is a continuation of ICSI technical report 
tr-99-002.  The focus is on the problem of sampling  
independent sets of a graph with maximum degree $\delta$.   
The weight of each independent set is expressed in terms  
of a fixed positive parameter $\lambda\leq\frac{2}{\delta-2}$, 
where the weight of an indepednent set $\sigma$ is  
$\lambda^{|\sigma|}$.  The Glauber dynamics is a simple Markov  
chain Monte Carlo method for sampling from this distribution. 
 
In the companion work, we showed fast convergence of this 
dynamics for triangle-free graphs.  This paper proves fast  
convergence for arbitrary graphs. 
 
 
----- 
File: 1999/tr-99-004 
 
A Multi-Band Approach to Automatic Speech Recognition 
 
Naghmeh Nikki Mirghafori 
tr-99-004 
January 1999 
 
Multi-band approaches have recently generated a great deal of interest 
in the automatic speech recognition (ASR) community.  In this 
paradigm, each sub-frequency region of the speech signal is 
treated as a distinct source of information and the streams are 
combined after each is processed independently.  Motivations for 
the multi-band paradigm include results from psycho-acoustic studies, 
robustness to noise, and potential for parallel processing. 
<p> 
The main contribution of this dissertation is the systematic 
exploration of an area of great interest to many in the research 
community, showing that multi-band ASR is a viable option, not just for 
improving recognition accuracy in the presence of noise, but also for 
clean speech.  The work focused on the design and implementation of a 
multi-band system, analysis of some of its characteristics, and 
development of extensions to the paradigm. 
<p> 
An analysis in terms of phonetic feature transmission showed 
multi-band processing to be better than a comparable traditional 
full-band design in many cases. It was observed that some bands were 
more accurate in discriminating between some phonetic categories. It 
was hypothesized that combining the confused sub-band classes would 
reduce the number of input classes and improve generalization. 
The size of the input space was reduced by almost 30%, and yet the 
global frame-level phonetic discrimination improved and the word 
recognition error did not change (the observed improvement was not 
statistically significant).  The results were consistent with the 
original hypothesis. 
<p> 
The analysis also showed that the phonetic transitions in the 
sub-bands do not necessarily occur synchronously and are affected by 
conditions such as speaking rate and room reverberation. Relaxing the 
synchrony constraints in the sub-bands during word recognition was 
investigated.  The experimental results suggested that removing the 
synchrony constraints for all phone to phone transitions is unlikely 
to be advantageous while significantly increasing computational cost. 
<p> 
The combination of the multi-band and the full-band system was 
studied.  This combination reduced the word recognition error rate for 
the experimental clean speech task by about 23-29% compared to the 
baseline system.  The results obtained are the best that we 
know of on the Numbers95 experimental database. 
 
----- 
File: 1999/tr-99-006 
 
An elementary proof of the Johnson-Lindenstrauss Lemma 
 
Sanjoy Dasgupta and Anupam Gupta 
tr-99-006 
March 1999 
 
The Johnson-Lindenstrauss lemma shows that a set of $n$ points in high 
dimensional Euclidean space can be mapped down into an $O(\log n/\e^2)$ 
dimensional Euclidean space such that the distance between any two 
points changes by only a factor of $(1 \pm \e)$. In this report, we prove 
this lemma using elementary probabilistic techniques and show that 
it is essentially tight. 
 
----- 
File: 1999/tr-99-007 
 
A Time-Sensitive Actor Framework in Java for the Development of 
Multimedia Systems over the Internet MBone 
 
Giancarlo Fortino, Libero Nigro, and Andres Albanese 
tr-99-007 
March 1999 
 
This paper describes an architectural framework for the development of 
Internet-based multimedia systems such as interactive and collaborative 
media on-demand applications. The programming in-the-small level centres 
on Java and a variant of the Actor model especially designed for 
time-dependent 
distributed systems. The programming in-the-large level can be tuned to 
exploit current real-time and control protocols proposed for the 
Internet MBone. 
A multimedia application is modelled as a collection of autonomous and 
(possibly) mobile media actors interacting one to another to achieve a 
common goal. 
Multiple stream synchronisation is based on reflective actors 
(QoSsynchronizers) which filter message transmissions and apply to them 
application-dependent 
QoS constraints. Admission control of multiple sessions is delegated to 
a system Broker. The paper describes the actor framework and discusses 
its application to 
the construction of Java Multimedia Studio on-Demand, a multimedia 
system designed to support playback, recording and editing of multimedia 
presentations. 
 
----- 
File: 1999/tr-99-008 
 
Sleep Stage Classification using Wavelet Transform and Neural Network 
 
Edgar Oropesa, Hans L. Cycon, and Marc Jobert 
tr-99-008 
March 1999 
 
In this paper we present a new method to do automatic sleep stage  
classification. The algorithm consists of basically three modules.  
A wavelet packet transformation (WPT) applied to 30 seconds long  
epochs of EEG recordings to provide localized time-frequency  
information, a feature generator which quantifies the information  
and reduce the data set size, and an artificial neural network for doing 
optimal classification. The classification results compared to those of a  
human expert reached a 70 to 80% of agreement. 
 
----- 
File: 1999/tr-99-009 
 
A Biological Grounding of Recruitment Learning and Vicinal Algorithms 
 
Lokendra Shastri 
tr-99-009 
April 1999 
 
Biological neural networks are capable of gradual learning based on 
observing a large number of exemplars over time as well as rapidly 
memorizing specific events as a result of a single exposure. The 
primary focus of research in connectionist modeling has been on 
gradual learning, but some researchers have also attempted the 
computational modeling of rapid (one-shot) learning within a 
framework described variably as recruitment learning and vicinal 
algorithms. While general arguments for the neural plausibility of 
recruitment learning and vicinal algorithms based on notions of 
neural plasticity have been presented in the past, a specific neural 
correlate of such learning has not been proposed. Here it is shown 
that recruitment learning and vicinal algorithms can be firmly 
grounded in the biological phenomena of long-term potentiation 
(LTP) and long-term depression (LTD). Toward this end, a 
computational abstraction of LTP and LTD is presented, and an 
``algorithm'' for the recruitment of binding-detector cells is 
described and evaluated using biologically realistic data.  
It is shown that binding-detector cells of distinct bindings exhibit 
low levels of cross-talk even when the bindings overlap. In the 
proposed grounding, the specification of a vicinal algorithm  
amounts to specifying an appropriate network architecture and  
suitable parameter values for the induction of LTP and LTD. 
<p> 
Keywords: one-shot learning; memorization; recruitment learning; 
dynamic bindings; long-term potentiation; binding detection. 
 
----- 
File: 1999/tr-99-010 
 
Soft-to-hard model transition in clustering: a review 
 
A. Baraldi and L. Schenato 
tr-99-010 
September 1999 
 
Clustering analysis often employs unsupervised learning techniques 
originally developed for vector quantization.  In this framework, a 
frequent goal of clustering systems is to minimize the {\it 
quantization error}, which is affected by many local minima. To avoid 
confinement of reference vectors to local minima of the quantization 
error and to avoid formation of dead units, hard $c$-means clustering 
algorithms are traditionally adapted by replacing their hard 
competitive strategy with a soft adaptation rule, where the degree of 
overlap between receptive fields is proportional to a monotonically 
decreasing scale (temperature) parameter. By starting at a high 
temperature, which is carefully lowered to zero, a soft-to-hard 
competitive clustering model transition is pursued, such that local 
minima of the quantization error are expected to emerge slowly, thereby 
preventing the set of reference vectors from being trapped in 
suboptimal states.  A review of the hard $c$-means, Maximum-Entropy, 
Fuzzy Learning Vector Quantization (FLVQ), Neural Gas (NG), 
Self-Organizing Map (SOM) and a mixture of Gaussians method is 
provided, relationships between these methods are highlighted and a 
possible criterion for discriminating between different soft-to-hard 
competitive clustering model transitions is suggested. 
<P> 
Keywords: unsupervised learning, soft and hard 
competitive clustering algorithms, quantization error. 
 
----- 
File: 1999/tr-99-011 
 
A Spatiotemporal Connectionist Model of Algebraic Rule-Learning 
 
Lokendra Shastri and Shawn Chang 
tr-99-011 
July, 1999 
 
Recent experiments by Marcus, Vijaya, Rao, and Vishton suggest that 
infants are capable of extracting and using abstract algebraic rules 
such as ``the first item X is the same as the third item Y''. Such  
an algebraic rule represents a relationship between placeholders or 
variables for which one can substitute arbitrary values. As Marcus  
et al. point out, while most neural network models excel at capturing 
statistical patterns and regularities in data, they have difficulty 
in extracting algebraic rules that generalize to new items. We 
describe a connectionist network architecture that can readily 
acquire algebraic rules. The extracted rules are not tied to features 
of words used during habituation, and generalize to new words. 
Furthermore, the network acquires rules from a small number of 
examples, without using negative evidence, and without pretraining. 
A significant aspect of the proposed model is that it identifies a  
sufficient set of architectural and representational conditions that 
transform the problem of learning algebraic rules to the much simpler 
problem of learning to detect coincidences within a spatiotemporal 
pattern. Two key representational conditions are (i) the existence of 
nodes that encode serial position within a sequence and (ii) the use 
of temporal synchrony for expressing bindings between a positional 
role node and the item that occupies this position in a given 
sequence. This work suggests that even abstract algebraic rules can 
be grounded in concrete and basic notions such as spatial and temporal 
location, and coincidence. 
 
----- 
File: 1999/tr-99-012 
 
Simultaneous speech and speaker recognition using hybrid architecture 
 
Dominique Genoud, Dan Ellis, Nelson Morgan 
tr-99-012 
July 1999 
 
The automatic recognition process of the human voice is often divided 
in speech recognition and speaker recognition. These 2 areas use the 
same input signal (the voice), but not for the same purpose: the 
speech recognition aims to recognize the message uttered by any 
speaker, and the speaker recognition wants to identify the person who 
is talking. However, more and more applications need to use 
simultaneously the 2 kinds of information. Some actual examples given 
below illustrate this tendency. 
<p> 
State-of-the-art speech recognition systems tend to be speaker 
independent by using models (phonemes, diphones, triphones) estimated 
on huge databases containing numerous speakers, and also by using 
parameterization techniques which try to suppress the speaker dependent 
characteristics (PLP,RASTA-PLP). However, for some types of 
applications it could be important to re-adapt the speaker independent 
speech recognizer to a defined speaker, in order to improve the noise 
robustness for example, or simply to improve the speech recognition 
performances by adding some knowledge of the speaker. Some recent 
results shows that speaker adaptation of a speech recognizer improve 
the performances of the systems [DARPA, 1998]. 
<p> 
Nowadays, numerous applications performing speech information 
retrieval require the automatic extraction of the content of shows and 
the retrieval of the speech of a particular speaker on a particular 
subject. In this case a speech recognition and a speaker recognition 
should be carried on in parallel. Furthermore the detection of speaker 
change in a conversation (speaker A/ speaker B or speaker/music) may 
also be very useful for the indexing and the labeling of the huge 
databases available. 
<p> 
Finally, a speaker recognition is needed for applications like secured 
voice access to information (as a bank account or a voice-mail 
box). In this case, the speaker recognition can be text independent if 
the content of the utterance is not checked. However, better results 
are obtained by using text dependent speaker recognition, both because 
a control of what is said can be done and also because more accurate 
models (phonemes, words) can be built. Anyhow, the text dependent 
speaker recognition has to be preceded by a speech recognition step to 
control and split the message properly. 
<p> 
All these applications show the need of a simultaneous speaker and 
speech recognition. This rapport shows that it exists some 
possibilities exist to carry out this 2 tasks simultaneously. 
 
----- 
File: 1999/tr-99-013 
 
A Study of Users' Perception of Relevance of Spoken Documents 
 
Tassos Tombros and Fabio Crestani 
tr-99-013 
July, 1999 
 
We present the results of a study of users' perception of relevance of 
documents.  Documents retrieved in response to a query are presented to 
users in a variety of ways, from full text to a machine spoken 
query-biased automatically-generated summary, and the difference in 
users' perception of relevance is studied.  The aim is to study 
experimentally how users' perception of relevance varies depending on 
the form that retrieved documents are presented.  The experimental 
results suggest that the effectiveness of advanced multimedia 
Information Retrieval applications may be affected by the low level of 
users' perception of relevance of retrieved documents. 
 
----- 
File: 1999/tr-99-014 
 
Robust Transmission of MPEG Video Streams over Lossy Packet-Switching 
Networks by using PET 
 
Andres Albanese and Giancarlo Fortino 
tr-99-014 
June, 1999 
 
Network heterogeneity is a major issue and multimedia applications have 
to deal with interconnected networks consisting of many sub-networks of 
non-uniformly distributed resources. Real-time traffic caused by video 
sources is bursty by nature, resulting in buffer overflow at the switch 
and unavoidable packet losses. Therefore the information is desirable 
be compressed and prioritized in a way that the application gracefully 
degrades during adverse network conditions. Priority Encoding 
Transmission (PET) is an approach to the transmission of prioritized 
information over lossy packetswitched networks. The basic idea is that 
the source assigns different priorities to different segments of data, 
and then PET encodes the data using multilevel redundancy and disperses 
the encoding into the packets to be transmitted. The property of PET is 
that the destination is able to recover the data in priority order 
based on the number of packets received per message. This report 
summarizes the results to date obtained from the PET project and gives 
direction of on-going and further work. The paper describes the 
fundamentals of the theory on which PET is based, the integration of 
PET with MPEG-1, some experimental results, and an application tool 
RTP-based, VIC-MPET, which allows encoding and playing robust MPEG 
video streams over the Internet MBone. 
 
----- 
File: 1999/tr-99-015 
  
Dynamic Pronunciation Models for Automatic Speech Recognition 
  
John Eric Fosler-Lussier 
tr-99-015 
September 1999 
  
As of this writing, the automatic recognition of spontaneous speech by 
computer is fraught with errors; many systems transcribe one out of 
every three to five words incorrectly, whereas humans can transcribe 
spontaneous speech with one error in twenty words or better.  This 
high error rate is due in part to the poor modeling of pronunciations 
within spontaneous speech.  This dissertation examines how 
pronunciations vary in this speaking style, and how speaking rate and 
word predictability can be used to predict when greater pronunciation 
variation can be expected.  It includes an investigation of the 
relationship between speaking rate, word predictability, 
pronunciations, and errors made by speech recognition systems.  The 
results of these studies suggest that for spontaneous speech, it may 
be appropriate to build models for syllables and words that can 
dynamically change the pronunciations used in the speech recognizer 
based on the extended context (including surrounding words, phones, 
speaking rate, etc.).  Implementation of new pronunciation models 
automatically derived from data within the ICSI speech recognition 
system has shown a 4-5\% relative improvement on the Broadcast News 
recognition task. Roughly two thirds of these gains can be attributed 
to static baseform improvements; adding the ability to dynamically 
adjust pronunciations within the recognizer provides the other third 
of the improvement.  The Broadcast News task also allows for 
comparison of performance on different styles of speech: the new 
pronunciation models do not help for pre-planned speech, but they 
provide a significant gain for spontaneous speech.  Not only do the 
automatically learned pronunciation models capture some of the 
linguistic variation due to the speaking style, but they also 
represent variation in the acoustic model due to channel effects.  The 
largest improvement was seen in the telephone speech condition, in 
which 12\% of the errors produced by the baseline system were 
corrected. 
<P> 
Keywords: speech recognition, pronunciation models, phonetics, 
speaking rate, word predictability, decision trees, linguistic 
variation 
 
----- 
File: 1999/tr-99-016 
 
An Experimental Study of the Effects of Word Recognition Errors in 
<br> 
Spoken Queries on the Effectiveness of an Information Retrieval System 
 
Fabio Crestani 
tr-99-016 
October, 1999 
 
The effects of word recognition errors (WRE) in spoken documents on 
the performance of an Information Retrieval (IR) system have been well 
studied and well reported in recent IR literature. Most of the 
research in this direction has been promoted by the Spoken Document 
Retrieval track of TREC. Much less experimental work has been devoted 
to studying the effects of WRE in spoken queries. It is easy to 
imagine that given the typical length of the user query, the effects 
of WRE in queries on the performance of an IR system must be 
destructive.  The experimental work reported in this paper intends to 
test that. The paper reports on the background of such a study, on the 
construction of a test collection, and on the first experimental 
results. The preliminary conclusions drawn from the experimentation 
enable to give some useful indications for the design of spoken query 
systems, despite the recognized limitations of the study. 
 
----- 
File: 1999/tr-99-017 
  
MetaViz: Visual Interaction with Geospatial Digital Libraries 
  
Volker Jung 
tr-99-017 
October 1999 
 
Recent initiatives to geospatial digital libraries provide access to a 
wealth of distributed data, but offer only basic levels of 
interactivity and user assistance. Consequently, users find it 
difficult and time-consuming to browse through data collections and 
locate those data sets that meet their requirements. The MetaViz 
project addresses two of the major barriers preventing the extensive 
use of digital libraries: lack of usability and information overload. 
This research focuses on geospatial data, making it possible to develop 
effective visualization and interaction methods that are based on 
familiar spatial metaphors. The visualization methods developed employ 
three-dimensional techniques, combining several characteristics or 
dimensions of metadata into single graphical views. As those 
visualizations are based on map and landscape metaphors, they are easy 
to understand and provide instant overviews of complex data 
characteristics. The visualization methods have been integrated into 
MetaViz, an interactive system for browsing and searching geospatial 
data. In MetaViz, graphical views of data characteristics can be 
created and combined dynamically, levels of detail can be adjusted and 
the data sets found can be previewed and accessed. MetaViz helps users 
to locate and select appropriate geospatial data from various sources 
and to combine and use them in an effective way. 
 
----- 
File: 1999/tr-99-020 
 
A Model for Combining Semantic and Phonetic Term Similarity for Spoken  
Document and Spoken Query Retrieval 
 
Fabio Crestani 
tr-99-020 
December, 1999 
 
In classical Information Retrieval systems a relevant document will 
not be retrieved in response to a query if the document and query 
representations do not share at least one term. This problem is known 
as ``term mismatch''. A similar problem can be found in spoken 
document retrieval and spoken query processing, where terms 
misrecognized by the speech recognition process can hinder the 
retrieval of potentially relevant documents. We will call this problem 
``term misrecognition'', by analogy to the term mismatch problem. 
Here we present two classes of retrieval models that attempt to tackle 
both the term mismatch and the term misrecognition problems at 
retrieval time using term similarity information. The models assume 
the availability of complete or partial knowledge of semantic and 
phonetic term-term similarity in the index term space. 
 
----- 
File: 1999/tr-99-021 
 
Schematic Maps for Robot Navigation 
 
Christian Freksa, Reinhard Moratz, and Thomas Barkowsky 
tr-99-021 
December 1999 
 
An approach to high-level interaction with autonomous robots by means 
of schematic maps is outlined. Schematic maps are knowledge 
representation structures to encode qualitative spatial information 
about a physical environment. A scenario is presented in which robots 
rely on high-level knowledge from perception and instruction to perform 
navigation tasks in a physical environment. The general problem of 
formally representing a physical environment for acting in it is 
discussed. A hybrid approach to knowledge and perception driven 
navigation is proposed. Different requirements for local and global 
spatial information are noted.  Different types of spatial 
representations for spatial knowledge are contrasted. The advantages of 
high-level / low-resolution knowledge are pointed out. Creation and use 
of schematic maps are discussed. A navigation example is presented. 
 
----- 
File: 2000/tr-00-001 
 
Automatic Detection of Prosodic Stress in American English Discourse 
 
Rosaria Silipo and Steven Greenberg 
tr-00-001 
March 2000 
 
The goal of this study is twofold. First, it aims to implement an 
automatic detector of prosodic stress with sufficiently reliable 
performance. Second, the effectiveness of the acoustic features most 
commonly proposed in the literature is assessed.  That is, the role 
played by duration, amplitude and fundamental frequency of syllabic 
nuclei is investigated.  Several data-driven algorithms, such as 
Artificial Neural Networks (ANN), statistical decision trees and fuzzy 
classification techniques, and a knowledge-based heuristic algorithm 
are implemented for the automatic transcription of prosodic stress. As 
reference, two different subsets from the OGI English stories database 
were hand labeled in terms of prosodic stress by two individuals 
trained in linguistics.  While the ANN based approach achieves the 
highest performance (77\% primarily stressed vocalic nuclei vs.~79\% 
unstressed vocalic nuclei in average for the two transcribers data 
sets), the other methods show that both transcribers grant a major role 
to  duration and (to a slightly lesser degree) to amplitude. Pitch 
relevant features of the syllabic nuclei appear to play a much less 
important role than amplitude and duration.   
 
----- 
File: 2000/tr-00-002 
 
Broadcasting Time cannot be Approximated 
within a Factor of 57/56-epsilon 
 
Christian Schindelhauer 
tr-00-002 
March 2000 
 
In the beginning the information is available only at some 
sources of a given network. The aim is to inform all nodes 
of the given network. Therefore, every node can inform its 
neighborhood sequentially and newly informed nodes can 
proceed in parallel within their neighborhoods. The process 
of informing one node needs one time unit. The broadcasting 
problem is to compute the minimum length of such a 
broadcasting schedule. 
<P> 
The computational complexity of broadcasting is investigated 
and for the first time a constant lower inapproximability 
bound is stated, i.e. it is NP-hard to distinguish between 
graphs with broadcasting time smaller than $b$ and larger than 
(57/56-epsilon)b for any epsilon>0. This improves on the lower 
bounds known for multiple and single source broadcasting, which 
could only state that it is NP-hard to distinguish between 
graphs with broadcasting time b and b+1, for any b >= 3. 
This statement is proven by reduction from E3-SAT, the analysis of 
which needs a carefully designed book-keeping and counting argument. 
 
----- 
File: 2000/tr-00-003 
 
Equation-Based Congestion Control for Unicast Applications:  
the Extended Version 
 
Sally Floyd, Mark Handley, Jitendra Padhye, and J&ouml;rg Widmer 
tr-00-003 
March 2000 
 
This paper proposes a mechanism for equation-based congestion 
control for unicast traffic.  Most best-effort traffic in the 
current Internet is well-served by the dominant transport protocol 
TCP.  However, traffic such as best-effort unicast streaming 
multimedia could find use for a TCP-friendly congestion control 
mechanism that refrains from reducing the sending rate in half in 
response to a single packet drop.  With our mechanism, the sender 
explicitly adjusts its sending rate as a function of the measured 
rate of loss events, where a loss event consists of one or more 
packets dropped within a single round-trip time.  We use both 
simulations and experiments over the Internet to explore performance. 
 
<P> 
Equation-based congestion control is also a promising avenue of 
development for congestion control of multicast traffic, and so an 
additional reason for this work is to lay a sound basis for the 
later development of multicast congestion control. 
 
----- 
File: 2000/tr-00-004 
 
Speech Recognition Experiments on Switchboard Corpus 
 
Toshihiko Abe 
tr-00-004 
March 2000 
 
This report shows results of a set of speech recognition approaches 
performed on Switchboard corpus which is a large spontaneous telephone 
conversation database.  The purpose is to improve recognition accuracy 
on Switchboard with our connectionist hybrid model.  The methods 
include the choice of kinds of acoustic features, gender dependent 
training, use of multi-stream features, etc.  We will also show that 
adding a feature of periodicity measure improves recognition accuracy. 
Finally, we will show a speaker adaptation approach that improves 
recognition accuracy for speech by a particular speaker. 
 
----- 
File: 2000/tr-00-005 
 
Acoustic Stress and Topic Detection in American English Spoken 
Sentences 
 
Rosaria Silipo and Fabio Crestani 
tr-00-005 
March 2000 
 
The relationship between acoustic stress and information content of 
words is investigated. On one side, the average acoustic stress is 
measured for each word throughout each utterance. On the other side an 
Information Retrieval (IR) index, based on the words frequency 
throughout the particular spoken sentence and throughout the 
collection of analyzed spoken sentences, is calculated. The scatter 
plots of the two measures (average acoustic stress on the y-axis and 
IR index on the x-axis) show higher values of average acoustic stress 
with the increasing of the information measure of the word in the 
majority of the analyzed utterances. A statistically more valid proof 
of such a relationship is derived from the histogram of the words with 
high average acoustic stress vs. the IR index. This confirms that a 
word with high average acoustic stress has also a high value of the IR 
index. 
 
----- 
File: 2000/tr-00-006 
 
Acoustic change detection and clustering on Broadcast News 
 
Javier Ferreiros, Dan Ellis 
tr-00-006 
March 2000 
 
We have developed a system that breaks input speech into speech 
segments using an acoustic similarity measure between two 
segments. The aim is to detect the time points where the acoustic 
characteristics change. These changes are caused mainly by speaker 
changes but also by acoustic environment changes. We have also 
developed another system that performs a clustering of the speech 
chunks generated by the former system and creates clusters containing 
the segments with homogeneous acoustic conditions. This clustering is 
fed back to the acoustic change detector to make more robust decisions 
based on both the acoustic similarity measurement between two 
consecutive segments and using extra information coming from the 
distance between the two clusters to which each of them belong. The 
interaction between the two systems (acoustic change detection and 
clustering) improves the results obtained for both aims. 
 
----- 
File: 2000/tr-00-007 
 
Stream combination before and/or after the acoustic model 
 
Daniel P.W. Ellis 
tr-00-007 
April 2000 
 
Combining a number of diverse feature streams has proven to be a very 
flexible and beneficial technique in speech recognition.  In the 
context of hybrid connectionist-HMM recognition, feature streams can 
be combined at several points.  In this work, we compare two forms of 
combination: at the input to the acoustic model, by concatenating the 
feature streams into a single vector (feature combination or FC), and 
at the output of the acoustic model, by averaging the logs of the 
estimated posterior probabilities of each subword unit (posterior 
combination or PC).  Based on four feature streams with varying 
degrees of mutual dependence, we find that the best combination 
strategy is a combination of feature and posterior combination, with 
streams that are more independent, as measured by an approximation to 
conditional mutual information, showing more benefit from posterior 
combination. 
 
----- 
File: 2000/tr-00-008 
 
Variable Packet Size Equation Based Congestion Control 
 
Pedro Reviriego Vasallo 
tr-00-008 
April 2000 
 
This report, extends previous work  in equation-based congestion 
control for unicast traffic.  Most best effort traffic on the internet 
is appropriately served by TCP which is the dominant transport protocol 
on the internet. However, there is a growing number of multimedia 
application for which TCP is not well suited. For those applications, 
several congestion control mechanisms have been proposed  in order to 
avoid congestion collapse on the internet. One of them is the recently 
proposed TCP Friendly Rate Control Protocol (TFRC). It can be only 
used by flows that have a constant packet size. In this paper, we 
propose an extension to the TFRC protocol in order to support variable 
packet size flows. Variable packet size has been used for the 
transmission  of video over the internet  and is also used in voice 
applications. So it is important for a congestion control protocol to 
support variable packet size flows. 
<P> 
We also explore the concept of fairness among flows when some of the 
flows send small packets. Currently, these flows are penalized by TFRC 
because it imitates TCP's behavior giving less throughput to flows that 
use small packets. We argue that if a flow is sending   small packets 
because the application requires it to do so (for example to minimize 
delay in a voice over IP conversation) it should get the same amount of 
bandwidth as a TCP session using large packets. This results in a 
modified concept of TCP friendliness that we introduce in this paper. 
<P> 
Finally we analyze some shortcomings of the equation used by TFRC to 
model TCP behavior and show that the impact of TCP timeouts are not 
completely modeled by the current TFRC equation. 
 
----- 
File: 2000/tr-00-009 
 
A Mobile Network Architecture for Vehicles 
 
J&ouml;rg Widmer 
tr-00-009 
May 2000 
 
In this report, a network architecture for vehicle communication based 
on Mobile IP is presented. The special network environment of a 
car allows optimizations but also requires modifications of existing 
approaches. We identify these issues and discuss the integration of 
possible solutions into the framework. For example, location 
information provided by a car navigation system can be used to improve 
handoff decisions and connectivity. To evaluate the architecture, 
simulation studies were carried out with the ns Network 
Simulator. This report also gives an overview of the necessary 
modifications and extensions to ns and additional tools to simplify 
future research in this area. 
----- 
File: 2000/tr-00-010 
 
A Scalable Content Addressable Network 
 
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, and Scott Shenker 
tr-00-010 
October 2000 
 
In this paper, we define the concept of a Content-Addressable Network; 
a system that essentially offers the same functionality as a hash table, 
i.e. it maps "keys" to "values". 
The novelty of a hash table in a Content-Addressable Network is that it 
may span millions of hosts across diverse administrative  
entities in the Internet. 
<P> 
We describe our design of a Content Addressable Network that is 
scalable, highly fault-tolerant and completely self-organizing. 
We analyse and simulate the performance and robustness properties 
of our design. Finally, we discuss 
some of the potential applications for a CAN.          
 
Webpubdate: 13 Oct 2000. {To be added by sysadmin; doesn't show on web.} 
----- 
File: 2000/tr-00-011 
 
Workshop on Design Issues in Anonymity and Unobservability (Preproceedings) 
 
Hannes Federrath(Ed.) 
tr-00-011 
July 2000 
 
        This workshop addresses the design and realization of anonymity 
        services for the Internet and other communication networks.  The 
        main topics of the workshop are Attacks on Systems, Anonymous 
        Publishing, Mix Systems, Identity Management, and Pseudonyms and 
        Remailers.  Anonymity and unobservability have become "hot topics" 
        on the Internet.  Services that provide anonymous and unobservable 
        access to the Internet are useful for electronic commerce 
        applications (obviously with the need for strong authenticity and 
        integrity of data) as well as for services where the user wants to 
        remain anonymous (e.g. web-based advisory services or consultancy). 
        This workshop was held at the International Computer Science 
        Institute (ICSI), Berkeley, California, July 25-26, 2000. 
 
Webpubdate: 7 Feb 2001. {To be added by sysadmin; doesn't show on web.} 
----- 
File: 2000/tr-00-012 
 
Discriminant Training of Front-End and Acoustic Modeling Stages to 
Heterogeneous Acoustic Environments for Multi-Stream Automatic Speech 
Recognition  
 
Michael Lee Shire 
tr-00-012 
December 2000 
 
The performance of Automatic Speech Recognition (ASR) systems degrades 
in the presence of adverse acoustic conditions.  A possible 
shortcoming of the typical ASR system is the reliance on a single 
stream of front-end acoustic features and acoustic modeling feature 
probabilities.  A single front-end feature extraction algorithm may 
not be capable of maintaining robustness to arbitrary acoustic 
environments.  Acoustic modeling will also degrade due to 
distributional changes caused by the acoustic environment.  This 
report explores the parallel use of multiple front-end and acoustic 
modeling elements to improve upon this shortcoming.  Each ASR acoustic 
modeling component is trained to estimate class posterior 
probabilities in a particular acoustic environment.  In addition to 
discriminative training of the probability estimator, the temporal 
processing of existing feature extraction algorithms are modified in 
such a way as to improve class discrimination in the training 
environment.  Probability streams are generated using multiple 
front-end acoustic modeling stages trained to heterogeneous acoustic 
environments.  In new sample acoustic environments, simple 
combinations of these probability streams give rise to word 
recognition rates that are superior to the individual streams. 
 
Webpubdate: 19 Dec 2000. {To be added by sysadmin; doesn't show on web.} 
----- 
File: 2001/tr-01-001 
 
Controlling High Bandwidth Flows at the Congested Router 
 
Ratul Mahajan and Sally Floyd 
tr-01-001 
April 2001 
 
FIFO queueing is simple but does not protect traffic from flows 
that send more than their share or flows that fail to use end-to-end 
congestion control.  At the other extreme, per-flow scheduling 
mechanisms provide max-min fairness but are more complex, keeping 
state for all flows going through the router.  This paper proposes 
RED-PD (RED with Preferential Dropping), a flow-based mechanism 
that combines simplicity and protection by keeping state for just 
the high-bandwidth flows.  RED-PD uses the packet drop history at 
the router to detect high-bandwidth flows in times of congestion 
and preferentially drop packets from these flows.  This paper 
discusses the design decisions underlying RED-PD, and presents 
simulations evaluating RED-PD in a range of environments. 
 
Webpubdate: 30 Apr 2001. {To be added by sysadmin; doesn't show on web.} 
----- 
File: 2001/tr-01-002 
 
Identifying the TCP Behavior of Web Servers 
 
Jitendra Padhye and Sally Floyd 
tr-01-002 
February 2001 
 
Most of the traffic in today's Internet is carried by the TCP 
protocol. Since web traffic forms the majority of the TCP traffic, 
TCP implementations in web servers are of particular interest. TCP 
has many user-configurable parameters and a wide range of 
implementations.  New congestion control mechanisms and TCP options 
continue to be developed. Hence, it is necessary to trace the 
deployment of various TCP mechanisms in the Internet.  Also, 
stability and fairness of the Internet relies on the voluntary use 
of congestion control mechanisms by end hosts. Therefore, it is 
important to test TCP implementations for conformant congestion 
control.  We have developed a tool called TCP Behavior Identification 
Tool (TBIT) to characterize the TCP behavior of web servers. Here, 
we describe TBIT, and present results about the TCP behaviors of 
major web servers.  We also describe the use of TBIT to detect bugs 
and non-compliance in TCP implementations. 
 
Webpubdate: 30 Apr 2001. {To be added by sysadmin; doesn't show on web.} 
----- 
File: 2001/tr-01-003 
 
USAIA: Ubiquitous Service Access Internet Architecture 
 
Joachim Sokol and J&ouml;rg Widmer 
tr-01-003 
February 2001 
 
The next generation Internet will provide high-quality, high-bandwidth 
connectivity.  However, the important aspect of mobility is often 
neglected.  Future Internet users will expect the availability of the 
full range of Internet applications regardless of the mode of access. 
We assume that mobile users in particular will use audio-based and 
video-based applications with specific QoS requirements.  The support 
for these applications that exists in wired networks is therefore also 
necessary in next generation IP-based wireless networks.  In this 
paper, we present a framework for the seamless integration of QoS and 
Mobility. 
 
Webpubdate: 3 May 2001. {To be added by sysadmin; doesn't show on web.} 
----- 
File: 2001/tr-01-004 
 
Episodic memory trace formation in the hippocampal system: a 
model of cortico-hippocampal interaction 
 
Lokendra Shastri 
tr-01-004 
April 2001 
 
        We readily remember events and situations in our daily lives and  
        acquire memories of specific events by reading a newspaper, or 
        watching a newscast. This ability to rapidly acquire ``episodic'' 
        memories has been the focus of considerable research in psychology 
        and neuroscience, and there is a broad consensus that the 
        hippocampal system (HS), consisting of the hippocampal formation  
        and neighboring cortical areas, plays a critical role in the  
        encoding and retrieval of episodic memory. But how the HS 
        subserves this mnemonic function is not fully understood. 
        <p> 
        This report presents a computational model, SMRITI, that 
        demonstrates how a cortically expressed transient pattern of 
        activity representing an event can be transformed rapidly into 
        a persistent and robust memory trace as a result of long-term 
        potentiation within structures whose architecture and circuitry 
        resemble those of the HS. Memory traces formed by the model 
        respond to highly partial cues, and at the same time, reject  
        similar but erroneous cues. During retrieval, these memory traces 
        acting in concert with cortical circuits encoding semantic,  
        causal, and procedural knowledge can recreate activation-based 
        representations of memorized events. The model explicates the 
        representational requirements of encoding episodic memories, and 
        suggests that the idiosyncratic architecture of the HS is well 
        matched to the representational problems it must solve in order 
        to support episodic memory function. The model predicts the 
        nature of memory deficits that would result from insult to 
        specific HS components and to cortical circuits projecting to 
        the HS. It also identifies the sorts of memories that must 
        remain encoded in the HS for the long-term, and helps delineate 
        the semantic and episodic memory distinction. 
 
Webpubdate: 27 Apr 2001. {To be added by sysadmin; doesn't show on web.} 
----- 
File: 2001/tr-01-005 
 
Automatic Labeling of Semantic Roles 
 
Daniel Gildea and Daniel Jurafsky 
tr-01-005 
April 2001 
 
        We present a system for identifying the semantic 
        relationships, or semantic roles, filled by constituents of a 
        sentence within a semantic frame.  Given an input sentence, 
        the system labels constituents with either abstract semantic 
        roles such as Agent or Patient, or more domain-specific 
        semantic roles such as Speaker, Message, and Topic.  The 
        system is based on statistical classifiers which were trained 
        on 653 semantic role types from roughly 50,000 sentences. 
        Each sentence had been hand-labeled with semantic roles in the 
        FrameNet semantic labeling project.  We compare the usefulness 
        of different features and feature-combination methods in the 
        semantic role labeling task. 
 
Webpubdate: 1 May 2001. {To be added by sysadmin; doesn't show on web.} 
----- 
File: 2001/tr-01-006 
 
An overview of Basque Locational Cases: Old Descriptions, New Approaches 
 
Iraide Ibarretxe-Antunano 
tr-01-006 
July 2001 
 
 
Basque, a language isolate spoken on both sides at the western 
end of the Pyrenees, has very rich lexical and grammatical 
resources for expressing space. There are five different 
locational cases and over thirty postpositions, also inflected 
with these cases, that allow fine and detailed descriptions of 
space. Traditional accounts on locational cases 
are good sources for descriptive as well as etymological 
information. However, when it comes to the explanation and  
understanding of the conceptualisation of space and motion in  
Basque, these studies do not offer any insights. In this 
paper, I present a critical overview of the semantic 
descriptions provided by these traditional accounts. Section 
1 gives a brief tour of the Basque case system. Section 2 
discusses those characteristics particular to locational 
cases. Section 3 describes the main five locational cases in  
more detail. Section 4 points out areas for further research,  
areas that posit problems for traditional accounts and 
possible ways to solve them. Section 5 briefly outlines the 
main spatial postpositions and some of their special 
characteristics. The main goal of this paper is to provide a 
useful background on Basque loctional cases for future studies 
on the conceptual system of space and motion in this language. 
 
 
Webpubdate: 6 August 2001. {To be added by sysadmin; doesn't show on web.} 
----- 
File: 2001/tr-01-007 
 
Exiting Events in Spanish: Boundary I-schema And Move X-schema 
 
Carmen Bretones, Mar&iacute;ia Crist&oacute;bal, Iraide Ibarretxe 
tr-01-007 
August 2001 
 
This paper analyses the structure and conceptualisation of exiting 
events in Spanish through the discussion of the construction salir-de, 
and compares it with an analogous scenario encoded in the English 
construction out-of.  An 'exiting event' in Spanish is defined as the 
translational motion from a region A (the source) through a boundary. 
Taking the Embodied Construction Grammar (ECG) model as the theoretical 
framework, our focus is on the kind of mental images Spanish speakers 
construe it when it comes to understand this construction. Section 1 
presents the main theoretical tenets of the Embodied Construction 
Grammar and a simplified version of their analysis of the English 
construction out-of. Section 2 yields a description of the construction 
salir-de and focuses on two schemas: the Boundary I-schema and the Move 
X-schema. Section 3 discusses the semantics of the landmarks that take 
part in this construction. Section 4 addresses more marginal cases 
where landmarks are portals. Finally, section 5 summarises the 
conclusions. 
 
Webpubdate: 30 August 2001. {To be added by sysadmin; doesn't show on web.} 
----- 
File: 2001/tr-01-008 
 
Synaesthetic Metaphors in English 
 
Carmen M. Bretones Callejas 
tr-01-008 
August 2001 
 
Recent work in metaphorical analysis makes it clear that many of our 
most basic concepts (and our reasoning via those concepts) are 
embodied: Lived experiences in our bodies inspire and constrain the way 
we conceive and articulate many of our other experiences. That is 
exactly what metaphor is based on, i.e., on experiential, body-linked, 
physical core of reasoning abilities (Lakoff and Johnson, 1999). 
Metaphor has the capacity to "introduce a sensory logic at the semantic 
level alluding to a more complex scenario of interrelated meanings and 
experiences of the world" (Cacciari, 1998 p.128). One of the most 
common types of metaphoric transfer is synaesthesia, i.e., the transfer 
of information from one sensory modality to another. 
 
I analyze this phenomenon in depth in this paper, taking my data from a 
corpus of 50 poems written by Seamus Heaney and analyzing examples such 
as: (1) cold smell (Digging, line 25), (2) stony flavours (From 
Whatever You Say  Say Nothing, line 19) or (3) coarse croaking (Death 
of a Naturalist, line 26). After that I compare my data with Day's 
(1996) in his study of synaesthesia in English. Finally, I point out 
the idea of synaesthetic connections as possible physical base for the 
cognitive process that we call metaphor. 
 
Webpubdate: 30 August 2001. {To be added by sysadmin; doesn't show on web.} 
----- 
File: 2001/tr-01-009 
 
Arriving Events in English and Spanish: A Contrastive Analysis in terms of 
Frame Semantics 
 
Maria Cristobal 
tr-01-009 
September 2001 
 
This paper presents a detailed contrastive frame semantic analysis of 
arriving events in English and Spanish, attested through a corpus study. 
<P> 
First, we present a formal description of the Arriving frame as a 
subframe of the Motion frame: arriving encodes a basic subpart of our 
conceptualization of motion, namely the transition from moving to arriving 
at a goal. 
<P> 
Second, we carry out a contrastive analysis of the 
predicates participating in this frame. We discuss cross-linguistic 
differences through the study of implicit frame elements, conflation and  
incorporation patterns, profiling, and deixis.  
<P> 
Third, we briefly introduce the question of polysemy. The 
spatial meaning of arriving is the core sense from which a set of sense 
extensions derives, pointing to a wide range of independent frames (e.g. 
Cognition frame, Achievement frame, etc.) The different senses can be described 
synchronically in terms of frame semantics, while motivation for them is 
to be found in the cognitive processes of Metaphor (across frames) and 
Fictive Motion (within frame). 
 
Webpubdate: 22 October 2001. {To be added by sysadmin; doesn't show on web.} 
----- 
File: 2002/tr-02-001 
 
A new view of the medial temporal lobes and the structure of memory 
 
Charan Ranganath, Lokendra Shastri, and Mark D&#039Esposito 
tr-02-001 
February 2002 
 
Recent research in cognitive neuroscience has supported the idea that 
active rehearsal of information over short delays, or working memory 
maintenance, is accomplished by activating long-term memory 
representations. Nonetheless, it is widely assumed that although the 
human hippocampus and related medial temporal lobe structures may be 
critical for the formation of long-term memories, they are not involved 
in working memory maintenance. Here, we reconsider this issue and review 
evidence suggesting that humans and nonhuman primates with large medial 
temporal lobe lesions have difficulty retaining complex, novel information 
even across short delays. These results suggest that perirhinal and 
entorhinal regions, and under some circumstances, even the hippocampus, 
may be necessary for some forms of working memory as well as long-term 
memory. Moreover, neurophysiological and neuroimaging evidence suggests  
that all of these medial temporal regions exhibit activity associated with 
the active maintenance of novel information. Finally, we review a neurally 
plausible computational model of cortico-hippocampal interactions that 
points to a special role of the hippocampus in the representation of 
relational codes in memory. Our analyses suggest that the hippocampus plays 
this special role not only in episodic long-term memory, but also in 
working memory maintenance. Collectively, these results are consistent with 
the hypothesis that the active maintenance of complex, novel information is 
accomplished through the sustained activation of long-term memory 
representations bound together by the hippocampus and medial temporal 
cortical regions.  
 
Webpubdate: 15 February 2002. {To be added by sysadmin; doesn't show on web.} 
----- 
File: 2002/tr-02-004 
 
Embodied Construction Grammar in Simulation-Based Language Understanding 
 
Benjamin K. Bergen and Nancy C. Chang 
tr-02-004 
February 2002 
 
We present Embodied Construction Grammar, a formalism for linguistic 
analysis designed specifically for integration into a simulation-based 
model of language understanding. As in other construction grammars, 
linguistic constructions serve to map between phonological forms and 
conceptual representations. In the model we describe, however, 
conceptual representations are also constrained to be grounded in the 
body's perceptual and motor systems, and more precisely to 
parameterize mental simulations using those systems.  Understanding an 
utterance thus involves at least two distinct processes: "analysis" to 
determine which constructions the utterance instantiates, and 
"simulation" according to the parameters specified by those 
constructions. In this report, we outline a construction formalism 
that is both representationally adequate for these purposes and 
specified precisely enough for use in a computational architecture. 
 
Webpubdate: 25 February 2002. {To be added by sysadmin; doesn't show on web.} 
----- 
File: 2002/tr-02-005 
 
Analysis of Composite Corridors 
 
Teigo Nakamura and Elwyn Berlekamp 
tr-02-005 
February 2002 
 
This work began as an attempt to find and catalog the mean values and 
temperatures of a well-defined set of relatively simple common Go 
positions, extending a similar but smaller catalog in Table E.10, 
Appendix E of the book, "Mathematical Go". The major surprises of our 
present work include the following: (1) A position of chilled value *2 
(previously unknown in Mathematical Go), (2) A surprisingly "warm" 
position, whose temperature is routinely underestimated even by very 
strong Go players, (3) More insights into decompositions.  Some 
positions decompose as a beginner might naively hope; others don't. 
One set of those which don't provides a basis for an extension of the 
"multiple invasions" theorem in the Mathematical Go book.  This 
appears in our Section 5.  In the new set of positions, like the old, 
a potential future shortage of liberties of the invading group results 
in a surprisingly hot temperature at one well-defined but 
far-from-obvious point along the invading group's frontier.  It is 
hoped that these results may someday provide the basis for further new 
insights and generalizations. 
 
Webpubdate: 27 February 2002. {To be added by sysadmin; doesn't show on web.} 
----- 
File: 2002/tr-02-006 
 
Improving TCP's Performance under Reordering with DSACK 
 
Ming Zhang, Brad Karp, Sally Floyd, and Larry Peterson 
tr-02-006 
July 2002 
 
TCP performs poorly on paths that reorder packets significantly, 
where it misinterprets out-of-order delivery as packet loss. The 
sender responds with a fast retransmit though no actual loss has 
occurred. These repeated false fast retransmits keep the sender's 
window small, and severely degrade the throughput it 
attains. Persistent reordering occasionally occurs on present-day 
networks. Moreover, TCP's requirement of nearly in-order delivery 
complicates the design of such beneficial systems as DiffServ, 
multi-path routing, and parallel packet switches. Toward relaxing this 
constraint on Internet architecture, we present enhancements to TCP 
that improve the protocol's robustness to reordered and delayed 
packets. We extend the sender to detect and recover from false 
fast retransmits using DSACK information, and to avoid false fast 
retransmits proactively, by adaptively varying dupthresh. Our 
algorithm adaptively balances increasing dupthresh, to avoid 
false fast retransmits, and limiting the growth 
of dupthresh, to avoid unnecessary timeouts. Finally, we 
demonstrate that delayed packets negatively impact the accuracy of 
TCP's RTO estimator, and present enhancements to the estimator that 
ensure it is sufficiently conservative, without using timestamps or 
additional TCP header bits. Our simulations show that these 
enhancements significantly improve TCP's performance over paths that 
reorder or delay packets. 
 
Webpubdate: 8 July 2002. {To be added by sysadmin; doesn't show on web.} 
----- 
File: 2002/tr-02-007 
 
A Syllable, Articulatory-Feature, and Stress-Accent Model of Speech 
Recognition 
 
Shuangyu Chang 
tr-02-007 
September 2002 
 
Current-generation automatic speech recognition (ASR) systems assume that  
words are readily decomposable into constituent phonetic components  
("phonemes").  A detailed linguistic dissection of state-of-the-art speech   
recognition systems 
indicates that the conventional phonemic "beads-on-a-string" approach is 
of limited utility, particularly with respect to informal, conversational 
material.   
The study shows that there is a significant gap between the observed data 
and the pronunciation models of current ASR systems.  It also shows that  
many important factors affecting recognition performance are not modeled  
explicitly in these systems. 
<P> 
Motivated by these findings, this dissertation analyzes spontaneous speech 
with respect to three important, but often neglected, components of  
speech (at least with respect to English ASR). 
These components are articulatory-acoustic features (AFs), the syllable 
and stress accent. 
Analysis results provide evidence for an alternative approach of speech  
modeling, one in which the syllable assumes preeminent status and is 
melded to the lower as well as the higher tiers of linguistic representation 
through the incorporation of prosodic information such as stress accent. 
Using concrete examples and statistics from spontaneous speech material 
it is shown that there exists a systematic relationship between the  
realization of AFs and stress accent in conjunction with syllable 
position.  This relationship  
can be used to provide an accurate and parsimonious characterization of  
pronunciation variation in spontaneous speech. 
An approach to automatically extract AFs from the  
acoustic signal is also developed, as is a system for the automatic  
stress-accent labeling of spontaneous speech.  
<P> 
Based on the results of these studies a syllable-centric, multi-tier model 
of speech recognition is proposed.  The model explicitly relates AFs,  
phonetic segments and syllable constituents to a framework for  
lexical representation, and incorporates stress-accent information into  
recognition. 
A test-bed implementation of the model is developed using a 
fuzzy-based approach for combining evidence from various  
AF sources and a pronunciation-variation modeling technique using  
AF-variation statistics extracted from data. 
Experiments on a limited-vocabulary speech recognition task using both 
automatically derived and fabricated data demonstrate the  
advantage of incorporating AF and stress-accent  
modeling within the syllable-centric, multi-tier framework, particularly 
with respect to pronunciation variation in spontaneous speech. 
 
Webpubdate: 20 Sept 2002. {To be added by sysadmin; doesn't show on web.} 
----- 
File: 2002/tr-02-008 
 
A Connectionist Encoding of Parameterized Schemas and Reactive Plans 
 
Lokendra Shastri, Dean Grannes, Srini Narayanan, and Jerome Feldman 
tr-02-008 
October 2002 
 
        We present a connectionist realization of parameterized schemas 
        that can model high-level sensory-motor processes and be a 
        candidate representation for implementing reactive behaviors.  
        The connectionist realization involves a number of ideas 
        including the use of focal-clusters and feedback loops to  
        control a distributed process without a central controller and 
        the expression and propagation of dynamic bindings via temporal 
        synchrony. We employ a uniform mechanism for interaction between  
        schemas, low-level somatosensory and proprioceptive processes, 
        and high-level reasoning and memory processes. Our representation 
        relates to work in connectionist models of rapid - reflexive - 
        reasoning and also suggests solutions to several problems in  
        language acquisition and understanding. 
 
Webpubdate: 11 Oct 2002. {To be added by sysadmin; doesn't show on web.} 
----- 
File: 2002/tr-02-009 
 
FrameNet: Theory and Practice 
 
Christopher R. Johnson, Charles J. Fillmore, Miriam R. L. Petruck, Collin F. Baker, Michael Ellsworth, Josef Ruppenhofer, and Esther J. Wood 
tr-02-009 
October 2002 
 
Describes Frame Semantics as applied in the FrameNet project, 
        what is annotated and why, how annotators deal with missing or 
        conflated frame elements, and the differences in annotating 
        sentences with verb, noun or adjective target words.  Explains 
        the phrase types and grammatical functions used in FrameNet 
        annotation, and briefly describes lexical entries and 
        frame-to-frame relations. (This paper is also included in 
        Release 1.0 of the FrameNet data.) 
 
Webpubdate: 17 Oct 2002. {To be added by sysadmin; doesn't show on web.} 
----- 
File: 2002/tr-02-010 
 
A Proposed Formalism for ECG Schemas, Constructions, Mental Spaces, and Maps 
 
Jerome A. Feldman 
tr-02-010 
September 2002 
 
 
The traditional view has been that Cognitive Linguistics (CL) is 
incompatible with formalization. Cognitive linguistics is serious about 
embodiment and grounding, including imagery and image-schemas, 
force-dynamics, real-time processing, discourse considerations, mental 
spaces, context, and so on. It remains true that some properties of 
embodied language, such as context sensitivity, can not be fully 
captured in a static formalism, but a great deal of CL can be stated 
formally in a way that is compatible with a full treatment.  It appears 
that we can specify rather complete embodied construction grammars 
(ECG) using only four types of formal structures: schemas, 
constructions, maps, and spaces. The purpose of this note is to specify 
these structures and present simple examples of their use. 
 
Webpubdate: 11 April 2003. {To be added by sysadmin; doesn't show on web.} 
----- 
File: 2002/tr-02-011 
 
The Meaning of Reference in Embodied Construction Grammar  
 
Jerome A. Feldman 
tr-02-011 
September 2002 
 
The ECG formalism is quite general, specifying only the ways to write 
and combine the four basic structure types: schemas, constructions, 
maps, and spaces. Grammars in ECG are deeply cognitive, with meaning 
being expressed in terms of conceptual primitives such as image 
schemas, force dynamics, etc. The hypothesis is that a modest number of 
universal primitives will suffice to provide the core meaning component 
for the grammar. Referent descriptors  entered the ECG formalism as the 
way of specifying the participants in a semantic specification This 
note discusses how to specify entity-like referents, focuses on the key 
issues in Reference, and treats some of the more problematic ones in 
some detail. It assumes a general knowledge of the NTL paradigm and is 
not self contained. 
 
 
Webpubdate: 11 April 2003. {To be added by sysadmin; doesn't show on web.} 
----- 
File: 2003/tr-03-001 
 
Pitch-based Vocal Tract Length Normalization 
 
Arlo Faria 
tr-03-001 
November 2003 
 
This paper investigates the correlation between  
fundamental frequency and resonant frequencies in  
speech, exploiting this relation for vocal tract length 
normalization (VTLN). By observing a speaker's average  
pitch, it is possible to estimate the appropriate  
frequency warping factor which will transform a  
spectral representation into one with less variation of 
the formants. I use a function of pitch that maps to a  
corresponding frequency warping factor. An exploration  
of speaker and vowel characteristics in the TIMIT  
speech corpus is used to optimize the parameters of  
this function. The approach presented here is a  
potentially simpler alternative to existing VTLN  
algorithms which derive the warping factor by other 
means. Recognizer results indicate that the pitch-based 
approach compares favorably against other methods;  
furthermore, performance could be further improved by  
using a warping function that is not strictly linear.  
 
Webpubdate: 3 November 2003. {To be added by sysadmin; doesn't show on web.} 
----- 
File: 2003/tr-03-002 
 
Scaling Up: Learning Large-scale Recognition Methods from Small-scale Recognition Tasks 
 
Nelson Morgan, Barry Y. Chen, Qifeng Zhu, Andreas Stolcke 
tr-03-002 
September 2003 
 
Despite the common wisdom that lessons learned from small experimental 
speech recognition tasks often do not scale to larger tasks, many 
important algorithms used in larger tasks were first developed with small 
systems applied to small tasks. In this paper we report experiments with 
the OGI Numbers task that led to the adoption of a number of engineering 
decisions for the design of an acoustic front end. We then describe a 
three-stage process of scaling to the larger conversational telephone 
speech (CTS) task.  Much of the front end design required no change at all 
for the more difficult task, yielding significant improvements over our 
baseline front end. 
 
 
Webpubdate: 25 September 2003. {To be added by sysadmin; doesn't show on web.} 
----- 
File: 2003/tr-03-003 
 
Identification of Protein Complexes by Comparative Analysis of 
Yeast and Bacterial Protein Interaction Data 
 
Roded Sharan, Trey Ideker, Brian Kelley, Ron Shamir and Richard M. Karp 
tr-03-003 
September 2003 
 
Mounting evidence shows that many protein complexes are conserved in 
evolution. 
Here we use conservation to find complexes that are common to 
yeast <I> S. Cerevisiae</I> and bacteria <I> H. pylori</I>. 
Our analysis combines protein 
interaction data, that are available for each of the two species, and 
orthology information based on protein sequence comparison. We develop a 
detailed probabilistic model for protein complexes in a single species, 
and a model for the conservation of complexes between two species. Using 
these models, one can recast the question of finding conserved complexes 
as a problem of searching for heavy subgraphs in an edge- and 
node-weighted graph, whose nodes are orthologous protein pairs. 
<P> 
We tested this approach on the data currently available for yeast and 
bacteria and detected 11 significantly conserved complexes. 
Several of 
these complexes match very well with prior experimental knowledge on 
complexes in yeast only, and serve for validation of our methodology. The 
complexes suggest new functions for a variety of uncharacterized 
proteins. By identifying a conserved complex whose yeast proteins function 
predominantly in the nuclear pore complex, we propose that the 
corresponding bacterial proteins function as a coherent cellular membrane 
transport system. We also compare our results to two alternative methods 
for detecting complexes, and demonstrate that our methodology 
obtains a much higher specificity. 
 
Webpubdate: 30 September 2003. {To be added by sysadmin; doesn't show on web.} 
----- 
File: 2003/tr-03-004 
 
A Discriminative Model for Identifying Spatial cis-Regulatory Modules 
 
Eran Segal and Roded Sharan 
tr-03-004 
October 2003 
 
Transcriptional regulation is mediated by the coordinated binding of 
transcription factors to the upstream region of genes. In higher 
eukaryotes, the binding sites of cooperating transcription factors are 
organized into short sequence units, called cis-regulatory modules. In 
this paper we propose a method for identifying modules of 
transcription factor binding sites in a set of co-regulated genes, 
using only the raw sequence data as input. Our method is based on a 
novel probabilistic model that describes the mechanism of 
cis-regulation, including the binding sites of cooperating 
transcription factors, the organization of these binding sites into 
short sequence modules, and the regulation of a gene by its modules. 
We show that our method is successful in discovering planted modules 
in simulated data and known modules in yeast. More importantly, we 
applied our method to a large collection of human gene sets, and found 
83 significant cis-regulatory modules, which included 36 known 
motifs and many novel ones. Thus, our results provide one of the first 
comprehensive compendiums of putative cis-regulatory modules in human. 
 
Webpubdate: 17 October 2003. {To be added by sysadmin; doesn't show on web.} 
----- 
File: 2004/tr-04-001 
 
SchemaDB - An Extensible Schema Database System Using ECG Representation 
 
Manli Li 
tr-04-001 
September 2003 
 
How are our language, concepts and thoughts formed? Schemas as the most 
primitive conceptual units contribute in forming languages and 
thoughts.  Schemas are studied by linguists, cognitive scientists, 
psychologists and computer scientists on various emphases. However, 
there is no existing systematic collection of schemas in a formalized 
representation. As part of MetaNet Project at ICSI, SchemaDB is an 
extensible database that aims at not only collecting all existing 
schemas through a user-friendly, web-based interface, but is also 
intended for formalizing schema using ECG (Embodied Construction 
Grammar). SchemaDB is to be used in cataloging, examining, computing 
metaphor and in many other language and cognitive science studies. The 
goal of the SchemaDB project is to create a user-friendly web based 
application in order to collect as many cross-cultural, cross-language 
schemas as possible in a complete, widely accessible, and human/machine 
readable manner. Using client/server architecture in addition with PHP 
(Hypertext Processor) scripting language and the relational database, 
MySQL, SchemaDB system enables secure interactions between users and 
the database server. 
 
Webpubdate: 10 January 2004. {To be added by sysadmin; doesn't show on web.} 
----- 
File: 2004/tr-04-002 
 
Meeting Recorder Project: Dialog Act Labeling Guide 
 
Rajdip Dhillon, Sonali Bhagat, Hannah Carvey, Elizabeth Shriberg 
tr-04-002 
February 2004 
 
Dialog act annotation potentially provides a means to aid information 
retrieval and summarization of meeting data.  This work presents an 
in-depth view of the annotation methods of both the dialog act 
annotation and adjacency pair labeling schemes used for the Meeting 
Recorder data.  Additionally, detailed descriptions of the individual 
tags within the Meeting Recorder Dialog Act tagset are provided. 
Issues such as utterance segmentation as well as numerous examples from 
the meeting data are found within this work. 
 
Webpubdate: 17 May 2004. {To be added by sysadmin; doesn't show on web.} 
----- 
File: 2004/tr-04-005 
 
On the Impact of BER on Realistic TCP Traffic in Satellite Networks 
 
Priya Narasimhan, Hans Kruse, Shawn Ostermann, Mark Allman 
tr-04-005 
November 2004 
 
There are many factors governing the performance of TCP-based 
applications traversing satellite channels.  The end-to-end 
performance of TCP is known to be degraded by the delay, noise and 
asymmetry inherent in geosynchronous systems.  This result has 
been largely based on experiments that evaluate the performance of 
TCP in single flow tests.  While single flow tests are useful for 
deriving information on the theoretical behavior of TCP and allow 
for easy diagnosis of problems, they do not represent a broad 
range of realistic situations and therefore cannot be used to 
authoritatively comment on performance issues.  The experiments 
discussed in this report test TCP's performance in a more dynamic 
environment with competing traffic flows from hundreds of TCP 
connections running simultaneously across the satellite channel. 
Another aspect we investigate is TCP's reaction to bit errors on 
satellite channels.  TCP interprets loss as a sign of network 
congestion.  This causes TCP to reduce its transmission rate 
leading to reduced performance when loss is due to corruption.  We 
allowed the bit error rate on our satellite channel to vary and 
tested the performance of TCP as a function of these bit error 
rates.  Our results show that the average performance of TCP on 
satellite channels is good even under conditions of loss as high 
as bit error rates of 10<sup><small>-5</small></sup>. 
  
Webpubdate: 17 November 2004. {To be added by sysadmin; doesn't show on web.} 
-----