I've mainly summarized the Structured/Sparse references and included some notes for some of the other papers. I could not obtain a few of the papers and have not had a chance to read others. The comments are minimal, but please tell me if I've misunderstood or misrepresented your work and I will correct the error. If you have seen more recent publications than the ones found here, I would appreciate the information and will include it in this list. My thanks to D.R. Mani, Stan Franklin, Joydeep Ghosh, S.C. Kak, Phil Kohn, Jacob M. Murre, Soheil Shams and Jim Steck for responding to my request for references. As noted elsewhere, most of the references are from D.R. Mani's dissertation proposal. Overview Papers --------------- Nordstrom92 - survey of implementations of bp, recurrent and kohonen nets on a variety of architectures, including specialized machines. Singer90 - survey of CM-2 implementations Structured/Sparse Connectionist Networks ======================================== Model of Computation -------------------- There does not seems to be much literature on mapping semantic networks onto parallel machines. The work by chung92 and evett89 (and other PARKA references which I have not yet read) is targeted at semantic nets while the work in blell87, wei91 and shams92 is mainly intended for sparse backprop style networks, but might be applied to semantic nets as well. Since large, sparse networks are often specified hierarchically, ghosh89 provides a model that models the net hierarchically, with the highest connectivity locally (within a "core"), lower within an "influence" region and lowest in the remote region. In addition, conditional connection probabilities model the "clumpiness" of connections. These conditional probabilities reflect that fact that (roughly speaking) a connection between two clusters of a network affects the likelihood that there are other connections as well. murre93b investigates general sparsity of connections and concludes that low uniform sparsity is not very helpful and processor topology does not much matter. However, if we partition the neural network into modules, and if the connections between modules are sparse, it does help. The conditional probability used by ghosh89 generates clumpy links, somewhere inbetween uniform sparsity and "modular" sparsity. SIMD Mappings ------------- In different SIMD mappings of sparse nets onto the CM-2 blell87 assigns two processors per weight and one per node, while chung92 puts all the outgoing weights on the same processor. Lange90 describes a simulator (DESCARTES) that is implemented on the CM-2 and uses the same mapping as blell87. The simulator has mechanisms for dealing with control - which nets are active and with what frequency are they cycled. evett89 and chung92 both represent semantic nets with arbitrary topologies and labelled arcs between nodes. Chung's system also provides for weights on the links while evett89's PARKA system does not. The data structures used by evet89 and chung92 are actually quite similar (both essentially store a set of out-pointers from every node), but evett89 forces every node to have a pointer (even if it is null) for every possible arc label available, whereas chung92 explicitly labels arcs with a (4 bit) tag. Clearly there are various tradeoffs like the ease of finding all arcs with a particular label, the number of labels allowed (maximum of 16 in chung92), space requirements etc. wei91 is meant for the backprop of sparse networks and takes advantage of an efficient permutation algorithm on a mesh structured machine (certain types of communication must be permitted) to shuffle around the activations to the weights and the results back to the activations. shams92 developes his own mesh SIMD machine, the DREAM machine, and a constraint net optimization of the assignment and scheduling problems. MIMD Machines ------------- lange90 deals with massively parallel MIMD machines A very different sort of paper is hinton81 which reformulates the semantic net problem as one of associative recall of missing members of triplets (node1 rel node2) with properties like inheritance falling out of his representation. The resulting neural net is fully connected. bic87 recasts the problem in a predicate logic framework which captures both the structure of the network as well as the nature of permissible queries. Data Parallel Approach ------------------------ Training set parallel approaches may be (usually trivially) applied to any style of network. Speed of Implementation ----------------------- Many of the fastest implementations of backprop style networks turn out to be training set parallel, when the speed is measures in Connections per second. However, as paugh93 makes clear and has been discussed on connectionists, this may not translate into any decrease in training time. Singer90 uses training set parallelism to obtain > 1GCPS on the CM-2. Other CM-2 approaches sometimes combine a training set parallel approach in combination with another mapping method. paugh93 There is a well known trade-off between the speed of convergence of backprop and the amount of training-set parallelism used. Batch updating, the extreme case, leads to much slower convergence. This effect dilutes the speedup. pugh93 explores this tradeoff and obtains empirical rules for the learning rate (not considering momentum) and block size to maximize speedup. Multi-Layer Backprop Style Networks ----------------------------------- Analysis -------- chu92 Applying integer programming techinques to obtain mappings of a layered net (fully connectivity between layers) that are within epsilon of optimal. Formulates the problem with linear constraints with a non-linear objective. Simplifies by partitioning the processor graph and making assumptions about the dominance of computation within a partition. SIMD Machines ------------- In addition to the CM-2, there are a large number of specialized processors, and implementations are surveyed in Nordstrom92. blell87 may be used for the dense case as well as the sparse case. Zhang90 uses a "vertical" partitioning of the network, i.e. a processor gets a single node from each layer (as well as the accompanying weights) Basic communication is cycling around the activation vec. within a layer. singer90 surveys other methods and also proposes a very fast data parallel approach which would be most applicable to relatively small networks. chinn90 applies systolic techniques to implement a dense net on a MasPar MP-1 kak88 deals with an efficient systolic style matrix vector algorithm. shams92a deals with the design of a SIMD mesh architecture with specialized routing switches that permits efficient implementation of multiple communication rings. Deals with algorithms for mapping newtorks onto the DREAM machine. MIMD Machines ------------- There are an enormous number of implementations on MIMD machines, especially the transputer, eg. Morgan90, chu92, shams92a, kung89 and many others. Hopfield Style Networks ----------------------- Forrest87 Implements the Durbin and Willshaw Elastic Net, an associative memory and the Geman and Geman algorithm on the DAP. wang89 Implemenation on a shared memory vector processor Manner89 Hopfield net considerations are used to derive constraints on a parallel processor of 68020's. Different updating considered Biologically Realistic Simulations ---------------------------------- Nelson89 deals with general issues in parallel algorithm analysis and proceeds to the description of the implementation of compartmental models, a chief issue being the partition of the compartmental model.