Commentary on Neural blackboard architectures of combinatorial structures in cognition by Frank van der Velde and Marc de Kamps.
Abstract: 62
words
Main Text: 1934 words
References: 222 words
Total Text: 2218 words
+1-510-666-2910 (for correspondence)
mailto:"shastri@icsi.berkeley.edu"
http://www.icsi.berkeley.edu/~shastri
Contrary to the assertions made in the target article, temporal synchrony, coupled with an appropriate choice of representational primitives, leads to a functionally adequate and neurally plausible architecture that addresses the massiveness of the binding problem, the problem of 2, the problem of variables, and the transformation of activity-based transient representations of events and situations into structure-based persistent encodings of the same.
Table 1 compares two sets of solutions to the challenges posed by Jackendoff (2002), one set of solutions is provided by the SHRUTI architecture that uses temporal synchrony for encoding dynamic bindings (Shastri & Ajjanagadde, 1993; Mani & Shastri, 1993; Shastri 1999; Shastri & Wendelken, 2000) and the other by the target article. This comparison is clearly at odds with what is stated in the target article. The following discusses the bases of the comparison and point out some of the factual errors and faulty analyses underlying the flawed evaluation of the temporal synchrony approach proposed in the target article.
Table 1: An evaluation of two architectures that
provide solutions to Jackendoff’s four challenges to
Cognitive Neuroscience.
|
Jackendoff’s challenges to Cognitive Neuroscience |
Temporal synchrony based SHRUTI architecture |
Neural blackboard (NBB) Architecture |
|
Massiveness of the binding problem |
A |
A |
|
Problem of 2 (Multiple Instantiation) |
B |
B |
|
Problem of variables (reasoning with abstract rules) |
A |
D |
|
Activity-based (dynamic) bindings and long-term bindings |
A |
I (incomplete) |
The SHRUTI architecture represents relations (or
predicates), types, entities, and causal rules using focal-clusters. Figure 1
depicts focal-clusters for relations (e.g., give), types (e.g., Person),
entities (e.g., John), and the rule give(x,y,z) Ž own(y,z). Within a focal-cluster, the activity of the +
node represents a degree of belief, the activity of the ?
node represents querying of information, and the
synchronous firing of a role node (e.g., giver)
and an entity’s (or type’s) + node represents the dynamic binding of the role
and the entity (or type). Type + nodes are further differentiated to encode
quantification (e for existential and
v for universal). Thus the sustained
activity of +:give
together with the firing of+:John, +:Mary, and +e:Book in synchrony with giver, recipient,
and give-object, respectively, encodes the active belief: “John
gave Mary a book.” This activity immediately leads to the inference “Mary owns
a book” because of synchronous activity propagating along connected nodes (e.g.,
owner synchronizes with recipient, and hence, with +:Mary).

Figure 1: A simplified depiction of how relations,
entities, types, and rules are represented in SHRUTI. Labels such as +, ?,
and giver are nodes. Each node
corresponds to a small ensemble of cells. A bidirectional link is a short-hand
notation for two links, one in either direction. Only some of the nodes and
links are shown (e.g., like Person, Book and Car are also subtypes of Things).
Contrary to
what is claimed in the target article (Section 3, para
2; Section 3.2,
para 1), no pre-existing fact nodes or synchrony detectors are required for the
activity-based encoding of “John gave Mary a book,” and no such nodes/detectors
are required for drawing inferences based on this fact. Furthermore, as long as
SHRUTI is told that Dumbledore is an entity or an instance of an existing type,
it will have no problem encoding the novel event “John gave Dumbledore a book”
and productively inferring that Dumbledore owns the book.
This brings
out the first major error in the authors’ understanding of the temporal
synchrony-based SHRUTI architecture. Contrary to their claim, SHRUTI does not
require pre-wired fact nodes for all possible facts. SHRUTI only
requires fact nodes (actually, fact circuits) for encoding memorable facts
in its long-term memory.
Problem of 2: While the simple network shown in Figure 1 permits an entity to simultaneously fill multiple roles in different relations, it cannot simultaneously encode multiple instances of the same event-type (e.g., “John gave Mary a book” and “Mary gave John a pen”) without binding confusions. Shastri & Ajjanagadde (1993) presented a solution to this problem within the temporal synchrony framework. The solution required having a small number of copies of each relational focal-cluster and encoding rules by interconnecting antecedent and consequent focal-clusters via a switching circuit (Mani & Shastri, 1993; see Wendelken & Shastri, 2004 for an alternate solution).
The authors correctly point out that the use of multiple copies of focal-clusters makes it difficult to learn regularities between relations. Since an occurrence of a situation wherein a give event leads to an own event would engage only one focal-cluster each of give and own, respectively, only this pair of focal-clusters will learn the causal link between give and own. Learning the link between all pairs of give and own focal-clusters would require either a mechanism to automatically perform the requisite weight changes across all pairs of focal-clusters or a large number of occurrences (sooner or later each pair of own and give focal-clusters would participate in a relevant situation and learn the link).
Unfortunately, the solution proposed by the authors suffers from exactly the same drawback as that suffered by the solution developed for SHRUTI. Instead of using multiple copies of relational focal clusters, the NBB architecture uses multiple copies of assemblies for each linguistic constituent (e.g., S, NP, and VP). The use of multiple copies renders the learning of structural dependencies and syntactic constraints between constituents very difficult, since any regularity learned from a sentence will be recorded only in the copy of a constituent that was used to encode the sentence; it will not generalize to all the other copies of the constituent. Thus, while the noun subassembly of N1 (or S1 or C1) and the verb subassembly of V1 (or S1 or C1) may learn the correct constraints/rules about noun-verb agreement, the noun and verb subassemblies of other copies of NP, S, and C assemblies will not. The lack of generalization over multiple copies will also manifest itself in the learning of (i) interactions between constituents and control circuits governing sentence parsing and (ii) gating circuits for controlling the flow of activity between subassemblies.
Problem of variables (reasoning with abstract rules): The NBB architecture can look up and extract bindings from compositional structures, but contrary to what is said in Section 6.6, it cannot perform reasoning with such structures. The authors’ state that “information of the form own(X,?) can be transformed into information of the form give(-,X,?) on the basis of a long-term association between own-agent and give-recipient (as in the model of Shastri & Ajjanagadde, 1993).” Indeed, SHRUTI can make such transformations rapidly. Referring back to Figure 1, if the network state is initialized such that ?:own fires and owner and own-object fire in synchrony with ?:Tom and ?:e:Thing, respectively, the resulting network state represents the active query: “Does Tom own some thing?”. Spreading activation within the type hierarchy and rules would transform this query into a large number of queries including “Did a person give Tom something?”.
But SHRUTI’s ability to make such transformations has no bearing on the ability of the NBB architecture to perform reasoning. SHRUTI can make such transformations rapidly because it explicitly encodes (i) each semantic role of relations such as give and own (e.g., give-recipient and own-agent) and (ii) the systematic associations between these roles (see the encoding of give(x,y,z) Ž own(y,z) in Figure 1). In contrast, NBB captures the compositional structure of sentences involving give and own by binding the role-fillers of give and own in a given sentence to generic linguistic constituents and generic thematic roles (Figure 14; target article). NBB, however, does not explicitly encode semantic roles such as give-recipient and own-agent - neither in its activity-based representation, nor in its long-term memory, and it does not have any representational machinery for capturing the associations between such semantic roles. Consequently, it cannot readily transform own(X,?) into give(-,X,?).
Interaction between activity-based (dynamic) bindings and long-term bindings: The authors’ proposal about encoding constituent binding in long-term memory suffers from two problems. First, the authors entertain a variant of the long-term learning problem that is of limited relevance from the standpoint of cognition. Second, they present a solution that is unlikely to work.
The idea of storing sentence structure in long-term memory seems ill-motivated. It is widely believed that we remember the semantic content of what we hear and read and not the actual linguistic input used to convey the content. At the sentence level, this semantic content corresponds, typically, to events and situations that can be viewed as instantiations of multi-dimensional relations (e.g., semantic frames). Language, then, is a means of conveying the bindings of roles and parameters in a multi-dimensional relational instance using a one-dimensional stream of words, and parsing is the process by which the appropriate bindings of a semantic frame are extracted from a stream of words. The sentence structure (or parse tree) is only a means to this end. Thus what need to be memorized are events and situations conveyed by sentences, not sentence structures.
The authors’ solution to the problem of one-trial learning of sentence structure using the hippocampal complex (HC) (Section 6.5) rests on untested assumptions and is unlikely to work. The solution requires HC to create a conjunctive encoding of the activity of the delay assemblies of all the memory circuits involved in representing the sentence structure (Section 6.5.1). The number of such assemblies - for even moderately complex sentences - is likely to be large (> 10), and it is not clear whether the requisite conjunctive representations can be recruited in the HC while keeping interference within reasonable bounds. It is also not clear how many distinct sentence structures can be memorized. The answers to these questions depend on, among other things, the anatomy of the HC, the number of cells in different subregions of the HC, the density of projections between subregions, and the physiological parameters governing LTP (e.g., how many concurrent inputs are required to induce LTP). An analysis of some of the available data about the human HC suggests that if one wants to maintain a low level of cross-talk, the maximum number of bindings that can be grouped together in a conjunct is likely to be small (~7) (Shastri, 2001b; In revision). In view of this, it would seem difficult to memorize any but the simplest of sentence structures.
Fortunately, the problem of one-shot memorization of events and situations described by sentences seems more amenable to a biologically plausible solution. A computational model cortico-hippocampal interactions (Shastri, 2001a; 2001b; 2002; In revision) demonstrates that a cortically expressed transient pattern of activity representing any arbitrary event can be transformed rapidly into a persistent and robust memory trace in the HC as long as the number of distinct role-fillers specified in the event remains within ~7.
Nested Structure: As discussed in (Shastri & Ajjanagadde, 1993b), a possible way of increasing nesting levels is to use a richer temporal structure whereby bindings at a deeper level of nesting are represented by synchronization at faster frequencies (e.g., gamma band) and bindings at a shallow level of nesting by slower frequencies (e.g., theta band). Moreover, as discussed in (Shastri & Ajjanagadde, 1993b; Section R2.5), many problems that seem to require deeper levels of nesting can be reformulated so as to require shallow nesting.
Parsing: The
NBB architecture parses English sentences; it encodes rules and constraints
pertaining to grammatical knowledge and performs inferences required for
parsing sentences (c.f., Section 6.8.4).
Given the focus of the NBB architecture on parsing, it is odd that the
authors did not compare their approach to that of Henderson (1994) who
developed an online, incremental parser motivated by the temporal synchrony
framework.
References
Henderson, J. (1994) Connectionist Syntactic Parsing Using Temporal Variable Binding. Journal of Psycholinguistic Research, 23: 353-379.
Mani, D. & Shastri, L. (1993) Reflexive reasoning with multiple-instantiation in a connectionist reasoning system with a type hierarchy. Connection Science, 5: 205-242.
Shastri,
L. (1999) Advances in SHRUTI - a neurally motivated
model of relational knowledge representation and rapid inference using temporal
synchrony. Applied Intelligence, 11:
79-108.
Shastri L (2001a) A computational model of episodic memory formation in the hippocampal system. Neurocomputing, 38-40:889-897.
Shastri L (2001b) Episodic memory trace formation in the hippocampal system: a model of
cortico-hippocampal interaction. ICSI Technical Report tr-01-00,
Shastri, L. (2002) Episodic memory formation and cortico-hippocampal
interactions. Trends in Cognitive
Science, 6:162-168.
Shastri, L (In revision) From transient patterns to persistent structures. Submitted to Behavioral and Brain Sciences. Available at www.icsi.berkeley.edu/~shastri/psfiles/shastri_em.pdf
Shastri, L. &
Ajjanagadde, V. (1993) From simple associations to
systematic reasoning. Behavioral
and Brain Sciences, 16: 417-494.
Shastri, L. &
Ajjanagadde, V. (1993b) A step toward reflexive reasoning. Behavioral and Brain Sciences, 16:
477-494.
Shastri L, & Wendelken, C. (2000) Seeking coherent explanations - a fusion of structured connectionism, temporal synchrony, and evidential reasoning. In: Proceedings of the Twenty-Second Conference of the Cognitive Science Society, 453-458. Philadelphia, PA.
Wendelken, C. & Shastri, L. (2004) Multiple instantiation and rule mediation in SHRUTI Connection Science, 16: 211-217.