Paper written for the Cognitive Modelling Workshop of the Seventh Australian Conference on Neural Networks, Australian National University Canberra, 9 April 1996.

Case Study #??: SHRUTI's treatment of negation and inconsistency

A connectionist treatment of negation and inconsistency

Lokendra Shastri and Dean J. Grannes
International Computer Science Institute
1947 Center St., Ste. 600
Berkeley, CA 94704,

Target Paper: L. Shastri, D.J. Grannes (1996). A connectionist treatment of negation and inconsistency. To appear in the Proceedings of the Eighteenth Annual Conference of the Cognitive Science Society July, 1996.

Simulation: An agent may act in an erroneous manner due to time pressure or limited attention. Focused evaluation, or an appropriate cue might make the necessary information available and lead to a correct response.

Introduction: SHRUTI attempts to understand how a system of simple neuron-like elements can encode a large body of background knowledge and perform certain inferences rapidly (reflexively). This includes the ability to deal with negated and inconsistent knowledge.

The ability to perform inferences in order to establish referential and causal coherence and generate expectations plays a crucial role in understanding language. Given the rate at which we can understand language, it is apparent that we can perform the requisite inferences rapidly --- as though they were a reflex response of our cognitive apparatus. In view of this, Shastri & Ajjanagadde (1993) have described such reasoning as reflexive. Certain types of negated knowledge also plays a role in such reasoning. Given ``John has been to Canada'' and ``John has not been to Europe'', one can readily answer the questions (i) ``Has John been to North America?'', (ii) ``Has John been to France?'' and (iii) ``Has John been to Australia?'' with ``yes'', ``no'', and ``don't know'', respectively. Similarly, given ``John is a bachelor'', one can readily answer ``no'' to ``Is John married to Susan?'' Observe that answering this question involves the use of negated knowledge that may be approximated as ``A bachelor is not married to anyone''.

The encoding of negated knowledge raises the possibility of inconsistencies in an agent's long-term memory (LTM). We often hold inconsistent beliefs in our LTM without being explicitly aware of such inconsistencies. But at the same time, we often recognize contradictions in our beliefs when we try to bring inconsistent knowledge to bear on a particular task. In view of this, a cognitively plausible model of memory and reasoning should allow inconsistent facts and rules to co-exist in its LTM. But at the same time, it should be capable of detecting contradictions whenever inconsistent beliefs become co-active during an episode of reasoning.

Finally, any agent with limited resources must sometimes act with only limited attentional focus and often under time pressure. This means that an agent may sometimes overlook relevant information and act in an erroneous manner. Focused evaluation or an appropriate cue, however, might make the necessary information available and lead to a correct response.

Memory: Memory in SHRUTI exists is both dynamic (i.e., transient) as well as long-term forms. While long-term memory is stored in the patterns of interconnections in the network (as described in the section entitled "Structure" below), dynamic memory is expressed as rhythmic patterns of activation (described in the section entitled "Time" below).

Time: Time is important to the SHRUTI system in that role-filler bindings are accomplished through the synchronous firing of appropriate role and filler nodes. If two nodes are firing synchronously, they are considered to be bound.

Change: SHRUTI does not model rule learning. In SHRUTI, the interesting notion of change is the dynamic evolution of the network state after a query has been posed, or a new fact has been conveyed, to it. As the system state evolves, "new" facts are inferred, explanations are generated, and predictions are made. This process can be viewed as the propagation of a rhythmic pattern of activity over the LTM network. Rules in SHRUTI are interconnection patterns among ensembles of cells that cause the propagation and transformation of rhythmic patterns of activity. Long-term facts are subnetworks that act as temporal pattern matchers that become active under suitable circumstances and create reverberatory patterns of activity.


General representation:

Figure 1a illustrates the representation of a predicate and entities. A node such as Johncorresponds to a focal node of the representation of the entity ``John''. Information about the various features of John and the roles he fills in various events is encoded by linking the focal node to appropriate nodes distributed throughout the network (see Feldman 1989, Shastri 1988).

Encoding of predicates: predicate clusters as convergence zones:

Consider the encoding of the binary predicate lovewith two roles: lover and lovee. This predicate is encoded by a cluster of nodes consisting of two role nodes depicted as circular nodes and labeled lover and lovee; an enabler node depicted as a pentagon pointing upwards and labeled e:love; and two collector nodes depicted as pentagons pointing downwards and labeled +c:love and --c:love respectively. In general, the cluster for an n-ary predicate contains n role nodes, one enabler node, and two collector nodes. The circular nodes are rho-btu nodes while the pentagon shaped nodes are tau-and nodes. The computational behavior of these nodes will be described shortly.

The cluster of nodes described above act as an anchor for the complete encoding of a predicate. All rules and facts that involve a predicate converge on its cluster, and all such rules and facts can be accessed by fanning out from this cluster. This representation of a predicate is consistent with the notion of ``convergence zones'' (Damasio 1989).

The semantic import of the enabler and collector nodes is as follows. Assume that roles of a predicate P are dynamically bound to some fillers thereby representing a dynamic instance of P (we will see how, shortly). The activation of the enabler e:P means that the system is trying to explain whether the currently active dynamic instance of P is supported by the knowledge in the memory. The request for such an explanation might be generated internally by the reasoning system, or be communicated to it by some other subsystem (e.g., the planning module). The semantic import of the two collectors +c:P and --c:P is the complement of that of the enabler node. The system activates the positive collector +c:P when the currently active dynamic instance of P is supported by the knowledge encoded in the system. In contrast, the system activates the negative collector --c:P when the negation of the active instance is supported by the system's knowledge. Neither collector becomes actives if the system does not have sufficient information about the currently active dynamic instance. The collectors can also be used by an external process. For example, the language understanding process might activate c:love and establish the bindings (lover=John, lovee=Mary) upon hearing the utterance ``John loves Mary''. Since the two collectors encode mutually contradictory information they have mutually inhibitory links.

Detecting a contradiction: The levels of activation of the positive and negative collectors of a predicate measure the effective degree of support offered by the system to the currently active predicate instance. These levels of activation are the result of the activation incident on the collectors from the rest of the network and the mutual inhibition between the two collectors. The two activation levels encode a graded belief ranging continuously from ``no'' on the one extreme --- where only the negative predicate is active, to ``yes'' on the other --- where only the positive collector is active, with ``don't know'' in between --- where neither collector is very active. If both the collectors receive comparable and strong activation then both collectors can be in a high state of activity. When this happens, a contradiction is detected by an additional node within each predicate cluster (not shown in Figure 1a).

Computational behavior of idealized nodes: If a rho-btu node A is connected to another rho-btu node B then the activity of B synchronizes with the activity of A. In particular, a periodic firing of A leads to a periodic and in-phase firing of B. A tau-and node becomes active on receiving a pulse (or a burst of activity) exceeding a minimum duration, pi. Thus a tau-and node behaves like a temporal and node. On becoming active, it produces an output pulse similar to the input pulse.

Encoding dynamic bindings: Dynamic bindings are represented by the synchronous firing of appropriate role and filler nodes. With reference to Figure 1a, the rhythmic pattern of activity shown in Figure 1b represents the dynamic bindings (lover=Mary,lovee=Tom) (i.e., the dynamic fact love(Mary,Tom)). Observe that Mary and lover are firing in synchrony and Tom and lovee are firing in synchrony. The absolute phase of firing of nodes is not significant. Also since e:love is firing, the system is essentially ``asking'' whether it believes that Mary loves Tom.

Encoding long-term facts: Memory as a temporal pattern matcher: A long-term fact behaves like a temporal pattern matcher that becomes active whenever the static bindings it encodes match the dynamic bindings represented in the system's state of activation. Figure 2a illustrates the encoding of the long-term facts love(John,Mary) and ~love(Tom,Susan). Given the query love(John,Mary)? the fact node F1 will become active and activate the collector +c:love indicating a ``yes'' answer. Similarly, given the query love(Tom,Susan)?, the fact node F2 will become active and activates the --c:P collector indicating a ``no'' answer. Finally, given the query love(John,Susan)?, neither +c:love nor --c:love would become active, indicating that the system can neither affirm nor deny whether John loves Susan.

Encoding of rules: A rule is encoded by (i) linking the roles of the antecedent and consequent predicates so as to reflect the correspondence between these roles specified by the rule, (ii) connecting the enabler of the consequent predicate to the enabler of the antecedent predicate, and (iii) by connecting the appropriate collectors of the antecedent predicates to the appropriate collector of the consequent predicate. The collector link originates from the positive (negative) collector of an antecedent predicate if the predicate appears in its positive (negated) form in the antecedent. Similarly, the link terminates at the positive (negative) collector of the consequent predicate if the predicate appears in a positive (negated) form in the consequent. Figure 2b shows the encoding of the rule bachelor(x) --> ~married(x,y). Observe that the a rule and its contrapositive are two distinct rules and one may be encoded in the LTM without the other.

The encoding of rules makes use of weighted links between predicates. These weights distinguish categorical rules from soft (default) rules and also lead to a gradual weakening of activation along a chain of inference.

Conclusions: Several interesting consequences of dealing with negated knowledge are captured in the following scenario (which we will refer to as the Post Office Example): John runs into Mary on the street. ``Where are you going?'' asks John. ``To the post office,'' replies Mary. ``But isn't today Presidents' Day?'' remarks John. ``Oops! I forgot that today was a federal holiday,'' says Mary after a momentary pause and heads back.

Clearly, Mary had sufficient knowledge to infer that ``today'' was a postal holiday. But the fact that she was going to the post office indicates that she had assumed that the post office was open. So in a sense, Mary held inconsistent beliefs. John's question served as a trigger and brought the relevant information to the surface and made Mary realize her mistake.

We model Mary's knowledge as follows (refer to Figure 3):

(i) presidents-day(day) --> federal-holiday(day)
(ii) 3rd-Mon-Feb(day) --> presidents-day(day)
(iii) 3rd-Mon-Feb(20-Feb-95)
(iv) ~3rd-Mon-Feb(21-Feb-95)
(v) weekday(day) ^ post-office(x) --> open(x,day) (with a medium weight)
(vi) weekend(day) ^ post-office(x) --> ~open(x,day)
(vii) federal-holiday(day) ^ post-office(x) --> ~open(x,day)
(viii) post-office(PO)

The significance of items (i), (v), (vi), and (vii) is fairly obvious. Item (ii) specifies that third Mondays in February are Presidents' Days. Ideally 3rd-Mon-Feb would be realized as a mental process. We are indirectly simulating such a procedure by assuming that such a mental process is accessed via the predicate 3rd-Mon-Feb in order to determine whether the day bound to its role is a third Monday in February. In this example, this mental ``calendar'' consists of two facts stated in items (iii) and (iv). Item (viii) states that PO is a particular post office. Items (i), (ii), (vi), and (vii) are categorical rules about the domain and have a high weight, but item (v) corresponds to default and defeasible information and hence, has a medium weight. In the current implementation, default rules have a weight of 0.70 while categorical rules have a weight of 1. We assume that ``Today'' is a concept which is bound each day to the appropriate date and to ``weekday'' or ``weekend'' depending on the day. These bindings are assumed to be available as facts in the agent's memory.

Imagine it is 20-Feb-95, which is Presidents' Day, and Mary is planning a trip to the post office (PO). Her ``go-to-post-office'' schema has the precondition that the post office must be open so it poses the query open(PO,Today)? Assume that after posing the query the schema monitors the activity of +c:open and --c:open and accepts an answer based on the criterion: Accept a ``yes'' (``no'') answer if the positive (negative) collector stays ahead and exceeds a threshold, theta, for some minimum length of time, delta. Once the schema accepts an answer, it terminates the query and proceeds with its execution.

Since ``Today'' is bound to 20-Feb-95, the fact weekday(20-Feb-95) is present in Mary's memory. When the schema asks the query open(PO,Today)?, the default rule about post offices remaining open on weekdays becomes active first and activates the positive collector +c:open (refer to Figure 3 and Figure 4). If we assume theta to be 0.5, the activation of +c:open exceeds theta after 12 cycles and stays above threshold for about 20 cycles. During this time, the negative collector does not receive any activation and stays at 0. If we assume that delta is 10 cycles, the schema will accept +c:open as an answer and withdraw the query. So Mary will set off to the post office.

Had the query remained active, the inference process would have eventually inferred that the post office is not open today. The result of the inferential process, if the query open(PO,Today)? had not been terminated by the schema, is shown in Figure 5. The dark lines show the activation of the collectors of open while the dotted lines show the activations of the collectors of some other relevant predicates. First it is inferred that today is a weekday. Next it is inferred that today is the third Monday in February. As a result, the inference that today is Presidents' Day, and hence, a federal holiday, follows. This in turn leads to the inference that the post office is not open today.

Subsequently, John asks Mary: ``Isn't today Presidents' Day?''. This causes the language process to activate e:Presidents-day and bind the role of Presidents' Day to 20-Feb-95. This leads to the activation of e:3rd-Mon-Feb and then +c:3rd-Mon-Feb (via the fact 3rd-Mon-Feb(20-Feb-95)). The activation from +c:3rd-Mon-Feb works its way back and activates --c:open. Since this activation is due to categorical rules (rules ii, i, and vii), it is stronger than that arriving at +c:open from the default rule (item v). The mutual inhibition between the highly activated --c:open and the moderately activated +c:open results in the suppression of +c:open, making Mary realize that the post office is not open (see Figure 6).

For a step-by-step on-line version of the Post Office example, look at this demo.


Cottrell, G.W. (1993) From symbols to neurons: Are we yet there? Behavioral and Brain Science, 16:3 p.454.

Damasio, A.R.(1989) Time-locked multiregional retroactivation: A systems-level proposal for the neural substrates of recall and recognition. Cognition 33, p. 25-62.

Feldman, J.~A. (1989) Neural Representation of Conceptual Knowledge. In Neural Connections, Mental Computationed. L. Nadel, L.A. Cooper, P. Culicover, & R.M. Harnish. MIT Press.

Shastri, L. (1988) Semantic networks : An evidential formulation and its connectionist realization Pitman/Morgan Kaufman.

Shastri, L. & Ajjanagadde, V. (1993) From simple associations to systematic reasoning. Behavioral and Brain Sciences 16:3 p. 417-494.

Shastri, L. & Grannes, D.J. (1995) Dealing with negated knowledge and inconsistency in a neurally motivated model of memory and reflexive reasoning. TR-95-041, ICSI, Berkeley.

Links to: