|
Questions and Summaries of Srini Narayanan's articles
discussed by Fred Dick
Summary of:
Talking the talk is like walking the walk: A computational
model of verbal aspect
The Narayanan model proposes to investigate the grounding of verbal aspect in
embodied action. Aspect (to refresh the memories of non-linguists) directs the
attentional spotlight to characteristics of the "temporal character" in a
particular instance (whereas tense relates the instance to other situations in
time.) Aspect appears to be quite a universal phenomenon, although it is coded
quite differently accross languages.
The hypothesis being looked at: linguistic aspect is a direct byproduct of
sensory-motor schemas. The existence of sensorimotor schemas is assumed, as is
a "controller" of these schemas (rather like the Baddeley "central executive").
Like the Bailey model, this simulation attempts to bi-directionally map verbal
aspect and x-schemas. Again, the x-schema is a serial algorithm (shown in
the flow-chart) which should produce a set of motor activities, like walking.
Narayanan has derived a set of "process primitives" which permit the
construction of such an algorithm. They are: duration of movement, periodicity
(or lack of it) in movement, level of resources allocated, goal of movement,
and current conditions of system.
The schema controller (the central executive) is characterized by the 2nd flow
chart (fig 2) and is itself an x-schema. It sends signals to particular motor
schemas, and serves to control their sequencing. The schema controller moves
through a series of states, and can chart possible trajectories through those
states.
It is further hypothesized that these levels of schemas are instantiated
neurally, and that they powerfully influence language. (from bottom of 2nd
page) "...Aspect Modifiers or other grammatical devices are like knobs which
when set activate the corresponding controller node, sanctioning which
inferences can be made by the hearer, given the same underlying schema (verb
form.) Languages may differ in which knob settings they allow, and hence may
vary which aspeects and how much bandwidth they allow the speaker." To me,
this seems very much like a version of Principles and Parameters theory,
although I am sure that nothing of the kind is intended by Mr. Narayanan!
Like the Bailey model, the aspect model uses Petri nets and serial algorithms
as basic structures. [Perhaps Mike Hayward and Dan could comment upon the
particular technical features of this model, as I am not at all familiar with
the issues surrounding Petri Nets...]
Information flows through the model in the following way: aspect markers (like
-ing, start, stop, -ed etc.) send feature tokens to the controller. These
interact with the controller in terms of directing paths through parameters of
primitives. I hope that Mr. Narayanan will be able to clarify exactly what
happens as the model is trained, as the description is somewhat opaque, as are
some of the conclusions. For example, it is unclear why and how the model,
after hearing the sentence "Jack has walked to the store" would activate the
"perfect" state (associated with "result" in the semantic vector and "walk" in
the motor vector ) and would be able to make an inference that Jack is at the
store and is possibly tired. Other examples of aspect as a byproduct of
sensorimotor schema are those given by Lakoff last week, such as motor
iteration as perfect aspect ("-ing").
The paper unfortunately does not contain any data or results, undoubtably due
to an early submission deadline or similar difficulty. It is unclear to me
what the purpose of the flowcharts on "imperfect/perfect", "die", and "start"
is. Perhaps it is simply to illustrate that verbal aspect can be characterized
using flowcharts depicting basic serial processing, a method that has also been
applied to motor control problems in AI.
A few very general questions:
What is the advantage of attempting to model embodied language using a typical
AI serial modular processor vis a vis a connectionist approach?
What neurobiological evidence is there for motor schemas?
What are the difficulties in modeling the co-evolution of language and motor
processes, particularly in a distributed approach?
How difficult will it be to scale up to a less defined problem e.g. are the
results of the model (whatever they are) due to the fact that the semantic and
motor possibilities are so carefully tailored to a particular theoretical bent,
or will the model generate the same results given a more realistic set of
possibilities?
Overview of "Modeling Embodied Lexical Development"
Bailey, Feldman, Narayanan, and Lakoff.
The goal of the model was to develop a structured connectionist system of
neurally plausible conceptual, representational, and lexical learning. The
basic approach assumed that lexical development was an ideal venue for testing
hypotheses about embodied systems as it (development) can be linguistically and
conceptually broken down into simple and tractable models. The first
instantiation of this was the Regier model presented last week. The Bailey
paper emphasizes that an essential assumption of this model is that all people
have the same visual system, and that visual concepts MUST arise from its
capacities.
This particular model addresses learning of "hand-motion" verbs, such as
"yank", "pull", "slide", and so forth. Stated at the outset is the claim
that, in order to model a child learning to label its own actions, the
"program" must be able to act out these actions. Bailey et al make the
trenchant observation that standard accounts of lexical development tend to
ignore the functional aspects of concepts and their labeling. [Why this paper
claims that a PDP account (in particular one that employs backprop) is unable
to account for function is unclear to me.]
The basic layout of the model is this: motor acts, represented in the bottom
layer of model, entail relation of action features and world states (at the
middle layer) to the top word level. Information flow is bi-directional. As
per last week's discussion, the Bailey group suggests that there should be
three "requirements" of language models: 1) that they involve multiple levels
of representation, 2) that they be computationally testable, 3), that they be
*reducible* to "structured" connectionist models, and 4) that representations
be able to use "computational learning algorithms." Bailey also finds it
useful to employ the motor (or X-) schemas in such models, which are
essentially programmed movements strung together in a particular algorithm.
This model, like the others discussed, uses Petri nets as the basic
computational device, as they "cleanly capture concurrent and event-based
asynchronous control", and also can sequence, be hierarchically ordered, as
well as be "parameterized". The model itself is put together with "places" and
"transitions" (circles and rectangles in the flow chart). If enough places are
active (be filled with tokens) around a transition, it turns "on" and may fire,
annulling the old inputs and shunting the new tokens into outputs. The
transitions can also trigger motor algorithms. Also utilized are feature
structures (f-structs) which essentially encode feature vectors. [Evidently
this is a concept borrowed from unification grammars] The features are not
on/off, but have a range of values. The "linking f-struct" in the middle of
figure 2 is the mediating artifact (to borrow a term) between motor and
linguistic representations. Note that Bailey claims that the requirements of
parameterizing the x-schemas are the principle determiners of how the
semantic features are encoded. In addition, the x-schema features were chosen
in order to allow the model to learn relevant verbs from any language. It
would be interesting to know 1) how these features were hit upon, and 2) how
verbs were selected across languages. Verbs, like the features in the linking
f-struct, are represented by feature vectors with probabilistic values.
Error reduction in the model is carried out using a Bayesian algorithm (see
figure in paper) which tries to maximize fit in both linguistic and motor
outputs. The assumptions about human lexical learning that are built into the
model are: 1) that children label experience, 2) that acquisition of motor
schemas precedes their labeling, 3) that an informant provides the necessary
verb in the correct context, and 4) that children learn words without negative
evidence and through fast mapping (e.g. Plunkett, von der Malsburg). The
essential problem presented to the model consists of finding the correct number
of senses for each verb, as well as mapping on not only the correct features to
those verbs, but also the correct distribution of weights. It is somewhat
unclear in the paper exactly what the results of the model are (presumably
these will be addressed tomorrow.)
*************************************************************************
Frederic Dick
Dept. of Cognitive Science
UCSD
(619) 453-8926
fdick@cogsci.ucsd.edu
|