[CS] Advanced Notice: Talk by Fernando Pereira, AT&T Labs on 10 May
Margery Ishmael
marge at cs.uchicago.edu
Tue Mar 28 12:04:40 CST 2000
Wednesday, 10 May
Ryerson 251 at 2:30 pm
(followed by refreshments in Ryerson 255)
FERNANDO PEREIRA
AT&T Labs -- Research
presents:
"Declarative Programming for a Messy World"
Programming languages emerged from the tidy world of tabular data and
artificial languages associated with formalized scientific and business
disciplines. In contrast, naturally-occurring data is highly variable and
ambiguous, making it a poor match for current programming techniques.
Nevertheless, advances in data storage, transmission and encoding are
bringing into the digital realm ever growing masses of natural data such as
text, speech, images, and biological sequences. Effective processing of
such data poses two related problems: how to model individual information
sources, and how to specify how the evidence from multiple sources should
be combined to answer particular information-processing needs. To address
the first problem, my department has led advances in machine learning,
information retrieval, natural-language processing and their connections to
statistics and speech processing.
In this talk, however, I will focus on the second problem and discuss a new
approach to the integration of multiple uncertain information sources.
This approach is based on declarative application-oriented languages whose
semantics reflects directly the calculation of alternative results with
different weights of evidence, and the appropriate rules for combining
those weights. As the main example, I will describe the library I
developed with Merhyar Mohri and Michael Riley for combining finite-state
information sources in speech and text processing, which is used in all
current speech recognition and synthesis projects at AT&T Labs. The
semantic foundation of the library supports powerful combination and
optimization methods generalizing standard finite-state algorithms, and
allows a variety of implementation techniques derived from lazy evaluation
and memorization in declarative programming. As a second example, I will
outline William Cohen's WHIRL system, which uses a Datalog-like query
language with approximate field matching and evidence weighting drawn from
vector-space information retrieval to answer queries over semi-structured
data from multiple Web sites. WHIRL relies on an efficient query
processing algorithm combining deductive database techniques with heuristic
search.
I will conclude by drawing connections with current research on languages
for probabilistic reasoning in graphical models that suggest interesting
directions for further work.
Host: Michael O'Donnell
========================================================
Margery Ishmael
Project Assistant
Department of Computer Science
1100 E. 58th St.
Chicago, IL 60637
tel: 773 834-3152 fax: 773 702-8487
marge at cs.uchicago.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.cs.uchicago.edu/pipermail/cs/attachments/20000328/485d2769/attachment.htm
More information about the cs
mailing list