Seminar Announcement

Andrew McCallum

University of Massachusetts, Amherst

"An Introduction to Information Extraction, with Conditional Random Fields"

Date: Tuesday, November 16, 2004
Time: 1:00 pm.
Place: Building 451, Room 1025 (White Room)
P Clearance / Unclassified
Contact: Tina Eliassi-Rad ((925) 422-1552) or Leslie Bills ((925) 423-8927)

Sponsored by: ISCR and CASC.


Abstract:

Information extraction is the process of filling a structured database from unstructured text. It is a complex statistical and computational problem often involving hundreds of thousands of variables, complex algorithms, and noisy and sparse data. Recently there has been significant success with conditionally-trained alternatives to joint probabilistic models such as hidden Markov models. In this talk I will briefly describe the landscape of information extraction problems and solutions, then introduce Conditional Random Fields (CRFs), and present four pieces of recent work: (1) feature induction for these models, applied to named entity extraction, (2) a random field method for co-reference resolution that has strong ties to graph partitioning, (3) an extension of CRFs to factorial state representation, enabling simultaneous part-of-speech tagging and noun-phrase segmentation, (4) an integrated model of segmentation and co-reference which improves the performance of both.

Joint work with colleagues at UMass: Charles Sutton, Ben Wellner, Michael Hay, Fuchun Peng, Khashayar Rohanimanesh, Wei Li.

Email: mccallum@cs.umass.edu

Speaker's web page: http://www.cs.umass.edu/~mccallum/

Research web page: http://www.cs.umass.edu/~mccallum/research.html

Institution web page: http://www.umass.edu/

News | Calendar | People | Groups | Current Projects | Collaborators | Sponsors | Publications | More Information | Search | Sitemap
LLNL | CAR | CASC | ISCR | ITS | Members Only | LLNL Disclaimers
UCRL-MI-125922 |