![]() Seminar Announcement
Andrew McCallumUniversity of Massachusetts, Amherst"An Introduction to Information Extraction, with Conditional Random Fields"
Date: Tuesday, November 16, 2004
Abstract:Information extraction is the process of filling a structured database from unstructured text. It is a complex statistical and computational problem often involving hundreds of thousands of variables, complex algorithms, and noisy and sparse data. Recently there has been significant success with conditionally-trained alternatives to joint probabilistic models such as hidden Markov models. In this talk I will briefly describe the landscape of information extraction problems and solutions, then introduce Conditional Random Fields (CRFs), and present four pieces of recent work: (1) feature induction for these models, applied to named entity extraction, (2) a random field method for co-reference resolution that has strong ties to graph partitioning, (3) an extension of CRFs to factorial state representation, enabling simultaneous part-of-speech tagging and noun-phrase segmentation, (4) an integrated model of segmentation and co-reference which improves the performance of both. Joint work with colleagues at UMass: Charles Sutton, Ben Wellner, Michael Hay, Fuchun Peng, Khashayar Rohanimanesh, Wei Li.
Email: mccallum@cs.umass.edu Speaker's web page: http://www.cs.umass.edu/~mccallum/ Research web page: http://www.cs.umass.edu/~mccallum/research.html Institution web page: http://www.umass.edu/ ![]()
|