![]() | ![]() | ![]() | Introduction to the Project |
The Berkeley FrameNet project is creating an online lexical resource for English, based on frame semantics and supported by corpus evidence. The `starter lexicon' will be available to the public by May, 2000, and will contain at least 2000 items - verbs, nouns, and adjectives - representative of a wide range of semantic domains. The aim is to document the range of semantic and syntactic combinatory possibilities (valences) of each word in each of its senses, through manual annotation of example sentences and automatic capture and organization of the annotation results. The FrameNet database is in a platform-independent format, and can be displayed and queried via the web and other interfaces.
A semantic frame, henceforth frame is a script-like structure of inferences, linked by linguistic convention to the meanings of linguistic units - in our case, lexical items. Each frame identifies a set of frame elements (FEs) - participants and props in the frame. A frame semantic description of a lexical item identifies the frames which underlie a given meaning and specifies the ways in which FEs, and constellations of FEs, are realized in structures headed by the word.
Valence descriptions provide, for each word sense, information about the sets of combinations of FEs, grammatical functions and phrase types attested in the corpus.
The annotated sentences are the building blocks of the database. These are marked up in XML and form the basis of the lexical entries. This format supports searching by lemma, frame, frame element, and combinations of these.
The FrameNet database acts both as a dictionary and a thesaurus. The dictionary features include definitions (from the Concise Oxford Dictionary, 10th Edition, courtesy of Oxford University Press), tables showing how frame elements are syntactically expressed in sentences containing each word, annotated examples from the corpus, and an alphabetical index. Like a thesaurus, words are linked to the semantic frames in which they participate, and frames, in turn, are linked to wordlists and to related frames.
The FrameNet corpus is the 100-million-word British National
Corpus (BNC), used through the courtesy of Oxford University Press
(OUP). The semantic annotation is carried out using the Alembic
Workbench (MITRE Corporation). The syntactic annotation, which
adds grammatical function and phrase type to each annotated phrase,
is handled by an in-house tagging program. Each FrameNet entry will
provide links to other lexical resources, including WordNet synsets
and the COMLEX subcategorization frames.
The project's deliverables will consist of the FrameNet database itself:
(Researchers interested in obtaining tools for doing similar annotation work should contact the FrameNet Project directly.)
| PI: | Charles J. Fillmore |
| Technical Director: | J. B. Lowe |
| Consultants: | B. T. Atkins, Urich Heid |
| Techies: | Collin Baker, Jane Edwards, Hiroaki Sato, Qibo Zhu |
| Lexicographers: | Hans Boas, Michael Ellsworth, Susanne Gahl, Christopher Johnson, Michael Locke, Monica Oliver, Miriam Petruck, Paula Rogers, Josef Ruppenhofer, Christopher Struett, Marianne Tolley, Margaret Urban, Nancy Urban, Ursula Wagner, Peter Wong, Esther Wood. |
![]() | ![]() | ![]() | Introduction to the Project |