Applications of Computational Linguistics, WS 01/02

Answer Extraction

Alexander Deubelbeiss, February 1, 2002


An answer extraction system: ExtrAns

An answer extraction system accepts a question formulated in natural language as its input. The answers are not generated by the system from a "knowledge database" but extracted from a body of human-formulated texts. These texts typically belong to a highly restricted domain but were originally written for other humans in syntactically unrestricted language.

 Aim: to provide better precision (relevance of results to the query) while maintaining the recall (completeness of results) of Information Retrieval methods that do not use NLP, without the cost and difficulty of developing a language-understanding question-answering system or of porting an existing one to a new domain. According to the developers of ExtrAns, answer extraction is well suited to medium-sized static corpora.

 The answers are not entire documents but only those parts that seem to provide a direct answer to the specific question the user asks. All questions are treated separately: the system stores no information about earlier questions, and it can't make sense of elliptical follow-up questions ("Which of these commands is suitable for text files?"). Answer extraction aims to supply all relevant answers (which may contradict each other) to each question, not to create the impression of having a conversation with a computer.

ExtrAns(University of Zürich Institute of Computational Linguistics) extracts its answers from Unix man (=Manual) pages, i.e., the documentation of a computer operating system's commands. A new version that uses the (substantially larger) technical manuals of the Airbus 320 as its data is currently being developed.


 
 

 ExtrAns converts its source data into "logical forms", a representation of both the syntactic structure and the semantic content of a sentence. This explains why it is useful to have a static corpus: that way, the conversion needs only be done once. Queries are subjected to the same treatment. Then, the source text database is searched for a sentence with a logical form that is equivalent to the logical form of the query.
 
 

Problems:

References

Berri, Jawad; Mollá Aliod, Diego; Hess, Michael.  Extraction automatique de réponses: implémentations du système ExtrAns http://www.ifi.unizh.ch/CL/berri/taln98.ps.gz, retrieved in January 2002

 Hess, Michael.  Mixed-Level Knowledge Representations and Variable-Depth Inference in Natural Language Processing. http://www.ifi.unizh.ch/CL/hess/ijait.ps.gz, retrieved in January 2002.

 Mollá Aliod, Diego; Hess, Michael.  On the Scalability of the Answer Extraction System "ExtrAns". http://www.ifi.unizh.ch/CL/molla/klagenfurt.ps.gz, retrieved in January 2002

 Mollá Aliod, Diego; Berri, Jawad; Hess, Michael. A Real World Implementation of Answer Extraction. http://www.ifi.unizh.ch/CL/hess/nlis.ps.gz, retrieved in January 2002
 
 

The ExtrAns project

Information about the project
Demo