Language & Information Lab.
Reusability and Research Programmes
in Computational Linguistics
Pius ten Hacken
[home
] [up
]
This is a description of my Habilitationsschrift. My Habilitation was granted
in November 2000. The manuscript (333 pages) is currently under review with
a publisher.
Motivation of the work
Computational linguistics (CL) is not usually considered from a philosophical
perspective. Researchers concentrate their efforts on developing programs
and tools and making them work rather than on such philosophical questions
as to what extent their work is scientific. In this book a systematic attempt
is made to create the basis for an analysis of CL in terms of a philosophy
of science. By focusing on reusability, a connection with one of the most
persistent themes in recent work in CL is established.
Since philosophy of science takes as its primary object natural science
rather than linguistics and is heavily biased towards empirical rather than
applied science, there are two barriers to be overcome before it can be applied
to CL. For this reason, a large part of the book is devoted to the groundwork
of producing a substantial link between philosophy of science, theory of
grammar, and CL. Theory of grammar is used as a bridge between philosophy
of science and CL, separating the two barriers.
Research programmes in theory of grammar
Available discussions of philosophical issues in theory of grammar often
show a lack of mutual understanding. The two sides in such a discussion fail
to make proper contact and tend to compensate for this by rhetorical violence.
This situation is reminiscent of what Thomas Kuhn analysed as a discussion
between representatives of different paradigms. A problem with the use of
Kuhn's theory is that it has itself become the topic of a discussion which
has generated a lot of confusion. In order to stay clear of the problems
related to this confusion, I develop a new concept of research programme,
which, though inspired by Kuhn's concept of paradigm, leaves the social aspects
of science out of account and concentrates exclusively on the epistemological
ones. A research programme is the specification of assumptions creating an
environment in which explanation of observations by a theory is possible.
The application of the concept of research programme to the recent history
of linguistics yields an analysis of the research programme of Chomskyan
linguistics and its relationship to some of its main competitors. Questions
addressed in this context include whether the rise of Chomskyan linguistics
and its replacing Post-Bloomfieldian linguistics constitutes a scientific
revolution; whether different stages of Chomsky's theory are part of the
same research programme; and whether competing theories such as LFG, GPSG,
Montague Grammar, and HPSG are part of the same research programme or not.
While much of the discussion of these issues in the literature available
so far is marked by a lack of subtlety, often resulting in a caricature of
the opposite viewpoint, I interpret each of the approaches under discussion
as a coherent, scientific approach to linguistics in its own right. Rather
than trying to spot contradictions in a presentation, I assume the intention
of presenting a coherent framework as given and consider apparent contradictions
as indications that my interpretation of the framework should be adapted.
In this way, the legitimate differences between research programmes stand
out.
Computational linguistics as an applied science
The barrier between empirical science and applied science is instantiated
by the opposition between theory of grammar and CL. I approach this opposition
first by an analysis of applied science focusing on the role of problem-solving
and explanation. The questions to be answered by a research programme in
CL can be made to follow from this analysis. They constitute the background
for the discussion of the contrast between probabilistic and linguistic approaches
to CL, which has sometimes been analysed in terms of a distinction between
Kuhnian paradigms or similar concepts. Such an analysis is problematic in
view of the by now widespread practice of combining these approaches. Applying
the method developed in the discussion of competing approaches in theory
of grammar, I develop an analysis which explains on the one hand the profound
difference felt between the two approaches and on the other hand the possibility
of combining them.
Reusability
A key notion in recent discussion in CL has been the reusability of resources
and components. The application of the concept of research programme to CL
raises the question of the extent to which reusability is threatened in principle
by the incompatibility of research programmes. In the discussion of Kuhn's
concept of paradigm in the literature, one of the most controversial aspects
has turned out to be Kuhn's statement that different paradigms are incommensurable.
The merits of theories formulated in different paradigms cannot be compared
on the basis of a common set of criteria. In the presentation of the concept
of research programme and its application to theory of grammar, I have taken
special care to establish and elucidate the relationship between the concept
of research programme and the property of incommensurability. As illustrated
by the discussion of the contrast between probabilistic and linguistic approaches
to CL, in applied science a deep mutual misunderstanding, pointing to incommensurability,
can coexist with the possibility of combining the results of the two approaches.
Case studies: Machine Translation and reusable lexical resources
Two case studies serve the combined goals of exploring what determines a
contrast between research programmes and how such a contrast affects the
reusability of resources and components. The areas chosen for the case studies
are machine translation (MT) and the development of reusable lexical resources.
The former is one of the most thoroughly studied fields of CL, the latter
is probably the field of CL in which reusability has been discussed in most
depth. In both cases, a systematic analysis of the decisions involved and
the approaches pursued is undertaken. These analyses result in an outline
of two research programmes in CL and an account of the extent to which reusability
is restricted across research programme boundaries.
[home
] [up
]
18-Aug-2000 Pius ten
Hacken