Language & Information Lab.

Reusability and Research Programmes
in Computational Linguistics

Pius ten Hacken

[home ] [up ]

This is a description of my Habilitationsschrift. My Habilitation was granted in November 2000. The manuscript (333 pages) is currently under review with a publisher.

Motivation of the work

Computational linguistics (CL) is not usually considered from a philosophical perspective. Researchers concentrate their efforts on developing programs and tools and making them work rather than on such philosophical questions as to what extent their work is scientific. In this book a systematic attempt is made to create the basis for an analysis of CL in terms of a philosophy of science. By focusing on reusability, a connection with one of the most persistent themes in recent work in CL is established.

Since philosophy of science takes as its primary object natural science rather than linguistics and is heavily biased towards empirical rather than applied science, there are two barriers to be overcome before it can be applied to CL. For this reason, a large part of the book is devoted to the groundwork of producing a substantial link between philosophy of science, theory of grammar, and CL. Theory of grammar is used as a bridge between philosophy of science and CL, separating the two barriers.

Research programmes in theory of grammar

Available discussions of philosophical issues in theory of grammar often show a lack of mutual understanding. The two sides in such a discussion fail to make proper contact and tend to compensate for this by rhetorical violence. This situation is reminiscent of what Thomas Kuhn analysed as a discussion between representatives of different paradigms. A problem with the use of Kuhn's theory is that it has itself become the topic of a discussion which has generated a lot of confusion. In order to stay clear of the problems related to this confusion, I develop a new concept of research programme, which, though inspired by Kuhn's concept of paradigm, leaves the social aspects of science out of account and concentrates exclusively on the epistemological ones. A research programme is the specification of assumptions creating an environment in which explanation of observations by a theory is possible.

The application of the concept of research programme to the recent history of linguistics yields an analysis of the research programme of Chomskyan linguistics and its relationship to some of its main competitors. Questions addressed in this context include whether the rise of Chomskyan linguistics and its replacing Post-Bloomfieldian linguistics constitutes a scientific revolution; whether different stages of Chomsky's theory are part of the same research programme; and whether competing theories such as LFG, GPSG, Montague Grammar, and HPSG are part of the same research programme or not. While much of the discussion of these issues in the literature available so far is marked by a lack of subtlety, often resulting in a caricature of the opposite viewpoint, I interpret each of the approaches under discussion as a coherent, scientific approach to linguistics in its own right. Rather than trying to spot contradictions in a presentation, I assume the intention of presenting a coherent framework as given and consider apparent contradictions as indications that my interpretation of the framework should be adapted. In this way, the legitimate differences between research programmes stand out.

Computational linguistics as an applied science

The barrier between empirical science and applied science is instantiated by the opposition between theory of grammar and CL. I approach this opposition first by an analysis of applied science focusing on the role of problem-solving and explanation. The questions to be answered by a research programme in CL can be made to follow from this analysis. They constitute the background for the discussion of the contrast between probabilistic and linguistic approaches to CL, which has sometimes been analysed in terms of a distinction between Kuhnian paradigms or similar concepts. Such an analysis is problematic in view of the by now widespread practice of combining these approaches. Applying the method developed in the discussion of competing approaches in theory of grammar, I develop an analysis which explains on the one hand the profound difference felt between the two approaches and on the other hand the possibility of combining them.

Reusability

A key notion in recent discussion in CL has been the reusability of resources and components. The application of the concept of research programme to CL raises the question of the extent to which reusability is threatened in principle by the incompatibility of research programmes. In the discussion of Kuhn's concept of paradigm in the literature, one of the most controversial aspects has turned out to be Kuhn's statement that different paradigms are incommensurable. The merits of theories formulated in different paradigms cannot be compared on the basis of a common set of criteria. In the presentation of the concept of research programme and its application to theory of grammar, I have taken special care to establish and elucidate the relationship between the concept of research programme and the property of incommensurability. As illustrated by the discussion of the contrast between probabilistic and linguistic approaches to CL, in applied science a deep mutual misunderstanding, pointing to incommensurability, can coexist with the possibility of combining the results of the two approaches.

Case studies: Machine Translation and reusable lexical resources

Two case studies serve the combined goals of exploring what determines a contrast between research programmes and how such a contrast affects the reusability of resources and components. The areas chosen for the case studies are machine translation (MT) and the development of reusable lexical resources. The former is one of the most thoroughly studied fields of CL, the latter is probably the field of CL in which reusability has been discussed in most depth. In both cases, a systematic analysis of the decisions involved and the approaches pursued is undertaken. These analyses result in an outline of two research programmes in CL and an account of the extent to which reusability is restricted across research programme boundaries.


[home ] [up ]
18-Aug-2000 Pius ten Hacken