Browse Prior Art Database

Algorithm Schemata and Data Structures in Syntactic Processing

IP.com Disclosure Number: IPCOM000128913D
Original Publication Date: 1980-Oct-01
Included in the Prior Art Database: 2005-Sep-20

Publishing Venue

Software Patent Institute

Related People

Kay, M.: AUTHOR [+3]

Abstract

Interest in context-free grammars and associated parsing techniques has recently been rekindled among theoretical linguists and psycholinguists. A number of reasons for this suggest themselves. Chomsky's program for linguistics values a grammar more highly the more constrained the theoretical framework in which it is written. It is only by assuming that the human linguistic faculty is specialized for a narrow class of languages that we may hope to explain the prodigious feats of language learning accomplished by even the dullest of children in their first six years. So goes the argument. Generally, the most desirable constraints have been held to be those that place a theory low on the scale of weak generative power. In other words, the larger the class of languages that a theory cannot characterize at all provided this class contains no natural languages the better the theory [ Footnote ] This is nor an accurate characterization of Chomsky's own current view. However, determined attempts to modify transformational grammar so as to reduce its weak generative power in a reasonably motivated manner have been largely unsuccessful. The overall theory remains relatively unconstrained. Bids for explanatory adequacy on its behalf continue to be based almost entirely on lists of constraints. These are more or less arbitrary from the point of view of the theory, and appended to it as riders and footnotes.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 4% of the total text.

Page 1 of 33

THIS DOCUMENT IS AN APPROXIMATE REPRESENTATION OF THE ORIGINAL.

©; Xerox Palo Alto Research Center, October, 1980

Algorithm Schemata and Data Structures in Syntactic Processing

by Martin Kay

CSL 80-12 October 1980 © Xerox Corporation 1980

Abstract: The space in which models of human parsing strategy are to be sought is large. This paper is an exploration of that space, attempting to show what its dimensions are and what some of the choices are that the psycholinguist must face. Such an exploration as this may provide some protection against the common tendency to let some choices go by default.

A notion of configuration tables is used to locate algorithms on two dimensions according as (1) they work top-down or bottom-up, and (2) they are directed or undirected. The algorithms occupying a particular place in this two dimensional space constitute an algorithm schema. The notion of a chart is used to show how to limit the amount of work a parser must do by ensuring that nothing is done more than once. Finally, the notion of an agenda is introduced to show how a rich variety of psychological strategies can be combined in a principled way with a given algorithm schema to yield an algorithm.

A version of this paper will appear in the proceedings of the Nobel Symposium on Text Processing, Gothenburg, 1980.

CR Categories: 3.42, 3.36, 3.65, 5.23 Key words and phrases: Natural Language, Psycholinguistics, Parsing, Syntax. XEROX
PALO ALTO RESEARCH CENTER 3333 Coyote Hill Road / Palo Alto / California 94304

1. INTRODUCTION

Interest in context-free grammars and associated parsing techniques has recently been rekindled among theoretical linguists and psycholinguists. A number of reasons for this suggest themselves. Chomsky's program for linguistics values a grammar more highly the more constrained the theoretical framework in which it is written. It is only by assuming that the human linguistic faculty is specialized for a narrow class of languages that we may hope to explain the prodigious feats of language learning accomplished by even the dullest of children in their first six years. So goes the argument. Generally, the most desirable constraints have been held to be those that place a theory low on the scale of weak generative power. In other words, the larger the class of languages that a theory cannot characterize at all provided this class contains no natural languages the better the theory1 However, determined attempts to modify transformational grammar so as to reduce its weak generative power in a reasonably motivated manner have been largely unsuccessful. The overall theory remains relatively unconstrained. Bids for explanatory adequacy on its behalf continue to be based almost entirely on lists of constraints. These are more or less arbitrary from the point of view of the theory, and appended to it as riders and footnotes.

1 This is nor an accurate characterization of Chomsky's own current view.

Xerox Corporation Page 1 Oct 01, 19...