Discontinous Constituents in Generalized Categorial Grammar
Emmon W. Bach
University of Massachusetts, Amherst
© 1981 Emmon Bach
0. Introduction. Categorial grammars were invented by K. Ajdukiewicz
(1935), on the basis of ideas of Lesniewski, Husserl, and ultimately
Aristotle. They have attracted the attention of linguists and philosophers
since 1935, but mostly outside the main trends of generative grammar
. Recently renewed interest in non-transformational approaches
to syntax
suggests that it might be well to take another look at
categorial grammars, since they seem to have been neglected largely because
they had been shown to be equivalent to context-free phrase structure grammars
in weak generative capacity and it was believed that such grammars were
incapable of describing natural languages in a natural way. It is my purpose
here to sketch a theory of grammar which represents a generalization of
categorial grammars of the sort invented by Ajdukiewicz (1935) and further
elaborated especially by Bar-Hillel (1953) and Lambek (1961).
Impetus toward this theory has come also from a quite different quarter: Hale,
Jeanne, and Platero (1977) and others have suggested that not all languages
are phrase-structure languages. It is my hope to develop a theory in which
phrase-structure grammars are seen as a special case of a more general type of
grammar, one which allows only a very small and restricted set of
transformation-like operations. In the present paper, I will concentrate on
the problem of "discontinuous constituents."
[2]
I take it as a necessary part of our job to provide a theory of syntax and
semantics (and phonology) for natural languages. It's worth noting that the
primary motivation behind Ajdukiewicz's construction, in so far as I can see
it, was to give a theory with a clear and consistent relation between syntax
and semantics.
In Section 1, I will review briefly the basic ideas of categorial grammar; in
Section 2, I will discuss the problem of discontinuous constituents; in
Section 3 we will look at a generalization of categorial grammars, and in
Section 4 I will present a sample grammar for a small part of English syntax.
1. Categorial grammars. The basic idea of categorial grammar is this:
think of a phrase as composed of a seequence of words, each assigned to a
category which acts either as an argument or as a function taking arguments of
a certain category to make a certain resultant category. A sequence of words,
each assigned to a category, is a phrase of category a, if it is
possible to find a way of assigning functions to arguments repeatedly in such
a way as to end up with just the category a by exhausting the elements
of the string. The system has obvious analogies to ordinary multiplication and
my first example is couched in these terms.
We first define an infinite set of categories (CAT) as follows:
- 2, 3, 5 ∈ CAT
- If α ∈, and β ∈ CAT, then
(α|β) ∈ CAT
- (iii) Nothing else is in CAT
Elements of CAT may be used as indices on sets of expressions. Let's
use the categories for the sets of expressions consisting of the categories
themselves.
BA = {A}
(BA denotes the set of basic, or lexical, elements of category A.)
So:
B2 = {2}
B2|3 = {2|3}, etc.
(I will usually suppress outermost parentheses.)
We can now define the language induced by the set of categories as the
union of the sets of phrases in each category A, PA, where we can
make explicit the ideas sketched above as follows:
[3]
- If α ∈ BA, then α ∈ PA
- If α ∈ PA|B, β ∈ PB,
then αβ and βα ∈ PA
Thus, for example, (2|3) ∈ P2|3, 3 ∈
P3
so (2|3)3 and 3(2|3) ∈ P2
We can specify a semantics for this "language" quite simply by letting our
domain of individuals be the positive rational numbers and interpreting each
basic expression as the number ordinarily denoted by it, "2" denotes two,
(2|3) denotes two-thirds, etc., and letting concatenation be interpreted as
multiplication.
Let's now consider a linguistically more interesting example. Let 2
stand for the category of sentences (with no basic members). Members of other
categories might be:
B3 = {John, Mary, Bill}
B2|3 = {walks, runs, talks}
B(2|3)|3 = {sees, kisses, loves}
B(2|3)|(2|3) = {slowly, enjoyably}
Then the language induced by the set of categories and lists of basic
expression includes these sorts of phrases:
P2: John walks, Mary slowly walks, walks Bill, sees John Mary,...
P2|3: sees Mary, walks, Mary kisses enjoyably,...
Some expressions are excluded
:
*slowly John Mary sees
slowly John Mary sees
-------(2|3)|(2|3) --------------2
John Mary sees
----3 ---------2|3
Mary sees
----3 ----(2|3)|3
This follows from the fact that functions must be immediately contiguous to
their arguments.
[4]
The kind of language we have with a system of categories of the above type is
what I have called an "M-language" (M for "mobile"): all and only permutations
of siblings are allowed. (Every M-language is obviously context-free but not
all CF languages are M-languages, e.g. {x | x = anbn} is
not, as Stanley Peters pointed out to me, cf. Bach, 1975.)
We can introduce ordering restrictions into our system by adopting a notation
introduced by Bar-Hillel (1953) and Lambek (1961). Let a/b
(b\a) index the set of expressions "looking for" arguments of category
b to their "right"("left") to make expressions of category a.
To approximate English we can replace the above categories by these:
B3\2, B(3\2)/3, B(2|3)|(2|3)
The system now generates these well-formed members of P2 (= S):
John enjoyably kisses Mary.
Mary walks slowly.
The system is obviously still context-free (now completely equivalent to CF
grammars in weak generative capacity). [Exercise 1: make a set of categories
for the language {x|x = anbn.] But we are still unable
to do discontinuous constituents.
2. Discontinuous constituents. If all natural languages could be
described by plain vanilla two-way categorial grammars of the sort we've
looked at, we'd be home free. But natural languages notoriously display, or
appear to display, discontinuous constructions. Some examples from English
are these:
1. (a) John looked up the number.
(b) John looked the number up.
2. (a) I persuaded to leave all the men in the room.
(b) I persuaded him to leave.
3. (a) A package from India is here.
(b) A package is here from India.
In each of the (a) cases there is a continuous string of elements which is
arguably a constituent, while in the (b) cases the strings [5] are broken up.
Examples like the above constituted one of the primary motivations for
transformations in the first place.
A more extreme case is provided by languages which allow a lot of freedom in
ordering. Here linguists have typically set up grammars which specify a fixed
base order and then appealed to operations of scrambling (but see Hale,
Jeanne, and Platero, 1977; Lapointe, 1980). Hale (cf. above reference and
Hale, 1979) has claimed that Walbiri allows for permutations of all words
around a fixed, second-place element in each clause. And there are many
examples of free constituent order in relatively fixed-order languages like
English.
Imagine a language in which any order of constituents is possible, i.e. for
any sentence all "scrambles" or words (or morphemes) is also a sentence. We
could represent functions that are looking for arguments "anywhere" as
a||b. [Exercise 2: show that the free "scramble" of a context-free
language is not necessarily context-free.] Similarly, we could introduce the
notation a//b (b\\a) for functions looking for arguments
indefinitely far to their "right" ("left").
The introduction of such categories entails that we will not be able to
construct phrase-markers for all possible permutations. For example,
consider a sentence that is disallowed for "contiguous English" but allowed in
"scrambled English." We can analyze the sentence after the fashion of a
Montague analysis tree, but the analysis can't be mapped onto a phrase-marker:
4. sees slowly John Mary
---(2|3)||3 ---(2|3)||(2|3) --3 --3
sees slowly John Mary
slowly sees...John
sees John
(This is only one of two possible analyses; in the other John is the
subject.)
Of course, no language has such complete freedom of order; typically, certain
constituents act as "boundary domains" for scrambling. (I conjecture that
such bounding as the effect of keep them all "almost context-free" (if not
context-free) i.e [6] that such grammars with bounding on recursive nodes do
not generate the full class of context-sensitive languages.) We can achieve
this result by stating general conditions, e.g. that S's be coherent.
3. Generalized Categorial Grammars. Let me sum up what I've suggested
as a generalization of the syntactic categories of categorial grammar in the
form of a recursive definition schema (for universal grammar):
(i) a, b ∈ CAT (a. b primitive categories)
(ii) If a, b ∈ CAT, so are
(a) a/b
(b) b\a
(c) a|b
(d) a//b
(e) b\\a
(f) a||b
(iii) Nothing else is in CAT.
I assume that this basic schema will be supplemented by a system of features,
so that we can include e.g. a/1b to index a function that
requires an argument immediately to its right with the additional stipulation
that the argument have the case value 1 (e.g. accusative, more on this in Bach
ms [1982]).
So much for syntax. The more difficult question is the semantics. For a
first step. I'll assume (along the lines of Montague's PTQ) that there is a
uniform mapping from categories to types and that every function argument rule
(cancellation rule) carries with it automatically a rule of interperetation
which says that the interpretation of the resultant category is the result of
applying the interpretation of the function expression to the intension of the
interpretation of the argument expression.
I've also not said very precisely how we stipulate how to analyze (or
construct) sentences with these "long-distance" functions. I consider it
still and open problem to try to give a general characterization of such
syntactic operations. In the next section I'll show how to accomplish a
syntactic/semantic for several particular cases in English. The basic idea is
this: we let functions expressions of category a//b act syntactically
as expressions of category a, and supply a free variable of the type of
intensions of the type corresponding to b in the translation. This
variable remains free until we find an expression of category b, then
we bind the variable with a lambda and let it loose. [7]
4. Some English Constructions. I now want to show how a generalized
categorial grammar can be used to capture certain facts about English. I will
exhibit a grammar which includes persuade and promise and
so-called heavy NP-shifts.
The primitive categories (I ignore further analysis) are S (sentence),
T (term-phrase, NP [=DP], CN (common noun), IV (tenseless
VP's), ĪV̅ (to VP's). The basic categories we need are
these, with some typical members.
BCN = {woman, fish, elephant,...}
BT = {Bill, Mary, John,...}
BT/CN = {every, the, a,...}
BT\S = {walks, runs,...}
B(T\S)/T = {hits, sees, loves,...}
B(T\S/ĪV̅ = {tries,...}
B((T\S/T))//ĪV̅ = {persuades,...}
B((T\S)/ĪV̅)/T = { ,...}
B(T\S)\(T\S) = {slowly, today,...}
BĪV̅/IV = {to,...}
We need the following Rule Schemata (from Universal Grammar). The rules may be
read as follows: A:= B C: If α ∈ B, β ∈ C, then
α β ∈ A, and if α translates as α' (i.e. 1') and
β as β' (2') then α β translates as (0'=) whatever follows =
under the rule. I ignore completely, for simplicity, conditions on feature
values for constituents, except for two; VST ("variable store") is a function
which tells us what free variables have been put into the translation and
remain to be bound on at appropriate points, while [-A] is a feature telling
us that there is "missing" phrase of category A (comparable to Gazdar's slash
categories, Gazdar, forthcoming [1982]). Note further that the rules are
schemata, and they (as well as the list of categories) must be further specified for particular languages.
| 0. |
PA := BA |
| | | | | | | |
| 0'= 1' | | | | | | | | | | |
| | | | | | | | | | | |
| 1a. | PA := PA/B PB | |
| 1b. | PA := PB | PB\A
|
| PA := PA//B PB | |
| |
PA := PB | PB\\A
| | | | | | |
| 0' = 1'(^2') | | | | 0' =
2'(^1') | | | | | [8] | | |
| 2. |
PA/C := P(A/B)//C PB |
|
|
| | | | | | | |
|
0' = λXi, <s,C>[1'(Xi, <s,C>)(^2')] |
(RIGHT-WRAP) | | | | | | | | | |
| 3. |
PA := PA/B |
|
|
|
|
|
|
|
|
|
|
|
|
[-B] |
|
|
|
|
|
|
0' = 1'(Xi,<s,B>) Add Xi,<s,B> to VST(0) |
"variable store" |
| 4. |
PA := PA |
|
|
|
|
|
|
|
|
|
[-B] |
|
|
|
0' = λXi,<s,B>[1'](^2') |
Remove Xi,<s,B> from VST(0) |
Here are two examples of derivations, one involving heavy NP-shift (using an
instantiation of Rule Schema 3) and one using Rule Schema 2. (This corresponds
to sub-operation Right-wrap of Bach, 1979, misleadingly called a subfunction
there.
Mary sees today the elephant
--- ---- ---- ---------
T (T\S)/T (T\S)\(T\S) T
----
T\S VST:℘3
---------------- T\S
T\S VST:℘3
[-T]
-------------------------------------------
T\S (VST:empty)
-----------------------T\S
---------------------------------------------------------S
Translation:
λ℘3[today'(see'(℘3))](^the(^elephant))[^Mary') ==> see'(the_elephant')(Mary')
Mary persuaded Bill to go
---- --------- ----- -- --
T ((T\S)/T)//ĪV̅ T ĪV̅/IV IV
---------------- --------
(T\S)/ĪV̅ ĪV̅
----------------------------------
T\S
-----------------------------------------------
S
Translation:
λP4[persuade'(P4)(^Bill')](^go')(^Mary') ==> persuade'(^go')(^Bill')(^Mary') [9]
Finally, let' consider phrases like give Bill a fish (not in our
fragment). I've argued (Bach, 1980) that give (in this usage) belongs
to the category B(IV/T)//T. The standard derivation follows
exactly that of persuade Bill to go.
give Bill ∈ PIV/T (by R2)
and is translated as λ℘[give'(℘)(^Bill')], so by 1a we have
give Bill a fish ∈
λ℘[give'(℘)(^Bill')](^a_fish')
==> give'(^a_fish)(^Bill')
Note that the inner argument of give, in this usage, is the thing given
and that Bill is the direct object of the TVP give a fish.
This same standard derivation will assign to the phrase give a fish
Bill an interpretation that reduces to this:
give'(^Bill)(^a_fish') and means the same as give Bill to a
fish.
What remains to be explained is why the option of R3 and R4 is not available
to yield "heavy NP-shift" interpretations of phrases like give a fish the
elephant (that is heavy). Note that R1a allows this derivation˙
give ∈ P(IV/T)//T give'
give a fish ∈ PIV/T give'(^a_fish')
give a fish the elephant ∈ PIV give'(^a_fish')(^the_elephant')
(Compare above: this is identical with the the result of the standard
derivation of give Bill a fish.) We don't want to block the option of
letting double slash categories take immediately following arguments
("anywhere to the right" must include "immediately following"). I propose
that what is crucial here is the identity of the inner and outer arguments, so
we can amend the rule schema (second option) as follows:
PA := PA//B PB unless A = (X/B) for some
X
,
.
Footnotes
- This is an exact transcription of the originally
published paper in NELS 11:1-12, (1981), except for updating and adding
references, a few corrections and improvements, and new footnotes 3a, 6. For
typographical convenience, I have substituted a different notation for
Ajdukiewicz's and my fractional notation: for A over B, I write here
"A|B." NB the argument category is on the right. I thank Wynn Chao
for comments and suggestions.
- []
-
- Among those who have taken up categorial grammars are
Lyons (1966), Lewis (1972), Geach (1972), Montague (1974: Paper 8, henceforth
PTQ).
- []
-
- To be mentioned here are Harman (1963), Brame (1978),
Bresnan (1978), Schachter (1978), Hudson (1976), K. Ross (1980), Lapointe
(1980, Gazdar (forthcoming [1982]). I am building in this paper on ideas of K. Ross
(1978), R. Saenz, S. Lapointe, and M. Flynn (1980)
- []
-
- This paper is a companion piece to Bach (ms. [1982]), where
I explore especially the kind of detailed dependencies found in the auxiliary
system of English with a generalized categorial system. For details on the
kind of system of which I am proposing a categorial variant, see K. Ross
(1980), Bach and Partee (1980), Partee and Bach (1980).
- []
-
- The original had a different example here, which
was in fact generable on a different analysis than the one cited (gratia Wynn
Chao). Analyses are presented in a different format here than in the
original, where they took the form of Montagovian analysis trees.
- []
-
-
For an interesting exploration of the question of providing general principles
determining what particular directional categories a language may pick,
see M. Flynn, 1980.
- []
-
-
Answers to exercises:
Exercise 1:
BS/S' = {a}
BS/b = {b}
BS\S' = {b}
Bb = {b}
Exercise 2: Let L = {X|X = (abc)n. L is CF (in fact regular). But
L'' = Scramble(L) is not CF. For let L' = {X|X =
anbmck then L'' ∩ L' = {X|X =
anbncn} is not CF,
but since the intersection of a CF language and a regular language is CF,
L'' (= Scramble(L)) can't be CF.
[]
Since this paper was written, there has been a profusion of research and
publication on categorial grammar. New references in the bibliography, which
yield a good sampling of this rich literature are: Carpenter 1997; Dowty 1982,
2007; Jacobson 1999; Moortgat 1988; Morrill 1995; Steedman 2000; Wood 1993.
REFERENCES
-
-
Ajdukiewicz, Kasimierz.
1935.
Die syntaktische Konnexita̎t.
Studia Philosophica
1: 1-27.
Translated as `Syntactic connexion' in Storrs McCall, ed., Polish Logic: 1920-1939 (Oxford: Oxford University Press, 1967).
-
Bach, Emmon.
1983.
Generalized categorial grammars and the English auxiliary.
In Frank Heny and Barry Richards, eds., Linguistic Categories: Auxiliaries and
Related Puzzles (Dordrecht: Reidel), II,
pp. 101-120.
-
Bach, Emmon.
1975.
Order in Base Structures.
In Charles N. Li (ed.), Word Order and Word Order Change, pp.
Austin, Texas:
University of Texas Press.
Pp. 307-43.
-
Bach, Emmon, and Barbara H. Partee.
1980.
Anaphora and Semantic Structure.
Proceedings of the Chicago Linguistic Society, l6:
Parasession on Anaphora.
-
Bar-Hillel, Yehoshua.
1953.
A quasi-arithmetical notation of syntactic description.
Language 29: 47-58.
-
Brame, Michael.
1978.
Base-generated Syntax.
Seattle: Noit Amrofer Press.
-
Bresnan, Joan
1978.
A realistic transformational grammar.
In Morris Halle, Joan Bresnan, and George A. Miller, eds.,
Linguistic Theory and Psychological Reality
(Cambridge, MA: MIT Press).
-
Carpenter, Bob.
1997.
Type-Logical Semantics.
Cambridge, Massachusetts / London:
MIT Press.
-
Dowty, David R.
1982.
Grammatical relations and Montague grammar.
In Pauline Jacobson and Geoffrey K. Pullum, eds.,
The Nature of Syntactic Representation
(Dordrecht: Reidel), pp. 79-130.
-
Dowty, David.
2007.
Compositionality as an Empirical Problem.
In Chris Barker and Pauline Jacobson, eds.,
Direct Compositionality
(Oxford: Oxford University Press), pp. 23-101.
-
Flynn, Michael.
1981.
Structure building operations and word order.
Ph.D. dissertation: The University of Massachusetts, Amherst (G.L.S.A.)
-
Gazdar, Gerald.
forthcoming [1982].
Phrase structure grammar.
In Pauline Jacobson and Geoffrey K. Pullum, eds., The Nature of Syntactic
Representation (Dordrecht: Reidel), pp. 131-186.
-
Geach, Peter.
1972.
A program for syntax.
In Donald Davidson and Gilbert Harman, eds.,
Semantics of Natural Language
(Dordrecht: Reidel), pp. 483-497.
-
Hale, Kenneth (ms.) On the position of Walbiri in a typology of the base.
(Unpublished paper, MIT, 1979).
-
Hale, Kenneth, LaVerne Jeanne, and Paul Platero.
1977.
Three cases of overgeneration.
In P. Culicover, T. Wasow, and A. Akmajian, eds., Formal Syntax (New
York: Academic Press, pp. 379-416.
-
Harman, Gilbert.
1963.
Generative grammars without transformation rules: a defense of phrase structure.
Language 39 (1963) pp. 597- 616.
-
Hudson, Richard A.
1976.
Arguments for a Non-transformational Grammar.
Chicago: University of Chicago Press.
-
Jacobson, Pauline.
1999.
Towards a variable-free semantics.
Linguistics and Philosophy
22:117-184.
-
Lambek, Joachim.
1961.
On the calculus of syntactic types.
In R. Jakobson, ed.,
Structure of Language and its Mathematical Aspects.
Providence: Proceedings of Symposia in Applied Mathematics, XII, American Mathematical Society. Pp. 166-178.
-
Lapointe, Steven G.
1980.
A theory of grammatical agreement.
Ph.D. dissertation:
The University of Massachusetts, Amherst.
-
Lewis, David.
1970 [1972].
General semantics.
Synthese:
22: 18-67.
Reprinted in Donald Davidson and Gilbert Harman, eds.,
Semantics of Natural Language (1972:Dordrecht: Reidel), pp. 169-218.
-
Lyons, John.
1966.
Towards a `notional' theory of the `parts of speech.'
Journal of Linguistics
2: 209-236.
-
Montague, Richard.
1974.
Formal Philosophy.
Edited by Richmond H. Thomason.
New Haven:
Yale University Press.
-
Moortgat, Michael.
1988.
Categorial Investigations.
Academisch Proefschrift: Universiteit van Amsterdam.
-
Morrill, G.
1995.
Discontinuity in categorial Grammar.
Linguistics and Philosophy.
18:175-219.
-
Partee, Barbara H., and Emmon Bach.
1980 [1981].
Quantification, Pronouns, and VP Anaphora.
In J. Groenendijk, T. Janssen, and M. Stokhof (eds.),
Formal Methods in the Study of Language
Proceedings of the Third Amsterdam Colloquium, 445-48l.
-
Ross, Kenneth.
1980.
Parsing English phrase structure.
University of Massachusetts, Amherst.
Unpublished Ph.D. dissertation.
-
Schachter, Paul.
1978.
Review of Hudson, 1976.
Language 54: 348-376.
-
Steedman, Mark.
2000.
The Syntactic Process.
Cambridge, Massachusetts / London:
MIT Press.
-
Wood, Mary McGee.
1993.
Categorial Grammars.
London and New York:
Routledge.
Linguistic Theory Guides.