Discontinous Constituents in Generalized Categorial Grammar *

Emmon W. Bach
University of Massachusetts, Amherst
© 1981 Emmon Bach

0. Introduction. Categorial grammars were invented by K. Ajdukiewicz (1935), on the basis of ideas of Lesniewski, Husserl, and ultimately Aristotle. They have attracted the attention of linguists and philosophers since 1935, but mostly outside the main trends of generative grammar [1]. Recently renewed interest in non-transformational approaches to syntax [2] suggests that it might be well to take another look at categorial grammars, since they seem to have been neglected largely because they had been shown to be equivalent to context-free phrase structure grammars in weak generative capacity and it was believed that such grammars were incapable of describing natural languages in a natural way. It is my purpose here to sketch a theory of grammar which represents a generalization of categorial grammars of the sort invented by Ajdukiewicz (1935) and further elaborated especially by Bar-Hillel (1953) and Lambek (1961).

Impetus toward this theory has come also from a quite different quarter: Hale, Jeanne, and Platero (1977) and others have suggested that not all languages are phrase-structure languages. It is my hope to develop a theory in which phrase-structure grammars are seen as a special case of a more general type of grammar, one which allows only a very small and restricted set of transformation-like operations. In the present paper, I will concentrate on the problem of "discontinuous constituents." [3]

[2] I take it as a necessary part of our job to provide a theory of syntax and semantics (and phonology) for natural languages. It's worth noting that the primary motivation behind Ajdukiewicz's construction, in so far as I can see it, was to give a theory with a clear and consistent relation between syntax and semantics.

In Section 1, I will review briefly the basic ideas of categorial grammar; in Section 2, I will discuss the problem of discontinuous constituents; in Section 3 we will look at a generalization of categorial grammars, and in Section 4 I will present a sample grammar for a small part of English syntax.

1. Categorial grammars. The basic idea of categorial grammar is this: think of a phrase as composed of a seequence of words, each assigned to a category which acts either as an argument or as a function taking arguments of a certain category to make a certain resultant category. A sequence of words, each assigned to a category, is a phrase of category a, if it is possible to find a way of assigning functions to arguments repeatedly in such a way as to end up with just the category a by exhausting the elements of the string. The system has obvious analogies to ordinary multiplication and my first example is couched in these terms.

We first define an infinite set of categories (CAT) as follows:

  1. 2, 3, 5 ∈ CAT
  2. If α ∈, and β ∈ CAT, then (α|β) ∈ CAT
  3. (iii) Nothing else is in CAT
Elements of CAT may be used as indices on sets of expressions. Let's use the categories for the sets of expressions consisting of the categories themselves.

BA = {A}

(BA denotes the set of basic, or lexical, elements of category A.) So:

B2 = {2}
B2|3 = {2|3}, etc.
(I will usually suppress outermost parentheses.)

We can now define the language induced by the set of categories as the union of the sets of phrases in each category A, PA, where we can make explicit the ideas sketched above as follows:


  1. If α ∈ BA, then α ∈ PA
  2. If α ∈ PA|B, β ∈ PB, then αβ and βα ∈ PA
Thus, for example, (2|3) ∈ P2|3, 3 ∈ P3 so (2|3)3 and 3(2|3) ∈ P2
We can specify a semantics for this "language" quite simply by letting our domain of individuals be the positive rational numbers and interpreting each basic expression as the number ordinarily denoted by it, "2" denotes two, (2|3) denotes two-thirds, etc., and letting concatenation be interpreted as multiplication.

Let's now consider a linguistically more interesting example. Let 2 stand for the category of sentences (with no basic members). Members of other categories might be:

B3 = {John, Mary, Bill}
B2|3 = {walks, runs, talks}
B(2|3)|3 = {sees, kisses, loves}
B(2|3)|(2|3) = {slowly, enjoyably}
Then the language induced by the set of categories and lists of basic expression includes these sorts of phrases:
P2: John walks, Mary slowly walks, walks Bill, sees John Mary,...
P2|3: sees Mary, walks, Mary kisses enjoyably,...
Some expressions are excluded [3a]:
*slowly John Mary sees

slowly                                  John Mary sees
-------(2|3)|(2|3)                   --------------2

                                  John          Mary sees
                                  ----3         ---------2|3
                                            Mary                sees                                      
                                            ----3               ----(2|3)|3
This follows from the fact that functions must be immediately contiguous to their arguments.

[4] The kind of language we have with a system of categories of the above type is what I have called an "M-language" (M for "mobile"): all and only permutations of siblings are allowed. (Every M-language is obviously context-free but not all CF languages are M-languages, e.g. {x | x = anbn} is not, as Stanley Peters pointed out to me, cf. Bach, 1975.)

We can introduce ordering restrictions into our system by adopting a notation introduced by Bar-Hillel (1953) and Lambek (1961). Let a/b (b\a) index the set of expressions "looking for" arguments of category b to their "right"("left") to make expressions of category a. To approximate English we can replace the above categories by these:

B3\2, B(3\2)/3, B(2|3)|(2|3)

The system now generates these well-formed members of P2 (= S):

John enjoyably kisses Mary.
Mary walks slowly.

The system is obviously still context-free (now completely equivalent to CF grammars in weak generative capacity). [Exercise 1: make a set of categories for the language {x|x = anbn.] But we are still unable to do discontinuous constituents.

2. Discontinuous constituents. If all natural languages could be described by plain vanilla two-way categorial grammars of the sort we've looked at, we'd be home free. But natural languages notoriously display, or appear to display, discontinuous constructions. Some examples from English are these:

1. (a) John looked up the number.   
    (b) John looked the number up.  
2. (a) I persuaded to leave all the men in the room. (b) I persuaded him to leave.
3. (a) A package from India is here. (b) A package is here from India.

In each of the (a) cases there is a continuous string of elements which is arguably a constituent, while in the (b) cases the strings [5] are broken up. Examples like the above constituted one of the primary motivations for transformations in the first place.

A more extreme case is provided by languages which allow a lot of freedom in ordering. Here linguists have typically set up grammars which specify a fixed base order and then appealed to operations of scrambling (but see Hale, Jeanne, and Platero, 1977; Lapointe, 1980). Hale (cf. above reference and Hale, 1979) has claimed that Walbiri allows for permutations of all words around a fixed, second-place element in each clause. And there are many examples of free constituent order in relatively fixed-order languages like English.

Imagine a language in which any order of constituents is possible, i.e. for any sentence all "scrambles" or words (or morphemes) is also a sentence. We could represent functions that are looking for arguments "anywhere" as a||b. [Exercise 2: show that the free "scramble" of a context-free language is not necessarily context-free.] Similarly, we could introduce the notation a//b (b\\a) for functions looking for arguments indefinitely far to their "right" ("left").

The introduction of such categories entails that we will not be able to construct phrase-markers for all possible permutations. For example, consider a sentence that is disallowed for "contiguous English" but allowed in "scrambled English." We can analyze the sentence after the fashion of a Montague analysis tree, but the analysis can't be mapped onto a phrase-marker:

4.      sees                 slowly      John     Mary
        ---(2|3)||3     ---(2|3)||(2|3)  --3      --3

        sees slowly John           Mary

     slowly      sees...John

                 sees      John
(This is only one of two possible analyses; in the other John is the subject.)

Of course, no language has such complete freedom of order; typically, certain constituents act as "boundary domains" for scrambling. (I conjecture that such bounding as the effect of keep them all "almost context-free" (if not context-free) i.e [6] that such grammars with bounding on recursive nodes do not generate the full class of context-sensitive languages.) We can achieve this result by stating general conditions, e.g. that S's be coherent.
3. Generalized Categorial Grammars. Let me sum up what I've suggested as a generalization of the syntactic categories of categorial grammar in the form of a recursive definition schema (for universal grammar):

  (i) a, bCAT  (a. b primitive categories)
 (ii) If a, bCAT, so are

     (a) a/b
     (b) b\a
     (c) a|b
     (d) a//b
     (e) b\\a
     (f) a||b

(iii) Nothing else is in CAT. 

I assume that this basic schema will be supplemented by a system of features, so that we can include e.g. a/1b to index a function that requires an argument immediately to its right with the additional stipulation that the argument have the case value 1 (e.g. accusative, more on this in Bach ms [1982]).

So much for syntax. The more difficult question is the semantics. For a first step. I'll assume (along the lines of Montague's PTQ) that there is a uniform mapping from categories to types and that every function argument rule (cancellation rule) carries with it automatically a rule of interperetation which says that the interpretation of the resultant category is the result of applying the interpretation of the function expression to the intension of the interpretation of the argument expression.

I've also not said very precisely how we stipulate how to analyze (or construct) sentences with these "long-distance" functions. I consider it still and open problem to try to give a general characterization of such syntactic operations. In the next section I'll show how to accomplish a syntactic/semantic for several particular cases in English. The basic idea is this: we let functions expressions of category a//b act syntactically as expressions of category a, and supply a free variable of the type of intensions of the type corresponding to b in the translation. This variable remains free until we find an expression of category b, then we bind the variable with a lambda and let it loose. [7]

4. Some English Constructions. I now want to show how a generalized categorial grammar can be used to capture certain facts about English. I will exhibit a grammar which includes persuade and promise and so-called heavy NP-shifts.

The primitive categories (I ignore further analysis) are S (sentence), T (term-phrase, NP [=DP], CN (common noun), IV (tenseless VP's), ĪV̅ (to VP's). The basic categories we need are these, with some typical members.

BCN = {woman, fish, elephant,...}
BT = {Bill, Mary, John,...}
BT/CN = {every, the, a,...}
BT\S = {walks, runs,...}
B(T\S)/T = {hits, sees, loves,...}
B(T\S/ĪV̅ = {tries,...}
B((T\S/T))//ĪV̅ = {persuades,...}
B((T\S)/ĪV̅)/T = { ,...}
B(T\S)\(T\S) = {slowly, today,...}
BĪV̅/IV = {to,...}

We need the following Rule Schemata (from Universal Grammar). The rules may be read as follows: A:= B C: If α ∈ B, β ∈ C, then α β ∈ A, and if α translates as α' (i.e. 1') and β as β' (2') then α β translates as (0'=) whatever follows = under the rule. I ignore completely, for simplicity, conditions on feature values for constituents, except for two; VST ("variable store") is a function which tells us what free variables have been put into the translation and remain to be bound on at appropriate points, while [-A] is a feature telling us that there is "missing" phrase of category A (comparable to Gazdar's slash categories, Gazdar, forthcoming [1982]). Note further that the rules are schemata, and they (as well as the list of categories[4]) must be further specified for particular languages.

0. PA := BA
0'= 1'
1a. PA := PA/B PB          1b. PA := PB PB\A
PA := PA//B PB          PA := PB PB\\A
0' = 1'(^2') 0' = 2'(^1') [8]
2. PA/C := P(A/B)//C PB
0' = λXi, <s,C>[1'(Xi, <s,C>)(^2')] (RIGHT-WRAP)
3. PA := PA/B
0' = 1'(Xi,<s,B>) Add Xi,<s,B> to VST(0) "variable store"
4. PA := PA
0' = λXi,<s,B>[1'](^2') Remove Xi,<s,B> from VST(0)
Here are two examples of derivations, one involving heavy NP-shift (using an instantiation of Rule Schema 3) and one using Rule Schema 2. (This corresponds to sub-operation Right-wrap of Bach, 1979, misleadingly called a subfunction there.
Mary          sees         today                 the elephant          
---           ----         ----                  ---------      
 T            (T\S)/T      (T\S)\(T\S)              T
              T\S VST:℘3
               ----------------  T\S
              T\S VST:℘3
                T\S (VST:empty)



λ℘3[today'(see'(℘3))](^the(^elephant))[^Mary') ==> see'(the_elephant')(Mary')  

Mary persuaded Bill to go ---- --------- ----- -- -- T ((T\S)/T)//ĪV̅ T ĪV̅/IV IV ---------------- -------- (T\S)/ĪV̅ ĪV̅ ---------------------------------- T\S ----------------------------------------------- S Translation: λP4[persuade'(P4)(^Bill')](^go')(^Mary') ==> persuade'(^go')(^Bill')(^Mary') [9]

Finally, let' consider phrases like give Bill a fish (not in our fragment). I've argued (Bach, 1980) that give (in this usage) belongs to the category B(IV/T)//T. The standard derivation follows exactly that of persuade Bill to go.

give Bill ∈ PIV/T (by R2)
and is translated as λ℘[give'(℘)(^Bill')], so by 1a we have
give Bill a fish
==> give'(^a_fish)(^Bill')
Note that the inner argument of give, in this usage, is the thing given and that Bill is the direct object of the TVP give a fish.

This same standard derivation will assign to the phrase give a fish Bill an interpretation that reduces to this: give'(^Bill)(^a_fish') and means the same as give Bill to a fish.

What remains to be explained is why the option of R3 and R4 is not available to yield "heavy NP-shift" interpretations of phrases like give a fish the elephant (that is heavy). Note that R1a allows this derivation˙

give ∈ P(IV/T)//T give'
give a fish ∈ PIV/T give'(^a_fish')
give a fish the elephant ∈ PIV give'(^a_fish')(^the_elephant')
(Compare above: this is identical with the the result of the standard derivation of give Bill a fish.) We don't want to block the option of letting double slash categories take immediately following arguments ("anywhere to the right" must include "immediately following"). I propose that what is crucial here is the identity of the inner and outer arguments, so we can amend the rule schema (second option) as follows:
PA := PA//B PB unless A = (X/B) for some X [5], [6].

This is an exact transcription of the originally published paper in NELS 11:1-12, (1981), except for updating and adding references, a few corrections and improvements, and new footnotes 3a, 6. For typographical convenience, I have substituted a different notation for Ajdukiewicz's and my fractional notation: for A over B, I write here "A|B." NB the argument category is on the right. I thank Wynn Chao for comments and suggestions.
Among those who have taken up categorial grammars are Lyons (1966), Lewis (1972), Geach (1972), Montague (1974: Paper 8, henceforth PTQ).
To be mentioned here are Harman (1963), Brame (1978), Bresnan (1978), Schachter (1978), Hudson (1976), K. Ross (1980), Lapointe (1980, Gazdar (forthcoming [1982]). I am building in this paper on ideas of K. Ross (1978), R. Saenz, S. Lapointe, and M. Flynn (1980)
This paper is a companion piece to Bach (ms. [1982]), where I explore especially the kind of detailed dependencies found in the auxiliary system of English with a generalized categorial system. For details on the kind of system of which I am proposing a categorial variant, see K. Ross (1980), Bach and Partee (1980), Partee and Bach (1980).
The original had a different example here, which was in fact generable on a different analysis than the one cited (gratia Wynn Chao). Analyses are presented in a different format here than in the original, where they took the form of Montagovian analysis trees.
For an interesting exploration of the question of providing general principles determining what particular directional categories a language may pick, see M. Flynn, 1980.
Answers to exercises:
Exercise 1:
BS/S' = {a}
BS/b = {b}
BS\S' = {b}
Bb = {b}

Exercise 2: Let L = {X|X = (abc)n. L is CF (in fact regular). But L'' = Scramble(L) is not CF. For let L' = {X|X = anbmck then L'' ∩ L' = {X|X = anbncn} is not CF, but since the intersection of a CF language and a regular language is CF, L'' (= Scramble(L)) can't be CF.
Since this paper was written, there has been a profusion of research and publication on categorial grammar. New references in the bibliography, which yield a good sampling of this rich literature are: Carpenter 1997; Dowty 1982, 2007; Jacobson 1999; Moortgat 1988; Morrill 1995; Steedman 2000; Wood 1993.


Ajdukiewicz, Kasimierz. 1935. Die syntaktische Konnexita̎t. Studia Philosophica 1: 1-27. Translated as `Syntactic connexion' in Storrs McCall, ed., Polish Logic: 1920-1939 (Oxford: Oxford University Press, 1967).
Bach, Emmon. 1983. Generalized categorial grammars and the English auxiliary. In Frank Heny and Barry Richards, eds., Linguistic Categories: Auxiliaries and Related Puzzles (Dordrecht: Reidel), II, pp. 101-120.
Bach, Emmon. 1975. Order in Base Structures. In Charles N. Li (ed.), Word Order and Word Order Change, pp. Austin, Texas: University of Texas Press. Pp. 307-43.
Bach, Emmon, and Barbara H. Partee. 1980. Anaphora and Semantic Structure. Proceedings of the Chicago Linguistic Society, l6: Parasession on Anaphora.
Bar-Hillel, Yehoshua. 1953. A quasi-arithmetical notation of syntactic description. Language 29: 47-58.
Brame, Michael. 1978. Base-generated Syntax. Seattle: Noit Amrofer Press.
Bresnan, Joan 1978. A realistic transformational grammar. In Morris Halle, Joan Bresnan, and George A. Miller, eds., Linguistic Theory and Psychological Reality (Cambridge, MA: MIT Press).
Carpenter, Bob. 1997. Type-Logical Semantics. Cambridge, Massachusetts / London: MIT Press.
Dowty, David R. 1982. Grammatical relations and Montague grammar. In Pauline Jacobson and Geoffrey K. Pullum, eds., The Nature of Syntactic Representation (Dordrecht: Reidel), pp. 79-130.
Dowty, David. 2007. Compositionality as an Empirical Problem. In Chris Barker and Pauline Jacobson, eds., Direct Compositionality (Oxford: Oxford University Press), pp. 23-101.
Flynn, Michael. 1981. Structure building operations and word order. Ph.D. dissertation: The University of Massachusetts, Amherst (G.L.S.A.)
Gazdar, Gerald. forthcoming [1982]. Phrase structure grammar. In Pauline Jacobson and Geoffrey K. Pullum, eds., The Nature of Syntactic Representation (Dordrecht: Reidel), pp. 131-186.
Geach, Peter. 1972. A program for syntax. In Donald Davidson and Gilbert Harman, eds., Semantics of Natural Language (Dordrecht: Reidel), pp. 483-497.
Hale, Kenneth (ms.) On the position of Walbiri in a typology of the base. (Unpublished paper, MIT, 1979).
Hale, Kenneth, LaVerne Jeanne, and Paul Platero. 1977. Three cases of overgeneration. In P. Culicover, T. Wasow, and A. Akmajian, eds., Formal Syntax (New York: Academic Press, pp. 379-416.
Harman, Gilbert. 1963. Generative grammars without transformation rules: a defense of phrase structure. Language 39 (1963) pp. 597- 616.
Hudson, Richard A. 1976. Arguments for a Non-transformational Grammar. Chicago: University of Chicago Press.
Jacobson, Pauline. 1999. Towards a variable-free semantics. Linguistics and Philosophy 22:117-184.
Lambek, Joachim. 1961. On the calculus of syntactic types. In R. Jakobson, ed., Structure of Language and its Mathematical Aspects. Providence: Proceedings of Symposia in Applied Mathematics, XII, American Mathematical Society. Pp. 166-178.
Lapointe, Steven G. 1980. A theory of grammatical agreement. Ph.D. dissertation: The University of Massachusetts, Amherst.
Lewis, David. 1970 [1972]. General semantics. Synthese: 22: 18-67. Reprinted in Donald Davidson and Gilbert Harman, eds., Semantics of Natural Language (1972:Dordrecht: Reidel), pp. 169-218.
Lyons, John. 1966. Towards a `notional' theory of the `parts of speech.' Journal of Linguistics 2: 209-236.
Montague, Richard. 1974. Formal Philosophy. Edited by Richmond H. Thomason. New Haven: Yale University Press.
Moortgat, Michael. 1988. Categorial Investigations. Academisch Proefschrift: Universiteit van Amsterdam.
Morrill, G. 1995. Discontinuity in categorial Grammar. Linguistics and Philosophy. 18:175-219.
Partee, Barbara H., and Emmon Bach. 1980 [1981]. Quantification, Pronouns, and VP Anaphora. In J. Groenendijk, T. Janssen, and M. Stokhof (eds.), Formal Methods in the Study of Language Proceedings of the Third Amsterdam Colloquium, 445-48l.
Ross, Kenneth. 1980. Parsing English phrase structure. University of Massachusetts, Amherst. Unpublished Ph.D. dissertation.
Schachter, Paul. 1978. Review of Hudson, 1976. Language 54: 348-376.
Steedman, Mark. 2000. The Syntactic Process. Cambridge, Massachusetts / London: MIT Press.
Wood, Mary McGee. 1993. Categorial Grammars. London and New York: Routledge. Linguistic Theory Guides.