Return to the home page for Patrick Kellogg

Context Sensitive Grammars and Chomsky's Hierarchy

In 1957, Noam Chomsky described five classifications for generative grammars. According to Chomsky:

"…a generative grammar is a system of many hundreds of rules of several different types, organized in accordance with certain fixed principles of ordering and applicability and containing a certain fixed substructure which, along with the general principles of organization, is common to all languages." (Chomsky, 1968, pp. 87-88)

There might be grammars that are created by random or stochastic rules, but those are not included in Chomsky's hierarchy. The first classification is the most general category in the hierarchy; the next category is a subset of that one, and so on, down to the most restrictive category.

1. Recursive grammar.
A recursively enumerable grammar is one that can be instantiated by an algorithm. Another way of saying this is: "there is a Turing machine that decides it" (Lewis, 1998, p. 195). Given an input that is "valid" for the grammar, the Turing machine will always halt in a correct "answer" state. When written using standard notation, both sides of the rewrite rules can have as many symbols as are needed (example, A B C D ® M N O P Q (etc.)). A grammar of this type could be said to be "incomplete", since the behavior of the Turing machine is not specified for all inputs. This is the same problem that Bertrand Russell and Alfred North Whitehead had when constructing their unified formal logic "Principia Mathematica" (Hofstadter, 1979, pp 17-619).

2. Recursively enumerable grammar.
A subset of #1, a grammar is recursively enumerable if (and only if) the language and its inverse are both recursive. So, for any input, the Turing machine is guaranteed to halt, and there are no undecidable inputs that might cause a "halting problem". Therefore, if A ® B is true for a given grammar, B ® A is true for its inverse. This means that given any assumption, we can tell in finite time whether it is true or false. Recursively enumerable grammars are sometimes referred to as "type-0" grammars (Manna, 1974, pp. 41-43).

3. Context-sensitive grammars.
Also known as "type-1" grammars, this class is restricted to Turing machines that can only examine states immediately before and after the current state. So, the grammar might include a rule to turn a sequence A B C into A X C (A B C ® A X B). Note that there must be the same number of symbols on both the left and right sides of the rule. Rules that are applied to symbols must consider the "context" of the symbol. For example, in the previous rule, a "B" can be turned into an "X" if (but not necessarily only if) it has a preceding "A" and a following "C". This makes it difficult to modify assumptions since individual symbols cannot be interpreted on their own. Also, the number of rules needed to fully specify the grammar might become very large, since there could be many special cases that need to be considered.

4. Context-free grammars.
Familiarly known as "CFGs", or "type-2" grammars. Here, there can only be one nonterminal symbol on the left-hand "assumption" side of the rule. Luckily, any rule can be rewritten to move all symbols to the right-hand "conclusion" side as needed (for example, A ¬ B ® C D is the same as A ® B C D. Rules can be applied to individual symbols since there is only one given assumption (like the "A" in the previous example). This means "CFGs are popular for natural language grammars." (Russell, et. al, 1995, p. 656)

5. Regular grammars.
This is the most specific (or "type-3") grammar and is the smallest subset. To be a regular grammar, every rule has a single nonterminal symbol on the left-hand side (as in #4), and only one terminal symbol on the right-hand side (or at most, one terminal symbol followed by a nonterminal symbol) as in the form A ® B. These grammars are similar to finite state machines, and can be implemented using a lookup table for each symbol that needs to be translated. However, "they are poorly suited for programming languages because, for example, they cannot represent constructs such as balanced opening and closing parenthesis. (Russell, et. al, 1995, p. 656)


Chomsky, Noam, "Language and Mind", Harcourt Brace Jovanovich, San Diego. 1968

Hofstadter, Douglas R., "Gödel, Escher, Bach: An Eternal Golden Braid", Vintage Books, New York. 1979

Manna, Zohar, "Mathematical Theory of Computation", McGraw-Hill, New York. 1974

Russell, Stuart J., and Peter Norvig, "Artificial Intelligence: A Modern Approach", Prentice Hall, New Jersey. 1995