Part three of a mini-series on using grammars as generators:
- Part I was about Context-Free Grammars and natural language parsing.
- Part II was about Lindenmeyer systems, Context Free design grammars, and Structure Synth.
- This third part is about generative poetry and Tree-Adjoining Grammars.
- Part IV will be about Style Grammars and Shape Grammars.
Issue 1: Fall 2008
(From Issue 1: Fall 2008)
In October 2008, Stephen McLaughlin and Jim Carpenter released ‘Issue 1: Fall 2008′, a 3785 page compilation of poems by more than 3000 contemporary American poets. (Download here).
Besides the stunning size of the book, this would probably have gone relatively unnoticed, if it weren’t for the fact that the 3000+ authors had not submitted any poems to the editors. Neither were they able to recognize any of the poems as their own. This of course created quite a stir.
It turns out that the book was generated by Erica T Carter, which actually is the name of a computer program written by Jim Carpenter.
Erica T Carter
Erica T Carter (ETC3) uses Tree-Adjoining Grammars (TAGs) for generating poems. TAGs are related to formal grammars, but instead of production rules operating on symbolic sequences, TAGs operate on trees.
Considering the example above, a tree may be inserted (substituted) into another tree, if its root node matches a terminal node (the a-2 and a-3 trees may be attached to the NP non-terminal). Besides the substitution operation, it is also possible to do adjoining operations, where a tree may be expanded by inserting another tree inside it (this paper has a short introduction to TAGs).
‘Erica T Carter’ is open source and may be downloaded (or tested) here.
Judging from the source and comments, the phrase structures in the TAGs were handpicked by Jim Carpenter after analyzing works by Frank O’Hara, Sylvia Plath, Gary Snyder, and Rachel Blau DuPlessis (not that I have read anything by them). In the online version of Erica, the default grammar choice is the 166-trees ‘Mimetic’ grammar, which is described as an amalgam of the rest of the grammars – I’m guessing that this is the one used for ‘Issue 1: Fall 2008′.
In addition to the trees, Erica uses a lot of extra information from a 30MB database (public available from the site too). The database contains lexicon tables with entries like ‘abashless = adj’ and ‘abate = v’ apparently based on Emily Dickinson Poems and Joseph Conrad’s Heart of Darkness (‘Apocalypse Now’). The database also contains more exotic tables, like ‘pronunciation’ (‘anachronism = AH0 N AE1 K R AH0 N IH2 Z AH0 M’) which may be used to construct rhymes.
The auto-generated poems seems very impressive, but after skimming through the text some patterns emerge, which seems unlikely if the poems were to be written by different (human) authors: for instance 2205 lines start with “Like a …” (that is 4% of the 55685 total lines!).
On the other hand the poems impress by containing cross serial dependencies as in this example:
Where both the title and content of the poem refer to ‘background’. This particular feature – where different parts of the generated sentences correlate – is difficult to achieve with context free grammars, since the substitutions of symbols does not depend on how other symbols are expanded.
Tree Adjoined Grammars and Context Sensitivity
Tree Adjoined Grammars fits in between the Context Free and Context Sensitive classes in the Chomsky Hierarchy: they describe Mildly Context-sensitive languages.
Where as Context Free grammars have the pleasant property of being parsable in polynomial space and time, they are not adequate for describing for example the natural languages (features such as cross serial dependencies and agreements are difficult to capture: for instance a formal grammar for a natural language may contain ‘noun’ and ‘verb’ non-terminals. But nouns and verbs depend on each other – something which is not easily expressed in a context free grammar, leading to output such as ‘Words is funny’).
So the Mildly context-sensitive languages provide a middle ground – they are better at capturing the structure of natural languages, but still not as difficult to handle as a true context-sensitive language.
Generative Art and the Freedom of Restrictions
Now, why even care about the complexity of the grammars? After all we are not interested in analyzing syntax or deriving mathematical properties of the grammars, we are interested in generating structure.
For instance, why restrict Context Free Art and Structure Synth to context free systems? The cost of generating structures does not increase by switching to a more powerful grammar, only the cost of analyzing the systems.
I think the answer relates to the very essence of ‘generative art’.
For me, generative art is all about exploring systems, without being too much in control. When generating structures, it should not be possible to anticipate how a given structure turns out by looking at the rules. There should be a sense of indeterminism and surprise in the result. The system needs not to necessarily be driven by random choices in order to achieve this – for instance the Mandelbrot set is a good example of this: nobody would have been able to imagine how complex images a simple system like “z -> z^2+c” could create, yet there is nothing stochastic in the generation of the set.
Choosing to work within a restricted rule system is a way to give up some control and force yourself to think differently. You have to explore and work within the limitations of the system, which may lead to interesting and surprising results.
More generic languages, for instance the popular Java-based Processing, have no limitations in expressiveness (Java, like all other general purpose programming languages, is Turing complete). Is Processing not suitable for generative art, because of its universal expressive power? Well, the answer is of course that Processing is very suitable and it is indeed widely used for producing generative art. In fact any Structure Synth or Context Free Art system could be emulated in Processing/Java because of this universal power. But Processing is also suitable for many other applications, like for instance Data Visualization and other non-generative tasks. Context Free Art and Structure Synth on the other hand force you to explore generative systems.