OuPoCo, the Combinatorial Poetry Workbench

Abstract : Oupoco (L’ouvroir de poésie combinatoire) is a project taking inspiration from Raymond Queneau's book Cent mille mille milliards de poèmes, published in 1961. Queneau’s book is a collection of ten sonnets which verses can be freely recombined to form new poems. The book can be seen as composed of ten sheets, each separated into fourteen horizontal bands, each band carrying a verse on its front. The reader can choose, for each verse, one of the ten versions proposed by Queneau. The ten versions of each verse have the same scansion and rhyme, which ensures that each sonnet thus assembled is regular in shape [Queneau, 1961]. It would be tempting to develop a computer-based version of Queneau’s work, but Queneau’s book is still under copyright, and it is by definition limited to its ten original sonnets. To overcome this problem, we developed the Oupoco project, aiming at proposing a sonnet generator based on the recombination of a large collection of 19th century French sonnets. The challenge is thus more complex than the one proposed originally by Queneau since our sonnets do not have the same scansion and rhyme. From this point of view, even if the project is intended to generate new sonnets, it is largely based on the development of analysis tools able to identify the scansion, the rhyme and the structure of the original sonnets. It is thus very different from the numerous projects dedicated to the pure generation of poetry, being with symbolic [Gervás, P., 2013] or neural methods [Ghazvininejad et al., 2017] [Van De Cruys, 2019] (among many others). Oupoco is currently based on a collection of 788 sonnets from 16 authors from the 19th century, and this database is regularly expanding. Each sonnet is encoded in a XML format along with related metadata, and a TEI version of the database is available. The project requires to get access to a formal representation of rhymes [Beaudouin, 2002]. In order to do this, the first step is to get a phonetic transcription of the last word of each verse, but this is not enough: for example, aimé and aimée have the same phonetic transcription, but do not rhyme, according to French rhyming rules (feminine and masculine do not rhyme according to the classical rules of French poetry); there are also cases where the phonetic transcription diverges but words actually rhyme (for example with sounds like [e] and [e]). A series of rules had thus to be defined to get a proper analysis of rhyme from the phonetic transcription of the last word of each verse. The generator uses this analysis to produce random sonnets, with different possible structures, respecting the rules of French versification (the code and the resources used, especially the sonnet database, are open source and freely available for research, see:
