From Cyclopaedia to Encyclopédie: experiments in machine translation and sequence alignment

Figure 1. Title page from the 1745 prospectus of the first Encyclopédie project. This page image is taken from ARTFL’s 18th Volume of the Encyclopédie.

It is well known that the Encyclopédie ou dictionnaire raisonné des sciences, des arts et des métiers began first as a modest translation project of Ephraim Chambers’ Cyclopaedia in 1745. Over the next few years, Diderot and D’Alembert would replace the original editors and the project would be duly transformed from a simple translation into an effort to compile and organise the sum total of the world’s knowledge. Over the course of their editorial work, Diderot, and most notably D’Alembert, were not shy in incorporating these translations of the Cyclopaedia as filler for the Encyclopédie. Indeed, ‘ils ont laissé une bonne partie de ces articles presque inchangés, ou avec des modifications insignifiantes’ (Paolo Quintili, ‘D’Alembert “traduit” Chambers. Les articles de mécanique de la Cyclopædia à l’Encyclopédie’, Recherches sur Diderot et sur l’Encyclopédie 21 (1996), p.75). The philosophes were nonetheless conscious of their debt to their English predecessor Chambers. His name appears some 1154 times in the text of the Encyclopédie and he is referenced as sole or contributing source to 1081 articles, where his name appears in italics at the end of a section or article. Given the scale of the two works under consideration, systematic evaluation of the extent of the philosophes’ use of Chambers has remained, even today, a daunting task. John Lough, in 1980, framed the problem nicely: ‘So far no one has had the patience to make a detailed study of the exact relationship between the text of Diderot’s Encyclopédie and the work of Ephraim Chambers. This would no doubt require several years of arduous toil devoted to comparing the two works article by article’(‘The Encyclopédie and Chambers’ Cyclopaedia’, SVEC 185 (1980), p.221).

Recent developments in machine translation and sequence alignment now offer new possibilities for the systematic comparison of digital texts across languages. The following post outlines some recent experimental work in leveraging these new techniques in an effort to reduce the ‘arduous toil’ of textual comparison, giving some preliminary examples of the kinds of results that can be achieved, and providing some cursory observations on the advantages and limitations of such systems for automatic text analysis.

Our two comparison datasets are the ARTFL Encyclopédie (v. 1117) and the recently digitised ARTFL edition of the 1741 Chambers’ Cyclopaedia (link). The 1741 edition was selected as it was one of the likely sources for the translation original project and we were able to work from high quality pages images provided by the University of Chicago Library (On the possible editions of the Cyclopaedia used by the encyclopédistes, see Irène Passeron, ‘Quelle(s) édition(s) de la Cyclopœdia les encyclopédistes ont-ils utilisée(s)?’, Recherches sur Diderot et sur l’Encyclopédie 40-41 (2006), p.287-92.) In a nutshell, our approach was to generate a machine translation of all of the Cyclopaedia articles into French and then use ARTFL’s Text-PAIR sequence alignement system to identify similar passages between this virtual French Cyclopaedia and the Encyclopédie, with the translation providing links back to the original English edition of the Chambers as well as links to the relevant passages in the Encyclopédie.

For the English to French machine translation of Chambers, we examined two of the most widely used resources in this domain, Google Translate and DeepL. Both systems provide useful Application Programming Interfaces [APIs] as part of their respective subscription services, and both provide translations based on cutting-edge neural network language models. We compared results from various samples and found, in general, that both systems worked reasonably well, given the complications of eighteenth-century vocabularies (in both English and French) and many uncommon and archaic terms (this may be the subject of a future post). While DeepL provided somewhat more satisfying translations from a reader’s perspective, we ultimately opted to use Google Translate for the ease of its API and its ability to parse the TEI encoding of our documents with little difficulty. The latter is of critical importance, since we wanted to keep the overall document structure of our dictionaries to allow for easy navigation between the versions.

Operationally, we segmented the text of the Cyclopaedia into short blocks, split at paragraph breaks, and sent them for automatic translation via the Google API, with a short delay between blocks. This worked relatively well, though the system would occasionally throw timeout or other errors, which required a query resend. You can inspect the translation results here – though this virtual French edition of the Chambers is not really meant for public consumption. Each article has a link at the bottom to the corresponding English version for the sake of comparison. It is important to note that the objective here is NOT to produce a good translation of the text or even one that might serve as the basis for a human edition. Rather, this machine-generated edition exists as a ‘pivot-text’ between the English Chambers and the French Encyclopédie, allowing for an automatic comparison of the two (or three) versions using a highly fault-tolerant sequence aligner designed to pick out commonalities in very noisy document spaces. (See Clovis Gladstone, Russ Horton, and Mark Olsen, ‘TextPAIR (Pairwise Alignment for Intertextual Relations)’, ARTFL Project, University of Chicago, 2008-2021, and, more specifically, Mark Olsen, Russell Horton and Glenn Roe, ‘Something borrowed: sequence alignment and the identification of similar passages in large text collections’, Digital Studies / Le Champ numérique 2.1 (2011).)

The next step was to establish workable parameters for the Text-PAIR alignment system. The challenge here was to find commonalities between the French translations created by eighteenth-century authors and translators and machine translations produced by a modern automatic translation system. Additionally, the editors and authors of the Encyclopédie were not necessary constrained to produce an exact translation of the text in question, but could and did, make significant modifications to the original in terms of length, style, and content. To address this challenge we ran a series of tests with different matching parameters such as n-gram construction (e.g., number of words that constitue an n-gram), minimum match lengths, maximum gaps between matches, and decreasing match requirements as a match length increased (what we call a ‘flex gap’) among others on a representative selection of 100 articles from the Encyclopédie where Chambers was identified as the possible source. It is important to note that even with the best parameters, which we adjusted to get favorable recall and precision results, we were only able to identify 81 of the 100 articles. (See comparison table. The primary parameters chosen were bigrams, stemmer=true, word len=3, maxgap=12, flexmatch=true, minmatchingngrams=5. Consult the TextPair documentation and configuration file for a description of these values.) Some articles, even where clearly affiliated, were missed by the aligner, due to the size of the articles (some are very small) and fundamental differences in the translation of the English. For example, the article ‘Compulseur’ is attributed by Mallet to Chambers, but the machine translation of ‘Compulsor’ is a rather more literal and direct translation of the English article than what is offered by Mallet. Further relaxing matching parameters could potentially find this example, but would increase the number of false positives, in effect drowning out the signal with increased noise.

All things considered, we were quite happy with the aligner’s performance given the complexity of the comparison task and the multiple potential variations between historical text and modern machine translations. To give an example of how fine-grained and at the same time highly flexible our matching parameters needed to be, see the below article ‘Gynaecocracy’, which is a fairly direct translation on a rather specialised subject, but that nonetheless matched on only 8 content words (fig. 2).

Figure 2. Comparisons of the article ‘Gynaecocracy’.

Other straightforward articles were however missed due to differences in the translation and sparse matching n-grams, see for example the small article on ‘Occult’ lines in geometry below, where the 6 matching words weren’t enough to constitute a match for the aligner (fig. 3).

Figure 3. Comparisons of the geometry article ‘Occult’.

Obviously this is a rather inexact science, reliant on an outside process of automatic translation and the ability to match a virtual text that in reality never existed. Nonetheless the 81% recall rate we attained on our sample corpus seemed more than sufficient for this experiment and allowed us to move forward towards a more general evaluation of the entirety of identified matches.

Once settled on the optimal parameters, we then Text-PAIR to generate both an alignment database, for interactive examination, and a set of static files. Both of these results formats are used for this project. The alignment database contains some 7304 aligned passage pairs. The system allows queries on metadata, such as author and article title as well as words or phrases found in the aligned passages. The system also uses faceted browsing to allow the user to summarize results by the various metadata (for more on this, see Note below). Each aligned passage is presented as a facing page representation and the user can toggle a display of all of the variations between the two aligned passages. As seen below, the variations between the texts can be extensive (fig. 4).

Figure 4. Text-PAIR interface showing differences in the article ‘Air’.

Text-PAIR also contextualises results back to the original document(s). For example, the following is the article ‘Almanach’ by D’Alembert, showing the aligned passage from Chambers in blue (fig. 5).

Figure 5. Article ‘Almanach’ with shared Chambers passages in blue.

In this instance, D’Alembert reused almost all of Chambers’ original article ‘Almanac’, with some minor variations, but does not to appear to have indicated the source of the first part of his article (page image).

The alignment database is a useful first pass to examine the results of the alignment process, but it is limited in at least two ways. It identifies each aligned passage, but does not merge multiple passages identified in in article pairs. Thus we find 5 shared passages between the articles ‘Constellation’. The interface also does not attempt to evaluate the alignments or identify passages that occur between different articles. For example, D’Alembert’s article ‘ATMOSPHERE’ indeed has a passage from Chambers’ article ‘Atmosphere’, but also many longer passages from the article ‘Generation’.

To accumulate results and to refine evaluation, we subsequently processed the raw Text-PAIR alignment data as found in the static output files. We developed an evaluation algorithm for each alignment, with parameters based on the length of the matching passages and the degree to which the headwords were close matches. This simple evaluation model eliminated a significant number of false positives, which we found were typically short text matches between articles with different headwords. The output of this algorithm resulted in two tables, one for matches that were likely to be valid and one that was less likely to be valid, based on our simple heuristics – see a selection of the ‘YES’ table below (fig. 6). We are, of course, making this distinction based on the comparison of the machine translated Chambers headwords and the headwords found in the Encyclopédie, so we expected that some valid matches would be identified as invalid.

Figure 6. Table of possible article borrowings.

The next phase of the project included the necessary step of human evaluation of the identified matches. While we were able to reduce the work involved significantly by generating a list of reasonably solid matches to be inspected, there is still no way to eliminate fully the ‘arduous toil’ of comparison referenced by Lough. More than 5000 potential matches were scrutinised, looking in essence for ‘false negatives’, i.e., matches that our evaluation algorithm classed as negative (based primarily on differences in headword translations) but that were in reality valid. The results of this work was then merged into in a single table of what we consider to be valid matches, a list that includes some 3700 Encyclopédie articles with at least one matching passage from the Cyclopaedia. These results will form the basis of a longer article that is currently in preparation.

Conclusions

In all, we found some 3778 articles in the Encyclopédie that upon evaluation seem highly similar in both content and structure to articles in the 1741 edition of Chambers’ Cyclopaedia. Whether or not these articles constitute real acts of historical translation is the subject for another, or several other, articles. There are simply too many outside factors at play, even in this rather straightforward comparison, to make blanket conclusions about the editorial practices of the encyclopédistes based on this limited experiment. What we can say, however, is that of the 1081 articles that include a ‘Chambers’ reference in the Encyclopédie, we only found 689 with at least one matching passage. Obviously this recall rate of 63.7% is well below the 81% we attained on our sample corpus, probably due to overfitting the matching algorithm to the sample, which warrants further investigation. But beyond testing this ground truth, we are also left with the rather astounding fact of 3089 articles with no reference to Chambers whatsoever, all of which seem, at first blush, to be at least somewhat related to their English predecessors.

The overall evaluation of these results remains ongoing, and the ‘arduous toil’ of traditional textual comparison continues apace, albeit guided somewhat by the machine’s heavy hand. Indeed, the use of machine translation as a bridge between documents to find similar passages, be they reuses, plagiarisms, etc., is, as we have attempted to show here, a workable approach for future research, although not without certain limitations. The Chambers–Encyclopédie task outlined above is fairly well constrained and historically bounded. More general applications of these same methods may well yield less useful results. These reservations notwithstanding, the fact that we were able to unearth many thousands of valid potential intertextual relationships between documents in different languages is a feat that even a few years ago might not have been possible. As large-scale language models become ever more sophisticated and historically aware, the dream of intertextual bridges between multilingual corpora may yet become a reality. (For more on ‘intertextual bridges’ in French, see our current NEH project.)

Note

The question of the Dictionnaire de Trévoux is one such factor, as it is known that both Chambers and the encyclopédistes used it as a source for their own articles – so matches we find between the Chambers and Encyclopédie may indeed represent shared borrowings from the Trévoux and not a translation at all. Or, more interestingly, perhaps Chambers translated a Trévoux article from French to English, which a dutiful encyclopédiste then translated back to French for the Encyclopédie – in this case, which article is the ‘source’ and which the ‘translation’? For more on these particular aspects of dictionary-making, see our previous article ‘Plundering philosophers: identifying sources of the Encyclopédie’, Journal of the Association for History and Computing 13.1 (Spring 2010) and Marie Leca-Tsiomis’ response, ‘The use and abuse of the digital humanities in the history of ideas: how to study the Encyclopédie’, History of European ideas 39.4 (2013), p.467-76.

– Glenn Roe and Mark Olsen

Annotation in scholarly editions and research

It has been, alas, almost exactly a year since our last face-to-face Besterman Workshop at 99 Banbury Road. Of course, webinars allow more people to join, and to do so, most importantly, from the comfort of their homes, where they can sit comfortably and set their thermostats to the temperature that suits them best. The advent of the Zoom/Teams era, however, has brought with it a number of unfortunate consequences: discussions are not as lively as they used to be, asking a follow-up question is nearly impossible, and so are chats with friends and colleagues, before, during, or after the talk. Worst of all, we no longer get a chance to eat our beloved Leibniz or Belgian biscuits – but those, to be fair, had already become something of a rarity towards the beginning of 2018. Anyway: those of you who did attend our last face-to-face Besterman Workshops may remember this gloomy and cumbersome poster of mine hanging from the mantelpiece.

This poster was presented at a conference in Wuppertal, Germany, at the end of February 2019: ‘Annotation in Scholarly Editions and Research: Function – Differentiation – Systematization’. Organised by Julia Nantke (Universität Hamburg) and Frederik Schlupkothen (Bergische Universität Wuppertal), this two-day bilingual Anglo-German colloquium was a wonderful occasion to reflect on the age-old human habit of glossing, commenting, and generally interfering with other people’s work.

Alongside some theoretical papers (to mention but one, Willard McCarty’s brilliant keynote lecture on annotation as a knowledge-producing practice), the symposium featured several more practice-oriented talks that would have certainly been of interest to many of our Digital Humanities followers: some focused on how best to structure and visualise annotation in digital scholarly editions; others raised the question as to how to annotate audio-visual materials; and yet others investigated the extent to which annotation can be automated.

Some of the papers given at the ‘Annotation in Scholarly Editions and Research’ conference can now be read in a volume published last year (yes, in 2020!) by De Gruyter and available in print as well as an Open Access eBook.

My own contribution to the volume (which you can find here, should you want to read it) presents what I think might be an efficient and user-friendly three-level annotation system, the ‘reversible annotation system’, which I developed while working on Digital d’Holbach, a born-digital scholarly edition of Paul-Henri Thiry d’Holbach’s complete works. On this model, I argue, a single set of notes can be so structured as to cater to very different audiences, meaning that the edition can hope simultaneously to be user-friendly and cost-efficient. Should you have any comments or suggestions for improvement, please do not hesitate to let me know!

Ruggero Sciuto, University of Oxford

Digitising the margins: a classification of Voltaire’s scribbles

The most famous squiggly lines relating to eighteenth-century writing are almost certainly to be found in Tristram Shandy. Sterne uses them to illustrate the non-linearity of stories (see about halfway down that page) and digressions from the main narrative, before reviving the device several volumes later to render graphically for his readers the movement of the stick brandished by the character Trim. But these squiggles from 1761-1762 are far from alone. Both before and after Sterne’s foray into wiggly line design, Voltaire was peppering the margins of his books with marginalia, which involved both verbal and non-verbal elements – that is, words and squiggles.

When a team of Russian scholars began to publish the marginalia from his library in the 1970s in the Corpus des notes marginales de Voltaire, they decided that a facsimile edition would be both too expensive and not sufficiently clear to read. They settled on a compromise editorial policy, which entailed transcribing Voltaire’s words and reproducing graphically any accompanying marks and lines (usually made in ink or lead pencil, but also comprising scratches or indentations in the paper, for example crosses scored with the thumb-nail). When the edition passed to the Voltaire Foundation, we adhered to the same principles for the remaining volumes, much to the chagrin of our typesetter, who nevertheless heroically drew hundreds of scribbles electronically to incorporate into the typeset file.

Vauvenargues, p.90; OCV, vol.145, p.484.

The example above and those that follow are from books that Voltaire annotated with the intent of returning them to their authors with suggestions for improvement. In principle this should mean a greater likelihood that any shapes drawn should be intelligible and contribute to the meaning of the verbal marginalia. Indeed, in the first case, in a copy of Vauvenargues’s Introduction à la connaissance de l’esprit humain, we can see that the vertical wavy line in the margin brackets the passage generally, and is connected with the note ‘peu déve / lop[p]é’ (poorly developed), while the second + sign links ‘sage’ in the printed text to ‘fort’ in the margin, indicating that rather than referring to a wise person, the author should be talking about a strong person (in opposition to the weak person indicated by the first + sign higher up).

Vauvenargues, p.48; OCV, vol.145, p.477.

Here Voltaire uses + signs again to flag the word ‘dans’ twice at the top of the page, and indicates by the curved line and a further ‘dans’ in the margin that Vauvenargues should be consistent in beginning each in the series of adverbial clauses with the same preposition. At the beginning of the new section lower down, he uses a sort of Greek gamma in the margin to show that an insertion should be made. All very clear for the addressee of the annotations. And between those two? The squiggly line in the margin is hard to interpret and may simply bear testament to his reading: did he stumble on this passage? Did he dislike it? Perhaps he wanted to write a criticism or a suggestion but couldn’t decide on what to say. At any rate, the squiggle draws our eye, nearly 300 years after it was penned, to a passage to which Voltaire must also have paid particular attention.

Frederick, p.122; OCV, vol.145, p.156.

This final example is a bit different insofar as it is not actually in Voltaire’s hand, but is a careful copy made of an original that was subsequently destroyed in the bombing of Berlin during the Second World War. Slanted crosses, several with double verticals (reminiscent of the letter H), indicate lines of verse by Frederick, king of Prussia, with which Voltaire, preeminent poet of his day, was unhappy and which are commented in the margins. The ‘gamma’ again probably draws the king’s attention to the replacement word written over the line. Here, the limits of the typeset page become apparent as the slashing lines and crosses come so thick and fast that it becomes difficult to fit them all in. An apparatus of notes at the bottom of the page helps, but the effect at first glance is really not quite the same.

Digitising these volumes, as part the Voltaire Foundation’s new initiative Digital Enlightenment, poses new challenges, but can it also bring new solutions? On first analysis the infinitely flexible nature of Voltaire’s squiggles seems to be at odds with the ordered discipline inherent in our approach to digitising the Œuvres complètes. We soon decided that we were not going to scan every mark in the source volume and virtually paste it into the digital text – not only would madness likely that way lie, but also considerable expense, and it would be a distinctly inelegant way of solving the problem. The more you look at the corpus of squiggles, however, the more you see that although in strict terms you have a very large number of different marks, you have a much smaller number of different types of mark, and if we can successfully classify and label those types, we can use that classification and those labels when we digitise the content. Instead of the data saying ‘here’s a picture of a squiggle’, it will instead say ‘at this point there’s a mark of type X.’

How, then, to classify these marks? If you think of what makes up a mark or a squiggle, it will be one or more line-type marks, and where there is more than one line-type mark, they may meet or cross each other at a particular point. We call the line-type marks edges, and the points where they meet or cross nodes, and if you count the number of edges and nodes you find you have a ready-made way of classifying – and even sorting – your squiggles. For example:

 

has one edge, and no nodes:

 

has two edges, but still no nodes, and:

has one edge and one node. If we turn these counts into parts of a label (e.g. n0e1) we can start to distil order out of infinite variety, and we can pretty soon have an easy lookup for our digitisers to use:

There is, of course, a degree of discretion involved here in grouping marks according to type – there is a slanted line 10º from the vertical and another 10º from the horizontal, but what if we find a line precisely 45º from both? Or a vertical line that wiggles not once or twice but… seven times? Well, we may then need to add a shape and a code, but the method allows that, and if there’s one thing this digitisation exercise has taught us, it’s that until you’ve marked up the final full stop, novelty may at any time appear before you. Expect, and accommodate, the unexpected.

Using this method, we will be able to allow readers to search for particular marks. Or, more correctly, for particular classifications of marks, e.g. for ‘a straight line slanting from bottom left to top right at an angle of inclination less than 45º from the horizontal’ rather than for a specific slanting line. But the classification should be sufficiently specific that a reader encountering a mark in one text, and wondering where else Voltaire has used it, should be able to see the other relevant instances.

How will we deal with squiggles that defy classification? We defy squiggles to defy this classification! Time will, of course, tell, but we’re confident that we can accommodate anything that Voltaire felt necessary to add to the texts he was reading, blissfully unaware of the coding system that awaited his scribbles.

– Gillian Pink and Dan Barker, dancan Ltd

 

Exploring Voltaire’s letters: between close and distant readings

La lettre au fil du temps: philosophe

‘La lettre au fil du temps: philosophe.’

A stamp produced by the French post office in 1998 celebrates the art of letter-writing by depicting Voltaire writing letters with both hands. It’s true that Voltaire wrote a lot of letters – over 15,000 are known, and more turn up all the time – but even so it’s not altogether clear that an ambidextrous letter-writer is someone we entirely want to trust. Voltaire’s correspondence is full of difficulties and traps, and faced by such a huge corpus, it is hard to know where to start. Without question, the Besterman ‘definitive’ edition (1968-77), digitised in Electronic Enlightenment, has had a major impact on Enlightenment scholarship: historians and literary critics make frequent use of these letters, but usually in an instrumental way, adducing a single passage in a letter as evidence in support of a date or an interpretation.

Nicholas Cronk and Glenn Roe, Voltaire’s correspondence: digital readings (CUP, 2020)

Nicholas Cronk and Glenn Roe, Voltaire’s correspondence: digital readings (CUP, 2020).

Voltaire’s letters can be notoriously ‘unreliable’, however, and they really need to be read and interpreted – like all his texts – as literary performances. Few critics have attempted to examine the corpus of the correspondence in its entirety and to understand it as a literary whole. In our new book, Voltaire’s correspondence: digital readings, we have experimented with a range of digital humanities methods, to explore to what extent they might help us identify new interpretative approaches to this extraordinary correspondence. The size of the corpus seems intimidating to the critic, but it is precisely this that makes these texts a perfect test-case for digital experimentation: we can ask questions that we would simply not have been able to ask before.

For example, we looked at the way Voltaire signs off his letters – and were surprised to find that only 13% of the letters are actually signed ‘Voltaire’; while over a third of the letters are signed with a single letter, ‘V’. Then Voltaire is hugely inventive in the way he plays with the rules of epistolary rhetoric, posing as a marmot to the duc de Choiseul. And if you want to know why in a letter (D18683) to D’Alembert he signs off ‘Miaou’, the answer is to be found in a fable by La Fontaine…

We studied Voltaire as a neologist. Critics have usually described Voltaire as an arch-classicist adhering rigorously to the norms of seventeenth-century French classicism. True, yet at the same time he is hugely energetic in coining new words, an aspect of his literary style that has been insufficiently studied. Here, corpus analysis tools, coupled with available lexicographical digital resources, allow us to consider Voltaire’s aesthetic of lexical innovation. In so doing, we can test the hypothesis that Voltaire uses the correspondence as a laboratory in which he can experiment with new formulations, ideas, and words – some of which then pass into his other works. We identified 30 words first coined by Voltaire in his letters, and another 36 words first used in his other works, many of which are then reused in the correspondence. Emmanuel Macron has encouraged the description of himself as a ‘président jupitérien’, so it’s good to discover that ‘jupitérien’ is one of the words first coined by Voltaire.

Voltaire letter

A letter in Voltaire’s hand, sent from the city of Colmar to François Louis Defresnay (D5612, dated 1753/1754).

A reader of Voltaire’s letters cannot fail to be struck by the frequency of his literary quotations. We explore this phenomenon through the use of sequence alignment algorithms – similar to those used in bioinformatics to sequence genetic data – to identify similar or shared passages. Using the ARTFL-Frantext database of French literature as a comparison dataset, we attempt a detailed quantification and description of French literary quotations contained in Voltaire’s correspondence. These citations, taken together, give us a more comprehensive understanding of Voltaire’s literary culture, and provide invaluable insights into his rhetoric of intertextuality. No surprise that he quotes most often the authors of ‘le siècle de Louis XIV’, though it was a surprise to find that Les Plaideurs is the Racine play most frequently cited. And who expected to find two quotations from poems by Fontenelle (neither of them identified in the Besterman edition)?! Quotations in Latin also abound in Voltaire’s letters, many of these drawn, predictably enough, from the famous poets he would have memorised at school, Horace, Virgil, and Ovid – but we also identified quotations, hitherto unidentified, from lesser poets, such as a passage from Manilius’ Astronomica. By examining as a group the correspondents who receive Latin quotations, and assigning to them social and intellectual categories established by colleagues working at Stanford, we were able to establish clear networks of Latin usage throughout the correspondence, and confirm a hunch about the gendered aspect of quotation in Latin: Voltaire uses Latin only to his élite correspondents, and even then, with notably rare exceptions such as Emilie Du Châtelet, only to men.

The woman on the left, a trainee pilot in the Brazilian air force, is an unwitting beneficiary of Voltaire’s bravura use of Latin quotation. The motto of the Air Force Academy is a stirring (if slightly macho) Latin quotation: ‘Macte animo, generose puer, sic itur ad astra’ (Congratulations, noble boy, this is the way to the stars). The quotation is one that Voltaire uses repeatedly in some dozen letters, and it is found later, for example in Chateaubriand’s Mémoires d’outre-tombe. On closer investigation it turns out that this piece of Latin is an amalgam of quotations from Virgil and Statius – in effect, a piece of pure Voltairean invention.

In the end, Voltaire’s correspondence is undoubtedly one of his greatest literary masterpieces – but it is arguably one that only becomes fully legible through the use of digital resources and methods. Our intention with this book was to affirm the simple postulate that digital collections – whether comprised of letters, literary works, or historical documents – can, and should, enable multiple reading strategies and interpretative points of entry; both close and distant readings. As such, digital resources should continue to offer inroads to traditional critical practices while at the same time opening up new, unexplored avenues that take full advantage of the affordances of the digital. Not only can digital humanities methods help us ask traditional literary-critical questions in new ways – benefitting from economies of both scale and speed – but, as we show in the book, they can also generate new research questions from historical content; providing interpretive frameworks that would have been impossible in a pre-digital world.

The size and complexity of Voltaire’s correspondence make it an almost ideal corpus for testing the two dominant modes of (digital) literary analysis: on the one hand, ‘distant’ approaches to the corpus as a whole and its relationship to a larger literary culture; on the other, fine-grained analyses of individual letters and passages that serve to contextualise the particular in terms of the general, and vice versa. The core question at the heart of the book is thus one that remains largely untreated in the wider world: how can we use digital ‘reading’ methods – both close and distant – to explore and better understand a literary object as complex and multifaceted as Voltaire’s correspondence?

– Nicholas Cronk & Glenn Roe, Co-directors of the Voltaire Lab at the VF

Voltaire’s correspondence: digital readings will be published in print and online at the end of October. The online version is available free of charge for two weeks to personal and institutional subscribers.

Digitising Candide

Candide

Candide, title page of edition 299L (see OCV, vol.48, p.88).

In what is arguably his most widely known work, Voltaire describes the extraordinary journey that his eponymous hero undertakes through geography and understanding, and for us digitising the novel is the first step on the long and – we hope and trust – exciting journey to digitise the whole of the complete works, the OCV. As such it has been a proof of concept, a baptism of reassuringly gentle fire, and a taste of things to come.

For a digital file that’s worth its bytes we need much more than just electronic words. We need a format that will encode structure and meaning so that people and – just as importantly – programs can understand the extra information we’re embedding into the file, and use it to help make readers’ and scholars’ use of the material richer and easier.

Thankfully many others have trodden a similar path. Since the 1980s countless digital humanities minds have contributed to the Text Encoding Initiative, simultaneously a sophisticated tag set for marking up scholarly material, and a community engaged in maintaining that model, supporting the people who use it, and improving it based on collective experience, wisdom, and usage. We had no need to invent a wheel – TEI is beautifully adapted for our journey. We used it to design a tailored model to suit the particular needs of the OCV and Digital d’Holbach. This is being applied for us by our supplier, Apex CoVantage, who are assembling a specialist team and developing automated tools to streamline the workflow, and using the first dozen volumes as tools to train both people and software. Candide was their introduction to this fascinating marriage of the Enlightenment and the computer.

The structural tagging – for things like introductions and notes – will allow readers to see as much or as little detail and complexity as they wish, choosing between at one end of the scale just the edited version of Voltaire’s words, to at the other the full panoply of editorial introduction, notes, bibliographic citations, and textual variants, with a varying choice between the two extremes. It will also help readers navigate through and across the various parts of the volume, enabling their own particular journey.

Tagging for meaning – what we call the semantic tagging – is what will allow the dataset to communicate within itself, to other datasets, and also to humans. It’s what can help make search fully useful rather than just a literal echo of what a user types, and it can help a reader see a wider range of ‘next steps’ by making meaningful connections beyond those possible with just words and spaces. We tag people, places, dates, works, and institutions, and we’re also going to be developing a full set of metadata to accompany the datasets, as a rich and consistent layer describing the entire corpus in disciplined detail – we aim for this to be our contribution to the semantic web. We tag for primary and secondary content, and every piece of text has a language code associated with it so that if machine translation were applied to the data set we can choose which parts of an edition are translated (e.g. the introduction) and which are left in the original language (e.g. primary content quotes). Again, our work enables control and choice.

Part of the digital file of Candide

Part of the digital file of Candide, showing the end of the text of the novel.

Candide

The end of the novel in the Paris, Lambert, 1759 edition.

These two aspects turn a dataset into something akin to a machine (with the metadata as the auxiliary power unit), with multiple interlocking components that make it much easier for readers to summon or suppress the parts of the edition they need.

A machine needs precision in its gears and smoothness in its moving parts, and digitisation is revealing the odd snag and missing bolt where the tools we now have to analyse the workings were not available forty years ago. The exercise is therefore an opportunity to collate points we might wish to address in a revised edition (as well as revealing the occasional typographic error). But overall it’s gratifying how the abstract model we designed ahead of any full-scale digitisation has proved to be fit for purpose, and allows us to interrogate and improve the digital Candide by program, benefits which will increase exponentially as more volumes are added to the electronic corpus. The whole, we think, will be very much greater than the sum of its parts.

While the ultimate consumer of the digital files we’re creating will be human readers, the immediate consumers as intermediaries will be machines and processes, and even a cursory look at the ‘raw’ file of Candide shows you why. Character-for-character there is much more tagging than text, and for the eye simply to read the novel is near impossible; we keep tripping over indexing, line breaks, page breaks, emphasis, witness references … the list of tags is seemingly endless. What we see is ‘noise’ since we’re not programmed to filter one thing from another, but a program can be told to do exactly that, allowing any amount of filtering, cross-referencing, formatting, and even transformation to render the volume exactly as a reader requires. In order to ensure simplicity, but allow richness, and to enable choice, we have to make sure we start from complexity.

Digitisation and the accompanying process of metadata curation is all about preserving content, extending reach, and adding value. If we get this right, we should be laying the foundations for globally accessible tools of immense richness which will add to – and not detract from – the core material and scholarship on which it is all built. We have a responsibility to use the digital tools available to help as many people as possible find, read, and understand the extraordinary legacy of Voltaire and his contemporaries. Il faut cultiver nos données.

Dan Barker, dancan Ltd.

The Digitizing Enlightenment ‘twitterstorm’ of 3 August 2020

This past week our publication partner, Liverpool University Press, shipped out copies of Digitizing Enlightenment: digital humanities and the transformation of eighteenth-century studies, edited by Simon Burrows and Glenn Roe, the July volume of Oxford University Studies in the Enlightenment.

Rousseau’s Premier Discours

Frontispiece and title page of the first edition of Rousseau’s Premier Discours, on the question ‘Si le rétablissement des sciences et des arts a contribué à épurer les mœurs’.

To help launch this important book, on Monday 3 August Burrows and Roe, joined by Melanie Conroy, one of the contributors, organized a ‘twitterstorm’, inviting dix-huitiémistes working on digital humanities projects of any sort to post links of their work on Twitter, tagged with #DigitizingEnlightenment.

Over the course of 48 hours stretching from first light Sunday morning in eastern Australia to midnight Monday night on the Pacific coast of the United States, 112 unique tweets were posted from 28 accounts. The sequence of posts may be read, in reverse chronological order, here.

To enlighten and enliven the discussion, and in the spirit of eighteenth-century intellectual exchange, the Voltaire Foundation sponsored a competition, asking for the most creative and thoughtful response to the question: ‘Has the rise of  #dh been a boon or a barrier to #C18 studies?’

Twelve individuals posted responses, and the jury – consisting of Burrows, Roe and Conroy – deploying a sophisticated algorithm, ranked the entries and identified three runners-up and two winners.

The three runners-up were:

Helen Williams

https://twitter.com/helen189/status/1290261481062375425?s=20

As a first-gen scholar in the North East teaching & researching at a post-92 institution, #DigitizingEnlightenment is a boon, making the #18thcentury accessible & bringing diverse new voices, projects & approaches to scholarship & study. Many of us wouldn’t be here without it.

– Helen Williams (@helen189) August 3, 2020

Bryan Banks

https://twitter.com/BryanBanksPhD/status/1290245758059388929?s=20

Really excited to see this book come out!@SimonBu86342933 @glennhroe @MelanieConroy1 put the #DH in 𝐝ix-𝐡uitiemistes.

Today’s organized hashtag #DigitizingEnlightenment, like much DH work more broadly, makes the #18thC more legible and accessible to us today./1 https://t.co/IajlYLtPWk

– Bryan Banks (@BryanBanksPhD) August 3, 2020

Russell Goulbourne

https://twitter.com/FrenchProfessor/status/1290215320091635720?s=20

Definitely a boon – because it’s the #DH analysis of huge numbers of texts that allows us to see that it’s precisely in the 1760s, at the height of the Enlightenment, that boon comes to mean “a benefit enjoyed”. QED. #DigitizingEnlightenment

– Russell Goulbourne (@FrenchProfessor) August 3, 2020

And now the winners:

Chad Wellmon

As Kant wrote 200+ years ago, DH has been a boon to #C18 studies. It’s a no-brainer @VoltaireOxford: “It is so easy to be immature. If I have a [computer] that has understanding for me, surely I do not need to trouble myself.” I. Kant, “An Answer to the Question ‘What is DH?’” https://t.co/wIGQJDT7p4

– chad wellmon (@cwellmon) August 3, 2020

https://twitter.com/cwellmon/status/1290310792156450819?s=20

Megan K. Roberts

I hate to be the lone skeptic, but I am concerned about the influence of #DH and #DigitizingEnlightenment on the field. Some projects are wonderful for research and teaching, but I worry that others place too much emphasis on an extremely select group of French philosophes.

– Meghan Roberts (@MeghanKRoberts) August 3, 2020

https://twitter.com/MeghanKRoberts/status/1290281237089665024?s=20

Both winners received copies of Digitizing Enlightenment as well as OSE’s June 2019 title, another volume of essays which deployed digital humanities methods to study the eighteenth century, Networks of Enlightenment, edited by Chloe Edmonston and Dan Edelstein.

As a supplement to the printed books, the data visualizations, tables and figures, as well as a portion of the text for each of these two volumes, are accessible on open access on the OSE ‘Digital Collaboration Hub’, built on the Manifold Scholar platform and hosted by Liverpool University Press. These may be accessed, appropriately, at http://digitizingenlightenment.com

Thanks to all who participated – and we all hope to be able to renew the annual ‘Digitizing Enlightenment’ symposium in July 2021, to be hosted at the University of Montpellier, in the context of the ‘Enquête sur la globalisation des Lumières’ initiative.

– Gregory S. Brown

NB: For the month of August, copies of Digitizing Enlightenment are available for purchase at a 25% discount. Purchasers in North America may order from the OUP-Global site using the code “DISTRO25” and purchasers anywhere else in the world, including UK, Europe and Australia,  may order from the LUP site using the code “DIGITIZING25“.

Digitizing the Enlightenment

As country after country has gone into COVID-19 lockdown, we have all had to learn to communicate, network, teach, study and relate online in ways unimaginable a few short years – or even months – ago. This phenomenon is just the latest stage in the information-technology revolution and part and parcel of the ongoing development of an increasingly digital society. This revolution has touched almost every aspect of our lives, from how we work, study, shop, relax and even make and maintain personal relationships. But it is also transforming scholarship and the way we conduct and communicate academic research. Thus, it is perhaps apt, and with consummate good timing, that Oxford University Studies in the Enlightenment has chosen to subject tag our new volume as ‘History of Scholarship (Principally of Social Sciences and Humanities)’. Yet this is certainly not how we and our collaborators envisaged our project at the outset, nor can any single tag capture the content of our volume and its collaborative agenda in its entirety.

The Digitizing Enlightenment workshop logo

The Digitizing Enlightenment workshop logo, designed by Evan Casey for the Voltaire Foundation, featured on the cover of Digitizing Enlightenment.

Ironically, as we write, Digitizing Enlightenment is also a living movement – or at least a loose network of scholars who meet annually in pursuit of a common agenda. That agenda was born in a series of conversations that took place from 2010, culminating in Dan Edelstein’s post-panel suggestion at the American Historical Association conference at Montreal in April 2014 that we should hold periodic meetings between like-minded digital projects relating to the Enlightenment. The aim of these meetings would be to establish common conventions and digital standards, with a view to linking our resources and realising the enormous and still largely untapped potential of Linked Open Data. Those present for Dan’s suggestion – Simon Burrows, Jeff Ravel, Sean Takats and Dan himself – have all provided chapters for our book, but much of the energy behind Digitizing Enlightenment since has come from Glenn Roe, who Simon had first encountered a month earlier in Australia, where they had both recently taken up academic positions.

It was this fortuitous coincidence, underpinned by the fertile combination of Simon’s professorial establishment funds and Glenn’s energy, together with their mutual contact books, that led to Western Sydney University hosting the first Digitizing Enlightenment symposium in July 2016. Among the projects discussed there, and in our book, were large-scale treatments of Enlightenment correspondences, theatre attendance records, and textual corpora including the mid-eighteenth century Encyclopédie; bibliometric projects were presented on the production and dissemination of literature; together with presentations on mapping and data visualization growing out of these projects. The symposium was so well received that it has been an annual event ever since. It was held at Radboud University in Nijmegen (2017), Oxford (2018), Edinburgh (2019). In 2020, but for COVID-19, it would have been held in Montpellier.

It was not entirely by chance that such a project coalesced around the guiding notion of the ‘Enlightenment’. For the long eighteenth century has been blessed by a number of high-profile and long-established digital projects. These include ground-breaking commercial datasets such as Gale-Cengage’s Eighteenth-Century Collections Online (ECCO), which features in several of our chapters, semi-commercial projects such as the Electronic Enlightenment and large academic consortiums such as the Franco-American ARTFL project. This made the Enlightenment a natural laboratory for exploring the possibilities and achievements of the Digital Humanities for transforming scholarship on a single historical era. Further, as our book emphases, our discussions built on a long tradition of digital innovation in eighteenth-century studies that can be traced back at least as far as the twin Livre et société dans la France du XVIIIe siècle volumes produced by a team led by François Furet in 1965 and 1970. It might further be added that our over-arching subject material lends itself to digital-historical analysis; the Enlightenment might after all be viewed as the long-run culmination of the intellectual turmoil and – as several contributors point out – information overload unleashed by a previous technological and communications revolution.

Digitizing Enlightenment is the July volume in the Oxford University Studies in the Enlightenment series

Digitizing Enlightenment is the July volume in the Oxford University Studies in the Enlightenment series.

With this in mind, then, we offer up Digitizing Enlightenment: Digital Humanities and the Transformation of Eighteenth-Century Studies as rather more than a contribution to the history of scholarship. Certainly, we have offered a sample of Digital Humanities c. 2016-2020, as it relates to the technologies available and their application to Enlightenment studies broadly construed. In addition, the first half of the book offers detailed accounts of the origins and development of key Enlightenment digital projects up until that point, accompanied by valuable and sometimes disarming insights on the dangers and delights of digital research from foremost practitioners in the field. These chapters, as well as some later contributions, are helping to reshape some dominant meta-narratives of the Enlightenment, not least by hinting simultaneously at the enduring aristocratic leadership of the French Enlightenment and the extent to which Enlightenment literary production and consumption was infused with religious content. However, our contributors also showcase other ways that Digital Humanities scholarship is in the process of changing the field through the transparency, methodological rigour, and collaborative imperatives that are necessary concomitants of this new kind of research. Finally, the book offers a collaborative roadmap for future digital research – at a moment where, as our final contributor, Sean Takats points out, the Enlightenment is fast losing its privileged position as the most richly digitized century of the modern era. As a corollary, we hope that our volume may be as useful to scholars of other periods as for Enlightenment scholars themselves.

– Simon Burrows (Western Sydney University) and Glenn Roe (Sorbonne University)

Simon Burrows and Glenn Roe are the editors of the July volume in the Oxford University Studies in the Enlightenment series, Digitizing Enlightenment: Digital Humanities and the Transformation of Eighteenth-Century Studies, which is the first book length survey of the impact of digital humanities on our understanding of a key historical period and paradigm.

This post is reblogged from Liverpool University Press.

A la portée de tout le monde

That was then: d’Holbach in print…

That was then: d’Holbach in print…

When I came upon the baron d’Holbach in the early 1960s – my undergraduate senior thesis was on d’Holbach’s atheism and the response of Voltaire and others to the Système de la nature, a choice of subject that played a not insignificant role in my scholarly life – there were few secondary sources to guide me, and there were precious few of his works accessible from a university library. As a graduate student studying his salon, his atheism, and his relationships, the only passage to his writings required taking a paper number in the waiting room of the Bibliothèque Nationale, rue de Richelieu, praying for a siège to open up, and waiting patiently for tomes that might or might not be the right editions to be delivered to my desk. Where was this extraordinary digital resource when I most needed it?

The secondary works about d’Holbach still cited by scholars of the Enlightenment when I began my research had been written in 1875, 1914, 1928, 1935, 1943, 1946, and 1955. There was a near universal consensus that it would be in the coterie holbachique that I would find the major movement of organized atheism in the Enlightenment (there were indeed a handful of atheists there, but the vast majority of devotees of the salon were not), and, for most scholars, diversely conspiratorial efforts to bring down the Ancien Régime. As each of these operational hypotheses dissolved in the light of the evidence – to my great disappointment at the time – I understood that one must learn to read texts, archives, correspondence, and contemporaneous records without prepossessed ideas about d’Holbach or his world, however ‘settled’ the scholarship appeared to be. Too many historians, claiming d’Holbach for a particular ideological camp, cited and presented him selectively and tendentiously, leaving one wholly unprepared for and startled by the dimensions and aspects of his work that they did not address or explicate. This narrative extended to the coterie itself.

… and this is now: screenshot from Tout d’Holbach.

… and this is now: screenshot from Tout d’Holbach.

D’Holbach’s salon was not a collaborative conspiracy or undertaking, pace many a historian and literary historian. The recent release of the truly collaborative Tout d’Holbach database coupled with the Voltaire Foundation’s project to create a digital scholarly edition of d’Holbach’s works (Digital d’Holbach) will allow debates about d’Holbach’s meanings and intentions to engage a large number of scholars and students in a variety of fields, informed by the array of texts on religion, superstition, philosophy, ethics, politics, and happiness that he bequeathed to his contemporaries and to us. Claims about his scepticism or dogmatism, elitism or egalitarianism, pessimism or optimism, happiness as contentment or happiness as hedonistic pleasure, for example, now can receive truly critical reception based upon unfiltered access to his actual works. Those of us who wrote about these matters might well have hoped that readers and students would go beyond our own claims and citations, testing the value of these by setting what we explicated in the context of the larger text and intellectual context, but now that is possible on a whole new scale. It is an exciting opportunity, and it is not only à la portée de tous ceux qui s’y intéressent, but it will expand such interest sizably and noticeably.

– Alan Charles Kors
Henry Charles Lea Professor Emeritus of History, University of Pennsylvania

Gillian Pink at the Voltaire Foundation: thirteen years and counting

As we approach the completion of the Œuvres complètes de Voltaire, I sat down with team co-ordinator Gillian Pink to find out more about how joining the editorial team led to becoming a researcher in her own right.

Gillian Pink and Birgit Mikus

Gillian Pink and Birgit Mikus.

You are one of the research editors working on the critical edition, a huge task. How did you come to work for the VF? Did you start editing OCV immediately?

I came to the VF almost by accident. I was studying for an MA in Publishing at Oxford Brookes University and Clare Fletcher, who was responsible for work placements on the MA, also did marketing here. She took one look at my CV – which at that point included work on a critical edition of an eighteenth-century sequel to Candide – and said ‘I think I know someone who would be very interested in this CV!’ That person turned out to be Janet Godden.

I arrived at 99 Banbury Road one afternoon in January 2007 for what I think I expected to be an interview, and was put to work straight away collating variants for Le Pyrrhonisme de l’histoire [since published in OCV, vol.67]. The rest, as they say… I did work briefly on Electronic Enlightenment before I started my full time employment on OCV in the autumn of that year, so an early introduction to digital editing, checking instances of words using non-Latin alphabets, as well as certain types of metadata.

So you have been at the VF for thirteen years – how many volumes have you worked on? Do you have a favourite text or volume?

Oh my! How many volumes… Taking a quick look at the shelves… twenty-five, perhaps, depending on your definition of ‘worked on’, and there are still a few more to go too. I don’t know if I have a single all-time favourite, but many favourites, which tend to be the ones I’ve contributed to as an author, rather than only as an in-house editor.

Questions sur l'Encyclopédie

The complete set of Questions sur l’Encyclopédie on the VF bookshelf.

One of my favourite Voltaire texts, I suppose, would have to be the Questions sur l’Encyclopédie, a glorious collection of mostly short articles summing up his thoughts on just about every topic under the sun as he approached the end of his life. I had some involvement with all of the eight volumes that make up the set in OCV, was lead in-house editor on six of those and annotated articles in four. Last year, along with the general editors Nicholas Cronk and Christiane Mervaud, we published a version of this text for a wider readership with Robert Laffont. But I also love the very humorous poem ‘Le Pauvre Diable’ that I edited in volume 51A, and of course the notebook fragments just published in the latest volume, 84, and the marginalia in volumes 136-145 are close to my heart and research interests as well…

Tell me more about the marginalia, please! What is your research interest in them?

If you had told me when I first joined the VF that a few years down the line I’d have completed a D.Phil. and become an expert on Voltaire’s marginalia, I’d have found it quite hard to believe. As you may know, the project of publishing Voltaire’s marginal notes was begun by colleagues in St Petersburg at the National Library of Russia, but after the Berlin wall came down, their publisher, Akademie Verlag, went through a period of upheaval and the project stalled. The VF picked it up and incorporated it (quite rightly) into OCV.

But the lady in St Petersburg who had been writing all the editorial notes had sadly died before she got to the final volume, so it was suggested that I might like to take this on as a doctoral project. In the end, I did a more typical thesis, while the annotation ended up being a separate project. Until then, while the marginalia had been studied to some degree, by far most of the articles published looked at Voltaire as a reader of a particular author. There was no proper study at that point looking at the marginalia as an ensemble, as a genre, looking for patterns in what we present as a corpus, although of course it wasn’t conceived as a corpus by Voltaire at all – rather like his correspondence in that way. And I was lucky to have an excellent supervisor in Nicholas [Cronk]. The result of all this was my book, Voltaire à l’ouvrage (Voltaire at work), which came out – nearly two years ago already!

Since then I played a leading role in bringing out a final volume of Voltaire’s marginalia in OCV, based on an even more disparate corpus, which is to say those books and manuscripts that for various reasons are not part of his library in St Petersburg, and so were not part of the original Russian project. While I still find marginalia fascinating for the direct insights they provide into readers’ responses to books (although they can’t always be taken completely at face value), I’m now extending this interest to reading notes in a broader sense, and Voltaire’s notebooks are a wonderfully challenging mix of reading notes, ideas of various sorts, and jottings that probably reflected snippets that he gleaned from oral sources.

We all know that the paper publication of OCV is nearing its completion this year. Do you have a new project lined up, for example regarding Voltaire’s notebooks you mentioned?

You’re quite right to ask. I do have several research ideas concerning the notebooks. I can’t go into too much detail because a couple of them need to be finalised with publishers and/or other colleagues, but I think there is much to be done in this area.

I’ll be talking about the notebooks at the annual ‘Journées Voltaire’ conference at the Sorbonne in June. I think the notebooks can be perceived as a bit ‘scary’, in part because of the wide variety of topics and the considerable lack of order within them, but also the fact that they were amongst the first volumes published in OCV. In those days scholarly practices didn’t demand the fuller sort of annotation that we tend to provide for readers nowadays, so Besterman’s notes are quite laconic and his perspective perhaps isn’t quite the one we would adopt these days either. For me, as someone whose approach tends to be based on material bibliography, I find it really helpful and revealing to look at the original manuscripts. Often, physical characteristics will strongly suggest – for example from the colour of the ink, the margins, the spacing – which sections were written at the same time, and so give a sense of which bits belong together or not. This is an area in which I hope our future digital edition of Voltaire’s complete works may build on the print and add real value, as there would be an opportunity to supplement the print transcription with digitised images.

Of course, the really interesting question to me is how Voltaire used his notebooks and other loose papers, how they were generated, and how they fed into his more public writings. I think there are still discoveries to be made in this area, and I’m lucky to be able to work with a great network of colleagues, from friends based in Voltaire’s library in St Petersburg, to digital humanities scholars at the Sorbonne and the University of Chicago, and research groups interested in textual genetics and the extract as a genre at ITEM [in Paris] and the IZEA [Halle, Germany]. So the future is full of exciting possibilities.

Birgit Mikus with Gillian Pink

The Salons Project: a digital approach to eighteenth-century French salons

We are currently finalising the programme for Digitizing Enlightenment IV, a day-long workshop that will take place on 15 July as part of the ISECS Congress in Edinburgh this summer. In order to expand our network of Digitizing Enlightenment projects and researchers, we encourage those working in any aspect of digital humanities across the interdisciplinary spectrum of eighteenth-century studies to attend the event, if in Edinburgh, or contact us for more information.

Meantime below is the second post in our series of follow-up discussions based on work presented at the Digitizing Enlightenment III workshop.

– Glenn Roe, Voltaire Lab

Eighteenth-century French salons have developed a mystical aura as sites of elite sociability and (more controversially) as potential workshops of Enlightenment philosophy. They were, however, ordinary face-to-face gatherings in many ways – not unlike unscheduled conferences and meetings with loose agendas today; the one consistent difference is that they were held in private homes instead of conference rooms and organized by individuals (normally women) rather than groups or committees. The nineteenth-century term “salon” grouped together a variety of meetings with certain characteristics: salons were held in private homes with relatively elite participants, conversation was the primary activity, and they occurred on set days and at times that were part of a larger social calendar. Aside from these very general characteristics, salons had a wide variety of purposes, publics, and activities.

a French salon

Niclas Lafrensen [Nicolas Lavreince] (1737-1807), A French salon.

The most celebrated among salons, notably Tencin’s, Graffigny’s, Geoffrin’s, and Lespinasse’s, have become associated with great writers, philosophes, and mathematicians, like Voltaire and D’Alembert. Antoine Lilti has challenged the view that salons were primarily counter-cultural venues for philosophical debate, showing that the aristocratic traditions influenced notions of politesse in the salons and emphasizing the aristocratic habitus of many salon hostesses even when they had philosophes as guests. Disagreements over the character of salons may amount to differences more of degree than of type, since historians generally agree that the salons were mixed environments, but these debates do demonstrate the importance, now more than ever, of working through who was in attendance, in order to identify the social characteristics of eighteenth-century French salons.

I am the co-director with Chloe Edmondson of The Salons Project, a database of primarily eighteenth- and nineteenth-century European salon participants. We completed our pilot project of French salons from 1700 to 1800 last year and have some preliminary results, which will appear in the volume Digitizing Enlightenment, edited by Glenn Roe and Simon Burrows, in 2019. As expected, we found a great deal of evidence for social mixité in eighteenth-century salons, including patterns of mixed gender, age, occupation, interests, and social status. We also found that both women and literary figures were present in all of the major salons, including salons like Deffand’s which were not known for their openness to the philosophes. We found that nobles were present in all salons, as were gens de lettres, and that these people were often one and the same.

Our list of more than 600 salon participants is far from a complete record of eighteenth-century French salon attendees, but it is the largest and most complete database that we are aware of. The purpose of our study was not only to create a database, but also to create a method and a format for sharing data about salons and other informal networks. This method uses the robust data model created by the Electronic Enlightenment project, such that our data are compatible with the many other Enlightenment-era projects that are inspired by that database. We also use the schema “Procope”, which we developed along with Maria Teodora Comsa, Dan Edelstein, and Claude Willan to classify Early Modern European individuals, and which is described in our article “The French Enlightenment network”.

the Salons Project

Salon, correspondence, and knowledge networks in French salons, 1650 to 1815 (data from The Salons Project, Conroy and Edmondson).

Within our larger dataset (1650 to 1815), we found that the letters networks and salon networks remained well integrated, and that philosophes were a minority but well integrated into the core of the network (see diagram). The most central figures are the ones whose networks are most associated with each field of knowledge (for example, Lespinasse’s salon is strongly associated with the “Letters_Philosophical” network, whereas Praslin’s is not; Voltaire’s correspondence network is more strongly associated with the encyclopédistes than is Necker’s; the Letters networks and “Letters_Philosophical” network are themselves tightly connected and central to salon networks). Whereas the best known salons of the era were well integrated into the letters and philosophical networks, it is important to remember that many of the salon attendees were not otherwise part of the French Enlightenment network, especially women, lower-status individuals, family members of other salon participants, and foreigners. By adding these more marginal people to the records on eighteenth-century French sociability, we hope to open up new avenues for finding social relations that are not well known among these more marginal participants on the edges of the Enlightenment. Even where we were not able to learn much about some of these more minor figures, including them in this preliminary dataset increases the chances that we will learn more about them in the future.

– Melanie Conroy, University of Memphis

Melanie Conroy is assistant professor of French at the University of Memphis and the co-director with Chloe Summers Edmondson (PhD candidate, Stanford University) of The Salons Project, a database of European salon participants. She can be reached at mrconroy@memphis.edu or @MelanieConroy. The Salons Project is online at salonsproject.org. The Salons Project is collaborative and invites new researchers to adopt its methods and share their data.