From Cyclopaedia to Encyclopédie: experiments in machine translation and sequence alignment

Figure 1. Title page from the 1745 prospectus of the first Encyclopédie project. This page image is taken from ARTFL’s 18th Volume of the Encyclopédie.

It is well known that the Encyclopédie ou dictionnaire raisonné des sciences, des arts et des métiers began first as a modest translation project of Ephraim Chambers’ Cyclopaedia in 1745. Over the next few years, Diderot and D’Alembert would replace the original editors and the project would be duly transformed from a simple translation into an effort to compile and organise the sum total of the world’s knowledge. Over the course of their editorial work, Diderot, and most notably D’Alembert, were not shy in incorporating these translations of the Cyclopaedia as filler for the Encyclopédie. Indeed, ‘ils ont laissé une bonne partie de ces articles presque inchangés, ou avec des modifications insignifiantes’ (Paolo Quintili, ‘D’Alembert “traduit” Chambers. Les articles de mécanique de la Cyclopædia à l’Encyclopédie’, Recherches sur Diderot et sur l’Encyclopédie 21 (1996), p.75). The philosophes were nonetheless conscious of their debt to their English predecessor Chambers. His name appears some 1154 times in the text of the Encyclopédie and he is referenced as sole or contributing source to 1081 articles, where his name appears in italics at the end of a section or article. Given the scale of the two works under consideration, systematic evaluation of the extent of the philosophes’ use of Chambers has remained, even today, a daunting task. John Lough, in 1980, framed the problem nicely: ‘So far no one has had the patience to make a detailed study of the exact relationship between the text of Diderot’s Encyclopédie and the work of Ephraim Chambers. This would no doubt require several years of arduous toil devoted to comparing the two works article by article’(‘The Encyclopédie and Chambers’ Cyclopaedia’, SVEC 185 (1980), p.221).

Recent developments in machine translation and sequence alignment now offer new possibilities for the systematic comparison of digital texts across languages. The following post outlines some recent experimental work in leveraging these new techniques in an effort to reduce the ‘arduous toil’ of textual comparison, giving some preliminary examples of the kinds of results that can be achieved, and providing some cursory observations on the advantages and limitations of such systems for automatic text analysis.

Our two comparison datasets are the ARTFL Encyclopédie (v. 1117) and the recently digitised ARTFL edition of the 1741 Chambers’ Cyclopaedia (link). The 1741 edition was selected as it was one of the likely sources for the translation original project and we were able to work from high quality pages images provided by the University of Chicago Library (On the possible editions of the Cyclopaedia used by the encyclopédistes, see Irène Passeron, ‘Quelle(s) édition(s) de la Cyclopœdia les encyclopédistes ont-ils utilisée(s)?’, Recherches sur Diderot et sur l’Encyclopédie 40-41 (2006), p.287-92.) In a nutshell, our approach was to generate a machine translation of all of the Cyclopaedia articles into French and then use ARTFL’s Text-PAIR sequence alignement system to identify similar passages between this virtual French Cyclopaedia and the Encyclopédie, with the translation providing links back to the original English edition of the Chambers as well as links to the relevant passages in the Encyclopédie.

For the English to French machine translation of Chambers, we examined two of the most widely used resources in this domain, Google Translate and DeepL. Both systems provide useful Application Programming Interfaces [APIs] as part of their respective subscription services, and both provide translations based on cutting-edge neural network language models. We compared results from various samples and found, in general, that both systems worked reasonably well, given the complications of eighteenth-century vocabularies (in both English and French) and many uncommon and archaic terms (this may be the subject of a future post). While DeepL provided somewhat more satisfying translations from a reader’s perspective, we ultimately opted to use Google Translate for the ease of its API and its ability to parse the TEI encoding of our documents with little difficulty. The latter is of critical importance, since we wanted to keep the overall document structure of our dictionaries to allow for easy navigation between the versions.

Operationally, we segmented the text of the Cyclopaedia into short blocks, split at paragraph breaks, and sent them for automatic translation via the Google API, with a short delay between blocks. This worked relatively well, though the system would occasionally throw timeout or other errors, which required a query resend. You can inspect the translation results here – though this virtual French edition of the Chambers is not really meant for public consumption. Each article has a link at the bottom to the corresponding English version for the sake of comparison. It is important to note that the objective here is NOT to produce a good translation of the text or even one that might serve as the basis for a human edition. Rather, this machine-generated edition exists as a ‘pivot-text’ between the English Chambers and the French Encyclopédie, allowing for an automatic comparison of the two (or three) versions using a highly fault-tolerant sequence aligner designed to pick out commonalities in very noisy document spaces. (See Clovis Gladstone, Russ Horton, and Mark Olsen, ‘TextPAIR (Pairwise Alignment for Intertextual Relations)’, ARTFL Project, University of Chicago, 2008-2021, and, more specifically, Mark Olsen, Russell Horton and Glenn Roe, ‘Something borrowed: sequence alignment and the identification of similar passages in large text collections’, Digital Studies / Le Champ numérique 2.1 (2011).)

The next step was to establish workable parameters for the Text-PAIR alignment system. The challenge here was to find commonalities between the French translations created by eighteenth-century authors and translators and machine translations produced by a modern automatic translation system. Additionally, the editors and authors of the Encyclopédie were not necessary constrained to produce an exact translation of the text in question, but could and did, make significant modifications to the original in terms of length, style, and content. To address this challenge we ran a series of tests with different matching parameters such as n-gram construction (e.g., number of words that constitue an n-gram), minimum match lengths, maximum gaps between matches, and decreasing match requirements as a match length increased (what we call a ‘flex gap’) among others on a representative selection of 100 articles from the Encyclopédie where Chambers was identified as the possible source. It is important to note that even with the best parameters, which we adjusted to get favorable recall and precision results, we were only able to identify 81 of the 100 articles. (See comparison table. The primary parameters chosen were bigrams, stemmer=true, word len=3, maxgap=12, flexmatch=true, minmatchingngrams=5. Consult the TextPair documentation and configuration file for a description of these values.) Some articles, even where clearly affiliated, were missed by the aligner, due to the size of the articles (some are very small) and fundamental differences in the translation of the English. For example, the article ‘Compulseur’ is attributed by Mallet to Chambers, but the machine translation of ‘Compulsor’ is a rather more literal and direct translation of the English article than what is offered by Mallet. Further relaxing matching parameters could potentially find this example, but would increase the number of false positives, in effect drowning out the signal with increased noise.

All things considered, we were quite happy with the aligner’s performance given the complexity of the comparison task and the multiple potential variations between historical text and modern machine translations. To give an example of how fine-grained and at the same time highly flexible our matching parameters needed to be, see the below article ‘Gynaecocracy’, which is a fairly direct translation on a rather specialised subject, but that nonetheless matched on only 8 content words (fig. 2).

Figure 2. Comparisons of the article ‘Gynaecocracy’.

Other straightforward articles were however missed due to differences in the translation and sparse matching n-grams, see for example the small article on ‘Occult’ lines in geometry below, where the 6 matching words weren’t enough to constitute a match for the aligner (fig. 3).

Figure 3. Comparisons of the geometry article ‘Occult’.

Obviously this is a rather inexact science, reliant on an outside process of automatic translation and the ability to match a virtual text that in reality never existed. Nonetheless the 81% recall rate we attained on our sample corpus seemed more than sufficient for this experiment and allowed us to move forward towards a more general evaluation of the entirety of identified matches.

Once settled on the optimal parameters, we then Text-PAIR to generate both an alignment database, for interactive examination, and a set of static files. Both of these results formats are used for this project. The alignment database contains some 7304 aligned passage pairs. The system allows queries on metadata, such as author and article title as well as words or phrases found in the aligned passages. The system also uses faceted browsing to allow the user to summarize results by the various metadata (for more on this, see Note below). Each aligned passage is presented as a facing page representation and the user can toggle a display of all of the variations between the two aligned passages. As seen below, the variations between the texts can be extensive (fig. 4).

Figure 4. Text-PAIR interface showing differences in the article ‘Air’.

Text-PAIR also contextualises results back to the original document(s). For example, the following is the article ‘Almanach’ by D’Alembert, showing the aligned passage from Chambers in blue (fig. 5).

Figure 5. Article ‘Almanach’ with shared Chambers passages in blue.

In this instance, D’Alembert reused almost all of Chambers’ original article ‘Almanac’, with some minor variations, but does not to appear to have indicated the source of the first part of his article (page image).

The alignment database is a useful first pass to examine the results of the alignment process, but it is limited in at least two ways. It identifies each aligned passage, but does not merge multiple passages identified in in article pairs. Thus we find 5 shared passages between the articles ‘Constellation’. The interface also does not attempt to evaluate the alignments or identify passages that occur between different articles. For example, D’Alembert’s article ‘ATMOSPHERE’ indeed has a passage from Chambers’ article ‘Atmosphere’, but also many longer passages from the article ‘Generation’.

To accumulate results and to refine evaluation, we subsequently processed the raw Text-PAIR alignment data as found in the static output files. We developed an evaluation algorithm for each alignment, with parameters based on the length of the matching passages and the degree to which the headwords were close matches. This simple evaluation model eliminated a significant number of false positives, which we found were typically short text matches between articles with different headwords. The output of this algorithm resulted in two tables, one for matches that were likely to be valid and one that was less likely to be valid, based on our simple heuristics – see a selection of the ‘YES’ table below (fig. 6). We are, of course, making this distinction based on the comparison of the machine translated Chambers headwords and the headwords found in the Encyclopédie, so we expected that some valid matches would be identified as invalid.

Figure 6. Table of possible article borrowings.

The next phase of the project included the necessary step of human evaluation of the identified matches. While we were able to reduce the work involved significantly by generating a list of reasonably solid matches to be inspected, there is still no way to eliminate fully the ‘arduous toil’ of comparison referenced by Lough. More than 5000 potential matches were scrutinised, looking in essence for ‘false negatives’, i.e., matches that our evaluation algorithm classed as negative (based primarily on differences in headword translations) but that were in reality valid. The results of this work was then merged into in a single table of what we consider to be valid matches, a list that includes some 3700 Encyclopédie articles with at least one matching passage from the Cyclopaedia. These results will form the basis of a longer article that is currently in preparation.

Conclusions

In all, we found some 3778 articles in the Encyclopédie that upon evaluation seem highly similar in both content and structure to articles in the 1741 edition of Chambers’ Cyclopaedia. Whether or not these articles constitute real acts of historical translation is the subject for another, or several other, articles. There are simply too many outside factors at play, even in this rather straightforward comparison, to make blanket conclusions about the editorial practices of the encyclopédistes based on this limited experiment. What we can say, however, is that of the 1081 articles that include a ‘Chambers’ reference in the Encyclopédie, we only found 689 with at least one matching passage. Obviously this recall rate of 63.7% is well below the 81% we attained on our sample corpus, probably due to overfitting the matching algorithm to the sample, which warrants further investigation. But beyond testing this ground truth, we are also left with the rather astounding fact of 3089 articles with no reference to Chambers whatsoever, all of which seem, at first blush, to be at least somewhat related to their English predecessors.

The overall evaluation of these results remains ongoing, and the ‘arduous toil’ of traditional textual comparison continues apace, albeit guided somewhat by the machine’s heavy hand. Indeed, the use of machine translation as a bridge between documents to find similar passages, be they reuses, plagiarisms, etc., is, as we have attempted to show here, a workable approach for future research, although not without certain limitations. The Chambers–Encyclopédie task outlined above is fairly well constrained and historically bounded. More general applications of these same methods may well yield less useful results. These reservations notwithstanding, the fact that we were able to unearth many thousands of valid potential intertextual relationships between documents in different languages is a feat that even a few years ago might not have been possible. As large-scale language models become ever more sophisticated and historically aware, the dream of intertextual bridges between multilingual corpora may yet become a reality. (For more on ‘intertextual bridges’ in French, see our current NEH project.)

Note

The question of the Dictionnaire de Trévoux is one such factor, as it is known that both Chambers and the encyclopédistes used it as a source for their own articles – so matches we find between the Chambers and Encyclopédie may indeed represent shared borrowings from the Trévoux and not a translation at all. Or, more interestingly, perhaps Chambers translated a Trévoux article from French to English, which a dutiful encyclopédiste then translated back to French for the Encyclopédie – in this case, which article is the ‘source’ and which the ‘translation’? For more on these particular aspects of dictionary-making, see our previous article ‘Plundering philosophers: identifying sources of the Encyclopédie’, Journal of the Association for History and Computing 13.1 (Spring 2010) and Marie Leca-Tsiomis’ response, ‘The use and abuse of the digital humanities in the history of ideas: how to study the Encyclopédie’, History of European ideas 39.4 (2013), p.467-76.

– Glenn Roe and Mark Olsen

Annotation in scholarly editions and research

It has been, alas, almost exactly a year since our last face-to-face Besterman Workshop at 99 Banbury Road. Of course, webinars allow more people to join, and to do so, most importantly, from the comfort of their homes, where they can sit comfortably and set their thermostats to the temperature that suits them best. The advent of the Zoom/Teams era, however, has brought with it a number of unfortunate consequences: discussions are not as lively as they used to be, asking a follow-up question is nearly impossible, and so are chats with friends and colleagues, before, during, or after the talk. Worst of all, we no longer get a chance to eat our beloved Leibniz or Belgian biscuits – but those, to be fair, had already become something of a rarity towards the beginning of 2018. Anyway: those of you who did attend our last face-to-face Besterman Workshops may remember this gloomy and cumbersome poster of mine hanging from the mantelpiece.

This poster was presented at a conference in Wuppertal, Germany, at the end of February 2019: ‘Annotation in Scholarly Editions and Research: Function – Differentiation – Systematization’. Organised by Julia Nantke (Universität Hamburg) and Frederik Schlupkothen (Bergische Universität Wuppertal), this two-day bilingual Anglo-German colloquium was a wonderful occasion to reflect on the age-old human habit of glossing, commenting, and generally interfering with other people’s work.

Alongside some theoretical papers (to mention but one, Willard McCarty’s brilliant keynote lecture on annotation as a knowledge-producing practice), the symposium featured several more practice-oriented talks that would have certainly been of interest to many of our Digital Humanities followers: some focused on how best to structure and visualise annotation in digital scholarly editions; others raised the question as to how to annotate audio-visual materials; and yet others investigated the extent to which annotation can be automated.

Some of the papers given at the ‘Annotation in Scholarly Editions and Research’ conference can now be read in a volume published last year (yes, in 2020!) by De Gruyter and available in print as well as an Open Access eBook.

My own contribution to the volume (which you can find here, should you want to read it) presents what I think might be an efficient and user-friendly three-level annotation system, the ‘reversible annotation system’, which I developed while working on Digital d’Holbach, a born-digital scholarly edition of Paul-Henri Thiry d’Holbach’s complete works. On this model, I argue, a single set of notes can be so structured as to cater to very different audiences, meaning that the edition can hope simultaneously to be user-friendly and cost-efficient. Should you have any comments or suggestions for improvement, please do not hesitate to let me know!

Ruggero Sciuto, University of Oxford

The Comédie-Française by the numbers, 1680-1793

The Comédie-Française in 1790, by Antoine Meunier

The Comédie-Française in 1790, by Antoine Meunier. (Bibliothèque en ligne Gallica, ARK btv1b10303194d)

Almost every evening at the playhouse of the Comédie-Française in Paris from 1680 to 1793, once the curtain had fallen and the theatre crowd had gone home, a designated member of the troupe retired to the box office (no doubt with a verre!) to count the evening’s proceeds, and enter the ticket sales by category in a folio-sized register. One hundred and thirteen of these registers, which allowed the troupe’s actors to divvy up the nightly proceeds, have remained in the possession of the troupe for over three centuries.

Register for the 1680-81 season (Paris, 1680)

Register for the 1680-81 season (Paris, 1680).

During the past decade an international team of scholars and developers has made digital versions of the registers available on the website of the Comédie-Française Registers Project (CFRP), and extracted the data they contain into a searchable database. Now a new volume of open-access, bilingual essays, Databases, Revenues, and Repertory: The French Stage Online, 1680-1793 | Données, recettes et répertoire. La Scène en ligne (1680-1793), published exclusively online by the MIT Press, scrutinizes the data assiduously recorded by the eighteenth-century actors to come up with new and surprising conclusions about the business of the stage in the Age of Enlightenment, as well as observations about the potentials and perils of the digital humanities for contemporary scholarship.

Databases, Revenues and Repertory: The French Stage Online, 1680-1793

Databases, Revenues and Repertory: The French Stage Online, 1680-1793 (MIT, 2020).

Scholars of the French eighteenth century know that the plays of the seventeenth-century greats, Molière, Racine, and Pierre Corneille, were frequently performed, but the troupe’s full repertory in this 113-year period consisted of more than 1000 plays written by over 300 authors, spread across more than 33,000 nightly performances. Essays in this new volume explore how politics, economics, and social conflict shaped the troupe’s repertory and affected its finances, and reveal some surprising conclusions. First, contributors Pierre Frantz and Lauren Clay underscore the fact that Voltaire, who wrote over two dozen plays that have largely been forgotten, was the financial mainstay of the troupe in the eighteenth century. By the second half of the century, revenue from the staging of his plays had overtaken that generated by the works of the seventeenth-century triumvirate, the authors that literary and theatre historians today tend to associate with the French theatre before 1800. The implication is that Voltaire was a box office draw because of his passion for political causes, thereby suggesting that the theatre was far more politicized in this period than we may have imagined.

The Crowning of Voltaire after the sixth performance of Irène in 1778, by Charles-Etienne Gaucher, after Jean-Michel Moreau

The Crowning of Voltaire after the sixth performance of Irène in 1778, by Charles-Etienne Gaucher (1741-1804), after Jean-Michel Moreau (1741-1814). (Art Institute of Chicago, public domain)

Second, as economic historian François Velde points out, this extraordinarily complete business archive, detailing the expenditures and revenues of a major cultural enterprise over more than a century, offers important financial and economic insights into Enlightenment France. After 1750 the box office revenues of the troupe grew every year, suggesting both increasing prosperity and growing interest in cultural activity among many classes in the decades leading up to the French Revolution of 1789. The actors adapted accordingly, adjusting ticket prices and altering their repertory to appeal to changing public taste. The nightly record of plays staged and box office receipts provides surprising insight into the changing political culture of eighteenth-century France.

This volume and the initial phase of the CFRP were focused on the nightly box office receipt data for 113 seasons. An essay by project co-director Jeffrey Ravel in the recent Oxford University Studies in the Enlightenment volume Digitizing Enlightenment: Digital humanities and the transformation of eighteenth-century studies (eds. Simon Burrows and Glenn Roe), charts the history of the project and addresses questions of audience in the digital humanities. In subsequent phases of the CFRP, already underway, the team will be recording data on the troupe’s daily expenditures and its casting decisions for each night’s plays. The expenditure data, when analyzed alongside the box office receipts, will tell us much more about the troupe’s aesthetic and financial decisions during this key period of French political and cultural history. The record of casting choices promises important insights into the history of celebrity and its financial impact on political and cultural institutions in both the past and the present. The team will also be digitizing the registers from 1799 through 1914, thereby providing an unparalleled run of over two centuries of box office receipt data for one of the major theatrical and cultural institutions in the world in this period.

If only those lonely, tired actors counting their livres tournois each evening had known the uses to which their labours would be put by interested scholars three hundred years later!

Jeffrey S. Ravel

Exploring Voltaire’s letters: between close and distant readings

La lettre au fil du temps: philosophe

‘La lettre au fil du temps: philosophe.’

A stamp produced by the French post office in 1998 celebrates the art of letter-writing by depicting Voltaire writing letters with both hands. It’s true that Voltaire wrote a lot of letters – over 15,000 are known, and more turn up all the time – but even so it’s not altogether clear that an ambidextrous letter-writer is someone we entirely want to trust. Voltaire’s correspondence is full of difficulties and traps, and faced by such a huge corpus, it is hard to know where to start. Without question, the Besterman ‘definitive’ edition (1968-77), digitised in Electronic Enlightenment, has had a major impact on Enlightenment scholarship: historians and literary critics make frequent use of these letters, but usually in an instrumental way, adducing a single passage in a letter as evidence in support of a date or an interpretation.

Nicholas Cronk and Glenn Roe, Voltaire’s correspondence: digital readings (CUP, 2020)

Nicholas Cronk and Glenn Roe, Voltaire’s correspondence: digital readings (CUP, 2020).

Voltaire’s letters can be notoriously ‘unreliable’, however, and they really need to be read and interpreted – like all his texts – as literary performances. Few critics have attempted to examine the corpus of the correspondence in its entirety and to understand it as a literary whole. In our new book, Voltaire’s correspondence: digital readings, we have experimented with a range of digital humanities methods, to explore to what extent they might help us identify new interpretative approaches to this extraordinary correspondence. The size of the corpus seems intimidating to the critic, but it is precisely this that makes these texts a perfect test-case for digital experimentation: we can ask questions that we would simply not have been able to ask before.

For example, we looked at the way Voltaire signs off his letters – and were surprised to find that only 13% of the letters are actually signed ‘Voltaire’; while over a third of the letters are signed with a single letter, ‘V’. Then Voltaire is hugely inventive in the way he plays with the rules of epistolary rhetoric, posing as a marmot to the duc de Choiseul. And if you want to know why in a letter (D18683) to D’Alembert he signs off ‘Miaou’, the answer is to be found in a fable by La Fontaine…

We studied Voltaire as a neologist. Critics have usually described Voltaire as an arch-classicist adhering rigorously to the norms of seventeenth-century French classicism. True, yet at the same time he is hugely energetic in coining new words, an aspect of his literary style that has been insufficiently studied. Here, corpus analysis tools, coupled with available lexicographical digital resources, allow us to consider Voltaire’s aesthetic of lexical innovation. In so doing, we can test the hypothesis that Voltaire uses the correspondence as a laboratory in which he can experiment with new formulations, ideas, and words – some of which then pass into his other works. We identified 30 words first coined by Voltaire in his letters, and another 36 words first used in his other works, many of which are then reused in the correspondence. Emmanuel Macron has encouraged the description of himself as a ‘président jupitérien’, so it’s good to discover that ‘jupitérien’ is one of the words first coined by Voltaire.

Voltaire letter

A letter in Voltaire’s hand, sent from the city of Colmar to François Louis Defresnay (D5612, dated 1753/1754).

A reader of Voltaire’s letters cannot fail to be struck by the frequency of his literary quotations. We explore this phenomenon through the use of sequence alignment algorithms – similar to those used in bioinformatics to sequence genetic data – to identify similar or shared passages. Using the ARTFL-Frantext database of French literature as a comparison dataset, we attempt a detailed quantification and description of French literary quotations contained in Voltaire’s correspondence. These citations, taken together, give us a more comprehensive understanding of Voltaire’s literary culture, and provide invaluable insights into his rhetoric of intertextuality. No surprise that he quotes most often the authors of ‘le siècle de Louis XIV’, though it was a surprise to find that Les Plaideurs is the Racine play most frequently cited. And who expected to find two quotations from poems by Fontenelle (neither of them identified in the Besterman edition)?! Quotations in Latin also abound in Voltaire’s letters, many of these drawn, predictably enough, from the famous poets he would have memorised at school, Horace, Virgil, and Ovid – but we also identified quotations, hitherto unidentified, from lesser poets, such as a passage from Manilius’ Astronomica. By examining as a group the correspondents who receive Latin quotations, and assigning to them social and intellectual categories established by colleagues working at Stanford, we were able to establish clear networks of Latin usage throughout the correspondence, and confirm a hunch about the gendered aspect of quotation in Latin: Voltaire uses Latin only to his élite correspondents, and even then, with notably rare exceptions such as Emilie Du Châtelet, only to men.

The woman on the left, a trainee pilot in the Brazilian air force, is an unwitting beneficiary of Voltaire’s bravura use of Latin quotation. The motto of the Air Force Academy is a stirring (if slightly macho) Latin quotation: ‘Macte animo, generose puer, sic itur ad astra’ (Congratulations, noble boy, this is the way to the stars). The quotation is one that Voltaire uses repeatedly in some dozen letters, and it is found later, for example in Chateaubriand’s Mémoires d’outre-tombe. On closer investigation it turns out that this piece of Latin is an amalgam of quotations from Virgil and Statius – in effect, a piece of pure Voltairean invention.

In the end, Voltaire’s correspondence is undoubtedly one of his greatest literary masterpieces – but it is arguably one that only becomes fully legible through the use of digital resources and methods. Our intention with this book was to affirm the simple postulate that digital collections – whether comprised of letters, literary works, or historical documents – can, and should, enable multiple reading strategies and interpretative points of entry; both close and distant readings. As such, digital resources should continue to offer inroads to traditional critical practices while at the same time opening up new, unexplored avenues that take full advantage of the affordances of the digital. Not only can digital humanities methods help us ask traditional literary-critical questions in new ways – benefitting from economies of both scale and speed – but, as we show in the book, they can also generate new research questions from historical content; providing interpretive frameworks that would have been impossible in a pre-digital world.

The size and complexity of Voltaire’s correspondence make it an almost ideal corpus for testing the two dominant modes of (digital) literary analysis: on the one hand, ‘distant’ approaches to the corpus as a whole and its relationship to a larger literary culture; on the other, fine-grained analyses of individual letters and passages that serve to contextualise the particular in terms of the general, and vice versa. The core question at the heart of the book is thus one that remains largely untreated in the wider world: how can we use digital ‘reading’ methods – both close and distant – to explore and better understand a literary object as complex and multifaceted as Voltaire’s correspondence?

– Nicholas Cronk & Glenn Roe, Co-directors of the Voltaire Lab at the VF

Voltaire’s correspondence: digital readings will be published in print and online at the end of October. The online version is available free of charge for two weeks to personal and institutional subscribers.

From the mundane to the philosophical: topic-modelling Voltaire and Rousseau’s correspondence

Voltaire and Rousseau’s correspondence are two fascinating collections which have perhaps not received the amount of attention than they could have due to the nature of these texts. Written over five decades, these letters cover a wide range of topics, from the mundanity of everyday concerns to more elaborate subjects. Getting an overall picture of these correspondences is challenging for the simple reader. This is unfortunate since these correspondences not only constitute a window into the private lives of Voltaire and Rousseau, or show an unfiltered expression of their respective thoughts, but they are also an example of the eclecticism professed by the philosophes. Fortunately modern computational techniques can truly help in providing an overview of the content of these letters and hopefully recapture – in a somewhat organized fashion – this very eclecticism of the Lumières. Thanks to the collaboration between the Voltaire Foundation and the ARTFL Project, I will be briefly discussing how topic-modeling can be used to draw an overall picture of these correspondences, and show a couple of examples of the model built from the Voltaire letters.

The ARTFL Project has long been engaged in exploring 18th-century discourses using digital tools, and the thematic opacity of correspondences is an ideal use-case for topic-modelling. This particular algorithm was designed to generate clusters of closely related words (or topics) by analyzing all word co-occurrences in any given corpus. Because these topics are extracted from their source texts, they are understood to describe the contents of the corpus analyzed. We recently released a topic-modelling browser – called TopoLogic – which was designed to explore such clusters of co-occurring words, and ran a preliminary experiment against the French Revolutionary Collection, the results of which can be seen here. When we built the topic models for Voltaire and Rousseau’s correspondences, we made sure to use the same parameters for both collections such that 40 topics (or discourses) were generated from each set of letters. We also only used those letters written by Voltaire on one side, and Rousseau on the other, hoping that we could perhaps make some comparisons between both models.

Let’s start with the Voltaire model, from which you can see the first 20 topics below:

As a first view into the topic model, the browser gives us the top 10 words for each topic, as well as their overall prevalence in the letters by Voltaire. From there we can further explore any topic, such as 16, which seems to map to Voltaire’s idea of the philosophe fighting against religious intolerance. By clicking on the topic however, we get an overview of how the topic is distributed in time, most important words in the topic, correlated topics, as well as documents where the topic is prominent (see figure below).

Let’s focus on several sections of this overview. We note below that the terms of philosophe and philosophie are weighted far more heavily than any other term, suggesting perhaps that all other words in this cluster may just constitute different characteristics of the philosophe in Voltaire’s eyes: religious concerns (prêtre, jésuite, religion, tolérance), attributes (honnête, sage), means of expression (article, livre).

All of these observations can of course be verified by exploring letters that feature topic 16 in a prominent way, which the browser does list. We can also see how the philosophe discourse evolves over the more than sixty years of Voltaire’s letters. Unsurprisingly, as his public involvement in religious affairs increases, the prevalence of such terms discussing his idea of the philosophe rises as well in his letters.

Among the discourses which tend to follow the same trend over time (see figure below), the cluster of terms related to justice (topic 5) stands out, once again showing that his public involvement is mirrored in his private correspondence. While these aspects are nothing really new, they provide for the prospective reader an easy way to find those letters that do discuss these topics.

Another interesting aspect of topic-modeling is that we can also examine the discursive make-up of any of Voltaire’s letters, and see if there are any other letters that share the same themes. Let’s examine Voltaire’s famous letter to Rousseau in which he mocks the citoyen de Genève’s position on the impact of literature in the second discourse (see figure below): ‘Les Lettres nourissent l’âme, la rectifient, la consolent’.

When we look at topical representation of this letter in the browser, we can note that the model found a number of different topics within this letter, which when combined do provide an overview of its contents. In it, Voltaire discusses – with much irony – his own experience as a writer (topic 33), which includes his role as historiographe du roi (topic 36), as well as the many controversies he was involved in (topic 10). He sarcastically laments the fact that he cannot afford to live with savages in a distant land (topic 25) because his health requires him to be treated by a doctor (topic 26 and 35). And as a whole, he defends the role of literature as a positive good for man (topic 0). Of course, one could argue that this topical structure is approximate, prone to discussion, and this is certainly true. However, this approximation is now available for all 15,000 letters, which then allows the computer to compare and group letters by this very topical structure. In this same document view, we can see documents which share a similar mixture of topics, such as a letter to Ivan Shuvalov from 1757 where Voltaire discusses his writing of history while displaying a very keen concern for the perception and impact of his writing, or another to D’Alembert where he complains about his bad health while stressing the importance of writing about useful things (‘il y avait cent choses utiles à dire qu’on n’a point dittes encore’).

One last aspect of the topic model is to examine the individual uses of words and the different contexts in which they are used. If we look at the uses of écrivain in the correspondences (see figure below), we can see how that its uses span across different types of discourses related to reason, the writing of history, or the public role of the writer. Looking at the actual word associations, we also note potentially interesting patterns. In the case of words that share similar topic distributions (used with a similar mix of discourses), a group of terms related to ignorance seems to dominate: fausseté, mensonge, ignorance, vérité, erreur, fable… This may allude to a sense of mission in Voltaire’s writings: to correct inaccuracies, to dispel lies, to reestablish the truth in the face of ignorance. Looking this time at words that tend to co-occur with écrivain, we get a very different picture, with terms that relate more to the activity of writing and the product of that writing. These two views on word associations do not contradict one another, but suggest different ways of thinking of the role of the écrivain as depicted in Voltaire’s letters.

To finish, let’s take a look at the topic model of Rousseau’s correspondence, and in particular how we can relate it to that of Voltaire. A quick overview of the first 20 topics in Rousseau’s letters reveals a similar – yet distinct – picture of the topical composition of his correspondence (see figure below).

Using the browser, we could track down Rousseau’s response to Voltaire’s criticism of the second discourse, and see if other letters discuss similar themes. This is all within the scope of this browser. For the sake of brevity however, and to show how topic models can be used to run comparative experiments, we wanted to focus on Rousseau’s usage of the word écrivain in order to see if and how it differed from what was suggested in the Voltaire model. As we can see below, Rousseau tends to use the term in similar contexts: the écrivain is invoked first and foremost as a conveyor of truth. But looking more closely at word associations, a distinctive pattern does emerge: such terms as lâche, haine, hypocrite, acharnement, or jalousie highlight a well-known trait of Rousseau, his paranoia in the face of his success as a writer. Clicking on any these words in the browser would allow a researcher to track down the individual uses of these terms as they relate to écrivain, and find those letters to discuss his persecution complex.

To conclude, we are well aware that any analysis provided here is purely built on the patterns derived from the topic models, and as such, remain unproven until verified by a close reading of the letters themselves. However, we hope to have shown how using a tool such as topic modeling can potentially provide new insights into the correspondences of Voltaire and Rousseau, or at the very least offer better guidance to scholars working on these two incredibly rich collections.

Clovis Gladstone

This article was first published in the Café Lumières blog in June 2020.

Clovis Gladstone’s Rousseau et le matérialisme appeared in Oxford University Studies in the Enlightenment 2020:8.

 

Digitising Candide

Candide

Candide, title page of edition 299L (see OCV, vol.48, p.88).

In what is arguably his most widely known work, Voltaire describes the extraordinary journey that his eponymous hero undertakes through geography and understanding, and for us digitising the novel is the first step on the long and – we hope and trust – exciting journey to digitise the whole of the complete works, the OCV. As such it has been a proof of concept, a baptism of reassuringly gentle fire, and a taste of things to come.

For a digital file that’s worth its bytes we need much more than just electronic words. We need a format that will encode structure and meaning so that people and – just as importantly – programs can understand the extra information we’re embedding into the file, and use it to help make readers’ and scholars’ use of the material richer and easier.

Thankfully many others have trodden a similar path. Since the 1980s countless digital humanities minds have contributed to the Text Encoding Initiative, simultaneously a sophisticated tag set for marking up scholarly material, and a community engaged in maintaining that model, supporting the people who use it, and improving it based on collective experience, wisdom, and usage. We had no need to invent a wheel – TEI is beautifully adapted for our journey. We used it to design a tailored model to suit the particular needs of the OCV and Digital d’Holbach. This is being applied for us by our supplier, Apex CoVantage, who are assembling a specialist team and developing automated tools to streamline the workflow, and using the first dozen volumes as tools to train both people and software. Candide was their introduction to this fascinating marriage of the Enlightenment and the computer.

The structural tagging – for things like introductions and notes – will allow readers to see as much or as little detail and complexity as they wish, choosing between at one end of the scale just the edited version of Voltaire’s words, to at the other the full panoply of editorial introduction, notes, bibliographic citations, and textual variants, with a varying choice between the two extremes. It will also help readers navigate through and across the various parts of the volume, enabling their own particular journey.

Tagging for meaning – what we call the semantic tagging – is what will allow the dataset to communicate within itself, to other datasets, and also to humans. It’s what can help make search fully useful rather than just a literal echo of what a user types, and it can help a reader see a wider range of ‘next steps’ by making meaningful connections beyond those possible with just words and spaces. We tag people, places, dates, works, and institutions, and we’re also going to be developing a full set of metadata to accompany the datasets, as a rich and consistent layer describing the entire corpus in disciplined detail – we aim for this to be our contribution to the semantic web. We tag for primary and secondary content, and every piece of text has a language code associated with it so that if machine translation were applied to the data set we can choose which parts of an edition are translated (e.g. the introduction) and which are left in the original language (e.g. primary content quotes). Again, our work enables control and choice.

Part of the digital file of Candide

Part of the digital file of Candide, showing the end of the text of the novel.

Candide

The end of the novel in the Paris, Lambert, 1759 edition.

These two aspects turn a dataset into something akin to a machine (with the metadata as the auxiliary power unit), with multiple interlocking components that make it much easier for readers to summon or suppress the parts of the edition they need.

A machine needs precision in its gears and smoothness in its moving parts, and digitisation is revealing the odd snag and missing bolt where the tools we now have to analyse the workings were not available forty years ago. The exercise is therefore an opportunity to collate points we might wish to address in a revised edition (as well as revealing the occasional typographic error). But overall it’s gratifying how the abstract model we designed ahead of any full-scale digitisation has proved to be fit for purpose, and allows us to interrogate and improve the digital Candide by program, benefits which will increase exponentially as more volumes are added to the electronic corpus. The whole, we think, will be very much greater than the sum of its parts.

While the ultimate consumer of the digital files we’re creating will be human readers, the immediate consumers as intermediaries will be machines and processes, and even a cursory look at the ‘raw’ file of Candide shows you why. Character-for-character there is much more tagging than text, and for the eye simply to read the novel is near impossible; we keep tripping over indexing, line breaks, page breaks, emphasis, witness references … the list of tags is seemingly endless. What we see is ‘noise’ since we’re not programmed to filter one thing from another, but a program can be told to do exactly that, allowing any amount of filtering, cross-referencing, formatting, and even transformation to render the volume exactly as a reader requires. In order to ensure simplicity, but allow richness, and to enable choice, we have to make sure we start from complexity.

Digitisation and the accompanying process of metadata curation is all about preserving content, extending reach, and adding value. If we get this right, we should be laying the foundations for globally accessible tools of immense richness which will add to – and not detract from – the core material and scholarship on which it is all built. We have a responsibility to use the digital tools available to help as many people as possible find, read, and understand the extraordinary legacy of Voltaire and his contemporaries. Il faut cultiver nos données.

Dan Barker, dancan Ltd.

The Digitizing Enlightenment ‘twitterstorm’ of 3 August 2020

This past week our publication partner, Liverpool University Press, shipped out copies of Digitizing Enlightenment: digital humanities and the transformation of eighteenth-century studies, edited by Simon Burrows and Glenn Roe, the July volume of Oxford University Studies in the Enlightenment.

Rousseau’s Premier Discours

Frontispiece and title page of the first edition of Rousseau’s Premier Discours, on the question ‘Si le rétablissement des sciences et des arts a contribué à épurer les mœurs’.

To help launch this important book, on Monday 3 August Burrows and Roe, joined by Melanie Conroy, one of the contributors, organized a ‘twitterstorm’, inviting dix-huitiémistes working on digital humanities projects of any sort to post links of their work on Twitter, tagged with #DigitizingEnlightenment.

Over the course of 48 hours stretching from first light Sunday morning in eastern Australia to midnight Monday night on the Pacific coast of the United States, 112 unique tweets were posted from 28 accounts. The sequence of posts may be read, in reverse chronological order, here.

To enlighten and enliven the discussion, and in the spirit of eighteenth-century intellectual exchange, the Voltaire Foundation sponsored a competition, asking for the most creative and thoughtful response to the question: ‘Has the rise of  #dh been a boon or a barrier to #C18 studies?’

Twelve individuals posted responses, and the jury – consisting of Burrows, Roe and Conroy – deploying a sophisticated algorithm, ranked the entries and identified three runners-up and two winners.

The three runners-up were:

Helen Williams

https://twitter.com/helen189/status/1290261481062375425?s=20

As a first-gen scholar in the North East teaching & researching at a post-92 institution, #DigitizingEnlightenment is a boon, making the #18thcentury accessible & bringing diverse new voices, projects & approaches to scholarship & study. Many of us wouldn’t be here without it.

– Helen Williams (@helen189) August 3, 2020

Bryan Banks

https://twitter.com/BryanBanksPhD/status/1290245758059388929?s=20

Really excited to see this book come out!@SimonBu86342933 @glennhroe @MelanieConroy1 put the #DH in 𝐝ix-𝐡uitiemistes.

Today’s organized hashtag #DigitizingEnlightenment, like much DH work more broadly, makes the #18thC more legible and accessible to us today./1 https://t.co/IajlYLtPWk

– Bryan Banks (@BryanBanksPhD) August 3, 2020

Russell Goulbourne

https://twitter.com/FrenchProfessor/status/1290215320091635720?s=20

Definitely a boon – because it’s the #DH analysis of huge numbers of texts that allows us to see that it’s precisely in the 1760s, at the height of the Enlightenment, that boon comes to mean “a benefit enjoyed”. QED. #DigitizingEnlightenment

– Russell Goulbourne (@FrenchProfessor) August 3, 2020

And now the winners:

Chad Wellmon

As Kant wrote 200+ years ago, DH has been a boon to #C18 studies. It’s a no-brainer @VoltaireOxford: “It is so easy to be immature. If I have a [computer] that has understanding for me, surely I do not need to trouble myself.” I. Kant, “An Answer to the Question ‘What is DH?’” https://t.co/wIGQJDT7p4

– chad wellmon (@cwellmon) August 3, 2020

https://twitter.com/cwellmon/status/1290310792156450819?s=20

Megan K. Roberts

I hate to be the lone skeptic, but I am concerned about the influence of #DH and #DigitizingEnlightenment on the field. Some projects are wonderful for research and teaching, but I worry that others place too much emphasis on an extremely select group of French philosophes.

– Meghan Roberts (@MeghanKRoberts) August 3, 2020

https://twitter.com/MeghanKRoberts/status/1290281237089665024?s=20

Both winners received copies of Digitizing Enlightenment as well as OSE’s June 2019 title, another volume of essays which deployed digital humanities methods to study the eighteenth century, Networks of Enlightenment, edited by Chloe Edmonston and Dan Edelstein.

As a supplement to the printed books, the data visualizations, tables and figures, as well as a portion of the text for each of these two volumes, are accessible on open access on the OSE ‘Digital Collaboration Hub’, built on the Manifold Scholar platform and hosted by Liverpool University Press. These may be accessed, appropriately, at http://digitizingenlightenment.com

Thanks to all who participated – and we all hope to be able to renew the annual ‘Digitizing Enlightenment’ symposium in July 2021, to be hosted at the University of Montpellier, in the context of the ‘Enquête sur la globalisation des Lumières’ initiative.

– Gregory S. Brown

NB: For the month of August, copies of Digitizing Enlightenment are available for purchase at a 25% discount. Purchasers in North America may order from the OUP-Global site using the code “DISTRO25” and purchasers anywhere else in the world, including UK, Europe and Australia,  may order from the LUP site using the code “DIGITIZING25“.

Digitizing the Enlightenment

As country after country has gone into COVID-19 lockdown, we have all had to learn to communicate, network, teach, study and relate online in ways unimaginable a few short years – or even months – ago. This phenomenon is just the latest stage in the information-technology revolution and part and parcel of the ongoing development of an increasingly digital society. This revolution has touched almost every aspect of our lives, from how we work, study, shop, relax and even make and maintain personal relationships. But it is also transforming scholarship and the way we conduct and communicate academic research. Thus, it is perhaps apt, and with consummate good timing, that Oxford University Studies in the Enlightenment has chosen to subject tag our new volume as ‘History of Scholarship (Principally of Social Sciences and Humanities)’. Yet this is certainly not how we and our collaborators envisaged our project at the outset, nor can any single tag capture the content of our volume and its collaborative agenda in its entirety.

The Digitizing Enlightenment workshop logo

The Digitizing Enlightenment workshop logo, designed by Evan Casey for the Voltaire Foundation, featured on the cover of Digitizing Enlightenment.

Ironically, as we write, Digitizing Enlightenment is also a living movement – or at least a loose network of scholars who meet annually in pursuit of a common agenda. That agenda was born in a series of conversations that took place from 2010, culminating in Dan Edelstein’s post-panel suggestion at the American Historical Association conference at Montreal in April 2014 that we should hold periodic meetings between like-minded digital projects relating to the Enlightenment. The aim of these meetings would be to establish common conventions and digital standards, with a view to linking our resources and realising the enormous and still largely untapped potential of Linked Open Data. Those present for Dan’s suggestion – Simon Burrows, Jeff Ravel, Sean Takats and Dan himself – have all provided chapters for our book, but much of the energy behind Digitizing Enlightenment since has come from Glenn Roe, who Simon had first encountered a month earlier in Australia, where they had both recently taken up academic positions.

It was this fortuitous coincidence, underpinned by the fertile combination of Simon’s professorial establishment funds and Glenn’s energy, together with their mutual contact books, that led to Western Sydney University hosting the first Digitizing Enlightenment symposium in July 2016. Among the projects discussed there, and in our book, were large-scale treatments of Enlightenment correspondences, theatre attendance records, and textual corpora including the mid-eighteenth century Encyclopédie; bibliometric projects were presented on the production and dissemination of literature; together with presentations on mapping and data visualization growing out of these projects. The symposium was so well received that it has been an annual event ever since. It was held at Radboud University in Nijmegen (2017), Oxford (2018), Edinburgh (2019). In 2020, but for COVID-19, it would have been held in Montpellier.

It was not entirely by chance that such a project coalesced around the guiding notion of the ‘Enlightenment’. For the long eighteenth century has been blessed by a number of high-profile and long-established digital projects. These include ground-breaking commercial datasets such as Gale-Cengage’s Eighteenth-Century Collections Online (ECCO), which features in several of our chapters, semi-commercial projects such as the Electronic Enlightenment and large academic consortiums such as the Franco-American ARTFL project. This made the Enlightenment a natural laboratory for exploring the possibilities and achievements of the Digital Humanities for transforming scholarship on a single historical era. Further, as our book emphases, our discussions built on a long tradition of digital innovation in eighteenth-century studies that can be traced back at least as far as the twin Livre et société dans la France du XVIIIe siècle volumes produced by a team led by François Furet in 1965 and 1970. It might further be added that our over-arching subject material lends itself to digital-historical analysis; the Enlightenment might after all be viewed as the long-run culmination of the intellectual turmoil and – as several contributors point out – information overload unleashed by a previous technological and communications revolution.

Digitizing Enlightenment is the July volume in the Oxford University Studies in the Enlightenment series

Digitizing Enlightenment is the July volume in the Oxford University Studies in the Enlightenment series.

With this in mind, then, we offer up Digitizing Enlightenment: Digital Humanities and the Transformation of Eighteenth-Century Studies as rather more than a contribution to the history of scholarship. Certainly, we have offered a sample of Digital Humanities c. 2016-2020, as it relates to the technologies available and their application to Enlightenment studies broadly construed. In addition, the first half of the book offers detailed accounts of the origins and development of key Enlightenment digital projects up until that point, accompanied by valuable and sometimes disarming insights on the dangers and delights of digital research from foremost practitioners in the field. These chapters, as well as some later contributions, are helping to reshape some dominant meta-narratives of the Enlightenment, not least by hinting simultaneously at the enduring aristocratic leadership of the French Enlightenment and the extent to which Enlightenment literary production and consumption was infused with religious content. However, our contributors also showcase other ways that Digital Humanities scholarship is in the process of changing the field through the transparency, methodological rigour, and collaborative imperatives that are necessary concomitants of this new kind of research. Finally, the book offers a collaborative roadmap for future digital research – at a moment where, as our final contributor, Sean Takats points out, the Enlightenment is fast losing its privileged position as the most richly digitized century of the modern era. As a corollary, we hope that our volume may be as useful to scholars of other periods as for Enlightenment scholars themselves.

– Simon Burrows (Western Sydney University) and Glenn Roe (Sorbonne University)

Simon Burrows and Glenn Roe are the editors of the July volume in the Oxford University Studies in the Enlightenment series, Digitizing Enlightenment: Digital Humanities and the Transformation of Eighteenth-Century Studies, which is the first book length survey of the impact of digital humanities on our understanding of a key historical period and paradigm.

This post is reblogged from Liverpool University Press.

A la portée de tout le monde

That was then: d’Holbach in print…

That was then: d’Holbach in print…

When I came upon the baron d’Holbach in the early 1960s – my undergraduate senior thesis was on d’Holbach’s atheism and the response of Voltaire and others to the Système de la nature, a choice of subject that played a not insignificant role in my scholarly life – there were few secondary sources to guide me, and there were precious few of his works accessible from a university library. As a graduate student studying his salon, his atheism, and his relationships, the only passage to his writings required taking a paper number in the waiting room of the Bibliothèque Nationale, rue de Richelieu, praying for a siège to open up, and waiting patiently for tomes that might or might not be the right editions to be delivered to my desk. Where was this extraordinary digital resource when I most needed it?

The secondary works about d’Holbach still cited by scholars of the Enlightenment when I began my research had been written in 1875, 1914, 1928, 1935, 1943, 1946, and 1955. There was a near universal consensus that it would be in the coterie holbachique that I would find the major movement of organized atheism in the Enlightenment (there were indeed a handful of atheists there, but the vast majority of devotees of the salon were not), and, for most scholars, diversely conspiratorial efforts to bring down the Ancien Régime. As each of these operational hypotheses dissolved in the light of the evidence – to my great disappointment at the time – I understood that one must learn to read texts, archives, correspondence, and contemporaneous records without prepossessed ideas about d’Holbach or his world, however ‘settled’ the scholarship appeared to be. Too many historians, claiming d’Holbach for a particular ideological camp, cited and presented him selectively and tendentiously, leaving one wholly unprepared for and startled by the dimensions and aspects of his work that they did not address or explicate. This narrative extended to the coterie itself.

… and this is now: screenshot from Tout d’Holbach.

… and this is now: screenshot from Tout d’Holbach.

D’Holbach’s salon was not a collaborative conspiracy or undertaking, pace many a historian and literary historian. The recent release of the truly collaborative Tout d’Holbach database coupled with the Voltaire Foundation’s project to create a digital scholarly edition of d’Holbach’s works (Digital d’Holbach) will allow debates about d’Holbach’s meanings and intentions to engage a large number of scholars and students in a variety of fields, informed by the array of texts on religion, superstition, philosophy, ethics, politics, and happiness that he bequeathed to his contemporaries and to us. Claims about his scepticism or dogmatism, elitism or egalitarianism, pessimism or optimism, happiness as contentment or happiness as hedonistic pleasure, for example, now can receive truly critical reception based upon unfiltered access to his actual works. Those of us who wrote about these matters might well have hoped that readers and students would go beyond our own claims and citations, testing the value of these by setting what we explicated in the context of the larger text and intellectual context, but now that is possible on a whole new scale. It is an exciting opportunity, and it is not only à la portée de tous ceux qui s’y intéressent, but it will expand such interest sizably and noticeably.

– Alan Charles Kors
Henry Charles Lea Professor Emeritus of History, University of Pennsylvania

A ‘Taste’ of Voltaire

Roseanne Silverwood has just received an MA in Translation (French and Spanish) from the University of Bristol (with distinction). Her dissertation project was to translate Voltaire’s article Goût from Questions sur l’Encyclopédie into English. She studied Modern Languages as an undergraduate at St Hilda’s College at the University of Oxford.

To say I was daunted is probably an understatement. Who was I to translate a previously untranslated text of Voltaire’s from French to English? Voltaire, the highly esteemed eighteenth-century French writer who still tops bestseller charts in France today.  However, for some reason this translation project piqued my attention and I knew, despite the hard work that it would entail, that I did not want to turn down the opportunity to be part of something that had the potential to be bigger than just my own academic studies.

From <i>Questions sur l’Encyclopédie</i> (1771), vol.6.

From Questions sur l’Encyclopédie (1771), vol.6.

I first spoke to Adrienne Mason, the intermediary between the Voltaire Foundation and myself, in late 2018 to find out more about how the collaboration between the University of Bristol and the Voltaire Foundation would work. However, it was early 2019 before I really embarked upon the project of translating Voltaire’s article Goût from Questions sur l’Encyclopédie, which formed the basis of my Master’s dissertation. After two years of studying for my MA Translation (part-time) as a distance learner at the University of Bristol, whilst working full-time in a completely different industry, I knew that taking on the translation of Goût (p.280-98) was going to be an enormous challenge, not just because I would be continuing to work full-time during the week, but also because I was planning a wedding at the same time!

What surprised me the most was how accessible Voltaire’s writing was to me as a modern-day reader, especially given that I am not a native French speaker. As part of the commission I was also tasked with translating the scholarly peritext (i.e. the footnotes to the article Goût in Volume 42A of the Œuvres complètes de Voltaire), and I have to admit that I enjoyed translating Voltaire’s own writing much more than the critical annotations that accompanied his article. I expect this is because Voltaire’s writing was more free-flowing and abstract, whereas the academic peritext was factual and punctuated with constant references and quotations from other authors, which presented many challenges in translation.

I spent much of my spare time in spring and summer that year locked away in my study working through all twenty pages of the commission. If I could at least do a first draft of one page whenever I had a spare day at the weekend, I had a chance of getting my dissertation project completed by the September deadline. I was impressed that, equipped with my student library privileges, there were so many resources that I could access online, even more, I think, than when I was completing my undergraduate degree between 2007 and 2011.

Title page of Questions sur l’Encyclopédie (1771).

Title page of Questions sur l’Encyclopédie (1771), vol.6.

One of the hardest aspects of this translation was where Voltaire, or the author of the peritextual material, quoted from different authors in languages other than English. In these cases, I had to first search for an existing translation of the quotation, and only if this did not exist could I translate the fragment myself. Therefore, I found myself trawling through eighteenth- and nineteenth-century texts online for this purpose, and was truly amazed at what was available at the click of a button. As I write this blog post, most of the world is under some form of lockdown due to Covid-19, and this certainly brings to the forefront the importance of initiatives such as the Voltaire Lab – where my translation can now be consulted – a ‘virtual space for cutting-edge research and experimentation on Voltaire’, so that scholarship can still flourish in the modern day despite the challenges that could be posed by distance learning, or even global pandemics.

For me personally, I have always loved studying languages and hope to make a career as a translator one day, so during this project it was interesting to consider translation as a profession, and in particular how it can be perceived as an undervalued discipline. This, in turn, can mean that academic texts in other languages, such as the critical annotations to Voltaire’s article, or even the essay Goût itself if considered as a social science text, are often overlooked because they are not originally written in the academic lingua franca, English. Furthermore, if a spotlight can reveal the important role that translations can play in advancing scholarship about a writer such as Voltaire, perhaps the discipline of translation as a whole could be elevated to greater heights. After all, as I well know, it is an academically demanding and time-consuming process.

Fortunately, I managed to complete my dissertation project by the deadline so that I could go and celebrate my own hen party guilt-free the following weekend. I got married a month later, and once the excitement of the wedding had subsided I was delighted to get the fantastic news that my dissertation had received a mark of 82%, which tipped me over into a distinction for my Master’s grade overall. I cannot thank my supervisor, Clare Siviter, Adrienne Mason and the Voltaire Foundation enough for the opportunity to participate in such a pioneering research project that highlights the importance of digitisation in academia and the fruits that can be borne by collaboration between different universities.

– Roseanne Silverwood