Exploring multilingual digital editions

The Taylor Institution Library recently launched a new course teaching digital editing, with students able to create digital editions in any language of their choice. I was delighted to be able to contribute by designing the accompanying website on which the texts are published:

I am the editor and developer of several academic resources, including the award-winning Eighteenth Century Poetry Archive (currently English-language poetry only) and the Thomas Gray Archive. My interest in working with multilingual materials was sparked by part of this resource: ‘Gray’s Elegy in Translation’. According to the Digital Miscellanies Index (DMI), Thomas Gray’s ‘Elegy Written in a Country-Churchyard’ (1751) is the most anthologized poem of the eighteenth century, and it is one of the most widely and frequently translated, paraphrased, and imitated poems in the English language. With to date at least 266 translations into at least forty languages, the Elegy has inspired translators ever since the earliest translations into Latin appeared in the early 1760s. Since those early translations, the Elegy has been influential in the history of many national literatures, particularly in the context of the evolution of European Romanticism.

Drawing on the extensive collection of Elegy translations compiled by Tom Turk,(1) the purpose of the project is firstly to enable the study of the evolution of translations of the poem in a single language and culture, and secondly to allow for a comparative study of the translations across languages and literatures, initially within, but ultimately beyond European boundaries. The first phase of the project covers the period up to 1805, comprising fifty-seven verse and prose translations of the Elegy in eleven languages (Danish, French, German, Italian, Latin, Polish, Portuguese, Russian, Spanish, Swedish, and Welsh). The translations variably highlight changes in the understanding and interpretation of Gray’s poem, reflect cultural borrowings and transfers, betray changes in literary taste, and may even allow us to uncover the circumstances, agency, and purpose of their production in the first place. Thanks to James D. Garrison’s outstanding work,(2) we have a sense of the national context of Gray’s significance in both France and Italy, but for many other European and particularly non-European languages, these histories remain to be written.

The two main objectives for the project website were to provide an intuitive interface to the translations that allows for easy comparison of equivalent passages and to allow users to comment on any part of the original or any of the translations. In the full-text view up to three texts (in any combination of languages) can be explored side-by-side in their entirety, ‘equivalent’ passages are highlighted when hovering over any stanza or paragraph:

In the detailed view any one stanza from any text can be compared with its ‘equivalents’ (if available) in either all of the translations or the translations in a particular language:

Users can add a translation or comment on a translation of any section of any of the texts using a simple click and drag action to mark the section to be annotated:

I would love to gain a better understanding from practitioners on which avenues to pursue (linguistic, stylistic, semantic etc.) for both the enhanced mark-up of the translations and the development of tools (and/or integration of external services, such as dictionaries/thesauri) to provide via the interface. Having caught the multilingual bug, I am also very keen to expand another resource of which I am editor, the Eighteenth Century Poetry Archive (already mentioned above), to include poems in other languages, along with tools for their analysis. Anyone reading this who might be interested in contributing to this endeavour, please get in touch!

I hope you will have a chance to explore the translations, and would love to hear about your experience with the current interface and any changes, improvements, or additions you would like to see in the future. If you can see the potential for any of the techniques mentioned to be applied in the Taylor Editions website, then I would be very happy to explore this further. Please do not hesitate to contact me with your feedback.

Alexander Huber (Bodleian Libraries, University of Oxford)
Editor, Thomas Gray Archive

(1) Thomas N. Turk, ‘Search and Rescue: An Annotated Checklist of Translations of Gray’s Elegy’, Translation and Literature 22(1) (Spring 2013): 45-73.

(2) James D. Garrison, A Dangerous Liberty (Newark, University of Delaware, 2009). Garrison covers a wide range of languages, with particular emphasis on French and Italian, and to a lesser extent German, Russian and Spanish.

Advertisements

A born-digital edition of Voltaire’s Dialogue entre un brahmane et un jésuite

Just as the print edition of the Œuvres Complètes de Voltaire is fast approaching its completion, we at the Voltaire Foundation are starting work on two new, highly ambitious digital projects thanks to the generosity of the Andrew W. Mellon Foundation: a digital edition of Voltaire’s works based on the Œuvres complètes (Digital Voltaire), and a born-digital edition of the works of Paul-Henri Thiry d’Holbach (Digital d’Holbach).

With a view to gaining the necessary skills required to begin my work on Digital d’Holbach, in autumn 2018 I attended an intensive course on digital editions run by the Taylorian Institution Library. Taught by Emma Huber in collaboration with Frank Egerton and Johanneke Sytsema, the course takes students through all the phases of the digital edition workflow, from transcription to publication and dissemination. It is a goal-focused, hands-on course during which students are warmly encouraged to create a born-digital edition of a short text from the Taylorian’s collections.

Although short and apparently light in tone, the piece that I chose to edit – Voltaire’s Dialogue entre un brahmane et un jésuite sur la nécessité et l’enchaînement des choses – is a key text in the evolution of Voltaire’s philosophical views. As the title suggests, the Dialogue hinges on the question of determinism (or fatalisme, in eighteenth-century French parlance) and touches on such crucial notions as moral freedom, causation, and the problem of evil. It was first published anonymously in the Abeille du Parnasse of 5 February 1752, and it then went through several reprints during Voltaire’s lifetime, with very few variants.

My edition of the Dialogue is of course not meant to replace the one already available in OCV. Rather, it was conceived to meet the needs of the broader public – and more specifically those of students. A very short introduction, displayed on the right-hand side, provides essential information on the philosophical issues at stake while situating the Dialogue in relation to other key texts by Voltaire. An original translation into English by Kelsey Rubin-Detlev makes the text more widely accessible, allowing students working in fields other than modern languages (e.g. philosophy) to engage with Voltaire’s ideas. High-quality pictures of the 1756 edition, which provides the base text, aim to give non-specialists a taste of what it feels like to leaf through a (dusty) eighteenth-century book. Finally, a modernised version of the text is available next to the facsimile, and a rich corpus of annotations – displaying in both the French transcription and the English translation and featuring links to several other digital resources (the ARTFL Encyclopédie and Tout Voltaire, but also Wikipedia and BibleGateway!) – aims to render the reading experience as informative and rewarding as possible.

But there is more to this edition than first meets the eye! For example, by clicking on ‘Downloads’ in the menu bar, a fifth column will appear from which the user is invited to download pictures as well as TEI/XML files, which can then be used as models to generate further digital editions. Also, a drop-down menu in the transcription column allows users to choose between two different versions of the text in addition to the modernised version displayed by default: a diplomatic transcription of the 1756 edition and a diplomatic transcription of a 1768 edition, which comes with its own set of images that are also available for download under a Creative Commons Licence. By looking at these texts, users will get a sense of how radically French spelling evolved in the mid-eighteenth century.

Readers of this blog are most cordially invited to browse my edition. Any feedback on content or presentation (e.g. the way footnotes or variants are displayed) would be greatly appreciated as I work towards an edition of a considerably longer text by d’Holbach. But more on that in the coming months!

Ruggero Sciuto

 

 

 

The Salons Project: a digital approach to eighteenth-century French salons

We are currently finalising the programme for Digitizing Enlightenment IV, a day-long workshop that will take place on 15 July as part of the ISECS Congress in Edinburgh this summer. In order to expand our network of Digitizing Enlightenment projects and researchers, we encourage those working in any aspect of digital humanities across the interdisciplinary spectrum of eighteenth-century studies to attend the event, if in Edinburgh, or contact us for more information.

Meantime below is the second post in our series of follow-up discussions based on work presented at the Digitizing Enlightenment III workshop.

– Glenn Roe, Voltaire Lab

Eighteenth-century French salons have developed a mystical aura as sites of elite sociability and (more controversially) as potential workshops of Enlightenment philosophy. They were, however, ordinary face-to-face gatherings in many ways – not unlike unscheduled conferences and meetings with loose agendas today; the one consistent difference is that they were held in private homes instead of conference rooms and organized by individuals (normally women) rather than groups or committees. The nineteenth-century term “salon” grouped together a variety of meetings with certain characteristics: salons were held in private homes with relatively elite participants, conversation was the primary activity, and they occurred on set days and at times that were part of a larger social calendar. Aside from these very general characteristics, salons had a wide variety of purposes, publics, and activities.

a French salon

Niclas Lafrensen [Nicolas Lavreince] (1737-1807), A French salon.

The most celebrated among salons, notably Tencin’s, Graffigny’s, Geoffrin’s, and Lespinasse’s, have become associated with great writers, philosophes, and mathematicians, like Voltaire and D’Alembert. Antoine Lilti has challenged the view that salons were primarily counter-cultural venues for philosophical debate, showing that the aristocratic traditions influenced notions of politesse in the salons and emphasizing the aristocratic habitus of many salon hostesses even when they had philosophes as guests. Disagreements over the character of salons may amount to differences more of degree than of type, since historians generally agree that the salons were mixed environments, but these debates do demonstrate the importance, now more than ever, of working through who was in attendance, in order to identify the social characteristics of eighteenth-century French salons.

I am the co-director with Chloe Edmondson of The Salons Project, a database of primarily eighteenth- and nineteenth-century European salon participants. We completed our pilot project of French salons from 1700 to 1800 last year and have some preliminary results, which will appear in the volume Digitizing Enlightenment, edited by Glenn Roe and Simon Burrows, in 2019. As expected, we found a great deal of evidence for social mixité in eighteenth-century salons, including patterns of mixed gender, age, occupation, interests, and social status. We also found that both women and literary figures were present in all of the major salons, including salons like Deffand’s which were not known for their openness to the philosophes. We found that nobles were present in all salons, as were gens de lettres, and that these people were often one and the same.

Our list of more than 600 salon participants is far from a complete record of eighteenth-century French salon attendees, but it is the largest and most complete database that we are aware of. The purpose of our study was not only to create a database, but also to create a method and a format for sharing data about salons and other informal networks. This method uses the robust data model created by the Electronic Enlightenment project, such that our data are compatible with the many other Enlightenment-era projects that are inspired by that database. We also use the schema “Procope”, which we developed along with Maria Teodora Comsa, Dan Edelstein, and Claude Willan to classify Early Modern European individuals, and which is described in our article “The French Enlightenment network”.

the Salons Project

Salon, correspondence, and knowledge networks in French salons, 1650 to 1815 (data from The Salons Project, Conroy and Edmondson).

Within our larger dataset (1650 to 1815), we found that the letters networks and salon networks remained well integrated, and that philosophes were a minority but well integrated into the core of the network (see diagram). The most central figures are the ones whose networks are most associated with each field of knowledge (for example, Lespinasse’s salon is strongly associated with the “Letters_Philosophical” network, whereas Praslin’s is not; Voltaire’s correspondence network is more strongly associated with the encyclopédistes than is Necker’s; the Letters networks and “Letters_Philosophical” network are themselves tightly connected and central to salon networks). Whereas the best known salons of the era were well integrated into the letters and philosophical networks, it is important to remember that many of the salon attendees were not otherwise part of the French Enlightenment network, especially women, lower-status individuals, family members of other salon participants, and foreigners. By adding these more marginal people to the records on eighteenth-century French sociability, we hope to open up new avenues for finding social relations that are not well known among these more marginal participants on the edges of the Enlightenment. Even where we were not able to learn much about some of these more minor figures, including them in this preliminary dataset increases the chances that we will learn more about them in the future.

– Melanie Conroy, University of Memphis

Melanie Conroy is assistant professor of French at the University of Memphis and the co-director with Chloe Summers Edmondson (PhD candidate, Stanford University) of The Salons Project, a database of European salon participants. She can be reached at mrconroy@memphis.edu or @MelanieConroy. The Salons Project is online at salonsproject.org. The Salons Project is collaborative and invites new researchers to adopt its methods and share their data.

 

The humanist world of Voltaire’s correspondence

We know from reading Voltaire’s letters that he likes quoting – French literature in abundance, but also a fair amount of Latin. There is often a strong sense that he is quoting from memory, which is more than likely the lasting mark of his Jesuit teachers at Louis-le-Grand, who put Latin at the centre of the curriculum. Indeed, Voltaire had the benefit of some renowned Jesuit scholars as his teachers, notably Le Père Porée, who famously taught a ‘Senecan’ prose style, and Le Père Thoulier (later the abbé d’Olivet), a distinguished Cicero scholar who remained on friendly terms with Voltaire throughout his career.

Latin verse in particular, played a preponderant role in Voltaire’s education, as poets were at the heart of college teaching, and Virgil, Ovid, and Horace were by far the big three since the 16th century at least.[1] The Jesuits taught primarily by way of daily recitals (recitatio) of verse required by all students: ‘On attachait à la recitatio une importance dont nous n’avons pas idée aujourd’hui…’ (Dainville, p.175). Thus, students at Louis-le-Grand all committed large chunks of Latin verse to memory as both a means of imitation for learning to write, and also as a method of retaining information, as Voltaire would elsewhere describe the pedagogical approach of the Jesuit Claude Buffier: ‘Il a fait servir les vers (je ne dis pas la poésie) à leur premier usage, qui était d’imprimer dans la mémoire des hommes les événements dont on voulait garder le souvenir’.[2]

Collège de Louis le Grand, circa 1789.

Collège de Louis le Grand, circa 1789.

Given this background, we aimed to examine Voltaire’s use of Latin quotations across his massive collection of correspondence, described by Christiane Mervaud as ‘perhaps his greatest masterpiece’. The Besterman edition of Voltaire’s correspondence, originally published in some 50 print volumes, and digitised in the early 2000s as part of the Electronic Enlightenment project, contains 21,256 letters of which 15,414 are written by Voltaire himself. It is astonishing, then, that this masterpiece remains relatively unstudied. Besterman identifies Latin passages when they are from the major writers (Horace, Virgil, Ovid, Lucretius) – the authors for whom there were concordances easily available in the 1950s and 1960s. In the case of lesser poets like Manilius, however, Besterman was obliged to leave the passages unannotated. These passages can now be easily identified thanks to new methods developed in the digital humanities. In particular, as part of this year’s research programme in the Voltaire Lab, we compared all of Voltaire’s letters to Latin digital sources in an effort to systematically identify all of his Latin quotations, while at the same time, as we’ll see below, exploring the social and intellectual networks over which these quotations were exchanged.

Marcu Manilius, <i>Astronomicon</i>, 1767.

Marcu Manilius, Astronomicon, 1767.

Using sequence alignment algorithms designed to identify literary text re-use at scale –developed in collaboration with the ARTFL Project at the University of Chicago – we identified some 672 Latin citations in Voltaire’s correspondence by comparing the letters to the Packard Humanities Institute’s Classical Latin Texts (PHI) digital corpus. The PHI contains essentially all Latin literary texts written before A.D. 200, as well as some texts selected from later antiquity. The resulting alignments allow us to move beyond Besterman’s ad hoc manner of identifying quotations towards a more systematic understanding of Voltaire’s use of Latin authors.

After some data pruning – the inclusion of several commentators and grammarians from Late Antiquity in the PHI dataset meant that there were some repeated matches that were spurious – we reduced our set of Latin passages to 342 citations used by Voltaire himself to his various correspondents. Here is a list of these quotations by Latin author in descending order:

Table 1. 342 individual Latin passages found in letters by Voltaire.

Table 1. 342 individual Latin passages found in letters by Voltaire.

Overwhelmingly Voltaire prefers to quote Latin poets; and that Horace, Virgil and Ovid should be the top three is hardly surprising, though the presence of Horace is dominant. There is breadth as well as depth here, and the list goes beyond the usual suspects to include minor figures such as Manilius, Statius, and Cato the Elder. Does this mean, for instance, that Voltaire is quoting someone like Manilius from memory? If so, how interesting and altogether unexpected.

The next important question we broached was concerned with the recipients of Latin passages, i.e., who are the adressees of the letters in which these Latin quotations appear? In all we found 101 different recipients of at least some Latin, out of 1,465 total recipients in Voltaire’s correspondence (roughly 14.5 %). This is quite small, as a proportion of addressees overall. So how can we gloss these names as members of a group, or network of Latin quotations?

Table 2. Addressees with more than five Latin quotations.

Table 2. Addressees with more than five Latin quotations.

Using the ‘Procope’ social network ontology of the French Enlightenment, established by Dan Edelstein et al., at Stanford,[3] we were able to automatically assign social categories to our list of addressees, which while not a perfect system, nonetheless helped us understand the fundamentally ‘elite’ status of this sub-set of Voltaire’s correspondents.

Gender is an obvious criterion that is apparently lacking: all addressees are male apart from one. Given that men learned Latin, and women didn’t, the use of Latin quotations is self-evidently gendered in this case. This is further reinforced by the manner in which Voltaire uses two verses by Virgil with La Duchesse de Choiseul, his one female addressee, in a letter from 1771:

‘Pour moi, Madame, qui les aime passionément je vous dirai
Ante leves ergo pascentur in æthere cervi
Quam nostro illius labatur pectore vultus.’

‘Vous entendez le latin, Madame, vous savez ce que celà veut dire:
Les cerfs iront paître dans l’air avant que j’oublie son visage.’
 [4]

After quoting the two lines from the Bucolics, Voltaire goes on to translate them for Madame de Choiseul, even though she can presumably understand the Latin – a case of early-modern ‘mansplaining’ in action.

Within the group of 101 addressees, there is a clearly-defined social group of old, close friends from school (those with whom he had learned Latin), as well as an overlapping sub-group in Normandy, or in one case from Voltaire’s early law career:

Addressees from Louis-le-Grand, where Voltaire learned Latin:

  • The Marquis d’Argenson (later foreign minister)
  • The Comte d’Argenson (later war minister)
  • The Duc de Richelieu (soldier and leading courtier)
  • The Comte d’Argental, conseiller au parlement de Paris
  • Pierre-Robert Le Cornier de Cideville, conseiller au parlement de Rouen

Other old friends from the overlapping Normandy/law group:

  • Formont, a wealthy, talented light poet who was also friends with Cideville.
  • Theriot, a an early friend of Voltaire’s, from when they were both young apprentice lawyers, who was also friends with Formont and Cideville.

Otherwise, we find many cultivated acquaintances in this list who are themselves authors: Frederick, Algarotti, D’Alembert, etc.; along with one of Voltaire’s teachers from Louis-le-Grand: d’Olivet, translator of Cicero and Desmosthenes into French, elected to the Académie in 1723. Clearly, Voltaire’s use of Latin was a means of determining readership. By constructing an epistolary community with selected groups of correspondents, Voltaire underscored their shared experiences and humanist culture.

But, to what extent was this sort of cultural exchange reciprocal? I.e., if Voltaire writes to you quoting Latin poets, do you feel obliged to respond in kind? What does it mean, for instance, that Voltaire uses Latin in so many letters to Frederick, and yet the prince never once uses Latin in return? Socially, the 41 respondents identified belong by-and-large to the same ‘elite’ categories of government or aristocracy, although there is a markedly greater presence of hommes de lettres (an ‘intellectual network’ that overlaps with the ‘social networks’ drawn from Procope) in this second list. See Table 3.

Table 3. Respondents with more than two Latin citations.

Table 3. Respondents with more than two Latin citations.

These are just some of the preliminary results we have begun to process in the context of a larger project on Voltaire’s culture of text re-use (including his penchant for ‘self-plagiarism’). As with most digital humanities projects, initial computational analyses don’t always produce ‘clean’ results, or cut-and-dried interpretations: some of the results have to be examined carefully, and some – as was the case for the grammarians and commentators mentioned above – will prove spurious or misleading. One begins asking one set of questions – can we identify Voltaire’s use of Latin and verify Besterman’s attributions – and end up with new ones: e.g., with whom did Voltaire use Latin, and how? Equally, we could extend these questions by examining other literary quotations, e.g., from French or Italian authors and by including other correspondence collections, comparing Diderot and Rousseau’s use of Latin, for instance, to that of Voltaire.

Ideally, this sort of experimental research approach also generates new research questions, ones that would have been difficult to frame outside of the digital environment. In this case, we were quickly confronted with the notion of what constitutes an instance of ‘re-use’ as opposed to an allusion or more oblique cultural reference. For example, our algorithm identified this passage from Cicero’s epistles:

‘Vale. CICERO BASILO S. Tibi gratulor, mihi gaudeo. te amo, tua tueor. a te amari et quid agas quidque agatur certior fieri volo…’

as a potential re-use employed by Voltaire in a letter to Marmontel from 1749:

‘Si vous recevez ma lettre ce soir, vous pourrez m’envoyer votre poulet pour m. de Richelieu, que je ferai partir sur le champ. Te amo, tua tueor, te diligo, te plurimum, &c.’ [5]

Is this re-use or not? Besterman makes no mention of Cicero in his annotation, but rather places this passage into a more generic class of ‘Roman epistolary formulas’. But perhaps there is more going on here; perhaps the model of Cicero’s epistles – central to the Jesuit syllabus – remains at the forefront of Voltaire’s mind when he himself is in the act of letter-writing. With the sorts of addressees for whom Voltaire uses Latin quotations he may likewise use a Ciceronian subscription. Here the Ciceronian model shapes Voltaire’s epistolary rhetoric.

Finally, pushing this line of enquiry a bit further, we came across another discovery: there are reduced versions of the passage, “Vale. Te amo”, which Voltaire uses extensively in the correspondence, and in particular with the social network of old school friends outlined above. This passage is in fact too small to be identified by our matching algorithms, and we would furthermore be a bit hard-pressed to classify it as a singularly Ciceronian borrowing. And yet…

– Nicholas Cronk and Glenn Roe

[1] See François de Dainville, L’Education des jésuites (XVIe-XVIIIe siècles) (Paris, Minuit, 1978).

[2] Voltaire, Siècle de Louis XIV, ‘Catalogue des écrivains’, OCV, vol.12.

[3] See Maria Teodora Comsa, Melanie Conroy, Dan Edelstein, Chloe Summers Edmondson, and Claude Willan, ‘The French Enlightenment Network’, The Journal of Modern History 88, no. 3 (September 2016): 495-534.

[4] [D17251]. Voltaire [François Marie Arouet], ‘Voltaire [François Marie Arouet] to Louise Honorine Crozat Du Châtel, duchesse de Choiseul [née Crozat]: Monday, 17 June 1771’. In Electronic Enlightenment Scholarly Edition of Correspondence, University of Oxford.

[5] [D3918]. Voltaire [François Marie Arouet], “Voltaire [François Marie Arouet] to Jean François Marmontel: Friday, 2 May 1749”, in Electronic Enlightenment Scholarly Edition of Correspondence, University of Oxford.

The Newberry French Revolution Collection at ARTFL

As we begin planning Digitizing Enlightenment IV, which will take place in the context of the ISECS Congress in Edinburgh in July 2019, we are keen to broaden the scope and breadth of the Digitizing Enlightenment community in order to highlight new, and existing, digital projects across the interdisciplinary spectrum of eighteenth-century studies. This post, based on work presented at the Digitizing Enlightenment III workshop held in Oxford in July 2018, demonstrates how to identify text reuse – citations, borrowings, plagiarisms – as well as other techniques for leveraging freely available large data-sets from the 18C.
– Glenn Roe, Voltaire Lab

The incredible richness of the Newberry Library’s French Revolution Collection (FRC) has been long known. It consists of more than 30,000 pamphlets and more than 23,000 issues of 180 periodicals published between 1780 and 1810, representing the opinions of all the factions that opposed and defended the monarchy during the turbulent period between 1789-1799 and also contains innumerable ephemeral publications of the early First Republic. The Newberry has released digital copies of more than 35,000 pamphlets totalling approximately 850,000 pages. Not only has the Newberry made the collection available to the public, but it has released a data feed of the entire collection, consisting of the Library’s exceptional metadata describing each object, the OCR text data, and links to the digital facsimiles accessible from the Internet Archive, encouraging researchers and instructors to incorporate the digital collection in new kinds of scholarship and engagement.

In order to facilitate experimental work at ARTFL on this unparalleled resource, we have loaded two versions of this collection – based on a download of the collection from the Newberry’s GitHub repository in November 2017 – into PhiloLogic4, the latest release of ARTFL’s text analysis software. The full version contains all 38,377 documents dating from the 16th century to the end of the 19th century. Our second build attempts to eliminate duplicate documents, is restricted to the period 1787-1799, and thus contains 26,445 documents.   Additional implementation information and full open access to both versions of the FRC collection are available online. The quality and coverage of the FRC texts makes it an ideal environment to test a variety of experiments and algorithms to enhance access and open new kinds of approaches using the 1787-99 sample data. At the bottom of the ARTFL FRC page, we have provided links to several different models for examining the collection which are based on extensions to the PhiloLogic4 package.

The simplest model is a document level search which returns matching documents by relevancy ranking based on Python Whoosh. This functions somewhat like a Google search on the collection, with links to the page images of the document or specific instances of the search words in context. For example, the results of a search for “conspirateurs aristocrates ennemis étrangères royalistes” can be seen here.

The second approach is the application of a Topic Model algorithm to the collection. Topic Models are a set of unsupervised learning algorithms that divide collections into a specified number of clusters based on vocabularies of each document which is widely used in digital humanities. The results of the Topic Model has been added to the metadata of the PhiloLogic4 build of the 1787-99 sample data. Each document is identified as having a first and second topic, denoted as A or B, with a number from 00-49 as listed in this TABLE. This first column is the topic number, the second is one or more english keywords which can also be searched. The third column is the top 3 weighted words (features) of that topic, and the 4th column is the rest of the top 10, all of which are shown in relative weight order. Thus, A29 will return the documents that have money assignats as the top weighted topic. Searching for “money” in topic models will get this as eight the first or second topic.   An alternative use of this data is to copy some or all of the terms in columns 3 and 4 into the Whoosh search form and get the documents in a ranked relevancy order.

Our first presentation of our work at the Digitizing Enlightenment III showed results from applying the latest version of our sequence aligner to detect text reuse – citations, borrowings, plagiarisms, and so on – from pre-Revolutionary documents during the Revolutionary period. Sequence alignment is a family of algorithms used in a surprising range of disciplines from genetics to text analysis to identify similar segments of arbitrary length. For this work, we aligned the FRC 1787-99 sample against ARTFL’s Frantext pre-1788 collection. The Frantext sample contains 1,263 documents and is particularly strong in 18th century holdings. We loaded the results of the alignment run in a dedicated database which can be queried in a variety of ways, such as source and/or target metadata as well as by words in matching passages.

The public database (June 22, 2018 build) found 8,937 aligned passages, or which around 1,000 were identified algorithmically as banalities. Filtering out shorter alignments, less than 10 words, results in just under 7,000 passages. It is important to note that these numbers are very relative, since they can vary significantly depending on the approach we use to identify and merge, where appropriate, longer passages. The general frequencies are not particularly surprising. The following is a table of the number of borrowed passages in the FRC by author.

Montesquieu – 1,315

Rousseau – 1,133

Voltaire – 979

Mably – 303

Aulony – 263

Racine – 168

Helvétius – 167

D’Holbach* – 146

 

Saint-Simon – 135

Bossuet – 110

La Fontaine – 94

Diderot – 85

Corneille – 72

Mirabeau – 71

Boileau – 69

Bernardin – 67

Montaigne – 65

*D’Holbach appears as two entries due to slight metadata differences.

The yearly distribution of borrowings from the top three Enlightenment authors again follows a reasonable pattern.

The annual distribution in the FRC of the 536 passages derived from Rousseau’s Contrat Social, seems reasonable and would match expectations based on other things we know.

While the global numbers are interesting, if not very surprising, there are number of specific texts and authors which would warrant further investigation. There are numerous chapbooks, such as the Calendrier moral, 1794, which are interesting because of their selection of inspiring passages from various authors. Jean-Jacques Barthélemy’s L’Accord de la religion et de la liberté (1791) features some 25 long extracts from d’Holbach’s Système social.

The alignment database is available to the public. The database has a variety of useful features. This link will push a search for all of the aligned passages in the FRC from Rousseau’s Contrat Social greater than 10 words. The report is laid out chronologically (in this case by FRC year). Each instance shows the matching passages with available metadata, links to the context of each passage, and a button to highlight the differences in each matching pair. The facets on the right will allow you to get frequencies by author, title, year and so on. Clicking on those will return the corresponding text pairs.

We anticipate further experimental work on the FRC, most notably in using the excellent subject information as ways to assess the accuracy of Topic Modelling and to consider supervised learning algorithms to further classify the collection by subject.

It is our pleasure to acknowledge that the Newberry Library has released this extraordinary resource under the Open Data Commons Attribution License, ODC-BY 1.0.   We believe that this splendid collection and the Newberry’s release of all of the data will facilitate a generation of ground-breaking work in Revolutionary studies. If you find the collection useful, please do contact the Newberry Library to congratulate them on this wonderful initiative and how their efforts contribute to your research.

We would love to hear from you. Please send comments, suggestions and problem reports to artfl@artfl.uchicago.edu.

– Clovis Gladstone and Mark Olsen

 

Digitizing Enlightenment III

The Voltaire Foundation, in collaboration with the Cultures of Knowledge project, the Maison Française d’Oxford, the Oxford Centre for European History and the Centre for Early Modern Studies, was pleased to host the third instalment of the Digitizing Enlightenment conference series on the 19th and 20th of July. This was the first academic event organised under the auspices of the Voltaire Lab, and was made possible by further support from the John Fell Fund.

Digitizing Enlightenment (DE) is a conference series that is establishing its domain as a major area of innovation in the Digital Humanities. The first convening of DE was in Sydney in 2016, hosted by Simon Burrows at Western Sydney University. This first meeting launched a set of discussions around a common set of problems and identified topics for collaboration in pursuit of interoperability among six distinguished, and in some cases, long-standing DH projects in the field of Enlightenment Studies:

  1. The ARTFL Project (Chicago);
  2. Mapping the Republic of Letters (Stanford);
  3. The Comédie Française Registres Project (MIT/Paris-Sorbonne/Nanterre);
  4. The French Book Trade in Enlightenment Europe (Western Sydney);
  5. Electronic Enlightenment (Oxford); and
  6. MEDIATE (Radboud).

The second gathering in Nijmegen in June of 2017, hosted by Alicia Montoya at Radboud University, continued these discussions and opened up more lines of communication and possible collaborative research across Europe and expanded our working notion of ‘Enlightenment’ as an historical period. These meetings thus established an international network of major digital humanities projects working on 17th- and 18th-century European intellectual and literary history. As a group, these projects have sought to identify and work collaboratively on shared research problems, solutions, and resources generated by their respective research programs in order to facilitate more comprehensive approaches to some of the major problems in the field today.

Greg Brown and conference attendees, Maison française d’Oxford.

Digitizing Enlightenment III was, by design, more focused than the prior meetings: it was aimed more narrowly at the hot topic of historical prosopography and network analysis, an area in which we felt the DE network can potentially provide leadership, and which could provide technical solutions that might allow for the integration of a whole range of ambitious projects in this field. The first two conferences were modest in size and quite international: 15-20 papers over two days, with 30-40 people in attendance. With our narrower focus, the third meeting was somewhat smaller but even more international, with participants from Australia, Austria, France, Germany, the US, and the UK. Accordingly, its format was more concentrated, in the form of six thematic round-tables, each dedicated to proposal and discussion of functional solutions to real-world problems already encountered in network analysis and prosopography of this period.

These roundtables were organized around a set of basic questions that allowed participants to engage with the overall thematic of the conference, without necessarily being experts in the domain. Participants spoke briefly on each proposed question, which allowed for ample discussion and question time afterwards. These questions included:

  • Why prosopography? Why networks?
  • What are historical or intellectual networks?
  • What is social network analysis?
  • How to re-construct a social network?
  • Who or what is excluded from networks?
  • What lies beyond networks, beyond prosopography?
  • How to link, sustain, and maintain networks?

A final roundtable was dedicated to discussion of next and future steps in this collaborative work, and where it was decided that we should aim to hold another event either during or around next year’s ISECS International Congress on the Enlightenment in Edinburgh.

Greg Brown (standing) and Howard Hotson.

Participants were also treated to a reception and dinner at Balliol College, generously sponsored by the Bodleian Libraries.

Between roundtables, we invited participants to present some of the current projects that are underway in the broad field of digital Enlightenment studies. These short presentations included already established projects, such as Early Modern Letters Online, the Quill Project, and Six Degrees of Francis Bacon, as well as new projects, such as the sequel to Simon Burrow’s FBTEE project, Mapping Print, Charting Enlightenment, and projects not yet fully developed on an early modern digital gazetteer, a new prosopographical model for natural law academics, and a project underway at Stanford on 18th-century salons as ‘networks’.

Our hope is that the Digitizing Enlightenment brand will continue on into the future, both in the form of future meetings – at ISECS in 2019 and perhaps Chicago in 2020 – and in a volume currently being edited for the Oxford University Studies in the Enlightenment series, which draws its content from the first two meetings. Should you have any questions about these projects, or our vision for future Digitizing Enlightenment events, please feel free to contact us at: de3@digitizingenlightenment.com

– Gregory Brown and Glenn Roe

The Voltaire Library Project: using digital humanities to understand Voltaire’s influences

Lena Zlock is a rising senior at Stanford University double-majoring in History and French. She is the principal investigator of the Voltaire Library Project, a digital humanities study of Voltaires personal library. She will be working at the Voltaire Lab during the summer of 2018. Her work is supported by the Vice Provost for Undergraduate Education at Stanford University. Lena can be reached at lzlock@stanford.edu or @LZlock89.

In the firmament of the Siècle des Lumières, Voltaire is the sun. His presence in the Enlightenment world is enormous by any metric. In life as in death, Voltaire’s name came to signify those who challenged orthodoxy and convention. When asked why philosopher Jean-Paul Sartre had not been arrested for his polemical critique of the Algerian War, French president Charles de Gaulle, simply replied, ‘One does not arrest Voltaire’.

To understand Voltaire’s thinking and impact, where better to look than his massive library of 6700 volumes? Much like its owner, the library has both a fascinating history and afterlife. It was sold to Catherine the Great of Russia by Madame Denis – Voltaire’s niece and lover – shortly after the author’s death in 1778. Catherine the Great was one of Voltaire’s most powerful admirers, writing to him in 1763, ‘By chance your works fell into my hands; and since then I have never stopped reading them, and have not wished to have anything to do with books which were not written as well and from which the same profit could not be derived.’[1] In 1779 the books started the perilous journey from Voltaire’s château in Ferney all the way to the Hermitage in St Petersburg. Under the careful watch of the Empress, the library was sorted and placed into the Hermitage Palace, alongside the library of fellow philosophe Denis Diderot. However, the contents of Diderot’s library were dispersed throughout the Hermitage. Voltaire’s library thankfully remained intact, with an occasional book identified as mistakenly belonging to the sire of Ferney.

Battle of the books

‘Before the Title of the Battle’, frontispiece to the Battle of the Books in the 1710 edition (London) of Jonathan Swift’s A Tale of A Tub.

His library is extraordinary for this period not only because it remains intact, but because it was a working library, rather than a collector’s library. Gorbatov’s thesis of ‘the working library’[2] is especially helpful in understanding Voltaire’s interaction with his books. It was active and even chaotic, akin to Jonathan Swift’s depiction of a ‘battle of the books’. But even with this flurry of activity, Voltaire’s was a curated library for the purpose of research and writing, which means that deliberate choices were made as to what became part of the collection. With marginalia in over half the books, the story of the library is in many ways the origin story of Voltaire’s corpus.

In 1961, Soviet researchers put together a catalogue of the library, including titles; names of authors, editors, translators, and publishers; places of publication; and real and false data for books that were censored or printed underground. In studying the library, researchers usually flip through the catalogue to find a particular book (e.g. did Voltaire have works by John Locke? Yes, but not the ones you think). What if we could study hundreds or even thousands of books at once?

With the advent of digital humanities, we can now visualize the full breadth and depth of Voltaire’s ‘laboratory’. How many works of history did Voltaire own? Science? Theology? Jurisprudence? Did he purchase these books or were they gifted to him? How many were clandestinely printed? Where is the historical weight of the library? The goal of my project is to create a three-dimensional portrait of Voltaire’s ‘life of the mind’.

My current project is building a database of the library. I took the library catalogue and ran it through Optical Character Recognition (OCR) software, making it machine readable (I am lucky to have Russian as my native language, otherwise the structure of the catalogue would have been unintelligible). Each of the 6700 books is organized along 130 metadata categories – including data on authorship, publication dates and locations, and censored works. These categories combine modern typologies with those intelligible to Voltaire and his contemporaries, such as 17th-century bookseller hierarchies of genre.

The database is enriched through linked data, drawing on repositories like Wikidata, Geonames, Virtual International Authority File (VIAF), and BnF Gallica. My research is guided by two questions: what forces – social, literary, geographic, political – shaped Voltaire’s library? And what in the library shaped Voltaire’s corpus? These questions are often two sides of the same coin. Using a big-data approach to the library, we can visualize the patterns that shaped the library, and in turn Voltaire’s own work. My goal is to recreate the experience Voltaire himself had as a researcher in the library. The library database will form part of a larger cross-referencing system. This system will incorporate a digitised version of Voltaire’s marginalia, as well as the current Electronic Enlightenment database. Users of the database will be able to reference his marginalia in the books, as well as letters to and from the individuals – authors, publishers, editors – in the library. By immersing ourselves in the laboratory of Voltaire’s mind, we can gain new insights into the Enlightenment’s lodestar.

– Lena Zlock

[1] Quoted in Inna Gorbatov, ‘From Paris to St Petersburg: Voltaire’s library in Russia’, Libraries & the Cultural Record, vol.42, No.3 (2007), p.308-324 (p.308).

[2] Gorbatov, ‘From Paris to St Petersburg’, p.314.