Skip to content

Topic Modeling 18th Century American Correspondence

Posted in Digital Humanities, and Percolating Ideas

This is a lightning talk of ongoing research, given at the 2018 American Historical Association meeting on January 4, 2018. I’ve revised the text of the talk to provide more details about this project below.

Initially, this talk was entitled, “Text Mining 18th Century American Correspondence,” but I began my research by using topic modeling to highlight documents that may be most helpful.

I am particularly interested in looking at matters of land, territory, and property in the correspondence records to understand why settlers moved into the Northwest Territory (or the modern Midwest) and how American political leaders viewed these lands. This is part of a larger project that interrogates the origins of American settler colonialism as practiced by the United States in comparison with the formation of the French settler colony in Algeria.

A 1776 petition from settlers in the modern state of Kentucky to the Virginia Assembly planted the seed of an idea for this project. In this missive, the settlers begged for assistance to fend off avaricious land speculators and raiding Native warriors. The petitioners positioned themselves as men and women who had moved across the mountains “in order to provide a subsistence for themselves and their posterity.”[1] This sentiment was repeated in numerous settler letters – both official and private. Americans and newly arrived immigrants flooded into the backcountry to achieve a “competence,” which required independently owned land that would provide a comfortable living for their families and future generations.


To investigate the notion of competency as a motivational factor for the emigration of Euro-American settlers to the “Northwest Territory,” what is now the American Midwest, I decided to topic model the correspondence of members of the Continental Congress and petitions from settlers in the Northwest Territory.

Topic modeling is a text analysis approach that utilizes an unsupervised machine learning algorithm (there are several options, including LDA) to identify underlying “topics” that organize a group of documents. My own application of topic modeling does, indeed, use LDA, or the Latent Dirichlet Algorithm, with MALLET, an open-source package from the University of Massachusetts, Amherst. For a tutorial on MALLET, check out the lesson in Programming Historian.



Due to the small size of the corpus, I first created a 10-topic model of the 25 petitions and memorials to Congress between 1781 and 1797 using the LDA algorithm built into MALLET and visualized with Lexos. In a review of the model, topic number 4 stood out as significant for this study.[2] This topic includes the following words:

Petitioners Post Rights State
America United-States Friends Industry
Welfare People Indian Tract
Total Bound Duty Husbandry
Pleased Justice Extensive Ancient

These words describe settlers’ petitions for access to land and recognition of prior land rights that Native communities granted settlers before 1784, as well as how settlers used and planned to use the land. I then opened the composition document in Excel to determine which documents were best represented by topic 4 by sorting the data in the Topic 4 column in descending order:

Documents Doc Name Topic 0 Topic 1 Topic 2 Topic 3 Topic 4
2 1787-07-26_PetitionInhabitantsVincennes 2.40E-05 0.467546094 2.11E-04 1.52E-04 0.531214389
25 1798-02-01_PetitionSciotoInhabitants 2.81E-05 0.512145799 2.47E-04 1.78E-04 0.481390277
10 1788-04-08_MemorialParsonsWarnum 1.15E-05 0.546927174 0.002148067 7.25E-05 0.425866987
1 1787-05-08_MemorialParsonsAssociates 1.63E-05 0.58105877 1.43E-04 1.03E-04 0.418101121
22 1797-04-01_PetitionGallipolis 0.002262416 0.064927378 0.423793312 0.014302638 0.010407824
4 1787-08-27_PetitionIllinoisCountry 1.03E-04 0.002950938 0.019261334 6.50E-04 4.73E-04
6 1787-09-15_PetitionIllinoisCountry 7.73E-05 0.057408792 6.80E-04 4.89E-04 3.56E-04
9 1788-02-20_Tardiveau_GovStClair 7.52E-05 0.002158484 0.040943151 4.75E-04 3.46E-04
3 1787-08-07_PetitionVincennes 7.32E-05 0.198242202 6.44E-04 0.170451195 3.37E-04
23 1797-08-07_PetitionKnoxCounty 6.26E-05 0.996477642 5.51E-04 3.96E-04 2.88E-04
15 1790-05-01_MemorialFatherGibault 5.26E-05 0.283263622 0.49822794 0.066075147 2.42E-04
11 1788-08-08_MemorialRoyalFlint 3.29E-05 0.036138565 2.89E-04 2.08E-04 1.51E-04
19 1793_ObservationsOnPetitionFrenchGallipolis 2.35E-05 0.5499958 2.07E-04 0.448831109 1.08E-04
20 1796-03-25_PetitionZane 2.12E-05 0.367370892 1.86E-04 1.34E-04 9.74E-05
8 1788-03-02_Tardiveau_PresCong 1.99E-05 5.71E-04 1.75E-04 1.26E-04 9.15E-05
18 1792-12-22_PetitionFrenchGallipolis 1.99E-05 5.71E-04 1.75E-04 1.26E-04 9.15E-05
12 1788-08-29_MemorialEttwein 1.97E-05 0.035717452 0.959832655 1.24E-04 9.06E-05
21 1797-02-20_PetitionILCtry 1.93E-05 0.024651803 1.70E-04 0.003564537 8.87E-05
24 1797-12-27_PetitionAmInhabitantsVincennes 1.92E-05 0.219371068 0.007006659 0.615556356 8.81E-05
13 1788-09-09_GeorgeMorgan 1.65E-05 0.224348033 1.45E-04 0.771882174 7.59E-05
17 1792-04-11_MemorialILWabashCo 1.41E-05 0.799949601 0.164068675 8.93E-05 6.50E-05
5 1787-08-29_PetitionSymmes 1.30E-05 3.74E-04 0.999006598 8.25E-05 6.00E-05
14 1788-08-29_MemorialEttwein 1.05E-05 0.039740304 9.25E-05 6.65E-05 4.84E-05
7 1788-02-28_MemorialVincennesILCtry 8.64E-06 0.014126592 7.60E-05 5.46E-05 3.97E-05
16 1790-06-09_MemorialFatherGibault 0.592737459 1.14E-04 3.48E-05 2.50E-05 1.82E-05

As the spreadsheet above shows, documents 2, 25, 10, and 1 are most representative of the topic of interest, topic 4. Reviewing these documents, I highlighted the words from topic 4 in which I was most interested: industry, welfare, pleased, rights, duty, husbandry, and settled.

View Fullscreen

In a close reading of these documents, with the addition of document #3, which I identified as related through my familiarity with the sources, I found that settlers’ desire to achieve a “competency,” or the means to provide for their families on independently owned land, motivated the decision of at least the petitioners to make a life in the western territories. They sought governmental assistance to access these lands at reasonable prices and the recognition and validation of pre-existing claims. In order to determine how representative these letters are, I will need to compare the text of these petitions to a larger corpus of settler correspondence and to compare the list of signatories with the list of all settlers who appear in correspondence records.

I then became interested in how the settlers, themselves, defined competency. These documents show that petitioners also described competency as the ability to “rear children in a comfortable manner” and “raise a subsistence by their [own] industry.” In order to achieve these aims, they begged Congress for reasonable land prices. Their petitions also expressed an underlying notion of a “right” to the western lands, which they believed they had earned through the sacrifice of leaving loved ones to immigrate west, their labor, and the improvements that they had made (or planned to make) to the land.

The next step was to compare the settlers’ petitions with the correspondence of Continental Congress members to understand how political leaders involved in the creation of the United States viewed the western land and its purpose. Given the much larger size of the corpus, I created a 125-topic model of the Continental Congress members’ correspondence and followed the same procedure of looking through the topics to identify those of most interest and then reviewed the letters that had the highest proportion of words from these topics, as well as additional political leaders’ letters from the Territorial Papers. Of these missives, Arthur St. Clair, governor of the Northwest Territory, best articulated the political leaders’ vision for these lands:

This extensive Region is blessed with a fertile Soil and desirable Climate in every part of it which has yet been explored; and the Inhabitants of the neighbouring States, very early discovered a strong Disposition to take Possession of it:—Congress, in order to turn that Disposition to the public Advantage, and to secure to the united States the Benefits that were expected to flow from the right to the Soil, as a fund for discharging the Domestic Debts, gave orders that it should be sold,” and established a form of Government for the future Inhabitants…
(St. Clair to Washington, Aug. 1789, Territorial Papers of the United States, v. 2).

Whereas, settlers were most interested in the opportunities for personal independence, upward social mobility, and an endowment for their children that access to western land provided, the correspondence of members of the Continental Congress focused on its financial benefits. Political leaders were most interested in selling the lands that Native communities ceded to pay off the national debt. 

If we consider the underlying definition of competency as providing an independent, comfortable living, without burdensome debt that returns one to a position of dependency, then this is precisely what both the settlers and the American political leaders were attempting to achieve, albeit through different means. However, the government could not achieve its goals without people to buy the land, therefore, the desires of both the settlers and political leaders proved mutually reinforcing.

There was a high price to be paid for Euro-American independence though, as we can see, particularly in the topics generated from the Continental Congress members’ correspondence records:

Notice in particular, the following words from these topics: transmitted, negotiations, ceding, extinguishment, extinguishing, and, ominously, funeral. Most of these words are more or less neutral, but they elide the brutal effects of American land acquisition and expropriation from Native communities.

This project is only a first step in analyzing these, and other, correspondence records from the late 18th century to understand this region’s development within the framework of settler colonization and the impulses that brought the Northwest Territory into the United States.

Next Steps:

  • Consider the significance of the findings from the topic model, including issues of representativeness of the topics and findings shared above
  • Continue cleaning Congress members’ correspondence records
  • Integrate additional sources in American political leaders’ corpus and extend time period covered to 1797
  • Integrate additional source materials into settler correspondence corpus
  • Use techniques, such as part-of-speech tagging and collocation analyses to study each populations’ understanding of the Northwest Territory, motivations, and objectives.

[1] “Petition from the Inhabitants of Kentucky, 15 June 1776,” in George Rogers Clark Papers, Illinois Historical Collections 8: 11-12.

[2] For an introduction to analyzing topic models, see Miriam Posner’s blog post. For those familiar with, or willing to learn, R, see how to run MALLET from within R and view the results with Graham, Milligan & Weingart’s “Topic Modeling with R,” in The Historian’s Macroscope.

Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *