AHA Roundtable Presentation
Ashley Sanders, Ph.D.
Digital Scholarship Librarian, Claremont Colleges
New York City, January 2, 2015
In an animated discussion with other graduate students gathered around a homemade dinner in Aix-en-Provence, I discovered that I was not wandering alone in the darkness as I sought out methods to search the archives and organize my findings. Each of us had questions about the best methodologies and techniques to use in our research, as well as how to take advantage of the new digital tools available.
In my brief comments today, I will outline just a few of the challenges and opportunities that archival research in the digital age presents, along with several tools and techniques with which to tackle the problems posed.
My biggest challenge when I conducted dissertation research was an unusual one for a historian. I discovered an overwhelming mountain of sources on the conquest and colonization of Algeria in the French colonial archives and an equally intimidating number on the colonization of the American Midwest.
In addition to the paralyzing plethora of sources I found, I also faced two corollary dilemmas: (1) how to find relevant documents amidst the avalanche of materials in which I was immersed, and (2) how to organize everything once I found it. Sifting (or filtering) and organizing are two challenges that most historians are just now beginning to grapple with.
Problem 1: Abundance. Ever-expanding access to what historian Dr. William Turkel has dubbed the “infinite archive” of digital sources has opened up new vistas for research, but greater access to vast materials also poses a new set of challenges. The shift from what the late Roy Rosenzweig has called, a “culture of scarcity to a culture of abundance” is a new experience for many historians that necessarily changes our workflow, as well as how we educate the next generation of historians. As T. Mills Kelly has observed, both historians and students have to learn how to “mak[e] sense of a million sources.” Therefore, digital tools are essential to tackling the challenges and opportunities that digitization’s abundance presents. In fact, this is the essence of Dan Cohen’s definition of digital history “as the theory and practice of bringing technology to bear on the abundance we now confront.”
Problem 2: Filtering. A corollary problem is that of searching and filtering. To find both primary and secondary materials today, nearly all of us use web-based searches through a university library, the archives’ site we wish to explore, as well as search engines like Google. This means that whether or not we call ourselves “digital historians,” we are, by Cohen’s definition. But do we understand how the search capabilities of each of these portals functions to retrieve results? And, “why,” you may ask, “is this important?” Most significantly, we must understand that the algorithms that search engines like Google and Bing use to return results attempt to tailor or customize our results based on our web-searching and click patterns. The algorithms try to predict what we are most interested in seeing, which means that potentially relevant results are filtered out in the process. It also means we are seeing results from a narrower perspectival field. Both of these consequences present obstacles for a historian attempting to understand and interpret events and people, particularly if they are related to views, organizations, different from the researcher’s personal interests or political views. For example, a left-leaning male academic is interested in studying the conservative backlash to the women’s liberation movement of the 1960s will see very different Google search results from those of a right-leaning male or female scholar, which will introduce an unknown, invisible bias.
The difficulty of appropriately filtering results is a second hidden problem associated with digital searching. In my own research, for example, I had difficulty tracking down whether a source referenced a father or son by the last name of De Lesseps. Attaching name references to data in the Name Authority File would clarify this issue and make search results more accurate. It would also address the problem of finding sources that contain alternate spellings of names, women whose names often change upon marriage, and pseudonyms. Linked Open Data, or the Semantic Web, offers great promise for historians, but we are still years away from seeing a fully linked web of data.
Even though filtering remains a challenge, scholars are able to more easily search for and find secondary works both within and outside their field. Advancements have been made in searching/finding tools, including customizable Google searches and alerts, tools like Serendipomatic, creating RSS feeds, and setting up digital library notifications. Historians can also find relevant content through regular expression searches, text mining, as well as document and topic clustering.
Possibility 1: A much wider range in the scale of historical inquiries now exists. We have access to increasingly better tools to help us find, read, annotate, analyze, process, store, and make connections between vast amounts of data. To conduct micro-histories, it is far easier to delve deeper into the historical record and make more nuanced connections through greater search features and accessibility of materials through digital archiving projects. Through new tools and developments in cyberinfrastructure for and by humanists, such as the SNAC Project (Social Networks and Archival Context), which reveals the social networks of historical figures to gauge an individual’s influence, historians have more ways than ever to investigate the impact of an individual or small group of people.
According to Brown University social historian Jo Guldi, the reason we turn to the Digital Humanities is for projects of scale. She suggests that we are in the midst of a turn from micro to macro histories. Her own toolkit called Paper Machines, co-designed with Christopher Johnson-Roberson, is an example of the shift she describes. This digital tool assists historians in aggregating and analyzing numerous documents to examine how textual themes change over time. It is now possible to track various themes in specific journals through the decades, compare expansive text sets, and investigate larger patterns in the spread and influence of ideas over the longue-durée.
Possibility 2: The digital environment in which we now work also makes possible larger collaborative, comparative, and interdisciplinary projects. In my own work, as one example, I am developing a digital repository for materials related to settler colonialism that also allows scholars to build online exhibits, including geospatial temporal exhibits. It is my intention to use the Omeka platform to facilitate such collaborative, comparative projects with scholars from many different locations. This ambitious project would not be possible without a digital hub that allows for easy communication and the sharing of resources.
The new H-Net Commons is another example: Every day scholars are working together to create innovative projects on the new Drupal platform. Some networks, like the American Studies network, are creating their own image archive. Others are collaborating across networks to facilitate and host important conversations (H-Material Culture and SciMedTech), and we are periodically developing “crossroads” networks for even larger collaborative undertakings. The first crossroad network was devoted to this past summer’s World Cup, but the most recent one focuses on World War I and is even more robust with multiple networks contributing content, projects, pages, syllabi, and discussions.
In art history, the Getty Research Institute is building a scholar workspace as an open source digital platform with accompanying toolset, as well as technical and methodological manuals. The digital workspace and tools tailored specifically to the study of visual objects will enable scholars to collaboratively analyze digital representations of primary art objects and create born-digital publications.
In the early stages of my research, I experimented with Zotero, NoteBook, and Nota Bene. It took months of trying out different tools and reading all of the articles I could find on digital workflows to create one that was comfortable and appropriate given the limitations of my hardware and time. To create reference files of the sources I found, I eventually settled on taking digital photos of my sources, collating related pages from each document, tagging, and filing them. I also moved all of my relevant notes into Evernote and began adding tags. As I conducted research, I took notes in Evernote about how the contents of each microfilm and file box related to my research, essentially creating my own finding aid. It’s a simple system that has worked relatively well, but there are many things I will do differently in the future, including creating a database of my sources, as Rachel will describe next. To address the challenges of abundance and organization, Rachel will discuss how database management and writing software can be used to make sense of historical sources once we find them.
 Roy Rozenzweig, “Scarcity or Abundance? Preserving the Past,” Clio Wired: The Future of the Past in the Digital Age (New York: Columbia University Press, 2011).
 Dan Coehn, et. al, “Interchange: The Promise of Digital History,” Journal of American History, 95, no. 2 (2008); www.journalofamericanhistory.org/issues/952/interchange
 This problem is eloquently outlined in Seth Denbo’s article “Linking the Past: History and the Semantic Web,” Perspectives on History (October 2014). http://historians.org/publications-and-directories/perspectives-on-history/october-2014/linking-the-past (Accessed: 30 November 2014). Library of Congress Name Authority File: http://id.loc.gov/authorities/names.html. (Accessed 19 December 2014).
 Tom J. Lynch, “Social Networks and Archival Context Project: A Case of Emerging Cyberinfrastructure,” Digital Humanities Quarterly 8, no. 3 (2014). http://www.digitalhumanities.org/dhq/vol/8/3/000184/000184.html (Accessed 19 December 2014).
 “History: The Key to Decoding Big Data,” Times Higher Education (2 October 2014). Web. https://www.timeshighereducation.com/features/history-the-key-to-decoding-big-data/2016026.article.
 Cf. “Research Projects,” The Getty Research Institute. http://getty.edu/research/scholars/research_projects/index.html; Judith Dobrzynski, “Modernizing Art History,” The Wall Street Journal (28 April 2014) http://www.wsj.com/articles/SB10001424052702304518704579519632304010744; Francesca Albrezzi, “Creating ‘Getty Scholars’ Workspace’: Lessons from the Digital Humanities Trenches,” The Getty Iris (6 March 2014). http://blogs.getty.edu/iris/creating-getty-scholars-workspace-lessons-from-the-digital-humanities-trenches/