Breaking New Ground in the Digital Humanities

Author: Aaron Smith

Matthew Wilkens

Matthew Wilkens, an assistant professor in Notre Dame’s Department of English, recently won a prestigious fellowship from the American Council of Learned Societies (ACLS) for his groundbreaking digital humanities research.

In naming Wilkens one of seven scholars to receive its 2014 Digital Innovation Fellowship, ACLS described his Literary Geography at Scale as “one of the largest humanities text-mining projects to date and the first truly large-scale study of 20th- and 21st-century literature.” The work is also supported by a $100,000 grant from the Notre Dame Office of Research.

Stephen Fallon, John J. Cavanaugh Professor in the Humanities and chair of the Department of English, said the award “confirms Matt’s role at the cutting edge of the rapidly growing field of digital humanities. His project will not only contribute to his book on geography and American fiction, it will also lead to a marvelous digital resource for countless researchers.”

Literary Geography at Scale

Wilkens, who studies contemporary American fiction and develops computational methods of literary criticism, said his project seeks to determine how literature uses geographic space—what kinds of locations it pays attention to, how those places change over time, and what factors might be driving those changes.

“You could try to do that in the usual way that English professors do, reading a few important books and showing what they’re up to,” he said. “But I want to work at a different scale, not on a handful of books, but on literature over hundreds of years. You’re not going to do that by reading millions of books.”

Instead, Wilkens is using a computer to pick out the place names in a collection of about 10 million digitized volumes. “I’ll then match those names with geographic data, plot them on different maps, and see what they look like,” he explained.

The digital corpus is administered by the HathiTrust Research Center, a descendant of the Google Books project and notable not just for its size—the largest in the world—but also for the fact that it includes material published during the 20th century.

“That’s really important,” Wilkens said. “A huge portion of the world’s literature has been written in the last hundred years, but almost all of the computational research that’s been done on books so far has been limited to the 19th century and earlier. This work will be one of the first to make use of newer texts.”

The real stakes are only partly about geography, he noted. “What I’m really trying to get at is how literature changes over time on large scales, how it develops differently in different places, and its relationship to the cultures that produce it.”

Wilkens, who is on a full-year sabbatical to focus on his research, said the ACLS award offers access to supercomputers to run the calculations involved and will pay for the design and hosting of a public website for the final data.

Text Mining the Novel

In addition to his work on Literary Geography at Scale, Wilkens is a co-investigator on a six-year project called “"Text Mining the Novel: Establishing the Foundations of a New Discipline":http://novel-tm.ca/?page_id=22,” or NovelTM for short.

The project, which recently received a $1.8 million grant from Canada’s Social Science and Humanities Research Council, seeks to produce the first large-scale, cross-cultural study of the novel according to quantitative methods.

Led by Andrew Piper at McGill University, the NovelTM team includes experts from a range of humanities and computer science backgrounds.

“We’re aiming above all to figure out what works and what doesn’t when we bring computation to literary studies,” Wilkens said. “There’s a lot of excitement about literature and big data, and there’s been real progress joining the two already, but it’s early days and a lot of the work so far has been relatively isolated. We want to see what’s possible with a larger and more systematic collaboration.”

Each year, he said, the team will carry out a set of studies organized around a single theme, using a range of computational techniques. “So, for instance, one year we might work on the problem of influence or genre or nationality, pursuing as many different approaches as we can. What we’ll end up with is, in the best case, quite a bit of new knowledge about how these issues play out at large scale and also a better sense of which techniques are suited to particular classes of problems. Both of those are really important.

“It’s hard to emphasize enough how little we know at the moment about either literature as a whole beyond the traditional canon or about the best ways to get critically useful information from millions of digitized books.”

Scientific Approach to Literature

As an undergraduate, Wilkens majored in chemistry and philosophy at the College of William and Mary. He went on to receive a master’s in English at the University of Wisconsin and a master’s in physical chemistry from the University of California at Berkeley before completing his Ph.D. in literature at Duke University.

He continues to blend science and the humanities as a faculty member in Notre Dame’s Department of English and as part of the faculty for the new Computing and Digital Technologies Minor in the College of Arts and Letters.

Wilkens is also the current president of Digital Americanists, a scholarly society dedicated to the study of American literature, culture, and digital media.

When he returns to the classroom next fall, Wilkens will continue to encourage his undergraduate and graduate students to apply computational methods in their educational and career pursuits.

“The combination of close reading and computational methods is just really powerful,” he said. “Learning to move between interpreting individual texts and analyzing data about them opens up all kinds of possibilities for a better understanding of the world. And it means that students will never be entirely out of their depth, no matter where their work takes them.”