Text Analysis

Why digital text analysis?
  • For exploration, hypothesis generation
    • Playing around with text (even if you don’t know what you’re doing) can be an effective way to gain a new perspective on the texts you study
  • To expand the audience for under-recognized texts

What questions do you want to answer by analyzing your text? Whether you have a specific hypothesis in mind or your work is more exploratory, analyzing the frequency of words in your text can be a good place to start. The results may confirm your suspicions or yield surprising outputs that lead to further research.


Word Frequency

We can visualize word frequency using Voyant Tools. Voyant Tools is a modular online tool set for visualizing one or many documents.


Basics of Voyant


Tool Interactions

An essential part of Voyant is that events in one tool can cause changes in other tools (the exact interactions depend on a number of factors, including which tools are visible).


Stopword List

A stopword is a word (usually a commonly-used word) that an application has been programmed to ignore. Usually, stopword lists contain common words such as a, an, the, and, to, from, etc.

Sometimes, it can be useful to add common words like person names or place names, depending on what your research question is. Stopword lists are customizable, allowing the researcher to remove words such as character or place names from the analysis.

Voyant automatically analyzes your text using a stop word list, but you can delete or edit this list as you see fit.


Stopwords Make a Difference

Compare the following two screenshots. The first uses Voyant’s Auto-detect stopword setting. In this case, it has recognized this text as English, so it is using a standard English stopword list. The second uses Voyant’s None stopword setting.

walden with stoplist
Voyant's Cirrus visualization of Walden with an English stopword list
walden without stoplist
Voyant's Cirrus visualization of Walden without a stopword list

If you have your own stopword list, you can upload it instead. Or you can simply add words to Voyant’s ready-made lists. To add or edit stopwords, click on the options icon in the top right of the visualization. This will produce a dialog box where you can select a list or edit it.

voyant options tab
Click on the 'options' tab (upper right)
voyant stoplist dropdown
Voyant's stoplist options

Contexts

Word counts give us some information about a text, but often we need more context to make sense of how those words are being used. The Voyant “contexts” tool shows each occurrence of a word with its surrounding text. The tool lets you choose from the text’s most frequent words, or you can type a word of your choice into the search box in the lower left-hand corner of the panel. Move the “context” slider next to the search box to view more words before and after your search term.

voyant-contexts
A search for the contexts of 'heard' in Walden

To see how many times words appear in close proximity to each other, you can also view the “Links” feature in the top left-hand panel of the homepage (the same panel where you see “Cirrus”).


Other Voyant Tools

Voyant has a robust documentation. In it, you’ll find a lot of tools listed that don’t appear on the landing page. You can access these tools by hovering your cursor over the grid icon in the upper right-hand corner of each tool box.



Word Tree

Want to compare similarities between your words’ contexts? Visualizing your text using a word tree can help with this.

A word tree is a visual search tool for text that groups the various patterns that exist in the text that comes after your chosen word or phrase. The contexts are arranged in a tree-like branching structure to reveal recurrent themes and phrases. The results can be at times unsurprising and at other times revealing, leading to further research.

wordtree visualization
Word Tree visualization of Bob Dylan's 'Blowin' in the Wind'