Text Analysis Reading Group Syllabus

Text Analysis Reading Group – Spring 2017

Every other Friday: 1/13, 1/27, 2/10, 2/24, 3/10, 3/24, 4/14, 4/28, 5/12
2:00-3:00pm
Mudd Conference Room (3rd Floor Mudd, next to Keck 2)

Meeting 1 — January 13:

Preprint of Rockwell, Geoffrey. “What is Text Analysis, Really?”, Literary and Linguistic Computing Vol. 18, no. 2 (2003): 209-219. http://geoffreyrockwell.com/publications/WhatIsTAnalysis.pdf
Ted Underwood, “Seven Ways Humanists Are Using Computers to Understand Text. The Stone and the Shell,” June 4, 2015, https://tedunderwood.com/2015/06/04/seven-ways-humanists-are-using-computers-to-understand-text/.
O’Conner, Brendan, David Bamman and Noah A. Smith. “Computational Text Analysis for Social Science: Model Assumptions and Complexity.” Second Workshop on Computational Social Science and Wisdom of the Crowds (NIPS 2011). (Be sure to check out the endnotes for articles in your field of interest as examples of using text analysis in discipline-specific research.)
OPTIONAL – Skill Development:
- Play with your own texts in Voyant Tools. You can also check out the documentation for each tool in this suite.
- If you choose to learn R for text analysis:
  - Matthew L. Jockers, Text Analysis with R for Students of Literature (Springer, 2014), http://link.springer.com/10.1007/978-3-319-03164-4, chs. 1–2.
  - Install R and the RStudio Desktop IDE (you may wish to install the preview version to get notebook support). Start to become familiar with the basics of R as described in the introductory chapters of either Jockers or Arnold and Tilton. You should also become start to become familiar with the basics of the Unix-style command line (see Shotts, Linux Command Line, as a reference).
- If you would prefer learning the programming language, Python, for text analysis:
  - Exploratory Programming for the Arts and Humanities, Introduction and chapters 1-2.

Meeting 2 — January 27:

Matthew K. Gold et al., “Forum: Text Analysis at Scale,” in Debates in the Digital Humanities 2016 (University of Minnesota Press, 2016), 525–568, http://dhdebates.gc.cuny.edu/debates/text/93.
Tim Hitchcock and William J. Turkel, “The Old Bailey Proceedings, 1674–1913: Text Mining for Evidence of Court Behavior,” Law and History Review 34, no. 4 (November 2016), 929-955. doi:10.1017/S0738248016000304.
OPTIONAL – Skill Development:
- R: Taylor Arnold and Lauren Tilton, Humanities Data in R (Springer, 2015), http://link.springer.com/10.1007/978-3-319-20702-5, chs. 1–2.
- Python: Exploratory Programming for the Arts and Humanities, Chapters 3-4.

Meeting 3 — February 10

Julia Flanders and Fotis Jannidis, “Data Modeling,” in A New Companion to the Digital Humanities, ed. Susan Schreibman, Ray Siemens, and John Unsworth (Wiley Blackwell, 2016), 229–37.
Stéfan Sinclair and Geoffrey Rockwell, “Text Analysis and Visualization: Making Meaning Count,” in A New Companion to the Digital Humanities, ed. Susan Schreibman, Ray Siemens, and John Unsworth (Wiley Blackwell, 2016), 274–90.
OPTIONAL – Skill Development:
- R: Arnold and Tilton, Humanities Data in R, chs. 9-10.
- Jure Leskovec, Anand Rajaraman, and Jeff Ullman, Mining of Massive Datasets, 2nd ed. (Cambridge University Press, 2014), http://www.mmds.org/, ch. 1.
- Python: Exploratory Programming for the Arts and Humanities, Chapters 5-6.

Meeting 4 — February 24

Shawn Graham, Ian Milligan, and Scott Weingart, Exploring Big Historical Data: The Historian’s Macroscope (Imperial College Press, 2015), ch. 3. [You can request through ILL]
Michael Witmore, “Text: A Massively Addressable Object,” in Debates in the Digital Humanities 2012 (University of Minnesota Press, 2012), http://dhdebates.gc.cuny.edu/debates/text/28.
OPTIONAL – Skill Development:
- R: Jockers, Text Analysis, chs. 11–12.
- Python: Exploratory Programming for the Arts and Humanities, Chapter 7.

Meeting 5 — March 10

Matthew L. Jockers and Ted Underwood, “Text-Mining the Humanities,” in A New Companion to the Digital Humanities, ed. Susan Schreibman, Ray Siemens, and John Unsworth (Wiley Blackwell, 2016), 291–306.
D. Sculley and Bradley M. Pasanek, “Meaning and Mining: The Impact of Implicit Assumptions in Data Mining for the Humanities,” Literary and Linguistic Computing 23, no. 4 (2008): 409–424, http://llc.oxfordjournals.org/content/23/4/409.short.
OPTIONAL – Skill Development:
- R: Gareth James et al., An Introduction to Statistical Learning with Applications in R (Springer, 2013), chs. 2, 3, 10.
- Leskovec, Rajaraman, and Ullman, Mining of Massive Datasets, ch. 3. http://www.mmds.org/
- Python: Exploratory Programming for the Arts and Humanities, Chapter 10.

Meeting 6 — March 24

Ryan Cordell, “Reprinting, Circulation, and the Network Author in Antebellum Newspapers,” American Literary History 27, no. 3 (September 1, 2015): 417–445, doi:10.1093/alh/ajv028.
David A. Smith, Ryan Cordell, and Abby Mullen, “Computational Methods for Uncovering Reprinted Texts in Antebellum Newspapers,” American Literary History 27, no. 3 (September 1, 2015): E1–E15, doi:10.1093/alh/ajv029.
OPTIONAL – Skill Development:
- Leskovec, Rajaraman, and Ullman, Mining of Massive Datasets, chs. 3 & 7. http://www.mmds.org/
- Python: Exploratory Programming for the Arts and Humanities, Chapter 11.

Meeting 7 — April 14 (Topic Modeling)

Journal of Digital Humanities 2, no. 1 (winter 2012): http://journalofdigitalhumanities.org/2-1/.
Robert K. Nelson and Digital Scholarship Lab, University of Richmond, “Mining the Dispatch,” 2011, http://dsl.richmond.edu/dispatch/.
OPTIONAL – Skill Development:
- Python: Exploratory Programming for the Arts and Humanities, Chapter 14.

Meeting 8 — April 28 (Topic Modeling)

Graham, Milligan, and Weingart, Exploring Big Historical Data, ch. 4.
David J. Newman and Sharon Block, “Probabilistic Topic Decomposition of an Eighteenth-Century American Newspaper,” Journal of the American Society for Information Science and Technology 57, no. 6 (2006): 753–767, http://onlinelibrary.wiley.com/doi/10.1002/asi.20342/full.
OPTIONAL – Skill Development:
- R: Jockers, Text Analysis, ch. 13.

Meeting 9 — May 12

Benjamin Schmidt, “Vector Space Models for the Digital Humanities,” October 25, 2015, http://bookworm.benschmidt.org/posts/2015-10-25-Word-Embeddings.html.
Michael A. Gavin, “The Arithmetic of Concepts: A Response to Peter de Bolla. Modeling Literary History,” September 18, 2015, http://modelingliteraryhistory.org/2015/09/18/the-arithmetic-of-concepts-a-response-to-peter-de-bolla/.
Matt J. Kusner et al., “From Word Embeddings to Document Distances,” in Proceedings of the 32nd International Conference on Machine Learning (ICML 2015), 2015, 957–966, http://www.jmlr.org/proceedings/papers/v37/kusnerb15.pdf.
OPTIONAL – Skill Development:
- GloVe vignette from text2vec and wordVectors documentation.

Most of the content for this reading group is based on Lincoln Mullen’s “Text Analysis for Historians” class at George Mason University. See http://lincolnmullen.com/courses/text-analysis.2016/ for his full syllabus.

Tags : Community of Practice, DH Instruction, Digital Humanities, Professional Development, Text Analysis

Ashley R. Sanders, Ph.D.