Text Analysis Reading Group Syllabus
Text Analysis Reading Group – Spring 2017
Every other Friday: 1/13, 1/27, 2/10, 2/24, 3/10, 3/24, 4/14, 4/28, 5/12
2:00-3:00pm
Mudd Conference Room (3rd Floor Mudd, next to Keck 2)
Meeting 1 — January 13:
- Preprint of Rockwell, Geoffrey. “What is Text Analysis, Really?”, Literary and Linguistic Computing Vol. 18, no. 2 (2003): 209-219. http://geoffreyrockwell.com/publications/WhatIsTAnalysis.pdf
- Ted Underwood, “Seven Ways Humanists Are Using Computers to Understand Text. The Stone and the Shell,” June 4, 2015, https://tedunderwood.com/2015/06/04/seven-ways-humanists-are-using-computers-to-understand-text/.
- O’Conner, Brendan, David Bamman and Noah A. Smith. “Computational Text Analysis for Social Science: Model Assumptions and Complexity.” Second Workshop on Computational Social Science and Wisdom of the Crowds (NIPS 2011). (Be sure to check out the endnotes for articles in your field of interest as examples of using text analysis in discipline-specific research.)
- OPTIONAL – Skill Development:
- Play with your own texts in Voyant Tools. You can also check out the documentation for each tool in this suite.
- If you choose to learn R for text analysis:
- Matthew L. Jockers, Text Analysis with R for Students of Literature (Springer, 2014), http://link.springer.com/10.1007/978-3-319-03164-4, chs. 1–2.
- Install R and the RStudio Desktop IDE (you may wish to install the preview version to get notebook support). Start to become familiar with the basics of R as described in the introductory chapters of either Jockers or Arnold and Tilton. You should also become start to become familiar with the basics of the Unix-style command line (see Shotts, Linux Command Line, as a reference).
- If you would prefer learning the programming language, Python, for text analysis:
- Exploratory Programming for the Arts and Humanities, Introduction and chapters 1-2.
Meeting 2 — January 27:
- Matthew K. Gold et al., “Forum: Text Analysis at Scale,” in Debates in the Digital Humanities 2016 (University of Minnesota Press, 2016), 525–568, http://dhdebates.gc.cuny.edu/debates/text/93.
- Tim Hitchcock and William J. Turkel, “The Old Bailey Proceedings, 1674–1913: Text Mining for Evidence of Court Behavior,” Law and History Review 34, no. 4 (November 2016), 929-955. doi:10.1017/S0738248016000304.
- OPTIONAL – Skill Development:
Meeting 3 — February 10
- Julia Flanders and Fotis Jannidis, “Data Modeling,” in A New Companion to the Digital Humanities, ed. Susan Schreibman, Ray Siemens, and John Unsworth (Wiley Blackwell, 2016), 229–37.
- Stéfan Sinclair and Geoffrey Rockwell, “Text Analysis and Visualization: Making Meaning Count,” in A New Companion to the Digital Humanities, ed. Susan Schreibman, Ray Siemens, and John Unsworth (Wiley Blackwell, 2016), 274–90.
- OPTIONAL – Skill Development:
- R: Arnold and Tilton, Humanities Data in R, chs. 9-10.
- Jure Leskovec, Anand Rajaraman, and Jeff Ullman, Mining of Massive Datasets, 2nd ed. (Cambridge University Press, 2014), http://www.mmds.org/, ch. 1.
- Python: Exploratory Programming for the Arts and Humanities, Chapters 5-6.
Meeting 4 — February 24
- Shawn Graham, Ian Milligan, and Scott Weingart, Exploring Big Historical Data: The Historian’s Macroscope (Imperial College Press, 2015), ch. 3. [You can request through ILL]
- Michael Witmore, “Text: A Massively Addressable Object,” in Debates in the Digital Humanities 2012 (University of Minnesota Press, 2012), http://dhdebates.gc.cuny.edu/debates/text/28.
- OPTIONAL – Skill Development:
- R: Jockers, Text Analysis, chs. 11–12.
- Python: Exploratory Programming for the Arts and Humanities, Chapter 7.
Meeting 5 — March 10
- Matthew L. Jockers and Ted Underwood, “Text-Mining the Humanities,” in A New Companion to the Digital Humanities, ed. Susan Schreibman, Ray Siemens, and John Unsworth (Wiley Blackwell, 2016), 291–306.
- D. Sculley and Bradley M. Pasanek, “Meaning and Mining: The Impact of Implicit Assumptions in Data Mining for the Humanities,” Literary and Linguistic Computing 23, no. 4 (2008): 409–424, http://llc.oxfordjournals.org/content/23/4/409.short.
- OPTIONAL – Skill Development:
- R: Gareth James et al., An Introduction to Statistical Learning with Applications in R (Springer, 2013), chs. 2, 3, 10.
- Leskovec, Rajaraman, and Ullman, Mining of Massive Datasets, ch. 3. http://www.mmds.org/
- Python: Exploratory Programming for the Arts and Humanities, Chapter 10.
Meeting 6 — March 24
- Ryan Cordell, “Reprinting, Circulation, and the Network Author in Antebellum Newspapers,” American Literary History 27, no. 3 (September 1, 2015): 417–445, doi:10.1093/alh/ajv028.
- David A. Smith, Ryan Cordell, and Abby Mullen, “Computational Methods for Uncovering Reprinted Texts in Antebellum Newspapers,” American Literary History 27, no. 3 (September 1, 2015): E1–E15, doi:10.1093/alh/ajv029.
- OPTIONAL – Skill Development:
- Leskovec, Rajaraman, and Ullman, Mining of Massive Datasets, chs. 3 & 7. http://www.mmds.org/
- Python: Exploratory Programming for the Arts and Humanities, Chapter 11.
Meeting 7 — April 14 (Topic Modeling)
Meeting 8 — April 28 (Topic Modeling)
- Graham, Milligan, and Weingart, Exploring Big Historical Data, ch. 4.
- David J. Newman and Sharon Block, “Probabilistic Topic Decomposition of an Eighteenth-Century American Newspaper,” Journal of the American Society for Information Science and Technology 57, no. 6 (2006): 753–767, http://onlinelibrary.wiley.com/doi/10.1002/asi.20342/full.
- OPTIONAL – Skill Development:
Meeting 9 — May 12
- Benjamin Schmidt, “Vector Space Models for the Digital Humanities,” October 25, 2015, http://bookworm.benschmidt.org/posts/2015-10-25-Word-Embeddings.html.
- Michael A. Gavin, “The Arithmetic of Concepts: A Response to Peter de Bolla. Modeling Literary History,” September 18, 2015, http://modelingliteraryhistory.org/2015/09/18/the-arithmetic-of-concepts-a-response-to-peter-de-bolla/.
- Matt J. Kusner et al., “From Word Embeddings to Document Distances,” in Proceedings of the 32nd International Conference on Machine Learning (ICML 2015), 2015, 957–966, http://www.jmlr.org/proceedings/papers/v37/kusnerb15.pdf.
- OPTIONAL – Skill Development:
Most of the content for this reading group is based on Lincoln Mullen’s “Text Analysis for Historians” class at George Mason University. See http://lincolnmullen.com/courses/text-analysis.2016/ for his full syllabus.
One thought on “Text Analysis Reading Group Syllabus”