Invited talk at the UK Institute of Historical Research – Digital History Seminar.
Bias detection remains an area of interest for digital humanists, computational linguists, and information studies scholars, who point to biases inherent in our algorithms, software, tools, and platforms, but we are only just beginning to examine how computational methods could be used to interrogate our primary textual sources. This project presents a method for bias detection that can be used at a study’s outset with little initial knowledge of the corpus, requires little pre-processing, and is both beginner-friendly and language-agnostic. Pairing topic modeling with sentiment analysis and targeted close reading of documents most closely related to topics of interest (based on document-topic weights) uncovered the stories of lesser-known actors in the history of Ottoman Algeria, as well as biases inherent in the writing of their histories. The anti-Arab and/or anti-Turkish sentiments one might expect to observe in French colonial texts were absent, but a latent anti-Semitic sentiment appeared in the topic models, indicated by sentiment analysis scores of topics related to Jewish people and verified through a close reading of related passages.
More details and slides are available on the page for this talk.