Text mining refers to the automated extraction of high-quality information from (digitized) text and is becoming an indispensable tool for the modern humanities scholar. In this workshop, we offer a practical introduction to text mining with Python. We will show examples of (1) how to compute word frequencies, (2) how to easily determine the language of a text, (3) how to compute similarity between texts, and (4) how to effectively cheat your way through reading a dense novel. Because ‘text’ can be construed broadly, this workshop will cover a diverse array of genres, from 19th-century literature to subtitles of present-day tv shows. While we will use actual Python code during the workshop, no prior knowledge of programming languages is required.
You can find the documents of the workshop here.