These are the materials for the course "Text recognition and analysis" given 6-7 Feb. 2025 at the Leibniz-Institut für Europäische Geschichte (IEG), Mainz. This book will serve as a reference during the course, and as a general introduction and reference for all things Handwritten Text Recognition / Optical Character Recognition (HTR/OCR).
This reference gives an overview of the most common tools for historical (handwritten) text recognition. In addition, I will also briefly discuss the initial digitization and potential citizen science components of such projects, leveraging my experience leading the Congo basin eco-climatological data recovery and valorisation project. It will discuss the practical issues of such projects and how to resolve them efficiently and cost-effectively. This course is a practical tool, not a theoretical machine learning reference. This course will give you an idea of what it takes to start a data recovery effort.