ATR Teaching Resource ATR Teaching Resource ATR
  • Introduction
  • eScriptorium
  • Models
    • Open-Source Models
    • HTR-United
  • Modern Approaches
    • TrOCR
    • Vision Language Models
  • Workshops
    • Workshops
    • DMSI 2026 Kalamazoo
  • Quiz
  • Literature
  1. Automated Text Recognition
  • Home
  • Introduction
  • eScriptorium
  • Models
    • Open-Source Models
    • HTR-United
  • Modern Approaches
    • TrOCR
    • Vision Language Models
  • Workshops
    • DMSI 2026 Kalamazoo
  • Quiz
  • Literature

On this page

  • Automated Text Recognition
  • What you will find here
  • Edit this page
  • Report an issue

Automated Text Recognition

Automated Text Recognition

A teaching resource on OCR and HTR for historical documents — from classical pipelines and open-source tools to modern transformer and vision-language models.

Start with the Introduction Explore Models Take the Quiz

What you will find here

Introduction

What is OCR and HTR? The recognition pipeline, key metrics (CER/WER), and the challenges of historical documents.

eScriptorium

The open-source annotation and transcription platform built on Kraken. Workflow, training, export, and further resources.

Open-Source Models

A curated overview of publicly available Kraken/eScriptorium models for medieval and early modern manuscripts.

HTR-United

The community initiative for sharing HTR training data and models under standardized metadata.

TrOCR

Microsoft’s transformer-based OCR model — architecture, pre-training, fine-tuning, and when to use it.

Vision Language Models

GPT-4o, Gemini, and open VLMs applied to historical text recognition — possibilities and limitations.

Quiz

Test your knowledge with a Kahoot-style multiple-choice quiz covering all sections of this resource.

Literature

Key readings from the Zotero group Automated Text Recognition, searchable and organized by year.

Workshops

Schedules and materials for hands-on ATR workshops. Next: DMSI 2026 Kalamazoo, 13 May 2026.

Back to top

Reuse

CC BY 4.0

Automated Text Recognition Teaching Resource

GitHub · CC BY 4.0

  • Edit this page
  • Report an issue

Introduction · Models · Quiz