Presentation Information     2014-12-15 (14:15)   •  The seminar room at Vi2

Speaker Alicia FornÚs
Comment Computer Vision Center - Universitat Aut˛noma de Barcelona
Title Recognition of the Historical Marriage License Books of the Cathedral of Barcelona.
Abstract The analysis and recognition of historical document images has attracted growing interest in the last years. Mass digitization and document image understanding allows the preservation, access and indexation of this artistic, cultural and technical heritage. This talk will focus on the recognition of historical handwritten documents, concretely, the Marriage Licenses Books conserved at the Archives of the Cathedral of Barcelona. Because of its continuity over five centuries, the source constitutes an unique instrument for studying the dynamics of population distribution. In this context, Document Image Analysis and Recognition (DIAR) techniques can be used for automatically extracting the information contained in these handwritten documents. First, we propose a hybrid language model for recognizing these syntactically structured documents, which is able to recognize out-of-dictionary words (e.g. new surnames, occupations) in controlled parts of the recognition, while keeping a closed vocabulary restriction for other parts. Second, we propose several word spotting techniques (query-by-example and query-by-string) that allow the user to search and extract information in these documents when the full transcription is not feasible or available.