2012-12-10

Speaker Fredrik Wahlberg
Title Word-spotting, what is it and why do we do it.
Abstract In many diciplines of humanist research the only sources are hand written manuscripts. They are often degraded, written in old script styles and in almost forgotten languages. When answering a specific question the available data is therefore often quite small and must be selected carefully. The holy grail for these researchers would be a Google for handwritten sources. However, doing a full automatic transcription of handwritten text is still a very challenging problem. Both for humans and machines. One way of supporting research in fields like philology, history and linguistics is to compromise on the full transcription and just try to find specific words or tokens. This is what, in the technical part of the field, is called word-spotting. If specific images of words can be reliably matched throughout a manuscript, searching is possible without solving the harder problem of a full transcription. I will present an overview of the latest decade of research in word-spotting with possible applications. I will try to present this research both from a humanist and a more technical perspective.