På svenska

Presentation Information     2015-11-09 (14:15)   •  The seminar room at Vi2

Speaker Tomas Wilkinson  (CBA)
Title A Novel Word Segmentation Method Based on Object Detection and Deep Learning Date
Abstract The segmentation of individual words is a crucial step in several data mining methods for historical handwritten documents. Examples of applications include visual searching for query words (word spotting) and character-by-character text recognition. In this talk, we present a novel method for word segmentation that is adapted from recent advances in computer vision, deep learning and generic object detection. Our method has unique capabilities and it has found practical use in our current research project. It can easily be trained for di?fferent kinds of historical documents, uses full gray scale information, does not require binarization as pre-processing or prior segmentation of individual text lines. We evaluate its performance using established error metrics, previously used in competitions for word segmentation, and demonstrate its usefulness for a 15th century handwritten document. This project is a part of q2b, From quill to bytes, a framework program sponsored by the Swedish Research Council (Vetenskapsrådet, Dnr 2012-5743) and Uppsala university. The work is done in part as a collaboration with the Swedish Museum of Natural History (Naturhistoriska riksmuseet).