||Transcription of digitized documents is an important task gaining lot of attention from the image processing and pattern recognition research community. Degradation of the
documents over time, variation in the writing styles, inconsistent document background based on the writing medium are but a few challenges encountered when working in this field.
One of the methods that has been successful in this area are the Hidden Markov models (HMMs) which are based on sound mathematical formulations. These models have their
origin in speech recognition and are good at handling data sequences. Since text is often written along one direction, handling written text as a sequence of characters fits well
within the HMM paradigm.
In the presentation an HMM paradigm will be discussed for evaluating some of the feature extraction procedures in the character recognition literature that are frequently
used in a common Hidden Markov Model framework. This should facilitate better understanding of the features and also help capture the information that aids respective feature
extractor in the classification task. The effects of model topologies and data normalization are also studied over different hand written and synthetic data-sets.