På svenska

Presentation Information     2012-10-19 (14:15)   •  The seminar room at Vi2

Speaker Xuan Tuan Trinh
Type Master thesis presentation
Title Online Learning of Multi-class SVMs
Abstract Support Vector Machines (SVMs) are state-of-the-art learning algorithms for classification problems due to their strong theoretical foundation and their good performance in practice. However, their extension from two-class to multi-class classification problems is not straightforward. While some approaches solve a series of binary problems, other, theoretically more appealing methods solve one single optimization problem. Training SVMs amounts to solving a convex quadratic optimization problem. But even with a carefully tailored quadratic program solver, training all-in-one multi-class SVMs takes a long time for large scale datasets. We first consider the problem of training the multi-class SVM proposed by Lee, Lin and Wahba (LLW), which is the first Fisher consistent multi-class SVM that has been proposed in the literature and has recently been shown to exhibit good generalization performance on benchmark problems. Inspired by previous work on online optimization of binary and multi-class SVMs, a fast approximative online solver for the LLW SVM is derived. It makes use of recent developments for efficiently solving all-in-one multi-class SVMs without bias. In particular, it uses working sets of size two instead of following the paradigm of sequential minimal optimization. After successful implementation of the online LLW SVM solver, it is modified to also support the popular multi-class SVM formulation by Crammer and Singer. This is done using a recently established unified framework for a larger class of all-in-one multi-class SVMs. This makes it very easy to in the future adapt the novel online solver to even more formulations of multi-class SVMs. The results suggest that online solvers may provide for a robust and well-performing way of obtaining an approximate solution to the full problem which constitutes a good trade-off between optimization time and reaching a sufficiently high dual value.