Decentralized Online Learning With Kernels
/ Authors
/ Abstract
We consider multiagent stochastic optimization problems over reproducing kernel Hilbert spaces. In this setting, a network of interconnected agents aims to learn decision functions, i.e., nonlinear statistical models, that are optimal in terms of a global convex functional that aggregates data across the network, with only access to locally and sequentially observed samples. We propose solving this problem by allowing each agent to learn a local regression function while enforcing consensus constraints. We use a penalized variant of functional stochastic gradient descent operating simultaneously with low-dimensional subspace projections. These subspaces are constructed greedily by applying orthogonal matching pursuit to the sequence of kernel dictionaries and weights. By tuning the projection-induced bias, we propose an algorithm that allows each individual agent to learn, based on its locally observed data stream and message passing with its neighbors only, a regression function that is close to the globally optimal regression function. That is, we establish that with constant step-size selections agents’ functions converge to a neighborhood of the globally optimal one while satisfying the consensus constraints as the penalty parameter is increased. Moreover, the complexity of the learned regression functions is guaranteed to remain finite. On both multiclass kernel logistic regression and multiclass kernel support vector classification with data generated from class-dependent Gaussian mixture models, we observe an stable function estimation and the state-of-the-art performance for distributed online multiclass classification. Experiments on the Brodatz textures further substantiate the empirical validity of this approach.
Journal: IEEE Transactions on Signal Processing