This paper proposes model rotation as a general approach to parallelize big data machine learning applications. To solve the big model problem in parallelization, we distribute the model parameters to inter-node workers and rotate different model parts in a ring topology. The advantage of model rotation comes from maximizing the effect of parallel model updates for algorithm convergence while minimizing the overhead of communication. We formulate a solution using computation models, programming interfaces, and system implementations as design principles and derive a machine learning framework with three algorithms built on top of it: Latent Dirichlet Allocation using Collapsed Gibbs Sampling, Matrix Factorization using Stochastic Gradient Descent and Cyclic Coordinate Descent. The performance results on an Intel Haswell cluster with max 60 nodes show that our solution achieves faster model convergence speed and higher scalability than previous work by others.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 firstname.lastname@example.org
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 email@example.com