study various inference methods for latent dirichlet allocation, including collapsed gibbs sampling, variational inference, considering both single and distributed implementation.
FastLDA: Fast Collapsed Gibbs Sampling For Latent Dirichlet Allocation, KDD'2008
SparseLDA: Efficient Methods for Topic Model Inference on Streaming Document Collections, KDD'2009
AliasLDA: Reducing the Sampling Complexity of Topic Models, KDD'2014
F+LDA: A Scalable Asynchronous Distributed Algorithm for Topic Modeling, 2014
LightLDA: Big Topic Models on Modest Compute Clusters, 2014
WarpLDA: a Cache Efficient O(1) Algorithm for Latent Dirichlet Allocation, VLDB'2016
CVB: A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation, 2007
Stochastic Variational Inference, 2013
Stochastic method can avoid scanning the whole dataset at each iteration, which is time-consuming in batch mode.
It iterates between subsampling of data and adjusting the hidden structure based only on the subsample.
Variational inference is amenable to stochastic optimization because the variational objective decomposes into a sum of terms, one for each data point in the analysis.
Stochastic Collapsed Variational Bayesian Inference for Latent Dirichlet Allocation, KDD'2013
CVB0: On Smoothing and Inference for Topic Models, 2009
Sparse stochastic inference for latent Dirichlet allocation, ICML'2012
Distributing the Stochastic Gradient Sampler for Large-Scale LDA, KDD'2016
Learning Topic Models by Belief Propagation, 2012
Residual Belief Propagation for Topic Modeling, 2012
Memory-Efficient Topic Modeling
Two types of implementation
-
Share Word-Topic matirx using PS
-
Shuffle Doc-Word matrix with topic assignment
WarpLDA: a Cache Efficient O(1) Algorithm for Latent Dirichlet Allocation, VLDB'2016
On Smoothing and Inference for Topic Models, 2009
Rethinking Collapsed Variational Bayes Inference for LDA, ICML'2012