Previous research has established several methods of online learning for latent Dirichlet allocation (LDA). However, streaming learning for LDA— allowing only one pass over the data and constant storage complexity—is not as well explored. We use reservoir sampling to reduce the storage complexity of a previously-studied online algorithm, namely the particle filter, to constant. We then show that a simpler particle filter implementation performs just as well, and that the quality of the initialization dominates other factors of performance.
Download Full PDF Version (Non-Commercial Use)