Given a set of documents \(\mathscr{D}\), each containing terms \(T_d\), learn a set of "topics" \(\mathscr{T}\) representative of the "semantic" content of the corpus.

Each topic is characterized by a distribution over terms, \(\phi_t\). Each document has a latent mixture over topics, \(\theta_d\).

We want to draw samples from the distribution over our latent random variables, e.g. \(\phi\), \(\theta\), and \(T\). This is very difficult due to the high dimensionality of the event space.

To do this we iteratively examine subsets of our variables and draw from their marginal distribution.