[algorithm] What is the difference between a generative and a discriminative algorithm?

The short answer

Many of the answers here rely on the widely-used mathematical definition [1]:

  • Discriminative models directly learn the conditional predictive distribution p(y|x).
  • Generative models learn the joint distribution p(x,y) (or rather, p(x|y) and p(y)).
    • Predictive distribution p(y|x) can be obtained with Bayes' rule.

Although very useful, this narrow definition assumes the supervised setting, and is less handy when examining unsupervised or semi-supervised methods. It also doesn't apply to many contemporary approaches for deep generative modeling. For example, now we have implicit generative models, e.g. Generative Adversarial Networks (GANs), which are sampling-based and don't even explicitly model the probability density p(x) (instead learning a divergence measure via the discriminator network). But we call them "generative models” since they are used to generate (high-dimensional [10]) samples.

A broader and more fundamental definition [2] seems equally fitting for this general question:

  • Discriminative models learn the boundary between classes.
    • So they can discriminate between different kinds of data instances.
  • Generative models learn the distribution of data.
    • So they can generate new data instances.

From http://primo.ai/index.php?title=Discriminative_vs._Generative Image source


A closer look

Even so, this question implies somewhat of a false dichotomy [3]. The generative-discriminative "dichotomy" is in fact a spectrum which you can even smoothly interpolate between [4].

As a consequence, this distinction gets arbitrary and confusing, especially when many popular models do not neatly fall into one or the other [5,6], or are in fact hybrid models (combinations of classically "discriminative" and "generative" models).

Nevertheless it's still a highly useful and common distinction to make. We can list some clear-cut examples of generative and discriminative models, both canonical and recent:

  • Generative: Naive Bayes, latent Dirichlet allocation (LDA), Generative Adversarial Networks (GAN), Variational Autoencoders (VAE), normalizing flows.
  • Discriminative: Support vector machine (SVM), logistic regression, most deep neural networks.

There is also a lot of interesting work deeply examining the generative-discriminative divide [7] and spectrum [4,8], and even transforming discriminative models into generative models [9].

In the end, definitions are constantly evolving, especially in this rapidly growing field :) It's best to take them with a pinch of salt, and maybe even redefine them for yourself and others.


Sources

  1. Possibly originating from "Machine Learning - Discriminative and Generative" (Tony Jebara, 2004).
  2. Crash Course in Machine Learning by Google
  3. The Generative-Discriminative Fallacy
  4. "Principled Hybrids of Generative and Discriminative Models" (Lasserre et al., 2006)
  5. @shimao's question
  6. Binu Jasim's answer
  7. Comparing logistic regression and naive Bayes:
  8. https://www.microsoft.com/en-us/research/wp-content/uploads/2016/04/DengJaitly2015-ch1-2.pdf
  9. "Your classifier is secretly an energy-based model" (Grathwohl et al., 2019)
  10. Stanford CS236 notes: Technically, a probabilistic discriminative model is also a generative model of the labels conditioned on the data. However, the term generative models is typically reserved for high dimensional data.