We'll be training a neural network to sample from the simple 1-D normal distribution $\mathcal(-1,1)$ Let $D$,$G$ be small 3-layer perceptrons, each with a meager 11 hidden units in total.

$G$ takes as input a single sample of a noise distribution: $z \sim \text(0,1)$.

The input to $D_1$ is a single sample of the legitimate data distribution: $x \sim p_$, so when optimizing the decider we want the quantity $D_1(x)$ to be maximized.

$D_2$ takes as input $x^\prime$ (the fake data generated by $G$), so when optimizing $D$ we want to $D_2(x^\prime)$ to be minimized.

You might ask: how can George generate samples from $p_$ if he doesn't know $p_$ in the first place?

We can create computationally indistinguishable samples without understanding the "true" underlying generative process [1].

Let $X$, $Y$ be the "observed" and "target" random variables.

Goodfellow's paper proposes a very elegant way to teach neural networks a generative model for any (continuous) probability density function.

This is a tutorial on implementing Ian Goodfellow's Generative Adversarial Nets paper in Tensor Flow.

(where "6" corresponds to the categorical class label for "tabby cat").

MNIST Le Net, Alex Net, and other classifiers are examples of a discriminative models.

On the other hand, a Generative model can allows us to evaluate the joint probability $P(X, Y)$.

