Depending on which algorithms you use, you end up with different ways of density estimation within every class.
In Linear Discriminant Analysis (LDA) we assume that every density within each class is a Gaussian distribution.
In LDA we assume those Gaussian distributions for different classes share the same covariance structure. In Quadratic Discriminant Analysis (QDA) we don't have such a constraint. You will see the difference later.
You can also use general nonparametric density estimates, for instance kernel estimates and histograms.
There is a well-known algorithm called the Naive Bayes algorithm. Here the basic assumption is that all the variables are independent given the class label. Therefore, to estimate the class density, you can separately estimate the density for every dimension and then multiply them to get the joint density. This makes the computation much simpler.
X may be discrete, not continuous. Instead of talking about density, we will use the probability mass function. In this case, we would compute a probability mass function for every dimension and then multiply them to get the joint probability mass function.
You can see that we have swept through several prominent methods for classification. You should also see that they all fall into the Generative Modeling idea. The only essential difference is in how you actually estimate the density for every class.