In the previous section, we looked at an introduction to Bayesian inferencing and learned how conditional probabilities along with law of total probability can be used in a predictive context. We looked at how Maximum likelihood and A posteriori estimations can be used to calculate the posterior probability by learning some unknown variable theta, given all the available data. In this is lesson, we shall look at using this Bayesian setting in a Gaussian context i.e. when the underlying random variables are normally distributed.
You will be able to:
- Understand and describe how MLE works with normal distributions
- Calculate the MLE estimations for expected mean and variance
Given some parameterized distribution
We maximize this function, to identify the maxima with respect to theta, and take its log to simplify the likelihood equation shown below:
Here we explicitly write
So far we have been looking at coin toss experiments and working with binomial distributions for our understanding. Let's take this a bit further and try to work with Gaussian/Normal distributions. Now consider the same idea as shown above, but with a Normal distribution. We know the parameters used to desribe a normal distribution are $(\muand\sigma^2)$. Where
$x_1, x_2, ..., x_n ∼ N(\mu, \sigma^2)~~~-~~~ A
normaldistributionisnormallyshownusingsuchnotation.$
So just like above, We can set up a likelihood equation:
As long as
And our likelihood function:
After taking the log likelihood and removing some constants, we get the following equation:
Note:
We can similarly calculate the MLE for variance following similar steps to get the result as :
Note: *A detailed derivation of above final MLE equations with proofs can be seen at this website. *
Next We shall try to implement this in python to get a better understanding of how such estimations work.
In analytics, most of the real world data is normally modeled under a normal distribution (think of central limit theorem and why this is the case). It is imperative that you can develop an intuition around how these distributions get represented and are inferred in analysis. A good mathematical understanding of the processes in data manipulations goes a long way. Let's try to see how we can translate this understanding in Python in the following lab.