Presently, the Bayesian methods are very popular and used extensively in many fields. However, there are still many people who are not very sure about what is Bayesian analysis and its requirement. Hence, in this blog post, I will try to explain Bayesian analysis in as simple terms as possible, starting with the basics.

The Bayes formula was published in 1763, which was about two years after its creator, Thomas Bayes, passed away. However, the usage methods of this formula became popular not before the end of the twentieth century. The reason for this was that the calculations are computationally expensive, and can only be possibly performed when the technology is pretty advanced.

## Bayes Theorem

Consider an example. Suppose, ‘A’ represents the proposition that there was a snowfall today, and ‘B’ represents the evidence that there is snow on the footpath.

The question, “What is the probability that there was a snowfall today given that there was snow on the footpath?”, can be represented by P(snowfall | snow on footpath). For answering this question, let us try to evaluate the equation’s right side. Before looking at the footpath, with what probability did snowfall occur ( P(snowfall) )? You can consider this as an assumption about the weather. Then, we ask that how likely the observation of finding snow on the footpath is under the above assumption? It can be denoted by P(snow on footpath | snowfall). This method helps us to find the final probability of snowfall, given the evidence that there was snow found on the footpath, by updating our original thoughts about the given proposition using an evidence.

## Prior Belief Distribution

Prior belief distribution represents how strong our beliefs are, depending on previous experience, about some parameters.

Even if there are no previous experiences, there are certain methods devised by mathematicians for calculating the prior belief distribution. They are called **Uninformative Priors**. **Beta distribution** is the mathematical function that represents the prior beliefs. The beta distribution has some amazing mathematical properties that are used for modelling beliefs about binomial distributions.

Prior distribution has a significant role in Bayesian inference. It is used for representing some information of a parameter that is uncertain, and on combination with the new data’s probability distribution, it yields posterior distribution. The axioms of decision theory can justify the existence of prior distribution for anything. Setting up of a prior distribution involves several key issues like the information that is going in the prior distribution and the properties that the posterior distribution obtained as a result has.

The posterior inferences observe only minor effects if the choices of the prior distributions are reasonable and the parameters are well-identified and the sample size is large. The definition of a large and well-defined sample size may not be very precise, however, the dependence on the prior distributions can be checked using sensitivity analysis, that is, comparison of posterior inferences using several different choices of prior distribution. The prior distribution becomes even more important if the sample size is small or if the data that is available provides information indirectly about few parameters of interest only.

## Posterior Belief Distribution

Posterior belief distribution is used to summarize the knowledge of all the uncertain quantities in the current state in Bayesian analysis. These uncertain quantities also include the unobservable parameters and the data that is missing or latent. The posterior density is the product of the prior density and the likelihood.

## Bayesian Analysis

While analysing real data, generally, we are interested in some property of the data. This property could be its mean, variance, etc.

Let us denote the experimental data as ‘Data’ and a property of this data that is of our interest as θ. There can be several possible values of this property θ.

We want to obtain a certain value of this property by using what is available to us, that is, P(θ | Data).

Applying the Bayes formula,

Writing the formula with the integral:

Here, we get the probability as a function of the parameter θ. Now, on maximizing this function, we can find the most probable value of this parameter θ. Then, we can find the variance and the average value of this parameter. We can even calculate the upper and lower limits within which this parameter lies with a certain probability, and many other similar insights.

P(θ | Data) is called the A Posteriori probability. P(θ) is called the A Priori probability and P(Data | θ) is called the likelihood function. The A Posteriori probability is calculated using the a priori probability and the likelihood function. The likelihood function is determined by the model. For that, a data collection model is created that depends on the parameter of interest.

The information that is known before the analysis is included in the a priori probability. In the above formula, the denominator of the fraction is an integral of the numerator for all the possible values of the parameter θ. This means that the denominator of the fraction is a constant, and it basically normalizes the a posteriori probability, which means that the integral of the a posteriori probability equates to one.

In this blog post, you learnt about the basics of Bayesian Analysis. This statistical paradigm finds great use in mathematics. It is used to answer many research questions about unknown parameters by using probability.

## Comments are closed.