View All
View All
View All
View All
View All
View All
View All
View All
View All
View All
View All
View All

What is Bayesian Thinking ? Introduction and Theorem

By Karan Kurani

Updated on Nov 30, 2022 | 8 min read | 6.6k views

Share:

A statistical theorem given by the English statistician and philosopher Thomas Bayes in the 1700s continues to be a guiding light for scientists and analysts across the world. Today, Bayesian thinking finds application in medicine, science, technology, and several other disciplines and continues to influence our worldview and resultant actions strongly. 

Thomas’ Bayes idea was strikingly simple. According to Bayes, the probability of a hypothesis being true depends on two conditions: how reasonable it is based on what we already know (the prior knowledge) and how well it fits new evidence. Thus, Bayesian thinking differs from traditional hypothesis testing in that the former includes the prior knowledge before jumping to conclusions.

With the preliminary introduction in mind, let us dive into a bit more detail about Bayesian statistics.

Bayesian Statistics

In simple terms, Bayesian statistics apply probabilities to statistical problems to update prior beliefs in light of the evidence of new data. The probability expresses a degree of belief in a specific event.

The degree of belief may be based on previous knowledge about the event based on personal assumptions or results of prior experiments. Bayesian statistics use the Bayes’ Theorem to compute probabilities. The Bayes’ Theorem, in turn, describes the conditional probability of an event based on new evidence and prior information related to the event.

With that in mind, let us brush up on the fundamental concept of conditional probability before we understand Bayes’ Theorem in depth.

Conditional Probability

Conditional probability can be defined as the likelihood of an event or outcome based on the occurrence of a previous event or outcome. It is calculated by multiplying the probability of the prior event by the probability of the subsequent or conditional event. 

Let’s take a look at an example to understand the concept better.

  • Event A is that a family planning an outing will go on a picnic. There is an 80% chance that the family will go on the picnic.
  • Event B is that it will rain on the day the family goes out on a picnic. The weather forecast says that there is a 60% chance of precipitation on the picnic day.
  • Hence, the probability (P) that the family goes on the picnic and it rains is calculated as follows:

P (Picnic and rain) = P (Rain | Picnic) P (Picnic) = (0.60) * (0.80) = 0.48

In the above example, conditional probability looks at the two events A and B in relationship with one another, that is, the probability that the family does go to the picnic and it also rains on the same day.

Hence, conditional probability differs from unconditional probability because the latter refers to the likelihood of occurrence of an event regardless of whether any other event or events have taken place or any other conditions are present.

The formula for conditional probability

The formula for conditional probability comes from the probability multiplication rule :

P (A and B) or P (A U B) = P ( B given A) or P (B | A) * P (A) 

In the above equation, P (A and B) is the joint probability, referring to the likelihood of two or more events occurring simultaneously. It is also written as P (A,B).

Here’s how to deduce the conditional probability equation from the multiplication rule:

Step 1: Write down the multiplication rule.

P (A and B) = P (B | A) * P (A) 

Step 2: Divide both sides of the equation by P (A).

P (A and B) / P (A) = P (B | A) * P (A) / P (A)

Step 3: Cancel P (A) on the right side of the equation.

P (A and B) / P (A) = P (B | A)

Step 4: Rewrite the equation.

P (A and B) = P (B | A) / P (A)

Thus, the formula for conditional probability is given as:

P (A and B) = P (B | A) / P (A)

Bayes’ Theorem

Using Bayes’ Theorem, we can update our beliefs and convictions based on new and relevant pieces of evidence. For instance, if we are trying to figure out the probability of a given person having cancer, we would generally assume it to be the percentage of the population that has cancer. However, if we introduce extra evidence, such as the person in question is a regular smoker, we can update our perception (and hence the probability) since the probability of having cancer is higher if an individual is a smoker. Hence, we utilize both our prior knowledge and the additional evidence to improve our estimations.

The formula for Bayes’ Theorem

Source

The above equation is the Bayes’ rule. Now, let us look into the stepwise derivation of the Bayes’ Theorem equation.

Step 1: Consider two events, A and B. A is the event whose probability we want to calculate and B is the additional evidence that is related to A.

Step 2: Write down the relationship between the joint probability and conditional probability of events A and B.

P (A,B) = P (A | B) * P(B) = P (B,A) = P (B | A) * P(A)

Step 3: Set the two conditional probability terms equal to each other.

P (A | B) * P(B) = P (B | A) * P(A)

Step 4: Divide both sides of the equation by P (B).

P (A | B) * P(B) / P (B) = P (B | A) * P(A) / P (B)

Step 5: Cancel P (B) on the left side of the equation.

P (A | B) = P (B | A) * P(A) / P (B)

Thus, we get the formula of Bayes’ Theorem as follows:

P (A | B) = P (B | A) * P(A) / P (B)

Understanding the terms in the Bayes’ Theorem equation

P (A | B) = P (B | A) * P(A) / P (B)

  • P (A | B) is called the posterior probability or the probability we are trying to estimate. Based on the previous example, the posterior probability would be the probability of the person having cancer, given that the person is a regular smoker.
  • P (B | A) is called the likelihood, referring to the probability of detecting the additional evidence, given our initial hypothesis. In the above example, the likelihood is the probability of the person being a smoker, given that the person has cancer.
  • P (A) is the prior probability or the probability of our hypothesis without any additional evidence or information. In the above example, the prior probability is the probability of having cancer.
  • P (B) is the marginal likelihood or the total probability of observing the evidence. In the context of the above example, the marginal likelihood is the probability of being a smoker.

A Simple Example To Understand Bayes’ Theorem

Using some hypothetical numbers in the previous example, we will see the effect of applying the Bayes’ Theorem.

Suppose the probability of having cancer is 0.06, that is, 6% of the people have cancer. Now, say that the probability of being a smoker is 0.20 or 20% of people are smokers, and 30% of people with cancer are smokers. So, P (Smoker | Cancer) = 0.30.

Initially, the probability of having cancer is simply 0.06 (prior). But using the new evidence, we can calculate P (Cancer | Smoker) = P ((Smoker | Cancer) * P (Cancer)) / P (Smoker) = (0.30*0.06) / (0.20) = 0.09.

Learn data science courses online from the World’s top Universities. Earn Executive PG Programs, Advanced Certificate Programs, or Masters Programs to fast-track your career.

Way Forward: Master the Concepts of Statistics for a Career in Data Science or Machine Learning

upGrad’s higher EdTech learning platform has impacted over 500,000 working professionals worldwide with its plethora of courses and immersive learning experiences. With a 40,000+ learner base spread across 85+ countries, upGrad’s industry-relevant courses are guaranteed to advance your career in your field of choice.

Master of Science in Data Science is an 18-months course imparting key skills in Statistics, Predictive Analysis, Machine Learning, Big Data Analytics, Data Visualization, etc.

Program Highlights:

  • Master’s Degree from Liverpool John Moores University and Executive PGP from IIIT Bangalore
  • 500+ hours of content, 60+ case studies and projects, 20+ live sessions, 14+ programming languages and tools
  • Industry networking, doubt resolution sessions, and learning support

Advanced Certificate Program in Machine Learning and Deep Learning is a rigorous 6-months course with peer networking opportunities, hands-on projects, industry mentorship, and 360-degree career assistance.

Program Highlights:

  • Prestigious recognition from IIIT Bangalore
  • 240+ hours of content, 5+ case studies, and projects, 24+ live sessions, coverage of 12 programming languages, tools, and libraries
  • 1:8 group coaching sessions and 1:1 mentorship sessions with industry experts

Conclusion 

background

Liverpool John Moores University

MS in Data Science

Dual Credentials

Master's Degree18 Months

Placement Assistance

Certification8-8.5 Months

Bayesian thinking underpins several areas of human thinking, inquiry, and belief, even though most of us are unaware of it. From cancer screening and global warming to monetary policy and risk assessment and insurance, Bayesian thinking is fundamental. Even the famous British mathematician Alan Turing is believed to have employed the Bayesian approach to crack the German Enigma Code during the Second World War.

Sign up with upGrad and further your knowledge of key statistical concepts and more!

Frequently Asked Questions (FAQs)

1. How can Bayes’ Theorem be used practically?

2. How many terms are required for building a Bayes model?

3. What is the difference between Bayes’ theorem and conditional probability?

Karan Kurani

1 articles published

Get Free Consultation

+91

By submitting, I accept the T&C and
Privacy Policy

Start Your Career in Data Science Today

Top Resources

Recommended Programs

IIIT Bangalore logo
bestseller

The International Institute of Information Technology, Bangalore

Executive Diploma in Data Science & AI

Placement Assistance

Executive PG Program

12 Months

Liverpool John Moores University Logo
bestseller

Liverpool John Moores University

MS in Data Science

Dual Credentials

Master's Degree

18 Months

upGrad Logo

Certification

3 Months