Importance Of Probability In Machine Learning And Data Science

Ojash Shrestha
5 min readMay 17, 2021

--

Among many fields and branches of mathematics, Probabilities plays a significantly important in both Artificial Intelligence and Data Science. Today, we’ll cover the basics of what probabilities really are and the theorems and real-world examples where these tools are used and how.

Read the previous article Statistics For Artificial Intelligence And Data Science to understand the foundation of Statistics that is used by Machine Learning Engineers and Data Scientists.

Photo by Josh Appel on Unsplash

Probabilities

Probability can be defined as the likeliness of something to occur or happen. Every time we need to explain what is the change of some outcome or an event to occur, we talk in terms of Probability.

What is the chance that the head or tail occurs when we flip a coin? This is probability.

What is the chance that 2 shows up when we roll dice? This can be explained by probability.

The way to calculate the probability of the occurrence of an event is as follows:

Probability of Event = number of ways it can happen / Total number of outcomes

For a coin having two sides, the probability that head shows up would be,

Probability of Head = number of ways it can happen / Total number of outcomes

There are two possible outcomes, head(H) and tail(T), which is one of each way either can happen.

Thus,

Example using Coin,

Probability of Head i.e.. P(H)= 1 / 2

Probability of Tail i.e.. P(T)= 1 / 2

Example using Dice,

Similarly, for a Dice which has 6 sides, with each side having 1, 2, 3, 4, 5, or 6.

Probability of occurrence of 2 i.e.. P(2) = 1/ 6

Probability of occurrence of 1. i.e.. P (Rolling 1) = 1/6 = 16.7%

Understanding Likelihood

In statistics, Likelihood is not Probability, alhough it can be used as a synonym in regular speech. But for any statistician, this would be nothing short of Wrong. As probability explains the measure of the change of any specific event or outcome to occur, Likelihood is used to increase the chances of any specific outcome to occur. One needs to choose the given distribution in a better way to increase the chance of the occurrence of the outcome.

Probabilities in real life

There are basically different ways to calculate Probability for the same problem.

Theoretical Probability

Theoretical Probability is calculated on the foundation of reasoning. This is the most accurate depiction of any possible outcome. This is the expected value which is more intuitive.

Experimental Probability

Experimental Probability is calculated by repeating experiments multiple times and observing the results. This is a different approach than Theoretical Probability. Anyone can perform the experiment and calculate the probability.

Law of Large Numbers

In Law of Large Numbers, we discuss how when experimenting multiple times for a particular event, we tend to get closer to the expected value. The average of the outcomes when obtained from a Large number of experiments, will be closer to the expected value with the increase in the number of experiments performed. The larger the number of experiments, the closer or more accurate the probability value obtained.

Conditional Probability

Conditional Probability can be explained as the probability of an event’s occurrence concerning one or multiple other events.

Eg. Let us suppose, Event A — You will read this article today.

Event B — You will drink a beverage today.

The conditional probability would be looking at these two different events, Event A and Event B in relationship with each other, and calculate both Events A and B happening such that you would be drinking a beverage while reading this article today.

For another instance, let us suppose, Event A — It will rain today

Event B — You need to go out today

The conditional probability would be, the probability of both Event A and B happening ie. You would need to go outside while it is raining today. This could predict, what is the probability of you needing to carry an umbrella today.

Independent Event

An independent event is an event that doesn’t have any relationship with the occurrence of any other event. Ie. Its occurrence doesn’t affect the probability of the happening of any other event.

Eg. When you roll a dice 5, it is an independent event. The dice rolls 5 or 6 or any other value, the prior occurrence of 5 has nothing to do which the followed-up rolling of the dice.

Dependent Event

Dependent Events are a set of events that depend upon the occurrence of any of the Other. The probability of occurrence of one event depends upon the occurrence of the other event. Thus, we call it dependent.

Eg. There are 100 M&M’s in a jar which is a mixture of 8 colors. When you take out One M&M of the color red, this would affect the probability of the occurrence of any color of M&M from the jar. The next outcome is dependent upon the prior.

Distributions

Probability Distribution can be defined as a function that explains every possible value and possibility that a variable can output within a given range for any particular experiment.

Continuous Distribution

Continuous Distribution explains the probabilities of occurrence of all the values within a given range in a particular experiment. Only, the range of values has a non — zero probability. In continuous Distribution, the probability of a continuous random variable equaling some value is always 0. It is often represented with the region under the curve.

Discrete Distribution

Discrete Distribution explains the probability of occurrence of every value of a discrete arbitrary variable. In a discrete probability distribution, every possible value of the discrete random variable has a non-zero probability. Henceforth, a discrete probability distribution is mostly represented in a tabular form.

Bayes’ Theorem

Bayes’ Theorem explains a method to find out conditional probability. This theorem is named after the 18th-century British Mathematician Thomas Bayes, who discovered this theorem. We know, Conditional Probability can be explained as the probability of an event’s occurrence concerning one or multiple other events. This mathematical formula has been widely used in Machine Learning for Modeling Hypotheses, Classification, and Optimization.

To Read the Full Article, Check it out at: Importance Of Probability In Machine Learning And Data Science

--

--

Ojash Shrestha
Ojash Shrestha

Written by Ojash Shrestha

Man on a Mission - to create epochal impact

No responses yet