Statistics For Artificial Intelligence And Data Science


Statistics is an essential prerequisite to Machine Learning and Data Science. Without the proper grasp of the foundation of Statistics, solving real-world problems using Machine Learning would be impossible. For every Artificial Intelligence, Machine Learning, and Data Science enthusiast, Statistics is fundamental to learn in order to dive deeper into these fields. With a proper understanding of Statistics, it will become easy while implementing regression, classification, and numerous other algorithms in Machine Learning. In order to become able to analyze data as a Data Scientist, having a deep understanding of Statistics is a no-brainer. One should master Statistics in order to master Machine Learning and Data Science.


Statistics is a branch of mathematics that deals with data collection, interpretation, analysis, and presenting it to give an insight into what the data actually represents. Statistics can be applied to a wide range of fields from finance, healthcare, technology, demographics, business and so much more. With proper data, statistics can give a perspective about the details and views unseen and unrealizable to the average human.

Central tendencies

As the name suggests, Central Tendencies gives a single value which depicts the central positioning among the given set of data. It can also be used to figure out the center of distribution of data. Central Tendency can be measured in different contexts in multiple ways. Some of the key measures to find out Central Tendency is through calculation of Mean, Median, and Mode.


Mean value is often also termed as Average. It is a very well know measure of calculating Central Tendency. Mean can be used for both discrete and continuous data.

Discrete Data: Discrete data are individual values that can be counted as separate values. It can be graphically presented by a bar graph.

E.g., the Number of readers viewing this article

Continuous Data

Continuous data are values that can be represented with a range.

E.g., Height of readers viewing this article.

Note that, the height of readers viewing this article could range from 3ft to 6 ft and even more or less. Thus, it would be a range of data that would be continuous. A continuous dataset would contain a set of values that are measurable ie. represents the scale of measurements like temperature, length, height, width which doesn’t necessarily even have to be just positive integer values but could be decimals and fractions.

Arithmetic Mean

It is calculated by adding all the values of a dataset divided by the total number of values in the set.

E.g., For a dataset of values, {100,200,300,400,500}

Here, the total number of datasets is: 5


Arithmetic Mean (X) = ∑ (all values in dataset)/ Total no of values in dataset

= (100+200+300+400+500)/5

= 300


The median value separates a given dataset into two parts. One is with the higher half part and the other is with the lower half part. Median is one value from the set of data which can be termed as a mid-value.

E.g., For a dataset of values — {1,2,3,6,7,8,9}

Out of the 7 values, the number that can divide the data set into equal halves of 3 highs and 3 low values data is the 4th value. Ie. 6

For an unordered data set, the data must first be ordered in ascending order, and thus, the mid-value/median be calculated.

For odd total values (n) of dataset, median is calculated by:

Median(x) = x(n+1)/2

For even total values (n) of dataset, median is calculated by:

Median(x) = (x(n/2)+x(n/2)+1 )/2


The mode value is the value that appears the most frequently in a dataset. The mode can be the same as the Mean or Median value but it doesn’t necessarily have to be so.

E.g., For a given dataset of {5,6,7,7,8,9}

Mode = 7, for 7 is the most frequently appearing value in the dataset


Dispersion is the degree to which the data in any given dataset is scattered or stretched or squeezed. Dispersions are measured in a variety of ways. Some of them are included below,


Variance can be defined as the measurement of the squared value of the deviation of all random data from the mean of the dataset. It is the square of Standard Deviation and denoted my Var(X) or squared noted of SD. i.e.. σ2

To Read the Full Article, Please Check:

Man on a Mission - to create epochal impact