Till now, we have assumed that datasets typically follow the 'normal' probability distribution. Recall the marketing campaign data set. In this video, let’s revisit that data set and see whether the age data follows a normal distribution or not.
Let’s hear from Thomas on how you can use Microsoft Excel to calculate the probabilities associated with a normal distribution.
You can use the dataset provided below.
In the video, you learnt that the ‘NORM.DIST’ function in Microsoft Excel can be used to calculate the cumulative probability of any point in a normal distribution. Similarly, the ‘NORM.S.DIST’ function can be used to calculate the cumulative probability of any point in a standard normal distribution. Both functions give the same result once you have converted the point on the normal distribution into its equivalent z-score.
It is often easier to convert the relevant point in a normal distribution into its equivalent z-score and perform the necessary calculations for the standard normal distribution.
Hence, in this segment, you learnt that the normal distribution can be assumed for datasets having sufficient samples of data points, typically 30 or more.