Some business applications may need you to analyse the behaviour of a target audience or infer from a large data set. However, it is practically impossible to analyse a very large data set.
For instance, if you were asked how many people in a country have ice cream at least once a week, it is practically impossible to ask each and every person in that country about their ice cream eating habits.
How do businesses tackle such problems? Let’s find out in the upcoming video.
In this video, you learnt that a sample is a small part of a population. Often the population is too large, making it infeasible or expensive to collect data from the entire population. This is where inferential statistics comes into the picture. Inferential statistics is a subject area that details on how to make inferences or derive insights about a large population from a small sample.
Typically, the statistical parameters of a sample (mean, variance, etc.) are calculated using the same formulae as that of a population. There is, however, one major difference; for a sample of size n, the formula of sample variance has a denominator ‘n - 1’, while in the case of a population of size N, the population variance has a denominator ‘N’. The following image contains the formulae and notations related to populations and their samples.
The image given below contains the formulae and notations related to populations and their samples.
While it is not explicitly mentioned, it is assumed that when we say we are collecting a sample, we are collecting a random sample. Simply put, a random sample is a sample that is chosen randomly without any bias.
Even though you have understood the concept of a sample, some questions are still unanswered, which are as follows:
You will find the answers to these questions as you go through the next segment, where which talks about the central limit theorem.