Now that you have gone through the mid-session summary, let’s get back to the rest of the session. Earlier, we tried to estimate the mean commute time of 30,000 employees of an office by taking a small sample of 100 employees and finding their mean commute time. This sample’s mean was = 36.6 minutes and its standard deviation was S = 10 minutes.
Recall that we also said that the population mean, i.e., the daily commute time of all 30,000 employees (μ) = 36.6 (sample mean) + some margin of error.
You can find this margin of error using the CLT (central limit theorem). Now that you know the CLT, let’s see how you can find the margin of error.
To summarise, let’s say that you have a sample with sample size n, mean and standard deviation S. Now, the y% confidence interval (i.e., the confidence interval corresponding to a y% confidence level) for would be given by the range:
Confidence interval = ,
where, Z* is the Z-score associated with a y% confidence level. In other words, the population mean and the sample mean differ by a margin of error given by .
Some commonly used Z* values are given below:
At this point, it is important to address a common misconception. Sampling distributions are just a theoretical exercise; you’re not actually expected to make one in real life. If you want to estimate the population mean, you will just take a sample. You will not create an entire sampling distribution.
You must be wondering why you studied sampling distributions if this is the case. To understand the reason for this, let's go through the actual process of sampling. Recall that you are doing sampling because you want to find the population mean, albeit in the form of an interval. The three steps to follow are as follows:
First, take a sample of size n.
Then, find the mean and standard deviation S of this sample.
Now, you can say that for a y% confidence level, the confidence interval for the population mean is given by .
However, as you may have seen in the video above, you cannot finish step 3 without the CLT. The CLT lets you assume that the sample mean would be normally distributed, with mean and standard deviation (approx. ). Using this assumption, it is possible to find the margin of error, confidence interval, etc.
Thus, you learnt about sampling distributions so that you could learn more about the CLT and be able to make all the assumptions as stated above.