Imagine sampling 2 units, n = 2, from a population of size N = 7. Population size is
denoted by capital N and sample size lower case n. Table ?? contains the unit labels,
numbered 1 to 7, and their associated values. The population mean of this sample is
μy = 57.00. Table ?? represents all possible samples of size n = 2 from a simple
random sample with replacement. In a SRSWR, each unit has a probability of
1 - ()n = 1 - (
)2 =
of being in the sample. Each possible sample in Table ?? is
equally likely with probability
, because there are 49 possible samples. Table ??
illustrates SRSWR with the sampling frame being all units in the target population, or
simply the population. Often when sampling some units in the target population are not
in the sample frame and thus have a zero probability of being included in the
sample. Imagine if a six sided die was used to select the units to be sampled.
Thus the units labeled 1 to 6 would have a probability of
of being in the
sample and unit number 7 has a zero probability. Table ?? illustrates the latter
situation, with the sampling frame being being units labeled 1 to 6 within the
population.
Table ?? represents all possible samples of size n = 2 from a simple random sample
without replacement. In a SRSWOR, each unit has a probability of =
of being in
the sample, same probability as SRSWR. Each possible sample in Table ?? is equally
likely with probability
, because there are 42 possible samples. For each sample of size 2,
there are two ways it could be obtained, considering order. Ignoring the order a unit is
selected there are (7
2)
= 21 possible different samples. Table ?? illustrates SRSWOR with
the sampling frame being all units in the target population, or simply the population.
Often when sampling, some units in the target population are not in the sample
frame and thus have a zero probability of being included in the sample. Again
imagine if a six sided die was used to select the units to be sampled. On the
second roll of the dice if the number obtained equaled that of the first roll, the
die would have to be rolled again, to obtain two distinct units for the sample
(SRSWOR). Thus the units labeled 1 to 6 would have a probability of
of being in the
sample and unit number 7 has a zero probability. Table ?? illustrates the latter
situation, with the sampling frame being being units labeled 1 to 6 within the
population.
Tables ??, ??, and ?? illustrate unequal probability sampling with and without replacement. Many samples are taken using unequal probability sampling. Again the majority of basic data analysis techniques assumes equal probability sampling was employed. In general, unequal probability sampling with replacement is much easier than without replacement to calculate estimates of the population quantity of interest. The author wishes to warn the reader to consider carefully before deciding on an unequal probability sample without replacement. Keep in mind that software is getting better and better at analyzing complicated sampling designs, in future the latter statement may not be valid. The main reason for the additional complication with unequal probability sampling without replacement is determining the probability of a specific unit will be in the sample. A general formula for calculating the probability unit i is in the sample is:
This formula could be used for SRSRWR or SRSWOR. For example, a SRSWR of size n = 2 from 6 out of the 7 units, Table ??, the probability of selecting unit 5 equals
| P(i = 5 | = ∑
5 | ||||||||||||||||||||||||||
| = P(1, 5) | + | P(2, 5) | + | P(3, 5) | + | P(4, 5) | + | P(5, 5) | + | P(6, 5) | + | P(7, 5) | |||||||||||||||
| = | + | + | + | + | + | + | 0 | ||||||||||||||||||||
| = | |||||||||||||||||||||||||||
| P(i = 5 | = ∑
5 | ||||||||||||||||||||||
| = P(1, 5) | + | P(2, 5) | + | P(3, 5) | + | P(4, 5) | + | P(6, 5) | + | P(7, 5) | |||||||||||||
| = 0.021 | + | 0.057 | + | 0.010 | + | 0.064 | + | 0.010 | + | 0.092 | |||||||||||||
| = 0.254 | |||||||||||||||||||||||
|
|
|
|
|
|
|
|
|