1. Equal Allocation
Formula: n_h = n / H
Where n = total samples, H = number of strata
Each stratum gets the same number of samples regardless of size.
Example: With 60 total samples and 3 groups:
Group 1 (500 items): 20 samples
Group 2 (300 items): 20 samples
Group 3 (200 items): 20 samples
Best for: When all groups are equally important, or when you want balanced representation.
2. Proportional Allocation
Formula: n_h = n * (N_h / N)
Where N_h = stratum size, N = total population
Larger strata get more samples, proportional to their size in the population.
Example: With 60 total samples from 1000 items:
Group 1 (500 items, 50%): 30 samples
Group 2 (300 items, 30%): 18 samples
Group 3 (200 items, 20%): 12 samples
Best for: When you want the sample to reflect the population distribution.
3. Optimal (Neyman) Allocation
Formula: n_h = n * (N_h * S_h) / sum(N_j * S_j)
Where S_h = standard deviation in stratum h
Allocates more samples to strata with higher variability (larger standard deviation).
This minimizes the overall variance of the sample estimate.
Example: With 60 samples, if Group 1 has high variance:
Group 1 (high variance): 35 samples
Group 2 (medium variance): 15 samples
Group 3 (low variance): 10 samples
Best for: When you want the most statistically efficient sample (lowest variance for a given sample size).
4. Fixed Allocation
You specify the exact number of samples for each stratum manually.
Example:
Group 1: 25 samples (you specify)
Group 2: 20 samples (you specify)
Group 3: 15 samples (you specify)
Best for: When you have specific requirements for each group (e.g., need exactly 20 high-frequency and 40 low-frequency words).
Custom Weights
Weights let you override any allocation method. If you set weights {1: 2.0, 2: 1.0, 3: 1.0},
Group 1 gets twice as many samples as Groups 2 and 3.
Reference
Cochran, W.G. (1977). Sampling Techniques (3rd ed.), Chapter 5. Wiley.