Each stratum must have valid cases in at least two clusters. For certain smaller subclasses of the sample, strata may have to be collapsed. See the discussion below on collapsing strata.
In order to create these pseudo-strata, each pair of adjacent clusters (in numeric order) is combined into a stratum. For example, clusters 1 and 2 would be paired as belonging to stratum 1, and clusters 3 and 4 would belong to stratum 2. If there is an odd number of clusters, the last cluster will be combined with the first two clusters, to form a stratum with 3 clusters.
It is important to understand that this method of creating strata is done separately for each cell of the table (subclass of the sample). If some cells do not have cases from all of the clusters, the strata in some cells may be quite different from the strata in other cells. For example, if a cell does not have any cases from cluster 2, stratum 1 for that cell would consist of clusters 1 and 3, and stratum 2 would consist of clusters 4 and 5. If there are expected to be substantial differences between the clusters, it may be preferable to create explicit strata yourself, rather than let the program create them automatically.
Once the clusters have been combined into pseudo-strata by the program, the Taylor series method is used to calculate standard errors, just as for the stratified cluster design.
However, if the pairing of clusters into pseudo-strata might result in large differences between clusters that happen to fall into the same pseudo-stratum, it may be preferable to group all the clusters into one large stratum. This will avoid having the clusters paired automatically into pseudo-strata. To do this, create a stratum variable that has the same value for all the cases (the number '1', for instance). Then define that variable as the stratum variable in the HARC file. This procedure will sacrifice any potential gains that might result from the implicit stratification of the clusters (if they have been ordered by some relevant criterion). But it will also avoid the inflation of variance that could result from the pairing up of very different clusters.
What this means is that the proportion of cases falling into each stratum, within each cell of the table, can be expected to vary from one sample to another. In other words, the stratum weights are not fixed but rather are random variables computed from each sample.
As a result, the formula for the variance of subclass means in stratified samples is used, instead of the ordinary simpler formula for stratified samples. (See Kish, pp. 132-136; especially formula 4.5.4 on p. 134.) The finite population correction (1-f) is ignored.
If weights are used for each case, the mean within each subclass is a ratio mean, since the weighted sum of cases is not fixed but would vary from one sample to another. Consequently, the Taylor series formula is used to calculate the sampling variance within each subclass. This Taylor series approximation is then used in the formula for the variance of stratified subclass means.
In some cells of a table (subclasses of the sample), it is possible that these requirements are not met for some of the strata. In such a case, the program will automatically combine adjacent strata, in order to be able to compute a within-stratum variance.
Since this collapsing of strata is done separately for each cell of the table, standard errors in different cells of the table may be based on different numbers of strata and different stratum definitions. The optional table of diagnostic information reports how many strata were actually used (after collapsing) for calculating standard errors in each cell.
If a stratum variable has been specified, this method of collapsing will preserve some of the gains of stratification, provided that adjacent strata (in numeric order) are more similar to one another than to strata farther removed. In other words, if the strata are part of a broader stratification scheme, it is advantageous to order the strata numerically in accordance with that scheme, so that similar strata are grouped together.
If only a cluster variable has been specified, the strata are always formed by pairing adjacent clusters. If some clusters become empty in small subclasses of the sample, the program will automatically use those clusters that still have cases in them, to create new pairs. This method of creating strata will preserve some of the gains of stratification, provided that adjacent clusters (in numeric order) are more similar to one another than to clusters farther removed. In other words, it is advantageous to order the clusters by some variable that is related to the dependent variable(s), so that the pairs grouped into strata are relatively similar. In that way the implicit stratification, produced by the sorting or ordering of the clusters, will be reflected in the explicit pseudo-strata created in order to calculate standard errors.
SUDAAN has an option to deal with incomplete strata, but it cannot collapse strata. When there are strata with only one PSU for some cells, by default SUDAAN prints an error message and halts. If you use the MISSUNIT option on the NEST statement, SUDAAN will print the same message (as a warning only), and then estimate the variance contribution of that PSU by using the difference between that PSU's value and the overall mean value for the population.
That SUDAAN solution, however, is not optimal. The overall percentage or mean value is not really a good substitute for the stratum-specific values, unless the stratum variable has no relationship to the dependent variable. It is usually preferable to collapse strata and calculate the variance contribution of each PSU by using the difference between that PSU and the other(s) within the collapsed stratum. And this is what SDA is able to do automatically. However, the consequences of combining PSUs into new strata depend on how the strata are ordered numerically. See the section above on collapsing strata for a discussion of those issues.