Exact statistics

Type of statistic

Exact statistics, such as that described in exact test, is a branch of statistics that was developed to provide more accurate results pertaining to statistical testing and interval estimation by eliminating procedures based on asymptotic and approximate statistical methods. The main characteristic of exact methods is that statistical tests and confidence intervals are based on exact probability statements that are valid for any sample size. Exact statistical methods help avoid some of the unreasonable assumptions of traditional statistical methods, such as the assumption of equal variances in classical ANOVA. They also allow exact inference on variance components of mixed models.

When exact p-values and confidence intervals are computed under a certain distribution, such as the normal distribution, then the underlying methods are referred to as exact parametric methods. The exact methods that do not make any distributional assumptions are referred to as exact nonparametric methods. The latter has the advantage of making fewer assumptions whereas, the former tend to yield more powerful tests when the distributional assumption is reasonable. For advanced methods such as higher-way ANOVA regression analysis, and mixed models, only exact parametric methods are available.

When the sample size is small, asymptotic results given by some traditional methods may not be valid. In such situations, the asymptotic p-values may differ substantially from the exact p-values. Hence asymptotic and other approximate results may lead to unreliable and misleading conclusions.

The approach

All classical statistical procedures are constructed using statistics which depend only on observable random vectors, whereas generalized estimators, tests, and confidence intervals used in exact statistics take advantage of the observable random vectors and the observed values both, as in the Bayesian approach but without having to treat constant parameters as random variables. For example, in sampling from a normal population with mean μ {\displaystyle \mu } and variance σ 2 {\displaystyle \sigma ^{2}} , suppose X ¯ {\displaystyle {\overline {X}}} and S 2 {\displaystyle S^{2}} are the sample mean and the sample variance. Then, defining Z and U thus:

Z = n ( X ¯ μ ) / σ N ( 0 , 1 ) {\displaystyle Z={\sqrt {n}}({\overline {X}}-\mu )/\sigma \sim N(0,1)}

and that

U = n S 2 / σ 2 χ n 1 2 {\displaystyle U=nS^{2}/\sigma ^{2}\sim \chi _{n-1}^{2}} .

Now suppose the parameter of interest is the coefficient of variation, ρ = μ / σ {\displaystyle \rho =\mu /\sigma } . Then, we can easily perform exact tests and exact confidence intervals for ρ {\displaystyle \rho } based on the generalized statistic

R = x ¯ S s σ X ¯ μ σ = x ¯ s U n     Z n {\displaystyle R={\frac {{\overline {x}}S}{s\sigma }}-{\frac {{\overline {X}}-\mu }{\sigma }}={\frac {\overline {x}}{s}}{\frac {\sqrt {U}}{\sqrt {n}}}~-~{\frac {Z}{\sqrt {n}}}} ,

where x ¯ {\displaystyle {\overline {x}}} is the observed value of X ¯ {\displaystyle {\overline {X}}} and S {\displaystyle S} is the observed value of s {\displaystyle s} . Exact inferences on ρ {\displaystyle \rho } based on probabilities and expected values of R {\displaystyle R} are possible because its distribution and the observed value are both free of nuisance parameters.

Generalized p-values

Classical statistical methods do not provide exact tests to many statistical problems such as testing Variance Components and ANOVA under unequal variances. To rectify this situation, the generalized p-values are defined as an extension of the classical p-values so that one can perform tests based on exact probability statements valid for any sample size.

See also

References

  • Fisher, R. A. 1954. Statistical Methods for Research Workers. Oliver and Boyd.
  • Mehta, C. R. 1995. SPSS 6.1 Exact test for Windows. Prentice Hall.
  • Mehta CR and Patel NR. 1983. A network algorithm for performing Fisher's exact test in rxc contingency tables. Journal of the American Statistical Association, 78(382): 427–434.
  • Mehta CR and Patel NR. 1995. Exact logistic regression: theory and examples. Statistics in Medicine, 14: 2143–2160.
  • Mehta CR, Patel NR and Gray R. 1985. On computing an exact confidence interval for the common odds ratio in several 2 x 2 contingency tables. Journal of the American Statistical Association, 80(392): 969–973.
  • Weerahandi, S. 1995. Exact Statistical Method for Data Analysis. Springer-Verlag.
  • Weerahandi, S. 2004. Generalized Inference in Repeated Measures: Exact Methods in MANOVA and Mixed Models. John Wiley & Sons.

External links

  • XPro, Free software package for exact parametric statistics