Semiparametric model

Type of statistical model

In statistics, a semiparametric model is a statistical model that has parametric and nonparametric components.

A statistical model is a parameterized family of distributions: { P θ : θ Θ } {\displaystyle \{P_{\theta }:\theta \in \Theta \}} indexed by a parameter θ {\displaystyle \theta } .

  • A parametric model is a model in which the indexing parameter θ {\displaystyle \theta } is a vector in k {\displaystyle k} -dimensional Euclidean space, for some nonnegative integer k {\displaystyle k} .[1] Thus, θ {\displaystyle \theta } is finite-dimensional, and Θ R k {\displaystyle \Theta \subseteq \mathbb {R} ^{k}} .
  • With a nonparametric model, the set of possible values of the parameter θ {\displaystyle \theta } is a subset of some space V {\displaystyle V} , which is not necessarily finite-dimensional. For example, we might consider the set of all distributions with mean 0. Such spaces are vector spaces with topological structure, but may not be finite-dimensional as vector spaces. Thus, Θ V {\displaystyle \Theta \subseteq V} for some possibly infinite-dimensional space V {\displaystyle V} .
  • With a semiparametric model, the parameter has both a finite-dimensional component and an infinite-dimensional component (often a real-valued function defined on the real line). Thus, Θ R k × V {\displaystyle \Theta \subseteq \mathbb {R} ^{k}\times V} , where V {\displaystyle V} is an infinite-dimensional space.

It may appear at first that semiparametric models include nonparametric models, since they have an infinite-dimensional as well as a finite-dimensional component. However, a semiparametric model is considered to be "smaller" than a completely nonparametric model because we are often interested only in the finite-dimensional component of θ {\displaystyle \theta } . That is, the infinite-dimensional component is regarded as a nuisance parameter.[2] In nonparametric models, by contrast, the primary interest is in estimating the infinite-dimensional parameter. Thus the estimation task is statistically harder in nonparametric models.

These models often use smoothing or kernels.

Example

A well-known example of a semiparametric model is the Cox proportional hazards model.[3] If we are interested in studying the time T {\displaystyle T} to an event such as death due to cancer or failure of a light bulb, the Cox model specifies the following distribution function for T {\displaystyle T} :

F ( t ) = 1 exp ( 0 t λ 0 ( u ) e β x d u ) , {\displaystyle F(t)=1-\exp \left(-\int _{0}^{t}\lambda _{0}(u)e^{\beta x}du\right),}

where x {\displaystyle x} is the covariate vector, and β {\displaystyle \beta } and λ 0 ( u ) {\displaystyle \lambda _{0}(u)} are unknown parameters. θ = ( β , λ 0 ( u ) ) {\displaystyle \theta =(\beta ,\lambda _{0}(u))} . Here β {\displaystyle \beta } is finite-dimensional and is of interest; λ 0 ( u ) {\displaystyle \lambda _{0}(u)} is an unknown non-negative function of time (known as the baseline hazard function) and is often a nuisance parameter. The set of possible candidates for λ 0 ( u ) {\displaystyle \lambda _{0}(u)} is infinite-dimensional.

See also

Notes

  1. ^ Bickel, P. J.; Klaassen, C. A. J.; Ritov, Y.; Wellner, J. A. (2006), "Semiparametrics", in Kotz, S.; et al. (eds.), Encyclopedia of Statistical Sciences, Wiley.
  2. ^ Oakes, D. (2006), "Semi-parametric models", in Kotz, S.; et al. (eds.), Encyclopedia of Statistical Sciences, Wiley.
  3. ^ Balakrishnan, N.; Rao, C. R. (2004). Handbook of Statistics 23: Advances in Survival Analysis. Elsevier. p. 126.

References

  • Bickel, P. J.; Klaassen, C. A. J.; Ritov, Y.; Wellner, J. A. (1998), Efficient and Adaptive Estimation for Semiparametric Models, Springer
  • Härdle, Wolfgang; Müller, Marlene; Sperlich, Stefan; Werwatz, Axel (2004), Nonparametric and Semiparametric Models, Springer
  • Kosorok, Michael R. (2008), Introduction to Empirical Processes and Semiparametric Inference, Springer
  • Tsiatis, Anastasios A. (2006), Semiparametric Theory and Missing Data, Springer
  • Begun, Janet M.; Hall, W. J.; Huang, Wei-Min; Wellner, Jon A. (1983), "Information and asymptotic efficiency in parametric--nonparametric models", Annals of Statistics, 11 (1983), no. 2, 432--452