Nonparametric regression

Category of regression analysis
Part of a series on
Regression analysis
Models
  • Linear regression
  • Simple regression
  • Polynomial regression
  • General linear model
  • Generalized linear model
  • Vector generalized linear model
  • Discrete choice
  • Binomial regression
  • Binary regression
  • Logistic regression
  • Multinomial logistic regression
  • Mixed logit
  • Probit
  • Multinomial probit
  • Ordered logit
  • Ordered probit
  • Poisson
Estimation
Background
  • icon Mathematics portal
  • v
  • t
  • e

Nonparametric regression is a category of regression analysis in which the predictor does not take a predetermined form but is constructed according to information derived from the data. That is, no parametric form is assumed for the relationship between predictors and dependent variable. Nonparametric regression requires larger sample sizes than regression based on parametric models because the data must supply the model structure as well as the model estimates.

Definition

In nonparametric regression, we have random variables X {\displaystyle X} and Y {\displaystyle Y} and assume the following relationship:

E [ Y X = x ] = m ( x ) , {\displaystyle \mathbb {E} [Y\mid X=x]=m(x),}

where m ( x ) {\displaystyle m(x)} is some deterministic function. Linear regression is a restricted case of nonparametric regression where m ( x ) {\displaystyle m(x)} is assumed to be affine. Some authors use a slightly stronger assumption of additive noise:

Y = m ( X ) + U , {\displaystyle Y=m(X)+U,}

where the random variable U {\displaystyle U} is the `noise term', with mean 0. Without the assumption that m {\displaystyle m} belongs to a specific parametric family of functions it is impossible to get an unbiased estimate for m {\displaystyle m} , however most estimators are consistent under suitable conditions.

List of general-purpose nonparametric regression algorithms

This is a non-exhaustive list of non-parametric models for regression.

Examples

Gaussian process regression or Kriging

In Gaussian process regression, also known as Kriging, a Gaussian prior is assumed for the regression curve. The errors are assumed to have a multivariate normal distribution and the regression curve is estimated by its posterior mode. The Gaussian prior may depend on unknown hyperparameters, which are usually estimated via empirical Bayes. The hyperparameters typically specify a prior covariance kernel. In case the kernel should also be inferred nonparametrically from the data, the critical filter can be used.

Smoothing splines have an interpretation as the posterior mode of a Gaussian process regression.

Kernel regression

Example of a curve (red line) fit to a small data set (black points) with nonparametric regression using a Gaussian kernel smoother. The pink shaded area illustrates the kernel function applied to obtain an estimate of y for a given value of x. The kernel function defines the weight given to each data point in producing the estimate for a target point.

Kernel regression estimates the continuous dependent variable from a limited set of data points by convolving the data points' locations with a kernel function—approximately speaking, the kernel function specifies how to "blur" the influence of the data points so that their values can be used to predict the value for nearby locations.

Regression trees

Decision tree learning algorithms can be applied to learn to predict a dependent variable from data.[2] Although the original Classification And Regression Tree (CART) formulation applied only to predicting univariate data, the framework can be used to predict multivariate data, including time series.[3]

See also

References

  1. ^ Cherkassky, Vladimir; Mulier, Filip (1994). Cheeseman, P.; Oldford, R. W. (eds.). "Statistical and neural network techniques for nonparametric regression". Selecting Models from Data. Lecture Notes in Statistics. New York, NY: Springer: 383–392. doi:10.1007/978-1-4612-2660-4_39. ISBN 978-1-4612-2660-4.
  2. ^ Breiman, Leo; Friedman, J. H.; Olshen, R. A.; Stone, C. J. (1984). Classification and regression trees. Monterey, CA: Wadsworth & Brooks/Cole Advanced Books & Software. ISBN 978-0-412-04841-8.
  3. ^ Segal, M.R. (1992). "Tree-structured methods for longitudinal data". Journal of the American Statistical Association. 87 (418). American Statistical Association, Taylor & Francis: 407–418. doi:10.2307/2290271. JSTOR 2290271.

Further reading

  • Bowman, A. W.; Azzalini, A. (1997). Applied Smoothing Techniques for Data Analysis. Oxford: Clarendon Press. ISBN 0-19-852396-3.
  • Fan, J.; Gijbels, I. (1996). Local Polynomial Modelling and its Applications. Boca Raton: Chapman and Hall. ISBN 0-412-98321-4.
  • Henderson, D. J.; Parmeter, C. F. (2015). Applied Nonparametric Econometrics. New York: Cambridge University Press. ISBN 978-1-107-01025-3.
  • Li, Q.; Racine, J. (2007). Nonparametric Econometrics: Theory and Practice. Princeton: Princeton University Press. ISBN 978-0-691-12161-1.
  • Pagan, A.; Ullah, A. (1999). Nonparametric Econometrics. New York: Cambridge University Press. ISBN 0-521-35564-8.

External links

Wikimedia Commons has media related to Nonparametric regression.
  • HyperNiche, software for nonparametric multiplicative regression.
  • Scale-adaptive nonparametric regression (with Matlab software).
  • v
  • t
  • e
Continuous data
Center
Dispersion
Shape
Count data
Summary tables
Dependence
Graphics
Study design
Survey methodology
Controlled experiments
Adaptive designs
Observational studies
Statistical theory
Frequentist inference
Point estimation
Interval estimation
Testing hypotheses
Parametric tests
Specific tests
  • Z-test (normal)
  • Student's t-test
  • F-test
Goodness of fit
Rank statistics
Bayesian inference
Correlation
Regression analysis
Linear regression
Non-standard predictors
Generalized linear model
Partition of variance
Categorical
Multivariate
Time-series
General
Specific tests
Time domain
Frequency domain
Survival
Survival function
Hazard function
Test
Biostatistics
Engineering statistics
Social statistics
Spatial statistics
  • Category
  • icon Mathematics portal
  • Commons
  • WikiProject