Introduction

NPRLab is a MATLAB toolbox for nonparametric regression. In particular, it implements a variety of "linear estimators". A linear estimator $r_n(x)$ can be written as $$r_n(x) = ∑↙{i=1}↖nl_i(x)Y_i,$$ for each $x$. Defining the vectors $r = (r_n(x_1),...,r_n(x_n))^T$ and $Y = (Y_1,...,Y_n)^T$, one can write these as a matrix-vector product: $$r = LY,$$ with $L$ the smoother matrix and $L_{ij} = l_j(x_i)$.

NPRLab is built around and requires the LS-SVMlab toolbox to work. The following linear estimators are currently implemented:

  • Nadaraya-Watson,
  • Local polynomial regression,
  • Priestley-Chao,
  • Least Squares Support Vector Machines (through the LS-SVMlab toolbox).

Moreover, other linear estimators can easily be added to the toolbox. Data rescaling and bandwidth tuning is done automatically by the software. Bandwidth tuning is done using (i) leave-one-out cross-validation, (ii) generalized cross-validation or (iii) Akaike information criterium corrected (AICC). Finally, the toolbox also features confidence interval estimation by the volume-of-tube formula both in a one-dimensional and multi-dimensional setting (see Sun, J., and Loader, C. R. (1994). Simultaneous confidence bands for linear regression and smoothing. The Annals of Statistics, 1328-1345.). The computation of the multi-dimensional volume-of-tube formula uses multiple cores of the CPU if the Parallel Computing toolbox is installed. The toolbox includes a number of examples that demonstrate the usage.

One-dimensional kernel regression

Nadaraya-Watson estimate (red) along with a 95% simultaneous confidence interval and the true underlying function $\exp^{-32(X-0.5)^2}$ (green).

Multi-dimensional kernel regression

LS-SVM estimate (red) along with a 95% simultaneous confidence band and the true underlying function (shaded area). The underlying function is a bivariate normal distribution.

Accuracy vs. computation time

The accuracy of the multi-dimensional volume-of-tube approximation is customizable: choose 3-term, 2-term or 1-term approximation for decreasing accuracy and increasing speed. The figure shows the computation time for a dataset of size N = 576.

Download and install

NPRLab is released under the GNU GPLv3 license.

Installation steps:

  1. Download LS-SVMlab here, unzip into a folder of choice and add the folder to the MATLAB path.
  2. Download and unzip NPRLab into a folder of choice.
  3. Run "install.m" within the NPRLab folder to add the required folders to the MATLAB path.
  4. Run one of the examples in the "examples" folder. (optional)

A short quickguide to the basic usage is available here. The examples in the "examples" subfolder should also help to get you started. In addition, use the MATLAB 'help' function to get additional information on the functionality.

Known issues/bugs:

  • If dim(X) > 1 and the degree of the local polynomial p > 1, then local polynomial regression gives incorrect results. All other cases work fine.

About the author

Pieter Jan Kerstens is a postdoctoral researcher in economics at the Department of Food and Resource Economics of the University of Copenhagen (Denmark). He obtained a PhD in economics from KU Leuven (Belgium), master degrees in economics and mathematical engineering at KU Leuven (Belgium) and a bachelor degree in computer science at Universiteit Antwerpen (Belgium). Feel free to contact him with any bugs/comments/suggestions you have at

I'd love to hear how NPRLab is used, so if you use NPRLab then feel free to drop me a note saying where and how you used it. If you use this toolbox in any work then I kindly ask you to give credit by appropriately citing the quickguide or this website where possible.

View Pieter Jan Kerstens's profile on LinkedIn