Exploring the use of variable bandwidth kernel density estimators

Variable bandwidth kernel density estimators increase the window width at low densities and decrease it where data concentrate. This represents an improvement over the fixed bandwidth kernel density estimators. In this article, we explore the use of one implementation of a variable kernel estimator in conjunction with several rules and procedures for bandwidth selection applied to several real datasets. The considered examples permit us to state that when working with tens or a few hundreds of data observations, least-squares cross-validation bandwidth rarely produces useful estimates; with thousands of observations, this problem can be surpassed. Optimal bandwidth and biased cross-validation (BCV), in general, oversmooth multimodal densities. The Sheather–Jones plug-in rule produced bandwidths that behave slightly better in this respect. The Silverman test is considered as a very sophisticated and safe procedure to estimate the number of modes in univariate distributions; however, similar results could be obtained with the Sheather–Jones rule, but at a much lower computational cost. As expected, the variable bandwidth kernel density estimates showed fewer modes than those chosen by the Silverman test, especially those distributions in which multimodality was caused by several noisy minor modes. More research on the subject is needed.

Issue Date:
Publication Type:
Journal Article
DOI and Other Identifiers:
st0036 (Other)
PURL Identifier:
Published in:
Stata Journal, Volume 03, Number 2
Page range:
Total Pages:

Record appears in:

 Record created 2017-04-01, last modified 2017-04-27

Download fulltext

Rate this document:

Rate this document:
(Not yet reviewed)