# clustvarsel

An R package implementing
*Variable Selection for Gaussian Model-Based Clustering*.

Variable selection for Gaussian model-based clustering as implemented
in the **mclust** package. The methodology allows to find
the (locally) optimal subset of variables in a data set that have
group/cluster information. A greedy or headlong search can be used,
either in a forward-backward or backward-forward direction, with or
without sub-sampling at the hierarchical clustering stage for starting
mclust models. By default the algorithm uses a sequential search, but
parallelisation is also available.

## Installation

You can install the released version of **clustvarsel**
from CRAN using:

`install.packages("clustvarsel")`

## Usage

Usage of the main functions and several examples are included in the
papers shown in the references section below.

For an intro see the vignette **A quick tour of
clustvarsel**, which is available as

`vignette("clustvarsel")`

The vignette is also available in the *Vignette* section on
the navigation bar on top of the package’s web page.

## References

Raftery, A. E. and Dean, N. (2006) Variable Selection for Model-Based
Clustering. *Journal of the American Statistical Association*,
101(473), 168-178.

Maugis, C., Celeux, G., Martin-Magniette M. (2009) Variable Selection
for Clustering With Gaussian Mixture Models. *Biometrics*, 65(3),
701-709.

Scrucca, L. and Raftery, A. E. (2018) clustvarsel: A Package
Implementing Variable Selection for Gaussian Model-based Clustering in
R. *Journal of Statistical Software*, 84(1), pp. 1-28.