mclust: Normal Mixture Modeling for Model-Based Clustering, Classification, and Density Estimation

An R package for normal mixture modeling fitted via EM algorithm for model-based clustering, classification, and density estimation, including Bayesian regularization.

clustvarsel: Variable Selection for Model-Based Clustering

An R package implementing variable selection methodology for Gaussian model-based clustering which allows to find the (locally) optimal subset of variables in a data set that have group/cluster information. A greedy or headlong search can be used, either in a forward-backward or backward-forward direction, with or without sub-sampling at the hierarchical clustering stage for starting MCLUST models. By default the algorithm uses a sequential search, but parallelisation is also available.
  • 'clustvarsel' is available on CRAN.
  • Author: Nema Dean, Adrian E. Raftery, and Luca Scrucca
  • Scrucca L. and Raftery A. E. (2015) clustvarsel: A Package Implementing Variable Selection for Model-based Clustering in R. Submitted to Journal of Statistical Software. Pre-print available on arXiv.

GA: Genetic Algorithms

This R package provides a flexible general-purpose set of tools for optimization using genetic algorithms. GAs search are available for both the continuous and the discrete case, whether constrained or not. Users can easily define their own objective function depending on the problem at hand. Several genetic operators are available and can be combined to explore the best settings for the current task. Furthermore, users can define new genetic operators and easily evaluate their performances. GAs can be run sequentially or in parallel.

qcc: Quality Control Charts

Shewhart quality control charts for continuous, attribute and count data. Cusum and EWMA charts. Operating characteristic curves. Process capability analysis. Pareto chart and cause-and-effect chart. The 'qcc' package is available on CRAN.
A short description of the package appeared in the newsletter R News, volume 4/1, June 2004, pp. 11-17.

The 'qcc' package is included in the 10 R packages I wish I knew about earlier blog post.
A nice overview is also given in Statistical Quality Control in R.

msir: Model-based Sliced Inverse Regression

An R package for dimension reduction based on Gaussian finite mixture models as an extension to sliced inverse regression (SIR) is available on CRAN.
  • Scrucca L. (2011) Model-based SIR for dimension reduction. Computational Statistics & Data Analysis, Vol. 55, Issue 11, pp. 3010-3026.
    Supplemental material.

GAabbreviate: Abbreviating Items Measures using Genetic Algorithms

An R package that uses Genetic Algorithms as an optimization tool for scale abbreviation or subset selection that maximally captures the variance in the original data.

Regularized Sliced Inverse Regression

The archive contains an R package implementing regularization and shrinkage for Sliced Inverse Regression (SIR) as described in
  • Scrucca L. (2006) Regularized sliced inverse regression with applications in classification (2006). In "Data Analysis, Classification and the Forward Search", editors Zani S., Cerioli A., Riani M., Vichi M., Berlin, Springer-Verlag, pp. 59-66.
It also contains R functions to apply the method to DNA microarrays data as described in
  • Scrucca L. (2007) Class prediction and gene selection for DNA microarrays using sliced inverse regression (2007). Computational Statistics & Data Analysis, Vol. 52, pp. 438-451.
Note that the archive is password-protected, so if you are interested drop me an e-mail.

Competing Risks Analysis

Functions (based on the cmprsk R package) and datasets for computing the cumulative incidence function in the presence of competing risks, testing equality across groups (Gray's test), and pointwise confidence intervals for competing risks curves. Reference:
Scrucca L., Santucci A., Aversa F. (2007) Competing risks analysis using R: an easy guide for clinicians. Bone Marrow Transplantation, 40, 381-387.

Functions (based on the cmprsk R package) and dataset for regression modeling of competing risk. Reference:
Scrucca L., Santucci A., Aversa F. (2009) Regression modeling of competing risk using R: an in depth guide for clinicians. To appear in Bone Marrow Transplantation.

dispmod: Dispersion Models

Functions for modelling dispersion in GLM. Available on CRAN.

forward: Forward search

Forward search approach to robust analysis in linear and generalized linear regression models. Originally written for S-Plus by Kjell Konis and Marco Riani. Ported to R by Luca Scrucca. Available on CRAN.