Software

mclust: Normal Mixture Modeling for Model-Based Clustering, Classification, and Density Estimation

An R package for normal mixture modeling fitted via EM algorithm for model-based clustering, classification, and density estimation, including Bayesian regularization.

  • mclust is available on CRAN
  • Authors: Chris Fraley, Adrian Raftery and Luca Scrucca
  • Package vignette



clustvarsel: Variable Selection for Model-Based Clustering

An R package implementing variable selection methodology for Gaussian model-based clustering which allows to find the (locally) optimal subset of variables in a data set that have group/cluster information. A greedy or headlong search can be used, either in a forward-backward or backward-forward direction, with or without sub-sampling at the hierarchical clustering stage for starting MCLUST models. By default the algorithm uses a sequential search, but parallelisation is also available.

  • clustvarsel is available on CRAN
  • Authors: Nema Dean, Adrian E. Raftery, and Luca Scrucca
  • Package vignette



GA: Genetic Algorithms

This R package provides a flexible general-purpose set of tools for optimization using genetic algorithms. GAs search are available for both the continuous and the discrete case, whether constrained or not. Users can easily define their own objective function depending on the problem at hand. Several genetic operators are available and can be combined to explore the best settings for the current task. Furthermore, users can define new genetic operators and easily evaluate their performances. GAs can be run sequentially or in parallel.

  • GA is available on CRAN
  • Author: Luca Scrucca
  • Package vignette



qcc: Quality Control Charts

Shewhart quality control charts for continuous, attribute and count data. Cusum and EWMA charts. Operating characteristic curves. Process capability analysis. Pareto chart and cause-and-effect chart. Multivariate control charts.



msir: Model-based Sliced Inverse Regression

An R package for dimension reduction based on Gaussian finite mixture models as an extension to sliced inverse regression (SIR).

  • msir is available on CRAN
  • Author: Luca Scrucca
  • Scrucca L. (2011) Model-based SIR for dimension reduction. Computational Statistics & Data Analysis, 55:11, 3010-3026.



GAabbreviate: Abbreviating Items Measures using Genetic Algorithms

An R package that uses Genetic Algorithms as an optimization tool for scale abbreviation or subset selection that maximally captures the variance in the original data.



Regularized Sliced Inverse Regression

The archive regsir.zip contains an R package implementing regularization and shrinkage for Sliced Inverse Regression (SIR) as described in

  • Scrucca L. (2006) Regularized sliced inverse regression with applications in classification (2006). In Data Analysis, Classification and the Forward Search<, editors Zani S., Cerioli A., Riani M., Vichi M., Berlin, Springer-Verlag, pp. 59-66.

It also contains R functions to apply the method to DNA microarrays data as described in

  • Scrucca L. (2007) Class prediction and gene selection for DNA microarrays using sliced inverse regression (2007). Computational Statistics & Data Analysis, Vol. 52, pp. 438-451.

Note that the archive is password-protected, so if you are interested drop me an e-mail.


Competing Risks Analysis

Functions based on the cmprsk R package for computing the cumulative incidence function in the presence of competing risks, testing equality across groups (Gray’s test), and pointwise confidence intervals for competing risks curves as described in

  • Scrucca L., Santucci A., Aversa F. (2007) Competing risks analysis using R: an easy guide for clinicians. Bone Marrow Transplantation, 40, 381-387.

Code and example dataset:

Functions based on the cmprsk R package for regression modeling of competing risk as described in

  • Scrucca L., Santucci A., Aversa F. (2010) Regression modeling of competing risk using R: an in depth guide for clinicians. Bone Marrow Transplantation, 45, 1388–1395.

Code and example dataset:


dispmod: Dispersion Models

Functions for modelling dispersion in Generalized Linear Models.

  • dispmod is available on CRAN