[Forum SIS] Seminario Christian Hennig (1 Dicembre ore 15:00)

Roberto Rocci roberto.rocci a uniroma2.it
Dom 27 Nov 2011 03:06:30 CET


Cari soci,
il dipartimento di Scienze Statistiche della Sapienza Università di Roma
ed il Dipartimento SEFEMEQ dell'Università di Tor Vergata di Roma,
vi invitano al sotto indicato seminario che si terrà
Giovedì 1 Dicembre alle ore 15:00 nella sala 34
del Dipartimento di Scienze Statistiche
della Sapienza Università di Roma.

Saluti,
Roberto Rocci

Title
How to find the best cluster analysis method
(for social stratification based on mixed type data)?

Author
Christian Hennig

Affiliation
Department of Statistical Science,
University College London

Abstract
This presentation will treat two issues.
1. A general approach for the decision about the ``best'' cluster
analysis method will be sketched.
This is based on the idea that a subjective decision is needed about
the cluster concept required
in a particular application, and that there is no such thing as an
objectively true clustering
determined by the data alone. For example, even if data can be
perfectly fitted by a Gaussian mixture,
this does not necessarily mean that the mixture components correspond
to the ``true'' clusters,
because in a given application unimodal mixtures of more than one
Gaussian may be considered as
a single cluster (see Hennig 2010).
On the other hand, large within-cluster dissimilarities may not be
admissible, in which case
certain Gaussian components need to be split up.
2. The general philosophy is applied to the problem of defining social
strata from data with
mixed continuous, ordinal and categorical variables. Clusterings as
obtained from a mixture/latent
class approach (as for example fitted by the LatentGOLD software,
Vermunt and Magidson, 2005)
are compared with clusterings obtained from applying $k$-medoids
(Kaufman and Rousseeuw, 1990)
to dissimilarities balancing the different types of variables in a
sensible user-specified way.

References
HENNIG, C. (2010): Methods for merging Gaussian mixture components.
Advances in Data Analysis and Classification, 4, 3-34.

KAUFMAN, L. and ROUSEEUW, P. J. (1990): Finding Groups in Data. Wiley,
New York.

VERMUNT, J. K. and MAGIDSON, J. (2005):
Technical Guide for Latent GOLD 4.0: Basic and Advanced. Statistical
Innovations Inc., Belmont Massachussetts.



Maggiori informazioni sulla lista Sis