[Forum SIS] Seminario Prof.ssa Rosanna Verde
Grazia Vicario
grazia.vicario a polito.it
Mer 14 Mar 2018 16:10:53 CET
Annuncio di seminario:
venerdì 6/04/2018 ore 15.00
Aula Buzano
Dipartimento di Scienze Matematiche
Corso Duca degli Abruzzi, 24 – 10129 Torino
Politecnico di Torino
Distributional Data methods for dimensional reduction in Big Data Analysis
Rosanna Verde
Dipartimento di Matematica e Fisica
Università della Campania “Luigi Vanvitelli”, Caserta, Italy
Abstract
Distributional Data Analysis (DDA) is a new field of research related to
Symbolic Data Analysis (Bock, Diday, 2000). The statistical units, or
objects, are described by empirical distribution of values observed for
numerical variables. In some cases, the distribution are synthesis or
aggregated data, in order to preserve the confidentiality of the
information. For the last ten years several methods have been proposed, as
extensions of the classical data analysis techniques, to the study of this
kind of data.
The most part of such techniques are based on a suitable distance measure to
compare distributions: the L2 Wasserstein distance (Rüschendorf, 2001). This
metric is particularly appropriate in DDA, and it satisfies some properties
(see, Irpino, Verde, 2015), like the distance decomposition in three
components related to the characteristics (position, size and shape) of the
compared distributions.
Starting from the definition of Basic statistics of DD by using the L2
Wasserstein distance, the most relevant methods developed on DD are:
- Clustering methods, like k-means, and hierarchical methods to
discover classes of DD according to the characteristics of the distribution
of the values, taking also into account the variability and the shape of the
set of distributions;
- Principal Component Analysis (PCA) for DD (Verde, Irpino, 2016) as
extension of PCA to this kind of data allows a visualization of the
variability of the data related to the several components of DDA. Moreover,
it furnishes a new visualization tool, that allows to highlight the
characteristics of the distributions, in a reduced subspace;
Finally, DDA presents strong potentiality in the analysis of Big Data, and
all that concerns the Volume of values and the Variability as a way of
performing a dimensional reduction of the data space.
A proposal is to use DD like suitable summaries in data stream analysis
(Balzanella A., Verde R., 2016)
Tutti gli interessati sono invitati a partecipare.
Per informazioni: grazia.vicario a polito.it
_____________________________
Prof.ssa Grazia Vicario
Department of Mathematical Sciences
Politecnico di Torino
Corso Duca degli Abruzzi 24, 10129 Torino (Italy)
Voice: +39 011 090 7504; Fax: +39 011 090 7599
Email: <mailto:grazia.vicario a polito.it> grazia.vicario a polito.it
-------------- parte successiva --------------
Un allegato HTML è stato rimosso...
URL: <http://www.stat.unipg.it/pipermail/sis/attachments/20180314/6f77beba/attachment-0001.html>
Maggiori informazioni sulla lista
Sis