[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

SEMINARIO Prof. Peņa




==============================================================
			AVVISO DI SEMINARIO
==============================================================

Il giorno 8 ottobre p.v., alle ore 11.00, 

presso la sala 34 (IV piano)
del DIPARTIMENTO DI STATISTICA, PROBABILITA' E STATISTICHE APPLICATE
dell' Universitā "La Sapienza" (P.le A. Moro 5, 00185 - ROMA), 

		il Prof. Daniel Peņa

della Universidad Carlos III di Madrid 

terrā un seminario dal titolo

"The SAR Procedure: A Diagnostic Analysis of Heterogeneous Data".

Tutti gli interessati sono invitati a partecipare.

======================================================================

Abstract

This paper presents a procedure for detecting heterogeneity in a sample
with respect to a given class of models. It can be applied to determine if
a sample of univariate or multivariate data has been generated by different
distributions, or if a regression equation is really a mixture of different
regression lines. The basic idea of the procedure is first to split the
sample into small homogeneous subgroups and then recombine the observations
in the subgroups to form homogeneous clusters. The splitting and
recombining scheme forms the core of the SAR procedure that is implemented
iteratively in three steps. First, outlier cleaning, an iterative process
of model fitting and outlier detection is applied to the sample to
eliminate isolated outliers. Second, splitting, the cleansed sample is
split into more homogeneous subgroups by using a measure of association
among the points in the sample given the model. The splitting is continued
until groups of minimal size that cannot be split further are found. Third,
recombining, a model is fitted to each homogeneous minimal group, and the
rest of the observations in the sample are checked iteratively one by one
for homogeneity and incorporation into the group. The final result of this
entire iterative process is a set of possible data configurations (PDC) for
the sample. The proposed procedure is exploratory and can be applied to
detect heterogeneity in any statistical modeling. The performance of the
procedure is illustrated in univariate, multivariate and linear regression
problems.