[Forum SIS] presentazione tesi di dottorato

Mer 14 Apr 2010 13:21:06 CEST

SAPIENZA UNIVERSITA' DI ROMA
DIPARTIMENTO DI STATISTICA PROBABILITA'
E STATISTICHE APPLICATE
P.LE A. MORO 5, ROMA

DOTTORATO DI RICERCA IN STATISTICA METODOLOGICA

Presentazione Tesi

21 aprile 2010
ore 9.15, Sala 34

ore 9.15
LEARNING EVALUATION IN PATTERN ANALYSIS
ALESSIA MAMMONE XXI ciclo

ore 10.00
STRUCCTURAL BREAKS IN DYNAMIC FACTOR MODELS
ANTONIETTA DI SALVATORE XXII ciclo

TUTTI GLI INTERESSATI SONO INVITATI A PERTECIPARE

Cordiali saluti
fulvio de santis

---------------------------------------

RIASSUNTI

LEARNING EVALUATION IN PATTERN ANALYSIS
ALESSIA MAMMONE
Recently, our capabilities of both generating and collecting data have
been increased rapidly. This explosive growth in data has generated an
urgent need for new techniques that can intelligently and automatically
transform the processed data into useful information and knowledge. An
important class of data mining algorithms designed for this purpose
consists of algorithms for Frequent Pattern Mining (FPM): the search for
frequently recurring data elements in data. We have now reached the
point where the real bottleneck of FPM is no longer the efficiency of
the algorithm used to find frequent patterns, but the pruning down of
this massive set of frequent and redundant patterns to a compact but
informative set of patterns, intelligibly presented to a human end-user.
The main challenge here is the formalization of the concept
"informativeness": it is a highly subjective notion, and a satisfying
objective formalization is yet to be found. To this aim we will
introduce a general framework to mine informative patterns. This
involves the definition of a pattern in broad generality, our definition
of informativeness, and our approach to deal with redundancy. Then we
will apply our general framework to the particular case of itemset
patterns in transactional databases: we will regard an itemset as
informative if it is non trivial under some reasonable probabilistic
assumptions about the data. So we will suggest our measure of
informativeness for itemsets in transactional databases, and a novel
efficient algorithm is proposed to filter a possibly large set of
frequent itemsets down to a small list of informative and non-redundant
itemsets, sorted in order of decreasing added informativeness. We
believe that our approach mirrors an intuitively satisfying notion of
informativeness. Therefore, to highlight the fact that our algorithm
find informative itemsets, we take as running example textual data, as
text data is usually quite intelligible, and the itemsets found are
likely to make sense, but of course the applicability of our method is
not limited to textual data.

STRUCCTURAL BREAKS IN DYNAMIC FACTOR MODELS
ANTONIETTA DI SALVATORE
Analyzing data sets with a large number of variables and time periods
involves a severe risk that some of the model parameters are subject to
structural breaks. Dynamic factor models may be more affected by this
issue than other models for time series, since factor models rely on
large datasets. Ignoring these breaks leads to inconsistent estimates of
the loadings, a larger dimension of the factor space and to worsen the
forecasts in particular in small samples.
In this thesis we develop a methodology to identify a single change
point occurring at some unknown date in dynamic factor models. The break
considered are: break in level, break in the loading factors and break
in the factor moments. It is known that the number of factors is equal
to the rank of the covariance matrices of the observable variables.
Furthermore the dynamic factor model can be decomposed by frequency
using spectral techniques and, under some restriction, much of the
variation of the observable variables at low frequencies can be
attributed to the common factors. We focus on the positive eigenvalues
of the spectral density matrices evaluated at lows frequencies
and we show that a break may modify their values or increase their
number or both. Then the bias in the rank and in the values of positive
eigenvalues of spectral density matrix needs to be taken into account
when testing for change point and for the number of common factors.