[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Giornata di studio su: Advances on Statistical Disclosure Limitation Methodologies for Microdata



Giornata di studio su: Advances on Statistical Disclosure Limitation
Methodologies for Microdata

Mercoledì, 5  Febbraio 2003
Sala dell’Orologio, Istat, Via C. Balbo 16, Roma
ore 11.00

Programma:
11-11.45  Mario Trottini (Universidad de Valencia, Spain) Comparing
alternative forms of data release: a multiple objective decision problem

11.45 - 12  Discussion

12 – 12.45 Silvia Polettini (Istat, Servizo MPS) An Experience on
Simulation for Data Protection
12.45 -13 Discussion

14,30 – 15.15 Jim Burridge (University of Plymouth, U.K.) Statistical
Disclosure Limitation Using “Information Preserving Statistical
Obfuscation” (IPSO): Applications to Discrete Data and Contingency
Tables
15.15 -15.30 Discussion



ABSTRACTS:
Comparing alternative forms of data release: a multiple objective
decision problem
Mario Trottini (Departamento de Estadistica e Investigacion Operativa
Universidad de Valencia, Spain)

Data Disclosure Limitation denotes a set of techniques aimed to protect
confidentiality in the release of statistical data. The goal is to find
forms of data release that are both useful for the users (researchers,
policy makers, public opinion) and safe in the sense do not violate
privacy and confidentiality of respondents represented in the data. This
is usually achieved applying a transformation to the original data,
often referred to as data mask, and then releasing the resulting data
set.  Many different data masks are currently available and there is an
increasing need to define suitable criteria that allow for their
comparison. Existing solutions are primarily heuristic and can lead to
undesirable answers. In this talk a general criterion based on multiple
objective decision theory is presented. Properties of the new criterion
and comparison with existing criteria are illustrated with several
examples.


An Experience on Simulation for Data Protection
Silvia Polettini (Istat, Servizio della Metodologia di base per la
produzione statistica, Roma)

A disclosure limitation procedure is proposed that protects microdata by
simulating artificial units drawn from a probability model, that is
estimated from the observed data. As the pledge for confidentiality only
involves the actual respondents, use of simulation, e.g. release of
synthetic units, should rule out the concern for confidentiality. An
intruder willing to re-identifying units via exact matching will be
discouraged from attempting an attack. As to the information loss, the
model used for simulation is such that the released data reproduce in
expectation the sample values of a prescribed set of characteristics. In
other words, under such model these characteristics maintain levels that
were observed on real data.  An application is  provided to the Italian
sample from the Community Innovation Survey (CIS). A simulation model is
defined for such data, although the approach can be applied in  more
generality. The maximum entropy principle is exploited  to achieve the
goal of matching data traits: by this approach we define a
semi-parametric model for the most relevant quantitative variables.
Entropy measures uncertainty of a distribution; its maximisation leads
to a distribution which is consistent with the given information but is
maximally noncommittal with regard to missing information.


Statistical Disclosure Limitation Using “Information Preserving
Statistical Obfuscation” (IPSO): Applications To Discrete Data and
Contingency Tables
Jim Burridge (University of Plymouth)

The seminar will describe a method of perturbing data which explicitly
preserves certain information.  The data are assumed to consist of two
sets of information on each respondent: public data and specific survey
data. It is assumed that both sets of data are liable to be released for
a subset of respondents. However, the public data will be altered in
some way to preserve confidentiality whereas the specific survey data is
to be disclosed without alteration. The method is a model based approach
to this problem and utilises the information contained in the sufficient
statistics. This seminar will discuss ways of applying the method to
discrete data with particular emphasis on log-linear models for
contingency tables. The talk will emphasise links to recent work on
“exact” methods of inference for contingency tables and will present a
new method of producing sample contingency tables with given marginal
totals. The method can be applied to multi-way contingency tables when a
decomposable graphical model is assumed for the cell counts. It is
believed that the method is simpler than some other methods described in
the literature.


Cordianli saluti,

Luisa Franconi

__________________________________________________________________________

Luisa Franconi
Istat
Servizio della Metodologia di base per la produzione statistica
U.O. metodologie e tecniche per la tutela della riservatezza
dell'informazione statistica
Via C. Balbo 16
00184 Roma
Italy
tel +39 06 4673 2306
fax +39 06 4667 8004
e-mail franconi@istat.it