[Forum SIS] Graph sampling - seminar series - Li-Chun Zhang @unipg

Maria Giovanna Ranalli giovanna a stat.unipg.it
Mer 14 Ott 2020 09:43:20 CEST


Si comunica che Li-Chun Zhang (Università di Southampton, Università di Oslo, Statistics Norway) terrà come Visiting Professor un ciclo di seminari su Graph Sampling nell'ambito delle attività del Dottorato di Ricerca in Economia dell'Università degli Studi di Perugia. 

Li-Chun Zhang è Professore di Statistica presso l’Università di Southampton e l’Università di Oslo, ed è Senior Researcher presso Statistics Norway. I suoi interessi di ricerca includono l'inferenza per popolazioni finite, il campionamento statistico, la non-risposta e gli errori di misura, la stima per piccole aree, imputazione e editing, statistical matching, record linkage. Ha recentemente introdotto nuove metodologie per il campionamento su grafi (graph sampling) con una grande potenzialità applicativa in tutti quei contesti in cui le popolazioni di interesse sono caratterizzate da legami e sono definite su reti. Ha partecipato a numerosi progetti di ricerca europei ed è editore associato di numerose riviste fa cui il Journal of the Royal Statistical Society-Series A e l’Harvard Data Science Review. 

https://www.southampton.ac.uk/demography/about/staff/lz1n11.page <https://www.southampton.ac.uk/demography/about/staff/lz1n11.page> 

Il ciclo di tre seminari si svolgerà in modalità ibrida (in presenza per gli studenti ed il personale unipg e su piattaforma Teams per tutti gli interessati) secondo il seguente calendario  

1. Graph sampling methods for epidemiological studies 
Lunedì 26 ottobre, ore 10, aula 301

2. Versatility and intricacy of graph sampling
Martedì 27 ottobre, ore 10, aula 301

3. Walk sampling from large dynamic graphs
Martedì 3 novembre, ore 10, aula 301

La presentazione del ciclo di seminari e l'abstract di ciascun seminario si trovano in calce all'email. Per partecipare ad uno o più seminari, compilare il form a questo link 
https://forms.gle/uRztbnUpSE17msd3A <https://forms.gle/uRztbnUpSE17msd3A> entro mercoledì 21 ottobre. 

Un cordiale saluto,
M. Giovanna Ranalli



==============


Graph sampling 

Li-Chun Zhang

Representing a finite collection of entities (i.e. a population) by a graph allows one to incorporate the connections (or links) among the entities in addition to the entities themselves. One may be interested in the structure of the connections, or the links may provide effectively access to the part of population that is the primary interest. Either way, graph sampling provides a statistical approach to study real graphs as they are. Just like sampling from finite populations, it is based on exploring the variation over all possible sub-graphs, i.e. sample graphs, which can be taken from the given population graph, according to a specified method of sampling. 
The series of talks are not intended as a systematic exposition of the graph sampling theory. The emphasis is to introduce the central concepts and highlight the versatility of graph sampling to a variety of problems. Hopefully this will stimulate greater interest for future research and applications.  


1. Graph sampling methods for epidemiological studies 

The ongoing pandemic reminds one of the need of sound statistical methods in such life-and-death matters. The modelling approach to the estimation of the coefficient R has received much attention. However, suppose one is able to test everyone in the whole population over the period of time required for calculating R, how much would the model estimate differ from the truth? Suppose one is able to keep tracking the population this way, would the interval estimate of the R number have valid coverage over time? 
Unlike modelling, sampling provides valid descriptive inference by construction, regardless the particular population or phenomenon of interest. Among others, Istat and ONS are conducting prevalence studies based on finite-population sampling methods. Intuitively, a design is likely to be more efficient, if the positives have a relatively higher representation in the sample than in the population. Since the virus is transmitted via personal contacts, contact-tracing of the observed positives is likely to generate a higher yield of cases than random sampling from the population. This requires a population-graph representation that includes contacts between the individuals in addition to the individuals themselves, given which I will discuss the efficacy of some graph sampling methods for cross-sectional and longitudinal prevalence estimation in epidemiological studies. 


2. Versatility and intricacy of graph sampling

Since a graph contains two types of objects, nodes and edges, graph sampling requires an observation procedure in addition to the selection of an initial probability sample of nodes. The observation procedure makes use of the incidence relationship between nodes and edges, in order to propagate from the initial sample via the incident edges which, depending on the context, can be multiwave, multistage and/or multilayer. As will be explained, via the articulation of observation procedure, graph sampling encompasses all the so-called unconventional finite-population sampling methods, such as indirect, network, line-intercept or adaptive cluster sampling. 
Moreover, adopting a graph sampling perspective allows one to deal with a multitude of graph parameters, with or without a fixed order, defined for sub-graphs of specific characteristics, which are referred to as motifs, such as mutual relationship (of a dyad), triangle (of a triad), tree (of K nodes), geodesic (of length K), connected component (of K-nodes). However, a graph sampling method may be a probability design for some graph parameters but not for others. Not all the observed motifs in the sample graph are eligible for estimation, if the inclusion probability cannot be calculated for all of them. On the one hand, due to the various additional complications, formulating a feasible sampling strategy, which consists of a sampling method and an accompanying estimator, for the graph parameter of interest is generally not a trivial issue in graph sampling; on the other hand, exploring the intricacy surrounding feasible strategies can be rewarding in terms of efficiency gains. The versatility and intricacy will be illustrated across a range of settings and problems. 



3. Walk sampling from large dynamic graphs

One can envisage a discrete-time walk in a graph as travelling from one city to another via the roads that connect the cities. In a random walk, one takes randomly one of the possible roads out of the current city, repeat the same at the next city, and so on. In a targeted walk, one can modify the transition probabilities by the Metropolis-Hastings method. 
A walk reaches gradually its equilibrium, if the chance that one visits a given city at a given time depends less and less on the particular starting point. The stationary probability that a walk takes one to a given city is the fraction of times the city is visited when the walk is at equilibrium. The stationary probabilities are generally known up to a proportional constant for the cities visited, but are unknown for the cities yet to be visited. Random or targeted walk can be related to many well-known graph problems, such as PageRank for internet research engines, periphery-core network structure, betweenness as a centrality measure. Walk sampling seems especially natural for large graphs, which may be dynamic as well, i.e. evolving over time, provided the walk can be fast-moving. The basis of inference is now the stationary probabilities at equilibrium, instead of the exact sample inclusion probabilities from a fixed population graph. The traditional methods only deal with parameters defined for the nodes. An approach to feasible strategies for graph parameters defined for motifs of arbitrary but fixed order will be developed, which can greatly increase the scope of inference. 








M. Giovanna Ranalli, PhD

~ Associate Professor of Statistics
~ Department of Political Science 
~ University of Perugia (Italy) 

~ Tel +39 075 5855245
~ Fax +39 075 5855950
~ url: https://www.unipg.it/pagina-personale?n=maria.ranalli <https://www.unipg.it/pagina-personale?n=maria.ranalli>
~ http://www.sp.unipg.it/surwey/ <http://www.sp.unipg.it/surwey/>
~ https://scholar.google.it/citations?user=nCqXom0AAAAJ&hl=en <https://scholar.google.it/citations?user=nCqXom0AAAAJ&hl=en> 





-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.stat.unipg.it/pipermail/sis/attachments/20201014/594a518d/attachment.html>


Maggiori informazioni sulla lista Sis