[Forum SIS] Seminar in Statistics LORENZO MASOERO

Pierpaolo De Blasi pierpaolo.deblasi a unito.it
Ven 10 Dic 2021 18:04:01 CET


SEMINARS IN STATISTICS @ COLLEGIO CARLO ALBERTO
<https://www.carloalberto.org/events/category/seminars/seminars-in-statistics/?tribe-bar-date=2019-09-01>

Venerdi 17 Dicembre 2021, alle ore 12.00, presso il Collegio Carlo Alberto,
in Piazza Arbarello 8, Torino, si terrà il seguente seminario:

------------------------------------------------

Speaker: *Lorenzo Masoero *(Amazon)


Title: *Improved prediction and optimal sequencing strategies for genomic
variant discovery via Bayesian nonparametrics*

Abstract:Despite the advent of Big Data, data-gathering in many domains can
still be an expensive process that necessitates careful planning when
operating under a fixed, limited budget. For instance, sequencing new
genomic data is a complex procedure that requires careful tuning:
researchers can spend resources to sequence a greater number of genomes
(quantity), or spend resources to sequence genomes with increased accuracy
(quality). In this talk, I consider the common setting in which scientists
have already conducted a pilot study to reveal variants in a genome and are
contemplating a follow-up study. Spending additional resources has the
potential to reveal new variations in the genome, and thereby new genetic
insights. Therefore, practitioners are interested in (i) predicting how
many new discoveries they will make under different experimental design
choices. In turn, they can leverage these predictions to optimally allocate
available resources in the design of a future experiment, e.g. (ii) to
maximize the number of future discoveries or (iii) to optimize the
usefulness of a future experiment for the task at hand, e.g. the power of
an associated statistical test.
I discuss novel methodologies to solve the problems mentioned above. Our
approach relies on a Bayesian nonparametric formulation that facilitates
(i) prediction for the number of new variants in the follow-up study based
on the pilot study. We show empirically that, when experimental conditions
are kept constant between the pilot and follow-up, our method's prediction
is competitive with the best existing methods. Unlike current methods,
though, our new method allows practitioners to change experimental
conditions between the pilot and the follow-up. We demonstrate how this
distinction allows our method to be used for more realistic predictions and
for optimal allocation of a fixed budget between quality and quantity. In
particular, we first show how, under a fixed budget, my predictions can be
used to maximize (ii) the number of new genomic variants discovered in a
follow-up study. Last, we show how our framework can guide practitioners in
other experimental design problems, and specifically how to achieve (iii)
the highest possible power in statistical tests in the context of rare
variants association studies.

------------------------------------------------

In ottemperanza alle norme anti Covid, per partecipare in presenza è
necessario prenotarsi tramite il seguente form online:
https://forms.gle/jVPLoDCQ1ANSB5dg8

Sarà possibile seguire il seminario anche in streaming:
Join Zoom Meeting
<https://us02web.zoom.us/j/88971670320?pwd=VXFseHNlYm5EKzFmZ25oTGdlRWw4QT09>
Meeting ID: 889 7167 0320
Passcode: 362208


Il seminario è organizzato dalla "de Castro" Statistics Initiative

www.carloalberto.org/stats

Cordiali saluti,
Pierpaolo De Blasi
---
University of Torino & Collegio Carlo Alberto

carloalberto.org/pdeblasi
<https://sites.google.com/a/carloalberto.org/pdeblasi/>
-------------- parte successiva --------------
Un allegato HTML è stato rimosso...
URL: <http://www.stat.unipg.it/pipermail/sis/attachments/20211210/01d17142/attachment.html>


Maggiori informazioni sulla lista Sis