50th Anniv. Seminar Series on 'The power of simple statistical techniques...' by Professor Lei SUN
posted by Department of Statistics and Actuarial Science for HKU and Public
Event Type: Public Lecture/Forum/Seminar/Workshop/Conference/Symposium
Event Nature: Science & Technology
DEPARTMENT OF STATISTICS AND ACTUARIAL SCIENCE
THE UNIVERSITY OF HONG KONG
50th Anniversary Seminar Series
Professor Lei SUN
Department of Statistical Sciences,
Faculty of Arts and Science Division of Biostatistics, Dalla Lana School of Public Health
University of Toronto, Canada
will give a talk
THE POWER OF SIMPLE STATISTICAL TECHNIQUES IN THE ERA OF BIG AND COMPLEX DATA: SOME RECENT EXAMPLES FROM GENETIC ASSOCIATION STUDIES
Genetic association studies aim to identify genetic markers associated with a heritable trait/outcome of interest. Data used are typically big and challenging. Whole-genome studies scan through millions of variables, missing data and measurement errors are often present, individuals from the same family are correlated, and complex genetic etiologies imply complex models. Development favoring machine learning type of approaches is rising rapidly, across different scientific studies. As a complementary approach, in this talk I present some recent examples where we efficiently and reliably extract information from large-scale genetic association studies, by reconsidering some of the classical statistical techniques in newer settings. We first revisit the well-known Fisher’s method, commonly used in meta-analyses to combine p-values from the same test applied to K independent samples. Here we propose to use it to combine p-values from different tests applied to the same sample, when analyzing multiple genetic variants simultaneously (Derkach, Lawless and Sun 2014, Statistical Science; 2015, Genetic Epidemiology), or when jointly capturing main and interaction effects (Soave et al. 2015, American Journal of Human Genetics). In both settings, we show that there are two classes of complementary tests that are asymptotically independent of each other under a global null hypothesis; this is a desirable feature for analyzing big data. We then revisit the simple linear regression and its celebrated extensions in novel context. We first show that Levene’s scale-test for variance heterogeneity can be derived from a two-stage regression framework, and this allows us to generalize the test, with ease, for more complex data (Soave and Sun 2017, Biometrics). If time permits, I will discuss on-going work, with graduate student Lin Zhang, on how to use a regression model to test Hardy-Weinberg equilibrium; this leads to a new allele-based association test with theoretical insights on its robustness. We also provide supporting evidence from applications including genetic association studies of complications related to type 1 diabetes and cystic fibrosis.
|Venue||Room 301, Run Run Shaw Building, HKU|
Registration is not required.