Genome-wide association and linkage studies involving hundreds or thousands of Single Nucleotide Polymorphisms (SNPs) are becoming increasingly common due to the rapid development of biotechnologies. Among many statistical challenges arising from these studies, the typical limited sample size is of particular concern because of high genotyping cost putting pressure to limit the number of individuals genotyped;increased statistical significant level for fear of too many false positives due to multiple comparisons;and moderate risk from each disease-associated variant allele in complex diseases. This application considers two strategies to address this issue: (1) to increase sample size by pooling data obtained from several sources;(2) to devise better statistical and computational tools for more efficient usage of the data. Correspondly, the first aim is to develop estimation and inference procedures for genetic association using data obtained from both population-based case-control and family-based studies, accommodating diverse ascertainment schemes of cases and controls, whereas the second aim is to develop analysis and regularization methods that enhance the possibility that the disease-associated variants and their interactions can actually be identified.
The second aim i s also concerned with the construction of risk predictive models from these SNPs. The highly dense SNP markers also pose problems to a more traditional model-based linkage analysis for gene discovery, because the methods for this analysis were developed assuming markers in linkage equilibrium, an assumption that is likely violated with the density of the SNPs.
The third aim i s to develop and evaluate estimating procedures for multipoint linkage analysis in the presence of linkage disequilibrium among SNP markers.
Showing the most recent 10 out of 319 publications