The ultimate goal is to develop and apply computational and statistical techniques for large-scale analysis of recently emerged genomic data in order to extract optimal and meaningful biological information Specific aims to reach this goal are threefold. (1) To thoroughly understand the technological and biological aspects of gene mapping and microarray- based experiments and to identify the statistical problems involved in functional genomic studies. This learning phase will make the PI, a mathematical statistician, familiar with the biology of human genetics. (2) To develop methods for efficient association analysis between Single Nucleotide Polymorphisms (SNPs) and specific heritable diseases. For many complex traits, the number of families or affected individuals in a study is smaller (or not much larger) than the number of SNP markers used in a genomic screen. This precludes meaningful multi-variate analyses on a genome-wide basis. To selective appropriate subsets of markers for further study and to combine information over multiple markers in multi-variate analyses, novel statistical bootstrap (resampling-based) methods will be developed The resulting subset of SNP markers will be further evaluated by logistic or other multiple regression models for risk assessment. (3) To develop statistical analysis methods for gene expression data obtained through microarray-based technologies. Issues such as reproducibility in multiple experiments and signal function frequently confound the analysis of microarray data. A multi-step procedure based on raw data from oligonucleotide expression array is proposed and computer programs will be developed. The approaches developed in the training period will ultimately allow improved analysis of both genomic and gene expression array data.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Mentored Quantitative Research Career Development Award (K25)
Project #
5K25HG000060-05
Application #
6963244
Study Section
Ethical, Legal, Social Implications Review Committee (GNOM)
Program Officer
Brooks, Lisa
Project Start
2002-03-06
Project End
2007-02-28
Budget Start
2005-03-01
Budget End
2006-02-28
Support Year
5
Fiscal Year
2005
Total Cost
$118,967
Indirect Cost
Name
Yale University
Department
Public Health & Prev Medicine
Type
Schools of Medicine
DUNS #
043207562
City
New Haven
State
CT
Country
United States
Zip Code
06520
Wu, Chengqing; Zhang, Hong; Liu, Xiangtao et al. (2009) Detecting essential and removable interactions in genome-wide association studies. Stat Interface 2:161-170
Harris, C R; Dewan, A; Zupnick, A et al. (2009) p53 responsive elements in human retrotransposons. Oncogene 28:3857-65
Klein, Robert J; Zeiss, Caroline; Chew, Emily Y et al. (2005) Complement factor H polymorphism in age-related macular degeneration. Science 308:385-9
Zareparsi, Sepideh; Branham, Kari E H; Li, Mingyao et al. (2005) Strong association of the Y402H variant in complement factor H at 1q32 with susceptibility to age-related macular degeneration. Am J Hum Genet 77:149-53
Hoh, Josephine; Ott, Jurg (2004) Genetic dissection of diseases: design and methods. Curr Opin Genet Dev 14:229-32
Wille, Anja; Hoh, Josephine; Ott, Jurg (2003) Sum statistics for the joint detection of multiple disease loci in case-control association studies with SNP markers. Genet Epidemiol 25:350-9
Hoh, Josephine; Matsuda, Fumihiko; Peng, Xu et al. (2003) SNP haplotype tagging from DNA pools of two individuals. BMC Bioinformatics 4:14
Ott, Jurg; Hoh, Josephine (2003) Set association analysis of SNP case-control and microarray data. J Comput Biol 10:569-74
Yang, Yaning; Zhang, Jingshan; Hoh, Josephine et al. (2003) Efficiency of single-nucleotide polymorphism haplotype estimation from pooled DNA. Proc Natl Acad Sci U S A 100:7225-30
Hoh, J; Jin, S; Parrado, T et al. (2002) The p53MH algorithm and its application in detecting p53-responsive genes. Proc Natl Acad Sci U S A 99:8467-72