Computational methods have become intrinsic to biomedical research. The overall goal is to provide Dr. Ka Yee Yeung (Ph.D. in Computer Science) with mentored training and research experience to transition into an independent multi-disciplinary investigator in biomedical research. A program of mentored research, academic coursework, and research plan has been designed for this purpose. The mentored research component consists of mentors and an advisory committee who are leading experts in molecular biology, proteomics, medical research, bioinformatics and statistics. The academic coursework component will provide Dr. Yeung with a solid background in molecular biology, cancer biology and statistics. The underlying theme of the research plan is development of methods and software tools to facilitate extraction of biological meanings from high throughput data in cancer and disease investigation. The major goals of our research plan are the following:
Specific Aim 1 : Development of improved algorithms for class prediction and identification of gene markers on microarray data related to Hepatocellular carcinoma (HCC) and Hepatitis C virus (HCV) associated liver disease. The problems of predicting the diagnostic or prognostic category of a given tissue sample (class prediction) and identifying potential gene markers from microarray data have received a lot of attention. We will develop improved algorithms for class prediction and identification of potential gene markers by taking advantage of variability over repeated measurements in microarray data.
Specific Aim 2 : Development of class prediction and class discovery algorithms on heterogeneous data. We will build on our previous work in cluster analysis and class prediction to develop algorithms to handle data from multiple sources, including microarray data, proteomics data and clinical data.
Specific Aim 3 : Development of improved visualization tools. Software tools for visualization will be developed to facilitate biologists to utilize their biological knowledge and to interpret computational results from high throughput data.
Specific Aim 4 : Development of practical guidelines for cluster analysis on microarray data. We will make use of our in-house database consisting of thousands of microarray experiments to conduct empirical studies to develop practical guidelines for cluster analysis.

Agency
National Institute of Health (NIH)
Institute
National Cancer Institute (NCI)
Type
Mentored Quantitative Research Career Development Award (K25)
Project #
1K25CA106988-01
Application #
6765680
Study Section
Subcommittee G - Education (NCI)
Program Officer
Eckstein, David J
Project Start
2004-05-01
Project End
2009-04-30
Budget Start
2004-05-01
Budget End
2005-04-30
Support Year
1
Fiscal Year
2004
Total Cost
$141,533
Indirect Cost
Name
University of Washington
Department
Microbiology/Immun/Virology
Type
Schools of Medicine
DUNS #
605799469
City
Seattle
State
WA
Country
United States
Zip Code
98195
Zarbl, Helmut; Gallo, Michael A; Glick, James et al. (2010) The vanishing zero revisited: thresholds in the age of genomics. Chem Biol Interact 184:273-8
Annest, Amalia; Bumgarner, Roger E; Raftery, Adrian E et al. (2009) Iterative Bayesian Model Averaging: a method for the application of survival analysis to high-dimensional microarray data. BMC Bioinformatics 10:72
Oehler, Vivian G; Yeung, Ka Yee; Choi, Yongjae E et al. (2009) The derivation of diagnostic markers of chronic myeloid leukemia progression from microarray data. Blood 114:3292-8
Bumgarner, Roger E; Yeung, Ka Yee (2009) Methods for the inference of biological pathways and networks. Methods Mol Biol 541:225-45
Chu, Vu T; Gottardo, Raphael; Raftery, Adrian E et al. (2008) MeV+R: using MeV as a graphical user interface for Bioconductor applications in microarray analysis. Genome Biol 9:R118
Gottardo, Raphael; Raftery, Adrian E; Yeung, Ka Yee et al. (2006) Bayesian robust inference for differential gene expression in microarrays with multiple samples. Biometrics 62:10-8
Liu, X; Sivaganesan, S; Yeung, K Y et al. (2006) Context-specific infinite mixtures for clustering gene expression profiles across diverse microarray dataset. Bioinformatics 22:1737-44
Yeung, Ka Yee; Bumgarner, Roger E; Raftery, Adrian E (2005) Bayesian model averaging: development of an improved multi-class, gene selection and classification tool for microarray data. Bioinformatics 21:2394-402
Li, Qunhua; Fraley, Chris; Bumgarner, Roger E et al. (2005) Donuts, scratches and blanks: robust model-based segmentation of microarray images. Bioinformatics 21:2875-82
Yeung, Ka Yee; Medvedovic, Mario; Bumgarner, Roger E (2004) From co-expression to co-regulation: how many microarray experiments do we need? Genome Biol 5:R48