Computational and quantitative methods are now an essential part of biological research. There is an urgent need for scientists who are expert in both computational and life sciences. We propose to continue addressing this need through the Columbia University predoctoral Training Program in Computational Biology. Our main goal is to train young scientists to do pioneering and high-impact research in biology using computational methods. We will accomplish this goal by providing the necessary background in both life and computational sciences and through mentored student research projects. The students will be trained to be experts in one area of computational biology (e.g. systems biology), have a good knowledge of another area (e.g. structural biology), and have an in-depth understanding of experimental biology in the area of their specialization (e.g. genetics or neuroscience). In the initial funding period - as suggested by the NIGMS training program guidelines - we focused on course and curriculum development using a small group of highly qualified trainees with diverse scientific backgrounds. Students trained by our program have already made exciting scientific breakthroughs and published high-profile articles. In this application we propose to expand the developed program to accommodate a large number of qualified students at Columbia and request 8 slots for the program. The program coursework will be tailored to students'backgrounds and interests, and will consist of a 4x2 scheme: at least 2 courses in life sciences (e.g. biochemistry, cell biology), 2 courses in quantitative subjects (e.g. machine learning, statistics), 2 courses in computational biology (e.g. computational systems biology, biophysics), and 2 electives from any of the above areas. Students will be able to earn doctoral degrees in the C2B2 Graduate Program or in the 13 affiliated departments and programs and in the laboratories of 20 training program faculty, representing both quantitative and life science disciplines. The students will also learn from rotations (including ones in experimental labs), guided independent study, an oral qualification exam, seminars, retreats, and journal clubs. The program will be directed by Dr. Barry Honig, Director of the Columbia University Center for Computational Biology and Bioinformatics (C2B2), and crucial program decisions will be made by the program's Executive Committee. It takes on average 5 years to complete our Ph.D. program and we will typically support computational biology students for a 2-3 year period. We will make every effort to recruit and retain students who belong to minority groups, come from underprivileged backgrounds, or are disabled. We will continue to accomplish this goal by actively recruiting qualified students from these groups. All students will be required to take a 1 semester course in responsible conduct of research.

Public Health Relevance

of this project to public health is in training students to use computational methods to understand genotype-to-phenotype relationships across the whole range of biological research, from genetics to protein function to cellular networks to clinical phenotypes.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Institutional National Research Service Award (T32)
Project #
Application #
Study Section
Special Emphasis Panel (TWD)
Program Officer
Ravichandran, Veerasamy
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Columbia University (N.Y.)
Schools of Medicine
New York
United States
Zip Code
Riley, Todd R; Slattery, Matthew; Abe, Namiko et al. (2014) SELEX-seq: a method for characterizing the complete repertoire of binding site preferences for transcription factor complexes. Methods Mol Biol 1196:255-78
Vendome, Jeremie; Felsovalyi, Klara; Song, Hang et al. (2014) Structural and energetic determinants of adhesive binding specificity in type I cadherins. Proc Natl Acad Sci U S A 111:E4175-84
Chen, James C; Alvarez, Mariano J; Talos, Flaminia et al. (2014) Identification of causal genetic drivers of human disease through systems-level analysis of regulatory networks. Cell 159:402-14
Thu, Chan Aye; Chen, Weisheng V; Rubinstein, Rotem et al. (2014) Single-cell identity generated by combinatorial homophilic interactions between ?, ?, and ? protocadherins. Cell 158:1045-59
Ward, Lucas D; Wang, Junbai; Bussemaker, Harmen J (2014) Characterizing a collective and dynamic component of chromatin immunoprecipitation enrichment profiles in yeast. BMC Genomics 15:494
Higgins, Claire A; Chen, James C; Cerise, Jane E et al. (2013) Microenvironmental reprogramming by three-dimensional culture enables dermal papilla cells to induce de novo human hair-follicle growth. Proc Natl Acad Sci U S A 110:19679-88
Jin, Xiangshu; Walker, Melissa A; Felsövályi, Klára et al. (2012) Crystal structures of Drosophila N-cadherin ectodomain regions reveal a widely used class of Ca²+-free interdomain linkers. Proc Natl Acad Sci U S A 109:E127-34
Gilman, Sarah R; Iossifov, Ivan; Levy, Dan et al. (2011) Rare de novo variants associated with autism implicate a large functional network of genes involved in formation and function of synapses. Neuron 70:898-907
Shechtman, Caryn F; Henneberry, Annette L; Seimon, Tracie A et al. (2011) Loss of subcellular lipid transport due to ARV1 deficiency disrupts organelle homeostasis and activates the unfolded protein response. J Biol Chem 286:11951-9
Harrison, Oliver J; Jin, Xiangshu; Hong, Soonjin et al. (2011) The extracellular architecture of adherens junctions revealed by crystal structures of type I cadherins. Structure 19:244-56

Showing the most recent 10 out of 11 publications