The genetic landscape of rare and common diseases has emerged as heterogeneous and complex. Already, researchers and clinicians face the challenge to discern pathophysiological mechanism and treatment opportunities for hundreds of genetic subtypes that have been identified in rare diseases, such as inherited neuropathies (INs) or mitochondrial diseases (MiDs) alone. Still, a large fraction of disease loci remains to be discovered ? a daunting task, since gene-identification studies often require immense sample-sizes, which are difficult to achieve, even for more common conditions. Simultaneously, much of the heritability of many disorders appears to be determined by the collective impact of possibly thousands of low-impact variants, spread across the genome. Ideally, the impact of a given set of candidate variants could be assessed within high-throughput framework that accounts for the genetic context of individual patients. Leveraging advanced deep learning algorithms, we have developed an unbiased, scalable method to rapidly identify disease- associated phenotypes in high-resolution, multiplexed, fluorescent microscopy images of primary, patient derived cells. In turn, the discovered phenotypes can be exploited as experimental signals against which the disease relevance of candidate variants can be confirmed, by virtue of genetic complementation experiments. At the same time, the standardized and scalable nature of our method renders it suitable to test potential therapeutic interventions, e.g. to test the efficacy of potential gene-therapy, or to screen small molecule libraries, while maintaining patient-specific granularity. The goal of this proposal is to apply our approach to an expanded cohort of patient cells and to refine methods to interpret both genetic and pharmacological perturbations. In this, I will be supported by an exceptional and multidisciplinary team of experts in clinical, molecular and functional genetics, and computer scientists, within the world-class scientific environment offered by Columbia University and the Broad Institute. In a carefully designed development plan, I will finalize my training in machine learning and data science, expand my expertise to single-cell RNA-sequencing and other single-cell methods, and acquire essential leadership and scholarly skills required for an independent research career. Over the course of this award, I will apply our cellular profiling approach to generate a standardized map of deep, quantitative descriptions of disease-associated cellular phenotypes across a number of INs, MiDs and neurodegenerative conditions. We will explore the integration of RNA-sequencing to enhance our approach. Finally, we will apply our method to the discovery and confirmation of new disease genes, and screen a limited number of pharmacological interventions through our method. Together, the proposed developmental plan and research strategy will foster my ability to lead an independent research program, to establish cellular profiling as a powerful platform to advance genomic and translational medicine.
The functional interpretation of genetic variation in diseases faces critical roadblocks, due to the lack of scalable methods to assess the significance of candidate variants experimentally, while accounting for genomic context. This proposal introduces a morphological profiling method, that is able to rapidly identify disease-associated phenotypes in patient cells and offers a cost-efficient and unbiased way to determine the significance of putative pathogenic variants at scale. If successful, this proposal will establish cellular profiling of patient cells as a powerful platform to advance genomic and translational medicine.