While the human genome provides a parts list of >20,000 proteins, it is still largely unknown how these proteins assemble into `molecular machines' to carry out their biological roles. This is important both for basic characterization of human genes and for understanding the mechanisms underlying most human genetic traits and diseases, which often arise from defects in systems of proteins working together. We focus on the >5,000 human proteins shared across eukaryotes and dating to the last eukaryotic common ancestor. These ancient proteins carry out critical cellular processes, including DNA replication, repair, transcription, splicing, mitochondrial and ciliary processes, and trafficking, among others. They are disproportionately drivers of human disease, linked to a wide array of disorders, spanning cancers, birth defects, metabolic disorders, Parkinson disease, Huntington disease, amyotrophic lateral sclerosis, mental retardations, and more. More than 750 of these deeply conserved human proteins are still entirely uncharacterized. A fundamental question is how all of these proteins work together to support cell function. However, a key limitation remains the lack of large-scale data directly interrogating these proteins' expression, interactions, and activation states. Current approaches for quantifying the proteome are only beginning to survey the proteins expressed in mammalian cells to any significant depth, and consistently suffer from low sensitivity and throughput. These limitations have slowed medical applications, e.g. biomarker discovery, where techniques including mass spectrometry and antibody arrays often lack sufficient sensitivity and quantification accuracy to be effective. We propose research in three broad areas: First, we propose a major effort to biochemically define the main human protein complexes, providing a mechanistic basis for interpreting diverse human genetics and diseases. We will focus primarily on evolutionarily conserved human proteins, leveraging studies in other species, due to these proteins' critical importance to cellular function. Second, we are developing surrogate functional assays for deeply conserved human proteins by systematically humanizing yeast cells, replacing each essential yeast gene in turn by its human version. The resulting strains serve as new physical reagents for studying human genes in a simplified organismal context, opening up simple high-throughput assays of human gene function, the impact of human genetic variation on gene function, the screening and repurposing of drugs, and the rapid determination of mechanisms of drug resistance. Finally, we aim to advance a new proteomics technology, single-molecule protein sequencing, which could potentially solve problems currently limiting the field, by orders-of-magnitude improvements in sensitivity and throughput. Success of these aims will give new insights into basic human cell biology and biochemistry, laying the foundation for future attempts to intervene, chemically or genetically, with those macromolecules most critical to the functioning of cells.
Deeply conserved human genes are strongly linked to diverse diseases, from cancer to neurodegeneration to cardiovascular disease. We propose to study these critical human genes by (1) biochemically defining how the encoded proteins physically organize into multiprotein assemblies, in order to help us better understand a variety of human genetic diseases, (2) systematically replacing yeast genes with their human equivalents, using the humanized yeast to directly experimentally measure the effects of human genetic variation and gene- drug interactions, and (3) advancing a new proteomics technology, single-molecule protein sequencing, which promises orders-of-magnitude improvements in sensitivity and throughput, and which would have broad applications across biology and medicine. Success of these efforts will provide new insights into basic human cell biology and biochemistry, laying the foundation for future attempts to intervene, chemically or genetically, with proteins critical to human cell health and disease.
Showing the most recent 10 out of 20 publications